home *** CD-ROM | disk | FTP | other *** search
- Tech Chapter 3 - Delayed indexing MetalBase 5.1
- -------------------------------------------------------------------------------
-
- If you take a look at the chapter on MetalBase's binary tree algorithms, you'll
- notice that the operation within mb_add() which links a new record into
- all a relation's indices is a complete, distinct routine. Without this
- routine, a record would exist in the physical heap, but would not be
- accessed by following any indices in any direction.
-
- Sounds awful, doesn't it?
-
- It's not all that bad. as of MetalBase 5.1, functions have been added which
- allow you to deal with records which have not been linked into the indices.
- The primary benefit of this is that add operations are nearly instantaneous;
- for example, if you were importing 5,000 records into a 10,000-record
- relation, mb_add_q() would be infinitely faster than using mb_add(). The
- downside is that these records can't be queried as well as others; to make
- them do that, there are two ways to get them into the primary heap:
-
- 1st- the routine mb_xfer() will take the first unindexed record that
- it finds and index it. In essence, calling mb_add_q() followed
- by mb_xfer() is identical to just calling mb_add(). The advantage
- here is that you can index records when your computer is free.
-
- 2nd- the utility MBDIAG, when it rebuilds indices, will take into account
- -all- records, including those not previously indexed. In this
- way, the 5000 records mentioned above can be indexed in a much,
- much shorter amount of time.
-
-
- To make this as fast as possible, records which haven't been indexed are
- kept in a contiguous chunk at the end of the .REL file:
-
- [ relation header ]
- [ idx1 ][ idx2 ][ idx3 ][ record 0 ] #0
- [ idx1 ][ idx2 ][ idx3 ][ record 1 ] #1
- [ idx1 ][ idx2 ][ idx3 ][ record 2 ] #2
- [ idx1 ][ idx2 ][ idx3 ][ record 3 ] #3
- [ 0 ][ 0 ][ 0 ][ record 4 ] #4
- [ 0 ][ 0 ][ 0 ][ record 5 ] #5
- [ 0 ][ 0 ][ 0 ][ record 6 ] #6
-
- In the example above, records 4-6 have not been indexed. If mb_xfer were
- called at this point, record 4 would be indexed, and 5 and 6 would be left
- in this unindexed-record heap.
-
-
- MB_ADD() ----------------------------------------------------------------------
-
- However, this leads to the problem of keeping all those records in a chunk
- when you've used mb_add(), or mb_del()'ed a record which is indexed. First
- off, mb_add()... in the case below, "record 7" has been added with mb_add()
- (so it had to be -before- the secondary heap, into position #4):
-
- [ relation header ]
- [ idx1 ][ idx2 ][ idx3 ][ record 0 ] #0
- [ idx1 ][ idx2 ][ idx3 ][ record 1 ] #1
- [ idx1 ][ idx2 ][ idx3 ][ record 2 ] #2
- [ idx1 ][ idx2 ][ idx3 ][ record 3 ] #3
- [ idx1 ][ idx2 ][ idx3 ][ record 7 ] #4
- [ 0 ][ 0 ][ 0 ][ record 5 ] #5
- [ 0 ][ 0 ][ 0 ][ record 6 ] #6
- [ 0 ][ 0 ][ 0 ][ record 4 ] #7
-
- So "record 4" was rolled to the end of the secondary heap, leaving a block
- of space for "record 7" to be inserted. Simple, eh?
-
- Note that the normal _check_dup() function uses _search() to find duplicates,
- which only knows about the primary (indexed) heap. So, before mb_add(),
- mb_add_q() and mb_upd() can accept a record, it has to do a second check
- throughout the entire secondary heap to ensure no nodups index will be
- violated by the addition.
-
-
- MB_DEL() ----------------------------------------------------------------------
-
- Now what about mb_del()? Using the above chart--if you deleted "record 5",
- "record 4" would replace it and that's that--no indices to fix, because
- it isn't indexed. It's not that easy if you delete, say, "record 2". In
- that case, the chart would end up looking like this:
-
- [ relation header ]
- [ idx1 ][ idx2 ][ idx3 ][ record 0 ] #0
- [ idx1 ][ idx2 ][ idx3 ][ record 1 ] #1
- [ idx1 ][ idx2 ][ idx3 ][ record 7 ] #2
- [ idx1 ][ idx2 ][ idx3 ][ record 3 ] #3
- [ 0 ][ 0 ][ 0 ][ record 4 ] #4
- [ 0 ][ 0 ][ 0 ][ record 5 ] #5
- [ 0 ][ 0 ][ 0 ][ record 6 ] #6
-
- So how did it get there? Well, first, it unlinked "record 2" from all
- relevant indices. Then it found the last record in the primary heap,
- "record 7", and wrote it over "record 2"'s slot... and corrected all the
- indices to point to slot #2 instead of slot #4, where "record 7" was
- before it moved. That left a hole at slot #4, so "record 4", in the
- last slot in the secondary heap, was rolled backward over that slot.
- Simple.
-
-
- MB_SEL() ----------------------------------------------------------------------
-
- This one was really, really simple... basically because LTEQ, LTHAN, EQUAL,
- GTHAN and GTEQ aren't supported on unindexed records. FIRST, NEXT, LAST
- and PREVIOUS are, however, so there's a separate routine in mb_misc.c
- called _find_queue(), which simply moves rel->pos and reads in the
- appropriate record.
-
-
- MB_UPD() ----------------------------------------------------------------------
-
- This one was even more simple than the others; basically, if updating an
- indexed record, the routine is the same as always. If updating an unindexed
- record, we just overwrite the record data with the new data, and that's
- it.
-
-