Il CD di internet

home *** CD-ROM | disk | FTP | other *** search

/ Il CD di internet / CD.iso / SOURCE / CONTRIB / MBASE / MBASE51.TAR / mbase51 / tech / chapter.3 < prev next >

Wrap

Text File | 1993-09-04 | 5.4 KB | 116 lines

Tech Chapter 3 - Delayed indexing MetalBase 5.1 ------------------------------------------------------------------------------- If you take a look at the chapter on MetalBase's binary tree algorithms, you'll notice that the operation within mb_add() which links a new record into all a relation's indices is a complete, distinct routine. Without this routine, a record would exist in the physical heap, but would not be accessed by following any indices in any direction. Sounds awful, doesn't it? It's not all that bad. as of MetalBase 5.1, functions have been added which allow you to deal with records which have not been linked into the indices. The primary benefit of this is that add operations are nearly instantaneous; for example, if you were importing 5,000 records into a 10,000-record relation, mb_add_q() would be infinitely faster than using mb_add(). The downside is that these records can't be queried as well as others; to make them do that, there are two ways to get them into the primary heap: 1st- the routine mb_xfer() will take the first unindexed record that it finds and index it. In essence, calling mb_add_q() followed by mb_xfer() is identical to just calling mb_add(). The advantage here is that you can index records when your computer is free. 2nd- the utility MBDIAG, when it rebuilds indices, will take into account -all- records, including those not previously indexed. In this way, the 5000 records mentioned above can be indexed in a much, much shorter amount of time. To make this as fast as possible, records which haven't been indexed are kept in a contiguous chunk at the end of the .REL file: [ relation header ] [ idx1 ][ idx2 ][ idx3 ][ record 0 ] #0 [ idx1 ][ idx2 ][ idx3 ][ record 1 ] #1 [ idx1 ][ idx2 ][ idx3 ][ record 2 ] #2 [ idx1 ][ idx2 ][ idx3 ][ record 3 ] #3 [ 0 ][ 0 ][ 0 ][ record 4 ] #4 [ 0 ][ 0 ][ 0 ][ record 5 ] #5 [ 0 ][ 0 ][ 0 ][ record 6 ] #6 In the example above, records 4-6 have not been indexed. If mb_xfer were called at this point, record 4 would be indexed, and 5 and 6 would be left in this unindexed-record heap. MB_ADD() ---------------------------------------------------------------------- However, this leads to the problem of keeping all those records in a chunk when you've used mb_add(), or mb_del()'ed a record which is indexed. First off, mb_add()... in the case below, "record 7" has been added with mb_add() (so it had to be -before- the secondary heap, into position #4): [ relation header ] [ idx1 ][ idx2 ][ idx3 ][ record 0 ] #0 [ idx1 ][ idx2 ][ idx3 ][ record 1 ] #1 [ idx1 ][ idx2 ][ idx3 ][ record 2 ] #2 [ idx1 ][ idx2 ][ idx3 ][ record 3 ] #3 [ idx1 ][ idx2 ][ idx3 ][ record 7 ] #4 [ 0 ][ 0 ][ 0 ][ record 5 ] #5 [ 0 ][ 0 ][ 0 ][ record 6 ] #6 [ 0 ][ 0 ][ 0 ][ record 4 ] #7 So "record 4" was rolled to the end of the secondary heap, leaving a block of space for "record 7" to be inserted. Simple, eh? Note that the normal _check_dup() function uses _search() to find duplicates, which only knows about the primary (indexed) heap. So, before mb_add(), mb_add_q() and mb_upd() can accept a record, it has to do a second check throughout the entire secondary heap to ensure no nodups index will be violated by the addition. MB_DEL() ---------------------------------------------------------------------- Now what about mb_del()? Using the above chart--if you deleted "record 5", "record 4" would replace it and that's that--no indices to fix, because it isn't indexed. It's not that easy if you delete, say, "record 2". In that case, the chart would end up looking like this: [ relation header ] [ idx1 ][ idx2 ][ idx3 ][ record 0 ] #0 [ idx1 ][ idx2 ][ idx3 ][ record 1 ] #1 [ idx1 ][ idx2 ][ idx3 ][ record 7 ] #2 [ idx1 ][ idx2 ][ idx3 ][ record 3 ] #3 [ 0 ][ 0 ][ 0 ][ record 4 ] #4 [ 0 ][ 0 ][ 0 ][ record 5 ] #5 [ 0 ][ 0 ][ 0 ][ record 6 ] #6 So how did it get there? Well, first, it unlinked "record 2" from all relevant indices. Then it found the last record in the primary heap, "record 7", and wrote it over "record 2"'s slot... and corrected all the indices to point to slot #2 instead of slot #4, where "record 7" was before it moved. That left a hole at slot #4, so "record 4", in the last slot in the secondary heap, was rolled backward over that slot. Simple. MB_SEL() ---------------------------------------------------------------------- This one was really, really simple... basically because LTEQ, LTHAN, EQUAL, GTHAN and GTEQ aren't supported on unindexed records. FIRST, NEXT, LAST and PREVIOUS are, however, so there's a separate routine in mb_misc.c called _find_queue(), which simply moves rel->pos and reads in the appropriate record. MB_UPD() ---------------------------------------------------------------------- This one was even more simple than the others; basically, if updating an indexed record, the routine is the same as always. If updating an unindexed record, we just overwrite the record data with the new data, and that's it.