14. Ultra-fast Searching with Indexes

FileFlex excels at index files. Letting FileFlex use indexes is like giving a prized racehorse the chance to just let it all out, throttling up a perfectly tuned race car, or punching a fighter jet into afterburner. When we say that FileFlex can locate any record in a sea of billions of records faster than the blink of an eye, we're talking FileFlex indexes.

Whenever possible, we recommend you use indexed-based searching. Indexes work by algorithm, rather than brute force. FileFlex looks at the string you're search for, does a mathematical calculation that basically tells it how far into the file to move, and boom, it's on the record you need. By contrast, both DBQuery and the full-text search DBFindMemo scan on a record by record basis. This means that if the data you're looking for is at the end of the file, DBQuery and DBFindMemo must individually check all the preceding records prior to finding the match. DBSeek (the interface to indexes) just does a calculation and whammo! You're on the record.

Indexes are not without their price, however. Since indexes can't check every record and rely instead on complex offset calculations, they don't support complex queries. But they're ideal for most queries you'll need. Need to find someone's address? Construct an index combining last name and first name, do a DBSeek, and--poof!--you're on the record. Need to find everyone in the Southeast Region who's booked over $1.25 million and who hasn't gotten a recent raise? Use DBQuery and be prepared to wait a while.

Indexes also take disk space, often quite a lot. Each index is it's own file and that file contains the complete data of the field it's indexing, as well as some overhead space used internally by FileFlex. So, if you've got this handy name and address database and you want to index on last name, followed by first name, you've got one index file. If you want to index based on zipcode, you've got another index file, and so forth. But what the heck. CD-ROMs are big, text is small, and new hard drives are cheap. Use indexes and rejoice in the raw speed!

Note: It is possible to use DBSeek and DBQuery incorrectly and get dog-poor performance. These tools provide you with the capabilities. But it's up to you to design something that works efficiently. Just because you've got a hot database engine is no excuse for poor application design.

Index Files Supported

FileFlex supports the use and updating of dBASE III-compatible index files only. Other index file architectures such as FoxPro indexes cannot be used. However, FileFlex can reindex a file using the dBASE index file structures. This makes it easy to use files which have been indexed using other methods.

When Indexes are Updated Automatically

Any time you have opened one or more index files related to an open database file and you make changes to that database file, FileFlex automatically updates those indexes to reflect the new file contents.

Opening and Using Index Files

Among the functions you may wish to perform on an index file are the following:

Opening an Index File

Use the FileFlex DBUseIndex function to open an index file for use. Supply the index file's name as an argument. Assign the result of this function to a variable (usually global) because you'll need to refer to the index file's ID in other scripts and handlers. Here's an example of the use of this function:
  put DBUseIndex("STARS") into StarIndex
Note that the name of the index file should include any extension the file might have. In the Macintosh environment, extensions are normally omitted, but in DOS environments, the file extension ".NDX" will generally be used for dBASE III-compatible index files.

We regularly get tech support calls from customers who get index file errors. The most common reason is that the database the index file indexes must be open and selected (DBUse and DBSelect) before executing a DBUse. One good practice is to do all your DBUse's and DBUseIndex's at the start of your application's life (for instance, in a startMovie handler). Use DBUse to open a data file and assign an ID to a global variable. Then call DBUseIndex repeatedly, opening all the index files for that data file, again assigning the IDs to globals. Repeat this until all your data and index files are open.

Note: Windows 3.1/DOS users may need to check the FILES= option in your CONFIG.SYS to make sure you're allowed to open this many files. If not, you'll need another file opening strategy, like only opening those files in use by a given module.

Checking an Index File for Currency

If you suspect that an index file may not be "in sync" with its database file, you can confirm its currency with the DBCheckIndex function. This function returns an error code of -8 if its contents do not match those of the database file. Otherwise, it returns the normal result code of 0.

This function would be most likely used in conjunction with the DBReindex function described below in a script line that looks like this:
  if DBCheckIndex(indexID) = -8 then 
    put DBReindex(indexID) into temp
  end if
The value of indexID would, of course, be the result of the previous call to the DBUseIndex function as described above.

Selecting an Open Index File

You must inform FileFlex which of (potentially) several open index files it should use to access the current file with the DBSelectIndex function. Supply the ID of the index file (returned by the DBUseIndex function) as an argument:
  put DBSelectIndex(StarIndex) into dbResult
Only one index file can be active at any time. It dictates the order in which record retrieval takes place. However, all open index files are updated each time a record is added or permanently.

If you're jumping between databases as well as indexes, don't forget to execute a DBSelect for the appropriate database file prior to calling DBSelectIndex on it's associated indexes.

Closing an Index File

Once you have opened an index file for use, you must also close it as part of cleaning up after your application. The FileFlex DBCloseIndex function closes a designated index file:
  put DBCloseIndex(StarIndex) into dbResult
To close an index file use the DBCloseIndex function along with its associated ID (returned by the DBUseIndex function) to close it.

Finding a Record by Index

One of the main reasons for using indexes is to enable the database to find a record by the specific content of a specific field, namely, the one on which the index is based. The other reason is to keep the file in a specific order, or at least to give the appearance of retrieving information in that specific order. Remember that FileFlex never physically reorders your data. Indexes are the FileFlex equivalent of sort...but an equivalent that lets you switch between an unlimited number of sort orders instantly.

To locate a record which matches a specific expression in the indexed field, using the currently open index file, use the FileFlex DBSeek function, supplying a parameter containing the expression to be matched in the indexed field:
  put DBSeek("Fred Jones") into foundRec
The seekExpr should contain the value for which you wish to search the indexed field. If this command fails to find the record (i.e., the search moves beyond the end of the file), the function returns a value of 3. You can then retrieve the contents of the located record.

Once you have found a record that matches the index key given as an argument, subsequent uses of the function will move the current record pointer to succeeding records in the file until a value of 3 is returned, indicating no more records match the criterion.

Building a Seek Expression

Seek expressions can be complex. One important thing to know is that the seek expression must match the exact number of characters as the data stored. So, in the example above with Fred Jones, DBSeek would have indicated an exact match only if the field being searched was defined to contain exactly ten characters and the contents was "Fred Jones".

But what if you're not looking for a name that exactly fits the field width? The short answer is you need to pad out the string to the right number of characters. Let's assume we're searching a name field defined as 15 characters wide. We'd need the following search expression to get an exact match ('~' indicates a space character):
  put DBSeek("Fred Jones~~~~~") into foundRec
New in 2.0: In previous versions of FileFlex, you had to hand-construct a function to pad the string to the appropriate number of spaces. But in FileFlex 2.0, we introduce a new utility function called DBBuildSeekExpr. This function requires as it's parameters the ID of the index to be searched, as well as one search expression for each field in the index. It returns a constructed search string you can then pass to DBSeek.

Let's assume, for example, that the index we're using has the ID of 1. We'd call the following to get a perfectly constructed seek expression:
  put DBBuildSeekExpr(1,"Fred Jones") into theExpr
  put DBSeek(theExpr) into foundRec
DBBuildSeekExpr is very smart. If the index has been defined to convert all data to upper case, the DBBuildSeekExpr will convert your search string to upper case. If you pass a number as a string, it will appropriate pad the spaces and decimal points to be sure that your search string is in the appropriate order. Further, DBBuildSeekExpr will construct complex queries with exactly the right spacing for use in complex searches.

Say, for example, that you created a multi-field index, searching last name, then first name. Using the following code would generate a perfectly balanced and sized search string:
  put DBBuildSeekExpr(1,"Jones", "Fred") into theExpr
  put DBSeek(theExpr) into foundRec

Getting Index File Information

New in 2.0: In previous versions of FileFlex, once you created an index, you better darn well name it clearly because there was no way to find out the original index expression. But now, you can get at the index expression in two ways: DBIndexExpr and DBListIndexFields.

DBIndexExpr

New in 2.0: The new DBIndexExpr returns the index expression that was used to create the index file. Make sure the index file in question has been opened, then pass the index ID as the sole parameter to DBIndexExpr. The function will return the actual string used to create the index:
  put DBIndexExpr(1) into theExpr

  -- theExpr contains the following:
  UPPER(NAME)+STR(SALARY,6,2)

DBListIndexFields

New in 2.0: The new DBListIndexFields takes two parameters, the index file ID and a delimiter and returns a list of fields indexed by the specified index, separated by the delimiter. Here's an example:
  put DBListIndexFields(1,RETURN) into theResult
  -- theResult contains
  NAME
  SALARY

Creating New Indexes and Updating Old Ones

FileFlex includes functions that enable you to create a new index on a database file and to re-index a file whose index may be out of date. Like with creating a database, FileFlex 2.0 includes the FileFlex Database Designer that takes much of the work out of creating index files. Simply launch the Database Designer (in Director 5, it's under the Tools menu) and choose Index. Drag and drop your way into an easily designed index.

Creating a New Index

If you have a database file open and you wish to use programming to index it on a field for which no index is presently available, you can create a new index file with the FileFlex DBCreateIndex function. This function requires four arguments, as explained below, and follows this format:
  put DBCreateIndex(fileName, indexExpr, unique, safety)
  into dbResult
The four arguments required are:

Re-Indexing a File

There are times that an index file and its associated data file can become unsynchronized. This might happen in a mixed-platform database environment, for example, where many people are using FileFlex to access a central database but a database administrator is using Access or FoxPro to maintain the files separately. You can determine if a particular index file is synchronized with the database in use by using the FileFlex DBCheckIndex function described earlier. If you find that it is not synchronized, you can use the FileFlex function DBReindex with the following syntax:
  put DBReindex(indexID) into dbResult
Note that this operation uses an existing and previously opened index file and replaces it with the newly generated index.

Note: From a performance perspective, it's often not wise to do a DBCheckIndex and then a DBReindex if the index is out of sync. DBCheckIndex will scan each record until it finds a mismatch, and this takes time. It's often easier just to run a DBReindex and go for a cup of coffee, go out to lunch, or go home for the night (depending on the size of the database).

Multi-Field Indexes

FileFlex will permit you to index a single file on multiple fields. To create such an index, simply list the names of all the fields you wish to use in the index field group as a single string, with a plus sign concatenating them. Here's an example:
  put DBCreateIndex("TERRSALE","TERRITORY+SALES",0,0) into dbResult
Fields should be listed from primary key to secondary key. Indexing will take place in major-minor order from left to right. In the above example, the file will end up being sorted by the field called TERRITORY and, within TERRITORY, by SALES.

There is no theoretical limit to the number of fields that can be combined into an index provided, of course, that the length of the argument to the DBCreateIndex command does not exceed the limitations of the environment in which you are running FileFlex, usually 256 characters.

Remember, once you create a multi-field index, you're going to want to search for data. The easiest way is to create a multi-field index seek expression using DBBuildSeekExpr as described above.

Note: Multi-field indexes only work on character fields. All fields must be character fields, so use the intrinsic functions to convert appropriately.

Intrinsic Functions in Indexes

To create an index using an intrinsic function, simply include it inside the string defining the index expression and call DBCreateIndex. Here's an example:
  put DBCreateIndex("TERRSALE","UPPER(SALES)",0,0) into dbResult
The above would index the field SALES, but would sort as though all the letters in the SALES field were converted to upper case.

DBCreateIndex supports the following intrinsic functions:
  DTOC
  DEL
  DELETED
  RECNO
  STR
  SUBSTR
  UPPER
And yes, you can actually do bizarre index expressions like:
  "UPPER(LAST)+UPPER(FIRST)+STR(AGE)"
Intrinsic functions are documented in detail in the Intrinsic Function Reference.

You Must Index on a String Data Type

Because FileFlex was originally designed for use with scripting environments, it expects all its parameters to be strings.

Even so, if you were to index a FileFlex file by a numeric field, the indexing process would work. Problem is, you'd never be able to seek out any data.

That's because FileFlex expects to receive string values when running DBSeek. Indexing on a numeric value would require FileFlex to somehow convert the string it always receives to a numeric value. It doesn't do this. So, if you want to index something in numeric or date order, use the intrinsic functions DTOC and STR.



  [Previous Chapter]    [Table of Contents]    [Next Chapter]


Copyright (c) 1996 David Gewirtz under license to Component Software Corp. All rights reserved worldwide.