Searching Documents in Multiple Languages

The following test drive demonstrates a query form that queries in different languages and uses the stemming feature of Index Server. Stemming is explained later on this page.

Note   Do this test drive only if you have installed Index Server with all of its supported languages, or at least with German and U.S. English.

This topic contains:

Querying in Another Language

Before taking this tour, you should change the scope in the sample file Query.idq to point only to the Corpus directory. This change will cause the Sample HTM/IDQ/HTX Search Form to search only the Corpus directory

To change the scope
  1. Open the Query.idq file in your preferred text editor.
  2. When you installed Index Server, Query.idq was copied to the Inetpubs\Iissamples\Issamples directory.

  3. Search for the following line:
  4. 
    CiScope=%CiScope%
    
  5. Change the value from %CiScope% to /corpus, and save and exit the file.
  6. The line should now look like this:

    
    CiScope=/corpus
    
  7. With the Sample HTM/IDQ/HTX Search Form in your browser, click the browser's Refresh button.
  8. Type Index Server in the Enter your query below box.
  9. Click the Go button.
  10. You should get results only from the Corpus directory. This step ensures the form has been set up correctly.

  11. Click the New Query link.

Before you can search for documents in another language, you have to change the MS.LOCALE setting in the Query.htm file. This change tells Index Server to use the stemming for that specific language.

To change MS.LOCALE to German
  1. Open the Query.htm file in your preferred text editor.
  2. When you installed Index Server, Query.htm was copied to the Inetpubs\Iissamples\Issamples directory.

  3. Search for MS.LOCALE.
  4. Change the value from CONTENT=EN-US to CONTENT=DE, and save and exit the file.
  5. The line of code should now look like this:

    
    <META NAME="MS.LOCALE" CONTENT="DE">
    

  6. Save and exit Query.htm.

 

To search for all forms of the German verb gehen
  1. With the Sample HTM/IDQ/HTX Search Form in your browser, click the browser's Refresh button.
  2. Type gehen** in the Enter your query below box.
  3. Be sure to type the two asterisks (**)as shown.

  4. Click the Go button.

This query looks for all documents that contain the German word gehen, which means to go in English. The two asterisks instruct Index Server to stem the word. Stemming is a linguistic process that takes a given word and reduces it to its root linguistic form. For example, the English stem for swam is swim. After stemming is performed, Index Server inflects the stemmed form into all the grammatically correct variants. For English, stemming swam would generate the root form swim and all the other variants, such as swimmer, swimmers, swam, swum, and so on.

In this German query, Index Server will stem gehen and inflect it to all its forms and post a query using the variants. Index Server knows to use German linguistics for stemming this word because you selected the German language in the drop-down list.

Executing the query may take some time because Index Server needs to load the German linguistics modules. Subsequent German queries will take much less time because the modules are already loaded.

Examining the Query Results

Index Server returns the file Ixgerman.doc.. Note that this file does not contain the word gehen anywhere in the text. It does, however, contain the word gegangen, which is the past-tense form of gehen. Index Server stemmed gehen and inflected it out to its linguistic forms, in this case, including gegangen.

Note also that the numeric values and time and date-stamps in the references have been formatted to German conventions (that is, using a period instead of a comma for thousands separators, and so on).

Index Server can be configured to use a default locale and language so that the language need not be specified by every query and query form. This form also allows the user to override any default locale and language settings for the purposes of the exercise. For more information, see Support for Multiple Languages.


© 1997 by Microsoft Corporation. All rights reserved.