home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Black Box 4
/
BlackBox.cdr
/
editors
/
nlh14.arj
/
NLH.DOC
< prev
next >
Wrap
Text File
|
1992-01-23
|
19KB
|
619 lines
--------------------------
| N L H |
| N A T U R A L |
| L A N G U A G E |
| H Y P E R T E X T |
--------------------------
(Version 1.4 17-Jan-1992)
Natural Language HyperText
==========================
This program allows user to get quickly proper answers, writing specific
questions using common natural language, then NLH allow him to navigate
answers via HyperText technique.
Fee and Licence
===============
This package is completely free of charge. No bureaucracy, no constraint no
legal issue can prevent you to make any copy you want. I made it for my
pleasure spending my RS&S days (Raining Saturdays & Sundays). I ask you only
one thing; to translate the following old Latin Cicerone's phrase :
"Plerique amicos, tanquam pecudes, eos potissimum diligunt, ex quibus sperant
se maximum fructum esse capturos" (Cicero 106 B.C.)
Maurizio Ammannato
Two programs
============
This package consists of two main programs : NLC.EXE to create a new ADB
(Answer Data Base) and NLH.EXE to execute Natural Language HyperText. To use
NLH, refer to NLUSERI.DOC (Italian) or NLUSERE.DOC (English).
Keywords
========
Before starting to describe the source file structure let me make some
considerations on how to capture the 'meaningful words' of a generic phrase.
I want to build an answer for the question "What communications port have I
to use?". Well, 1) I know the question is about a trouble, 2) we are speaking
about a communications package, 3) if the question is about port is because
the user wants to use this port.
Looking at this example we can conclude that it is enough to use the one-word
"PORT" keyword to identify related answer.
If I have to answer to the following question: "How can I change my
communications port parity?", we probably have to use "CHANGE PARITY" as a
two-words keyword to identify proper answer.
If instead the question is about: "How can I change my communications port
data bits?", we probably have to use "CHANGE DATA BITS" as a three-words
keyword to identify proper answer.
Finally if the question is: "How can I access PCMB ADDRESS BOOK Basket?",
probably I have to use "PCMB ACCESS ADDRESS BASKET" four-words keyword, to
identify proper answer. I can obviously have different keywords pointing to
same answer, i.e. :
'-------------------
@#$
Access Address Basket (First Keyword)
This basket......................
.................................
.................................
..............................ok?
'-------------------
@#$
Address Book Basket (2nd Keyword)
@SEE:Access Address Basket
'-------------------
@#$
Access Book Basket (3rd Keyword)
@SEE:Access Address Basket
above structure allows all 3 of three-words keywords to point to the same
answer.
Master Keyword
==============
When NLH user writes his question into the Bottom Input Window, he can get an
answer or he cannot get any answer (if NLH didn't find any keyword at all).
You can decide to use Master Keyword option to let user to get an answer in
ANY CASE. That's why of Master Keyword.
******************************
ONLY ONE MASTER KEYWORD FOR DB
******************************
Simply enclose that Master Keyword between two '@' (decimal 64, without
quotes) to use this option.
In PCMB.SRC source Answer Data Base I've chosen 'Troubleshooting' as the
Master Key and it looks like :
@#$
@Troubleshooting@
........................................................
........................................................
@#$
Now when user doesn't get any answer, the Troubleshooting text appears.
Take this opportunity to write that Master Text very clearly, detailed with a
lot of Text Keywords (see below) to allow user to navigate up to closest
answer to his question.
A part of that, the Master Key is a key like any other (you can access
directly this keyword from input text or you can enclose it between |....|,
like an usual Text Keyword, etc).
Creating New ADB
================
***************************************************************
Source ADB file must be ASCII, any filename with .SRC extension
***************************************************************
Take your preferred Text Editor or Word Processor capable to export document
in ASCII, listable, printable file (i.e. C:\NL\PE2 pippo.src).
An NLH Answer Data Base consists of two main parts. The first one describes
the words (from user input) that NLH has to replace before executing the
searching process.
For example Quik-Comm is an ADB keyword. The user can write correctly Quik-
Comm or also QuikComm or QuickComm or QC etc.
To facilitate NLH search, you can instruct it to replace all above words with
the Quik-Comm correct word in the following way :
'--------------------- beginning of source PIPPO.SRC file (ASCII)
@REPLACE-START@
QuikComm,Quik-Comm
QuickComm,Quik-Comm
Quik-Com,Quik-Comm
QC,Quik-Comm
file,files
driver,drivers
diskette,disk
communication,communications
respond,answer
....................
....................
@REPLACE-END@
Once NLH gets question from bottom window, word by word looks for its alias
(if exists), then replaces it accordingly to above list.
*********************************************
FIRST word will be replaced by the SECOND one
*********************************************
For example if you write in the REPLACE section :
...................
QuickComm,Quik-Comm
...................
that means each time NLH receives the input word "QuickComm" from user,
replaces that with "Quik-Comm" word, then NLH starts the search.
Once declared alias list you can start creating the Answer Data Base body
part.
Any answer has three parts:
******************************************************
1) ANSWER SEPARATOR, 2) ANSWER KEYWORD, 3) ANSWER BODY
******************************************************
ANSWER SEPARATOR
================
Use the @#$ character string on top of new answer to let program to
understand where the old question ends and where the new one starts.
ANSWER KEYWORD
==============
It's the keyword to be searched. It can be a one-word keyword (i.e. PORT) or
two-words keyword (i.e. PORT NUMBER), three-words keyword (i.e.
COMMUNICATIONS PORT NUMBER) or finally a four-words keyword (i.e. MODIFY PCMB
ADDRESS BOOK).
***********************************************
AN ANSWER KEYWORD CAN HAVE A MAXIMUM OF 4 WORDS
***********************************************
ANSWER BODY
===========
It is the text of the answer which can have up to 78 characters per line and
INFINITE lines. Just and example :
****************************************
IT DOESN'T MATTER OF KEYWORD WORDS ORDER
("Comm Port" is the same of "Port Comm)"
****************************************
'-------------------
@#$ 1st) ANSWER SEPARATOR
Communications Port 2-WORDS KEYWORD
First answer row......................
...................................... ANSWER BODY
Last answer row.......................
'-------------------
@#$ 2nd) ANSWER SEPARATOR
Port 1-WORD KEYWORD
First answer row......................
...................................... ANSWER BODY
Last answer row.......................
***********************************************
PRINT PCMB.SRC FILE FOR A COMPLETE REAL EXAMPLE
(C:\PRINT PCMB.SRC)
************************************************
Special Keys
============
As told before, you can have more keywords pointing to same answer. You have
only to use "@SEE:" command followed by pointed keyword. An example :
'-------------------------------------- any ' means comment
@#$
Communications Port
First answer row....................
....................................
....................................
....................................
....................................
Last answer row.....................
'-----------------------------------
@#$
Communication Port
@SEE:Communications Port
'-----------------------------------
@#$
Telecommunication Port
@SEE:Communications Port
'-----------------------------------
@#$
Communication Board
@SEE:Communications Port
'-----------------------------------
@#$
Port
@SEE:Communications Port
'-----------------------------------
@#$
UART
@SEE:Communications Port
Who makes the question and writes "Communications Port" or "Communication
Port" or "Port" or "Ports" etc.. gets always proper answer without
duplicating sometime a long answer for each of these similar questions.
Text Keyword
============
Sometime it is useful to let user navigate more than one window letting him
to go in depth of the explanation with more details. Let imagine your user is
asking something on communication port : "How can I change data bits
communications parameters?"
@#$
Communications Data Bits
You can change these communication parameter is the following way :
...................................................................
Other than data bits you probably have also to change parity. See
also |Changing Parity| keyword etc.................................
Writing a word between two '|' (decimal 124) you can mark this text word as a
keyword. When user will press ENTER key NLH looks for this keyword and
displays a new window with a second answer. Obviously "Changing Parity" MUST
exist in the ADB. At this moment you cannot have more than 100 keywords per
screen.
Text Keyword Type
=================
There are three types of Text Keywords : text, program and image keywords.
1) Text Keyword
---------------
It's exactly the same example as before. An Example :
"Other than this explanation, see also : |Change Parity|" (and Change Parity
is another keyword with related text and eventually other text keywords).
2) Program Keyword
------------------
******************************************************
PROGRAM KEYWORDS MUST HAVE .EXE .COM OR .BAT EXTENSION
******************************************************
If the keyword has one of above extensions, NLH simply SHELL (executes) this
program and at its end comes back exactly at the same point. An Example:
"Now if you have well understood, execute |TX.EXE| program and pressing twice
F2 keys, come back here for comments!"
3) Image Keyword
----------------
***************************************
IMAGE KEYWORDS MUST HAVE .SCR EXTENSION
***************************************
Can be useful (and PCMB.ADB uses extensively this option) to display a screen
(of the program we are speaking about) exactly as it appears in the reality.
See ahead the use of NLW program. An example.
"Put highlight to SERVICES basket (|PCMB-R01.SCR| press ENTER now to see how
PCMB screen must looks like).........."
NLW Program
===========
This TSR (Terminate and Stay Resident) utility allows you to capture video
screens to a file which MUST have .SCR extension. To capture for example PCMB
screens to be displayed using NLH, do as following.
At DOS prompt execute C:>NLW.
Then execute PCMB. When you want to save current screen, press together Alt-B
(Alt and B keys). NLW ask for its filename then saves this file to disk.
Quitting PCMB, copy all these .SCR file to NL directory where programs and
DBs are. "NLW /U" to uninstal this program from memory.
When you are writing your answer data base put |PCMB-01.SCR| keyword in the
right point, then when user will press this keyword, a standard PCMB screen
will appear. Any key to come back to the answer text of NLH package.
Updating Existing ADB
=====================
Simply edit .SRC listable, ASCII source file. Make your corrections or
updates, then save it again.
****************************************************
REMEMBER TO EXECUTE NLC.EXE PROGRAM AFTER ANY UPDATE
****************************************************
Adding New Keywords
===================
In the same manner as told before, you can increase the knowledge of your ADB
appending new answers to client new questions, without any limit (just
physical hardware ones)
************************************************************
REMEMBER TO EXECUTE NLC.EXE PROGRAM AFTER ADDING NEW ANSWERS
************************************************************
Compiling .SRC Source
=====================
Once defined all the possible answers in the "filename.SRC" Source ADB
(Answer Data Base) file
*****************************************************
SOURCE ADB MUST BE AN ASCII, LISTABLE, PRINTABLE FILE
*****************************************************
put this file in the same NLH programs directory (i.e. NLH directory), then
execute NLC.EXE followed by a space and the source filename to build the ADB
file and related indexes (for example C:\NLH\NLC PIPPO.SRC).
NLC, if no errors, creates all needed files :
- PIPPO.ADB the ADB with all the answers
- PIPPO.AX1 1-word index file
- PIPPO.AX2 2-words index file
- PIPPO.AX3 3-words index file
- PIPPO.AX4 4-words index file
- PIPPO.WL ADB used words list file (input words don't matching this
file words, are discarded to speed up search process)
- PIPPO.RL to be replaced words list
- PIPPO.MK master keyword record
During compilation you can get two different errors :
1) @SEE: command is pointing to non existing keyword (pointed keyword must be
defined BEFORE @SEE: command)
2) Text Keyword is pointing to non existing keyword (probably wrong text)
NLC.CFG
=======
Look at this ASCII file to modify language or to change PRINTER ON/OFF
option.
For Technical People
====================
All programs are written in Quik Basic 4.5 language using QB4BAS and QuickPak
Assembler routines library for processing, QBVID 'C' routines for video
management and finally my routines calling DOS INTERRUPT services for
windows.
The Answer Data Base file is a random fixed-length text file pointed by
sorted index files. This file structure allows a very fast dichotomic search
(dichotomic technique takes, in the worst case, 16 random access to find one
record between 100.000). That means that the ADB and related indexes can grow
CONSIDERABLY without increasing answer response time.
Searching Process
-----------------
Once NLH gets a carriage return code from bottom input window, it divides the
phrase word by word. Then tries to replace each word with its alias (if found
in .RL index file).
Then, still before starting search, confronts each input word with .WL index
and discards all words which don't match that list (words not used in the
Answer Data Base keywords)
Finally NLH starts searching 4-words at the time (it doesn't matter of order,
thus user can write "change data bit value" or "bit change data value" or
"value data bit change" up to 24 combinations) then if it doesn't find this
keyword, it looks for 3-words at the time, then 2-words and finally it search
word by word.
An Example
----------
User writes :
How can I send and receive from Host my QC?
I Step: NLH replaces "QC" with "mail" (first list in PCMB.SRC)
II Step: NLH discards "can" "I" "and" "from" "my" because not found in
PCMB.WL index
III Step: now we have following phrase "HOW SEND RECEIVE HOST MAIL" and
NLH takes first 4 words and starts searching this 4-words
keyword into .AX4 index for 24 time equal to 4-words combinations:
A B C D A C B D A B D C A C D B A D B C A D C B
B A C D B C A D B A D C B C D A B D A C B D C A
C B A D C A B D C B D A C A D B C D B A C D A B
D B C A D C B A D B A C D C A B D A B C D A C B
Then, if not found, takes other 4-words and again. If not found, then takes 3-
words and their combinations, after 2-words and finally word by word.
History
=======
1.4 . Printer option for compiler (NLC) in the NLC.CFG file instead
of screen prompt.
. Some cosmetic to NLC program.
. Better string length control for NLH program.
. Added NLH.CFI and NLC.CFI configurator files in Italian
language (copy them to respectively NLC.CFG and NLH.CFG).
1.3 . Fixed a bug when reading an ADB without REPLACE KEYWORDS
section.
. Fixed a minor bug when reading comment lines in NLH.CFG file.
. Added Function Keys within Alt-keys (for some PCs not
fully MS-DOS compatible). Print NLH.CFG to see changes.
. No more Exit Window when related messages is a null string.
. Added message when saving text to external file NLH.OUT.
. Rewritten Window Routines to increase speed.
. Printer not ready message.
. Revised documentation.
. Other minor cosmetics.
1.2 . New utility and new management of Picture keywords (.SCR)
1.1 . Cosmetics.
1.0 . Initial version.