Since I hate to write manuals, this is a very succinct manual. The program is pretty much self-explaining... Right now, it is a PPC-only executable. Let me know if some of you are interested in a 68k version. Sulu requires System 7.
I have decided that this first version is going to be free, although it already represents many hours of programming. If you really like this tool and would like to see future directions explored, I am sure you know what to do.
Sulu is a mixture of a browser and an indexing engined designed to operate on composite text files, such as mailing list digests, newsgroups digests, CompuServe Navigator archive and session files, TidBits filesÉ After having indexed those files you can navigate in a hierarchical view of the messages and perform lightning fast searches on author and title.
The first step is to index the files containing the messages you want to browse into a sulu document. To do so, select the 'Index...' command of the File menu and pick a file containing messages. If you want to index all the files in a given directory, you can select the bottom button in the file selection window. The Index command will put the newly found messages in the top window.
Double clicking on a item in a message window displays its contents. If it contains a thread, you will get a thread window, much like the image above. If it contains messages, you will get a message window:
The message windows contains four buttons of interest: - left and right arrow take you to the next message at the same level - plus and minus respectively expand and collapse the thread selected in the left hand part of the window. They are inactive if a message without descendants is selected.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sulu is especially optimized for finding messages based on author and/or title. Search space is limited to the content of the topmost window at the time the Find command is selected. You can therefore search the entire document if the root window is on top, or only one of the sub threads if a thread window is on top. You can even do composite searches by issuing a Find command when a Find results window is showing. The result will be the messages which contain the first search string *and* the second search string. The title of the find window gives you an indication of what the search space is going to be.
It is also possible to search based on message body content, but since content is currently not indexed, don't expect miracles.
If the files that you want to index can be modified after the index, or if you can add or remove files from a given directory, you can setup a Watch directory. Select the 'Watch directory' item from the file menu and pick a directory. From now on, every time the document is opened or whenever you select the 'Update' item of the file menu on a window containing a watch directory, this directory will be checked: old files will be removed, new files will be indexed and modified files will be re-indexed.
The indexing facility of sulu has basic scriptability capabilities. You can for example do something like:
choose file with prompt "Choose File:"
set theFile to the result
tell application "sulu"
import theFile into the first document
end tell
When an import fails, the trace window, which can be displayed from the window menu, gives some feedback on what the parser actually is doing with your files. I will definitely need along with the file you are trying to import if you want to send me a bug report. 'Invalid Token' errors are not a problem for the result of the parsing. 'Syntax error' message might prevent the parser to continue correctly.
The current version of Sulu is a bit of a memory pig. If you are importing large files (i.e. more than 5mb) make sure you give it plenty of memory.
Sulu is a bit more clever for MacNav (né Compuserve Navigator) session files than for other files. It can do two additional things when it recognizes a session file: - first it will attempt a second pass on messages looking like digests that are found inside the session. The end result is that if you get a CMSP digest for example through compuserve mail, the digest will be correctly interpreted - second, it is able to do incremental update on Compuserve sessions file: if the session file is in a watch directory and sulu detects that the file has changed and that it has grown, it will attempt to resume indexing at the same place it stopped at the last update. The result is that if you put your current session file.
Here is an list of what has currently been tested:
If a format is in this list it should be recognized. If it is not, it might be. Don't hesitate to send me samples of text files you would like to be indexed.
During the indexing process, the progress bar gives you an indication of how well parsing is progressing. The progress bar only changes when a new message is found. If it gets stuck and then progresses by a big hop, this is usually the sign that something is wrong.
I have no doubt that the parser can be fooled (and easily at that, but I won't tell you how), but given the nature of the files that can potentially be encountered, making a foolproof parser is next to impossible.
The parser which is the heart of sulu was developped using PCCTS, the Purdue Compiler Construction Tool Set, by Terence Parr. This excellent alternative to Lex/Yacc can be found here.
Kudos go to Éric Paillé for ß testing
Dont't hesitate to send me a message. The latest version of this program will be living here.
v1.0 : First public release
Copyright © 1996 Patrice Gautier, all rights reserved