
User Manual
Overview
WebWolf is a personal WebBot that scans Web Sites on the net and compiles
a list of Files, links, Ftp Sites and other resources found on the World Wide
Web. It works on the same concept as other bots on the net such as
Altavista, Webcrawler etc, however it only targets sites with specific content,
and the results are returned in real time and are thus always current.
The specialty of WebWolf is locating files within Internet's Web space.
These are usually missed by FTP search engines, and conventional Crawlers.
WebWolf has an builtin ability to learn and map out the network each time
you use it, and it will become better at locating Files with each use.
Using a 28.8 K dialup line, WebWolf will be eventually able to locate and
catalogue 10000's of resources per hour.
Unregistered Version Restrictions
If you have not yet registered, you will have cause to despair, for you will
not be able to utilise the full potential of this product. There are three
restrictions in the Unregistered version. You will be not be able to use
more than ten concurrent connections, which means that WebWolf will run
slower. Secondly you loose the benefit of previous sessions as
you will not have restart capability, and the search will stop after locating
2000 files.
If you have not done so already, please register your copy. You will find it
most worth while.
Configuring WebWolf
Before running the program, you need to configure WebWolf.
Select OPTIONS followed by PREFERENCES from the main menu. The WebWolf
Preferences window should
come up. If you are using a proxy server, the settings as well as the port
must be correct.
Specifying an URL
You may specify an URL of a site where each hunt for resources is to begin.
The site should be related to the type of information your are seeking, and
idealy have links to other sitez with similar content. If you do not
specify an URL, WebWolf may consult an internet search engine to obtain
a starting point.
Selecting Keywords
Selecting the right combination of keywords means everything!
WebWolf will only extract information from a Web page, if the page
contains a specific keyword. You can specify as many keywords or phrases
as you want, separated by commas(,). Keywords are case insensitive.
If your keywords are too specific, WebWolf may soon run out of links
to crawl, but if they are too general, too much information on unrelated
topics may be retrieved.
For example if you are seeking information, files and resources about databases,
a good choice of keywords would be:
oracle,sybase,database,sql
The two product names 'oracle' and 'sybase' are very specific, whereas 'database'
and 'sql' are more general, and will help WebWolf locate pages that may be
linked indirectly via sites that cover this topic, but not individual products.
After you use WebWolf for a while, you should begin to get a feel as to what
combinations of keywords yield the best results.
Note: The most specific keyword should always be first, and should not be
a phrase. You should also avoid using general keywords that may be common
on unrelated pages. For example by also adding the keyword 'computer':
oracle,sybase,database,sql,computer
...you broaden the search, and any page containing this word will be crawled.
Initialy you will get the right results, but after a few levels, the link
with the databases topic will get lost all together in the bulk of material
that will be located.
You may also use advanced query and Boolean syntax. For more information and
examples, select Query Sytax from the Help menu.
Once you have entered your keywords, click on [Start] to begin the hunt.
Selecting Search Parameters
When you start a search, a Search Parameters window will be displayd
with a number of search options. WebWolf's search strategy will be
defined based on the settings of these options.
Search All Library links
All links in the current library will be connected to and the content
searched and indexed.
Update the Library with new Finds
If enabled, whenever a page is encountered that contains
appropriate file and/or FTP links, it will added to the library for
future reference.
Use Search Engines to locate New Threads
If enabled, WebWolf will consult Internet Search engines to locate
starting Links, which will server as entry points to new web rings for
exploration.
Recurse into and Search all new links
If enabled, WebWolf will connect to any link present on a page, that
is related to the current search topic. If disabled, only the initial
page from the specified URL or the library will be searched, and all
other links found will not be crawled. For example by disabling this
option, and the use of search engines above, you can force WebWolf
to scan library links only and stop.
Limit search to specified Site
If you have specified a starting URL on the main page, only pages and
files on that site will be searched and indexed. All other links will
be ignored.
FTP Links
If Enabled, links to FTP sites will be indexed.
File Links
If enabled, Links to File will be indexed.
Unix Files
If enabled, Unix files (eg: .Z .tar .gz etc.) will be indexed and included
in search results.
MAC Files
If enabled, MAC files (eg: .hqx) will be indexed and included in the search
results.
Text Documents
if enabled, text files and Documents will be indexed and included in search
results.
Restarting a Sessions
All WebWolf session are automatically restarted. Any previous results are
retained and included in the current Search. To restart a session, click the
topmost button from the right. This button may read either [Restart] or
[Start]. IF you are using an unregistered version, Restart is disabled and
you will need to start a new session.
Note: It is a good idea to start a new project, or clear out the library,
if you are starting a search for a topic unrelated to the
previous.
Starting a new session
To start a new session, select CLEAR from the main menu. Any previous results
will be deleted. (You will be prompted for confirmation). You can click the
[START] button to begin with a clean slate.
Note: The library links will not be deleted.
Resetting A session
Selecting RESET from the main menu will reset the status of all known
links. Bad linkz will be deleted, and all others re-scanned.
Unless you do a reset (or clear), linkz that have already been visited
are never searched again.
Displaying Results
The buttons on the left hand side display the number of Files, Links and
other items of interest that have been found to date. Clicking on one
of these buttons will launch a browser to display the results. The
[View] button can also be used. This brings up a menu of result files.
Please keep in mind the current version of WebWolf does not attempt to
validate File and FTP linkz and these are as up-to-date
as the source where they have been posted.
The FIND button
Once your file lists get very large, it will become difficult to locate
specific files of interest. In this case you can use the FIND button to search
for specific keywords in the Search Results.
Click the [FIND] button, and when the Find windows comes up, enter a keyword or
a list of comma(,) delimited keywords to search for. A list of all files
matching the search criteria will be displayed.
Setting a Watch
The watch screen can be used to display any new search results based on a
given keyword. For example if you enter 'sql', WebWolf will
tell you whenever a new entry with this keyword is added to the database.
The [Browse] button can then be used to display all files that have
'sql' as part of the filename, title or URL.
Using A Library
Unless disabled, WebWolf will use and maintain a session
history in a library file. This file contains information from previous
sessions and includes a list of all sites where file and/or FTP links were
found. Neither the RESET or CLEAR menu options will delete library entries.
Please note that each project has its own library.
If specified, Library Linkz will always be scanned first
(second if an URL is given).
For this reason, if you wish to explore a new area of the web, or search
for a different topic, you should clear out the existing library, or start
a new search project.
If you frequently search for different types of information, unless you
maintain a separate project for each, it may be a good idea to disable library
use altogether.
Note: Any modifications made to the current library may not be saved while
a search is in progress.
WebWolf URL Library

The main URL Library screen is accessed by clicking the Library button.
If the Library Update checkbox is enabled, an entry will be added
automaticaly whenever a new site that contains files and/or FTP sites is
located. This screen is a summary of all library links
in the current project and can also be used to delete, view and visit the
displayed sites.
Whenever you clear or reset the current project, the library links are retained
and when enabled, will always be scanned first if a starting url is not
specifed.
To edit or add a new link, click on the [?] next to each entry or use the
[Editor] button.
Library Link Editor Screen

The link editor can be used to add new links, view,
modify or delete existing listings in the project library.
The priority or possition of each link can be modified by using the Left
and right arrow buttons [<]   [>].
Note: The library entries may not be modified while a search is in progress.
Important Considerations
We recommend that you start a new session periodically, and clear out
old results, as you may start experiencing memory problems once WebWolf's
linkz database gets very large.