home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Magazyn Enter 1999 March
/
enter_03_1999_1.iso
/
OS2
/
TEMPL
/
TTON.DOC
/
TTON.CFG
< prev
next >
Wrap
Text File
|
1998-01-17
|
20KB
|
418 lines
# *************************************************************
# Templeton, copyright 1995-1998 N.A. Krawetz
# All rights reserved.
# *************************************************************
# configuration for Templeton
#
# Lines beginning with a '#' are comments and are ignored.
# Lines should not be more than 80 characters.
# Operands in this file are in the form:
# parameter value
# The parameter is case insensitive, except where a text string or URL
# is required.
# Boolean values ("true" or "false") are case insensitive.
# Numeric values should be numbers -- non-numbers are regarded as 0.
# All other types of values ARE case sensitive.
# ******************** Registration ****************************
# Register: registration code
# Software that is registered contains a unique registration
# code. This code should be entered exactly as it is provided.
# If your site contains multiple registrations, you may list
# each registration code on a line starting with the
# key word "Register".
# Please read the licensing agreement for registration
# information.
# Register 12-34567-891011
# ******************* File System *****************************
# LocalPath: absolute path
# LocalPath informs the program where to store the downloaded files.
# IF this path is:
# LocalPath none
# THEN no files are generated. Only a log file containing the remote
# servers WWW map is created in the current directory.
#
# Currently, files should be stored in the root directory of the file system.
# For WWW servers, this is the server's root directory.
# (This limitation will be removed in future releases.)
# For DOS based machines, this path may include a drive letter:
# LocalPath e:\server.www\
#
# Either slash "/" or backslash "\" are valid for specifying a directory.
# The trailing slash or backslash is optional.
#
# This option is only used when the "Interactive" option is FALSE.
LocalPath /
# FATFormat: boolean
# Determines the filename format for the current operating system.
# DOS based machines using drives formatted with a File Allocation Table (FAT)
# can only handle filenames containing 8 characters and a 3 character
# extension. Setting this option to TRUE will generate 8.3 character file
# names. The default is FALSE, and will generate unlimited length filenames.
# NOTE: Under DOS, this option is always TRUE (DOS only supports FAT file
# names). Under OS/2, this value becomes TRUE automatically if the destination
# path (LocalPath) is located on a FAT partition.
FATFormat FALSE
# User: e-mail address
# In case of emergency, this is the person who is running the program
# and who should be contacted to stop the program from running.
# This MUST be a valid e-mail address, and SHOULD also be available with
# a "talk" command.
# As a side note, it is never a good idea to let automatic software run
# unsupervised (especially this type of software). The "User" should be
# available to read their e-mail at all times during the execution of this
# program.
# The default is the account running the program on the current machine.
# User webmaster@host.machine.org
# ********************* Network *****************************
# DNSLookup: boolean
# A single machine may be refered by many hostnames. For example, "www",
# "www.crumple.com", and "paper.crumple.com" are all the same webserver.
# When DNSLookup is TRUE, Templeton will correctly identify these different
# hostnames are the same machine. When DNSLookup is FALSE, each of these
# hostnames are treated as different hosts. The DNS (domain name service)
# may take time to resolve a hostname (up to 2 minutes) so setting DNSLookup
# to FALSE can dramatically increase Templeton's speed, especially when
# processing HTML documents with many links to other machines.
# The default setting is TRUE.
# DNSLookup TRUE
# ProxyHost: hostname or IP address
# Proxy agents are machines that act as a gateway through a firewall.
# If your local network uses a proxy agent, specify the name of
# the proxy agent here. If you are uncertain about your network, consult your
# network manager or provider.
# A proxy server is only used when a server is specified.
# ProxyHost proxyhost.network.net
# ProxyPort: integer
# When using a proxy server (see ProxyHost), the port on the proxy server
# should be specified. The default port is 80. This value is not
# used if no proxy host is specified with ProxyHost.
ProxyPort 80
# Spoof: text-string
# Some WWW servers make incorrect assumptions about the browser/robots. (Most
# of these are the Netscape servers.) These servers assume that, since the
# browser is not "Netscape" the browser cannot handle the HTML documents and
# therefore, the document is not transfered. By "spoofing" a different name,
# the WWW robot can use a qualified browser name to retrieve the HTML
# document.
# NOTE: The first word of the spoof-name is used for restrictions when
# robot exclusion is honored (see Exclusion). This means, if Templeton tells
# the WWW server that it is "Netscape" and the server does not permit
# Netscape browsers, then the server will also not permit Templeton.
# Common spoof names (and browsers) are:
# Mozilla (Netscape Browser)
# WebCrawler (WebCrawler robot)
# InfoSeek (InfoSeek robot)
# WebExplorer (IBM WebExplorer for OS/2)
# Harvest (a web robot)
# Mosaic (NCSA Mosaic)
# Lynx (Lynx, text browser)
# Microsoft Internet Explorer
# PRODIGY-WB (Prodigy browser)
# Spoof Mozilla (Templeton)
# ********************* Restrictions *****************************
# RestrictHost: boolean
# This parameter informs the program not to leave the designated host. Links
# to machines not on the current host are not traversed.
RestrictHost TRUE
# RestrictPath: absolute path
# This parameter is only used when a host is restricted.
# When a host is restricted, a subpath on that host may also be restricted.
# Hypertext references to documents outside this subtree are not traversed.
# Either slash "/" or backslash "\" are valid for specifying a directory.
# The trailing slash or backslash is optional.
RestrictPath /
# RestrictDepth: numeric value
# Hyperlinks are travered in a breadth-first search. An unrestricted search
# may download an entire WWW server's data. By restricting the depth,
# only immediate portions of the server will be received.
# Images and non-href links are considered to be at the same depth as the
# document.
# A restricted depth of 0 means no restriction.
# The default is 1
RestrictDepth 1
# RestrictImages: boolean
# Most HTML documents contain both text and graphics. Frequently, these
# graphics come from links to other computers. When restricting to a specific
# host, these images would not be retrieved (not on the host). Setting
# RestrictImages to FALSE allows inline graphics and image maps to be located
# on a remote host, but will not affect restrictions to hyperlinks.
# Setting the value to TRUE will apply all restrictions to all files
# (images, text, etc.).
# This option is available for people who want to mirror entire web documents,
# not just sites. The default value is FALSE, indicating that entire
# documents *should* be retrieved.
# Note: images and image maps restricted by the Deny configuration option
# or by robot exclusion are not retrieved.
# RemoveRestricted: boolean
# This parameter informs the program to remove untraversed links. Links to
# restricted machines or restricted depths are removed from the HTML file,
# but the visible test is still available (just not a hyperlink).
# The default value is FALSE.
RemoveRestricted FALSE
# RestrictDuration: HH:MM
# Templeton can run for hours or days. You can specify a runtime duration
# by entering the maximum number of hours (HH) and minutes (MM).
# The number of hours does not need to be restricted to a 24-hour period.
# Entering 0:0 disables this option. The default value is 0:0.
# RestrictDuration 2:30
# RestrictStopTime: HH:MM
# Templeton can run for hours or days. You can specify a specific stoptime
# by entering the hour (HH) and minute (MM). Times are provided in military
# notation, where 1PM is 13:00, etc. This option only works over a 24-hour
# period. Midnight is 24:00, but 1 minute after midnight is 00:01. Invalid
# times, such as 28:00, are ignored. Specifying 0:0 (the default value),
# disables this option.
# Add: URL
# Place a specific URL on the list of URLs to process.
# Be aware that restrictions apply.
# Exclusion: boolean
# This parameter determines whether Templeton will support server provided
# robot exclusion files (robots.txt). Many servers maintain exclusion files
# to prevent robots from wandering around virtual directory trees, from
# retrieving very temporary or uncomplete files, or copyright materials. It
# is considered "polite" for web agents to obey the exclusion files when they
# exist. The default value, TRUE, means that robot exclusion files are obeyed.
# Setting Exclusion to FALSE will ignore robot exclusion files.
Exclusion TRUE
# Deny: URL
# The URL provided, as well as all subtrees or the URL, are not processed.
# Many times specific directory subtrees are not desirable. You can deny
# retrieval of these URL's using this setting.
# For example, to NOT retrieve the "archive" subtree of the host loco.com,
# you would specify:
# Deny http://loco.com/archive/
# If you do not include the trailing slash (http://loco.com/archive) then
# all subdirectories beginning with "archive" are not processed. This
# includes "archive.1", "archive.old", "archive_from_1994", etc.
# Deny statements may also include a '*' as a wild character. This
# symbol represents 0 or more characters for matching. If, for example,
# you do not wish to retrieve GIF files, you would use:
# Deny *.gif
# Multiple Deny statements may be specified.
# Allow: URL
# Similar to "Deny", "Allow" explicitly specifies that a subtree is
# retrievable. When used in conjunction with Deny URL, branches of a
# subtree may be specified for access, while other subtrees are ignored.
# Multiple Allow statements may be specified.
# Authorize: "realm" base64-code
# This complex command allows you to specify a username and password
# for basic WWW-authentication. The realm is a quoted string.
# The base64-code contains the encoded username and password. Use
# the pwd64.exe program to generate your base64-code.
# The realm is a case-sensitive string provided by the WWW server. If you
# do not know the realm for the pages you wish to retrieve, use Templeton
# to interactively retrieve the page. Templeton will display the realm
# name and ask for your username and password.
# Be aware that realms are not unique. If different documents use the
# same realm but require different passwords, Templeton will require
# you to enter the username and password.
# To skip a realm, use the username "-" and password "-", or the
# base64-code: LTot
# Authorize "Secret Password" ZHIubmVhbDpyZWdpc3RlciBtZQ==
# Proxy-Authorize: "realm" base64-code
# Similar to "Authorize", this complex command allows you to specify a
# realm and password for a secure HTTP proxy server.
# Proxy-Authorize "Secret Password" ZHIubmVhbDpyZWdpc3RlciBtZQ==
# Sleep: numeric
# Sleep determines the number of seconds to pause before sending a request to
# a WWW server. SLEEP IS IMPORTANT.
# Warning: Templeton can generate thousands of requests per minute. Many
# WWW servers cannot handle a sudden onslaught of requests. Setting the
# Sleep parameter to 0 (zero) may generate too many requests for the server
# and kill the server. This is bad.
# A sleep setting of 0 (zero) is known to kill the following types of servers:
# All WWW servers that run under Microsoft Windows (TM)
# Old generation (HTML/1.0) CERN servers on all platforms
# Low sleep values may also generate large amounts of network traffic and
# hog network resources.
# For safety, you should set the sleep interval to at least 5 seconds.
# The longer, the better. Remember, this program is automated and can
# easily run for hours. What's the rush?
Sleep 10
# ********************* Preferences *****************************
# FileOverwrite: boolean or "modified"
# Files that already exist on the local system are normally overwritten.
# Setting the FileOverwrite option to FALSE will not overwrite files on the
# local file system. Setting the FileOverwrite option to "Modified"
# (no quotes) will only retrieve documents (non-HTML) that have been changed
# since the last retrieval. The modified option is useful when retrieving
# the same URL multiple times; modified will not waste time retrieving GIF
# and JPG files that have already been retrieved.
# FileOverwrite does NOT effect HTML documents -- HTML documents are always
# retrieved. Templeton can only determine links by retrieving HTML documents.
# Skipping an HTML document would mean skipping possible links.
# Default value is MODIFIED, only retrieving newer non-HTML files.
FileOverwrite modified
# Index: filename
# For hypertext references that only specify a directory, this is the
# default html file in the directory.
# NOTE: if FATFormat is TRUE, the 8.3 name translation will be applied to
# this filename.
# The default name is "index.html"
Index index.html
# ISMAP: absolute path to executable
# For WWW servers, many imagemaps use a program that takes coordinates from
# a selected image <IMG SRC=... ISMAP> and return a new URL. Some of the
# more common methods use a data file containing known coordinates and a
# program to identify which URL is activated. Commonly, this program is
# called "imagemap" or "imagemap.exe".
# The ISMAP parameter specifies the WWW server's path to the imagemap program.
ISMAP /cgi-bin/imagemap
# MapType: NCSA or CERN
# For the executable specified in the ISMAP parameter (see above), this
# option determines the format of the file. If the image map file can be
# retrieved, then it is converted into this specified format.
# Valid options are either "CERN" or "NCSA". The default is NCSA.
MapType NCSA
# ********************* Logging *****************************
# Mailto-File: filename
# Similar to "Server-File" logging, the filename listed on the "Mailto-File"
# line contains a list of e-mail addresses found in the HTML documents. Only
# e-mail addresses that are active (hyperlinks) are used. E-mail addresses
# displayed as plain text in the document or contained in CGI scripts are not
# listed in the mailto logfile.
# NOTE: This list MAY contain duplicate entries. Duplication removal may be
# added in later versions.
# (Some people have found this to be a very useful feature for generating
# mailing lists.)
# Setting the filename to "none" disables logging.
# The default is no mailto logging.
# Mailto-File mailtolist
# RemoteMapping: boolean
# Determines whether remote mapping will be done. The default is TRUE
# while does perform mapping. The map filename is mapindex.html and is
# either located at the root of the LocalPath or in the current directory
# if the system is not mirroring files.
# Note: if you change the default index name, for example, to "welcome.html"
# then the default map file will be "mapwelcome.html".
RemoteMapping TRUE
# Server-File: filename
# A data file is generated containing the host name, IP address, and
# WWW server type for each server visited. For servers listed as IP
# address only, the host name is also the IP address.
# Setting the filename to "none" disables logging.
# The default is no server logging.
# Server-File serverlist
# Update-File: filename
# The update file list is useful for downloading only files which have been
# modified. Although the option "FileOverwrite modified" will update
# newer images, it does not work with HTML documents. The Update-File
# option is useful for refreshing HTML documents as well as images.
# To use the saved update-file, include the file name on the command line.
# Setting the filename to "none" disables the update file.
# The default is "none".
# ********************* Advanced *****************************
# The advanced configuration commands should be used with caution.
# These commands allow other applications to perform tasks on the
# retrieved documents. Applications that are spawned (operate
# concurrently) with Templeton may overwhelm the user or operating system.
# Spawned applicatons include those begun with "start" under OS/2,
# or followed by "&" under Unix.
# NOTE: Templeton has the capability to spawn thousands of applications
# in a few seconds.
# On Unix-type systems, Templeton introduces security risks when executed
# as root.
# For applications that are not spawned, Templeton will pause until
# the application has ended. This allows for a guarenteed order of processing
# for the called applications.
# Command_html: string
# Command_image: string
# Command_map: string
# Command_default: string
# Execute a system command on each document stored on the file system.
# The different command types are for HTML documents, images, map files,
# or the default command when any of the other commands are not set.
# This are useful for counting documents, storing statistics, printing,
# converting, etc.
# The string "none" turns off these commands. This default is "none".
# The command string will replace special characters with desired information:
# characters: becomes:
# %d depth
# %h host (server)
# %p remote parent URL (first URL containing a link to this URL)
# %P local parent file (first file containing a link to this URL)
# %l local file
# %n current time in GMT (see %t)
# %N current time in local time (see %T)
# %r remote file (URL without server)
# %s saved file (same as %l)
# %t file timestamp (RFC 822 format) in GMT
# %t{rfc822} file timestamp in RFC 822 format
# %t{rfc850} file timestamp in RFC 850 format
# %t{ansi-c} file timestamp in ANSI C format
# %t{iso8601} file timestamp in ISO 8601 format
# %t{iso8601c} file timestamp in ISO 8601 compressed format
# %T similar to %t, but times provided in local time
# %u url
# %% %
# The special characters ARE case sensitive.
# NOTE: Command_image and Command_default do not distinguish between
# different file formats.
# Example: to convert all HTML documents to text using the program
# html2txt (not provided with the Templeton distribution), you would use:
# Command_html html2txt %s
# Command_url: string
# Similar to Command_html, this command line string is executed by *every*
# URL found. This includes other protocols such as "ftp://", "gopher://"
# and "mailto:". No effort is made toward uniqueness; the same URL may be
# seen hundreds of times.
# Because this command is processed each and every time a URL is found, it may
# significantly slow the runtime performance of Templeton.
# The string "none" turns off this command. The default is "none".
# This command replaces the same characters as Command_html, except for
# %l and %s; the local filename is unavailable.
# The time formats, %t and %T, show the time the URL was found by Templeton,
# *not* the timestamp of the file.
# The execution of the Command_url string does not effect the execution of
# the Command_html, Command_image, Command_map, or Command_default strings.
# Interactive: boolean
# Determines whether the user should be prompted for
# configuration information or if Templeton should
# start running automatically.
# The default setting is TRUE.