home *** CD-ROM | disk | FTP | other *** search
- Simple Offline USENET Packet Format (SOUP) Version 1.2
-
- Copyright (c) 1992-1993 Rhys Weatherley
-
- rhys@cs.uq.oz.au
-
- Last Update: 14 August 1993
-
- DISTRIBUTION
-
- Permission to use, copy, and distribute this material for any purpose
- and without fee is hereby granted, provided that the above copyright notice
- and this permission notice appear in all copies, and that the name of Rhys
- Weatherley not be used in advertising or publicity pertaining to this material
- without specific, prior written permission. RHYS WEATHERLEY MAKES NO
- REPRESENTATIONS ABOUT THE ACCURACY OR SUITABILITY OF THIS MATERIAL FOR ANY
- PURPOSE. IT IS PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES.
-
- NOTE: This document is NOT in the public domain. It is copyrighted.
- However, the free distribution of this document is unlimited.
-
- If you create a product which uses this packet format, it is suggested
- that you include an UNMODIFIED copy of this document to inform your users
- as to the packet format. All queries about this format, or requests for
- the latest version should be directed to Rhys Weatherley at the above
- e-mail address.
-
- INTRODUCTION
-
- For many years, the FidoNet community has been using QWK and other formats to
- enable users to download their mail and conferences to be read while off-line.
- This not only saves phone charges and prevents tying up BBS lines for long
- periods of time; it also allows a user to use much more powerful tools on
- their own machine to process the downloaded "packets" than what can be made
- available in an on-line environment.
-
- To date however, very little work has been done in the USENET and dial-in Unix
- community to facilitate the same user operations. Some attempts have been
- made to use QWK, but due to QWK's limitations and unsuitability for the USENET
- message formats, such efforts have not been very successful.
-
- Within USENET, the tendency seems to be either "dial-in to some other machine
- and put up with it", or "set up your own USENET site". The former keeps the
- user at the mercy of whatever user interfaces the admin of the other machine
- sees fit to install, and the latter requires far more computing knowledge than
- the average computer user is expected to have. Both of these can serve to
- lock out large portions of the computer-literate public from experiencing
- USENET. The latter option can also give rise to security problems in the form
- of forged USENET messages, which a more controlled dial-in system avoids.
-
- The purpose of this document is to define a new packet format which is aware
- of the conventions used in the USENET community, forming a middle ground
- between dial-in user interfaces and full USENET connectivity. It is not
- limited to downloading USENET news however. The same format could be used
- to enable a Unix user to package up their Unix mailbox and download it for
- later perusal. The format is extensible to other kinds of news or conference
- systems, so it is feasible, although not yet defined, that QWK or FidoNet
- messages could be accomodated within the same packet as USENET messages.
-
- REVISION HISTORY
-
- 1.2 Add COMMANDS and ERRORS files. Renamed to "Simple Offline USENET
- Packet Format". A few extra fields and type codes for the AREAS and
- LIST files. Message area summaries.
-
- 1.1 Add description of the LIST file. Everything else is identical to 1.0.
-
- 1.0 Original version of the document.
-
- Previously, this document was known as the "Helldiver Packet Format" (HDPF).
- A variant of HDPF, called the "Simple Local News Packet format" (SLNP) was
- created by Philippe Goujard (ppg@oasis.icl.co.uk). This document combines
- the features of both previous formats and the name was changed to make it
- less product-oriented.
-
- TERMINOLOGY
-
- Packet: a set of files, collected into a compressed archive.
-
- Message packet: the primary kind of packet which contains messages for
- the user to read.
-
- Reply packet: a special kind of packet which contains replies composed by
- the user, usually in response to the messages in a message packet.
-
- Packet generator: a program which generates packets to be downloaded and
- read, and which processes uploaded reply packets.
-
- Packet reader: a program which reads packets, usually by presenting the
- messages in a packet to the user, and which generates reply packets.
-
- Packet processor: either a packet generator or a packet reader.
-
- Generating host: the computer on which the packet generator executes.
-
- Reading host: the computer on which the packet reader executes.
-
- Download: the transfer of a packet from the generating host to the reading
- host. This transfer may take place in any fashion, although the
- most common method is through the use of a file transfer protocol
- such as Zmodem or Kermit.
-
- Upload: the transfer of a packet from the reading host to the generating host.
-
- Packet stream: a logical link between the generating and reading hosts over
- which downloads and uploads of packets take place.
-
- Message area: a collection of messages which are related by a common topic
- or purpose. Examples of message areas include USENET newsgroups,
- Unix mailboxes, and FidoNet conferences.
-
- Reply message area: a special kind of message area which contains replies
- being uploaded to a generating host.
-
- Text file: an ASCII file consisting of lines terminated by linefeed characters
- (LF, 10 decimal). Some operating systems terminate lines in a text
- file by CRLF pairs: such files must be converted to LF-terminated
- lines for transmission in a packet.
-
- ANATOMY OF A PACKET
-
- A packet is a group of files, collected into a compressed archive. The
- standard compression technique defined by this document is ZIP. Other
- techniques such as ARJ, ZOO, ARC, LZH, etc can also be used. It is also
- possible for Unix's tar.Z format to be used to transmit packets. The minimum
- requirement is a method to collect a group of files into a single packet,
- and a method to expand the packet back into the original files. ZIP is
- specified to provide a common compression format for packet processors.
- Each of the filenames in a packet should be stored in upper case on those
- systems where case matters (e.g. Unix).
-
- The following file specifications may appear in a packet:
-
- INFO Optional textual information.
- LIST List of message areas on the generating host.
- AREAS Index of the message areas within the packet.
- REPLIES Index of the reply message areas from the reading host.
- *.MSG Text of the messages in a particular message area.
- *.IDX Index information for messages in a message area.
- COMMANDS Extra commands sent along with a packet.
- ERRORS Errors that occurred during the execution of commands.
-
- Other filenames may also appear in the packet, but are not defined by this
- specification, so they should be avoided by generating software, and ignored
- by receiving software.
-
- The INFO file is an optional text file which may contain any kind of textual
- information from the generating system. Typically this file would only be
- present if there is some kind of urgent message that must be sent to the
- receiving user. Use of this file to store the name of the generating host
- and other such static information is possible, but discouraged to save space
- and transmission time. If such information is required, then the COMMANDS
- file can be used to transfer it.
-
- The LIST file is an optional text file which contains a list of all message
- areas that are available on the generating host, together with the format of
- the messages. It is specified further in the section "LIST FILE".
-
- The AREAS file is a text file which contains an index of the message areas
- present within the packet, specifying the name of the message area, the
- filename the messages may be found in, and the message format. This is
- specified further in the next section.
-
- The REPLIES file is a text file which contains an index of the message areas
- present within the packet that contain replies from the user which should
- be mailed or posted on the generating host. In most cases, a packet will
- contain either an AREAS file or a REPLIES file, but both may be present.
- See the section "REPLIES FILE" below for more information.
-
- The *.MSG files contain the text of the messages from a single message area.
- The actual format of this file depends on the type of message area specified
- in the AREAS file. See the section "MESSAGE FILES" below for more information.
-
- The *.IDX files provide an index into the *.MSG files, usually specifying
- where each message starts and the contents of some of the common message
- header fields. These files are intended for use by reading software on the
- recipient's system to quickly display an overview of the messages present in
- a message area. See the section "INDEX FILES" below for more information.
-
- The COMMANDS file is a text file which contains commands to be executed on
- the reading or generating hosts to change the behaviour of the hosts at
- each end of a packet stream. The ERRORS file contains textual error messages
- to report to a human at the host the packet is destined for. These two files
- are explained further in the section "SENDING COMMANDS BETWEEN SYSTEMS" below.
-
- AREAS FILE
-
- The AREAS file is a text file containing zero or more lines, each of which
- specifies a single message area, its encoding and the name of the message/index
- file pair in which the messages appear. In particular, each line has the
- following form:
-
- prefix<TAB>area name<TAB>encoding[<TAB>description[<TAB>number]]
-
- where "prefix" specifies the name of the message/index file pair, "area name"
- is the name of the message area, "encoding" specifies the formats of the
- message and index files and the type of message area, "description" is a
- descriptive name for the message area, and "number" is the number of messages
- in the message file. The last two fields are optional. Additional fields may
- be added in a future version of this specification.
-
- The message and index files corresponding to the message area have the names
- "prefix.MSG" and "prefix.IDX" respectively. If "prefix" contains alphabetic
- characters, they must be upper case.
-
- The message area name may be any sequence of printable ASCII characters (space
- through tilde). Under USENET, this is typically a dotted name like
- "comp.lang.c". Other networks may include spaces or other unusual characters
- in the area names, so the receiving software must be aware of this fact,
- and act accordingly. Also, receiving software must deal gracefully with
- characters that have the high bit set, or names that contain control
- characters, since people in other countries that speak a language other than
- English may wish to use their country's native encoding for the message area
- name. The only hard rule is that the name may not contain TAB, CR or LF.
- Receiving software should treat the name as an indivisible string to be
- displayed to the user.
-
- The encoding field consists of two or three ASCII characters (usually
- alphabetic). The first specifies the format of the message file, the second
- specifies the format of the index file, and the optional third specifies the
- kind of area (private or public). The following message file formats are
- currently defined (case is significant):
-
- u USENET news articles
- m Unix mailbox articles
- M Mailbox articles in the MMDF format
- b Binary 8-bit clean mail format
- B Binary 8-bit clean news format
- i Index file only
-
- The individual message file encodings are explained further in the next
- section. The format 'i' indicates that no message file is present, and
- the index file should be used as a summary of the messages in the message
- area. This is explained further in the section "MESSAGE AREA SUMMARIES".
- The following index file formats are currently defined (again, case is
- significant):
-
- n No index file
- c C-news overview database format
- C Shorter C-news overview database format
- i Offset/length pairs delineating the messages
-
- These types are explained further in the section "INDEX FILES" below.
-
- See the section "MINIMAL CONFORMANCE" for information on the minimal number
- of message and index formats that should be supported by packet generators
- and packet readers.
-
- The following kind of message areas are currently defined (again, case is
- significant):
-
- m The message area contains private mail
- n The message area contains public messages, or news
- u The message area kind is unknown (the default)
-
- This third letter is optional. If it is not present or unknown, the kind
- of area depends on the message file type. Message types 'm', 'M', and 'b'
- default to kind 'm', and message types 'u', 'B' and 'i' default to kind 'n'.
- It is not recommended that the value 'u' for this third letter be used,
- although future versions of this specification may add additional letters,
- necessitating 'u' to be placed in the third letter if the kind is unknown.
- If the message area kind can be solely determined from the message file
- type, it is recommended that the third letter be omitted to save space and
- transmission time.
-
- Further types may be defined in future versions of this specification. If
- the packet processor does not recognise a message file type, it should ignore
- the corresponding message and index files. If the packet processor does
- not recognise a index file type, it can either ignore the message file, or
- attempt to break down the message file into separate messages by some other
- means. If the packet processor does not recognise a message area kind,
- the kind should be treated as unknown. The user should be warned if a message
- area has been ignored.
-
- The optional message area description in the AREAS file consists of any
- sequence of printable ASCII characters. This may be used to insert a
- "readable" name for the message area. It may not contain TAB, CR or LF.
-
- A message area may appear more than once in the AREAS file, each time with a
- different prefix, but this is discouraged. This could be used to split large
- message areas across more than one message file, but this is more conveniently
- handled by generating a separate packet containing the area contination.
-
- The following examples demonstrate the capabilities of the AREAS file:
-
- 0000000 Email mn
- 0000001 comp.lang.c uc C Programming Language Discussions 125
- 0000002 news.future Bc Future of USENET 38
-
- EMAIL /usr/spool/mail/fred unm Private e-mail for fred
- U000001 comp.bbs.misc MCn
- U000002 comp.bbs.waffle ui
-
- MESSAGE FILES
-
- The format of the message file depends on the message file format specified in
- the AREAS file. This version of the specification defines three formats,
- which are in common use in the USENET and Unix community, and two additional
- binary formats which permit messages to be stored with no modification or
- assumptions about line lengths and byte values.
-
- For each of these formats, lines are terminated with LF characters. Any CR
- characters in the messages should be considered as data characters, or ignored
- on receipt. In particular, MS-DOS systems should strip CR characters from
- text messages before writing them to a packet.
-
- A 'u' (USENET) message file is a text file consisting of one or more messages
- prefixed with an rnews header. This header has the form "#! rnews n" where
- "n" is the number of bytes in the message that follows the header, excluding
- the line-feed character which terminates the header. If the number in the
- header is followed by white space and other characters, these other characters
- should be ignored, until the terminating LF character is encountered.
-
- A note about the rnews header: although a terser separator could be used, the
- rnews header has the following advantages: (a) the messages can be extracted
- in the absense of index files, or where the index files have an unknown type,
- and (b) the message files can be imported into a USENET system as standard
- rnews batches. Thus, if the user wishes to set up a real USENET site, or
- simply use dedicated USENET software to read packets, they can use their
- existing packet provider as a convenient read-only newsfeed, with no extra
- burden placed on the system administrator of the generating system.
-
- A 'm' (Unix mailbox) message file is a text file consisting of one or more
- messages. The first line of each message must start with the character
- sequence "From ". Any remaining lines in the message which start with
- "From " should have the character '>' prepended. Thus the "From " lines
- delimit the message file into separate messages.
-
- A 'M' (MMDF mailbox) message file is a sequence of one or more messages,
- separated by at least 4 Control-A characters. The message file may optionally
- start and end with a sequence of such characters. If a sequence of 4 or more
- Control-A characters occurs in a message, it should be "adjusted" by the
- insertion of spaces to split the sequence. The use of Control-A characters
- within a message is discouraged.
-
- The 'm' and 'M' formats were chosen for mail because of their common
- occurrence in the Unix community. The generating system may elect to instead
- convert a mailbox into the USENET format if it wishes, and set the area kind
- to 'm' to inform the packet reader that the message area contains private
- e-mail rather than news.
-
- The 'b' (binary mail) and 'B' (binary news) formats are identical. The
- contents of each message must conform to RFC-822/1036 and may contain content
- information compatible with RFC-1341 (MIME). The only difference between
- the messages of these formats and the preceding formats is that no assumption
- is made about line lengths, and any of the 256 values for a byte may be used
- in any position. Each message is preceded by a 4-byte value which indicates
- the length of the message in bytes, stored in big-endian order (i.e. high
- byte first, low byte last). The difference between 'b' and 'B' is a semantic
- one: message files of type 'b' are expected to contain mail messages, and
- message files of type 'B' are expected to contain news messages. Thus, reader
- software can make a distinction between the two if it desires.
-
- For most practical purposes, 'u', 'm' and 'M' should be sufficient. The binary
- 'b' and 'B' types should be used for articles that contain 8-bit binary data.
- It is possible to use type 'u' for binary data as well, but 'm' and 'M'
- cannot be because the message contents may be modified. When MIME becomes
- more wide-spread, it is expected that binary messages containing programs,
- sound, pictures and video will become popular, necessitating these binary
- types.
-
- Note that MIME messages can be stored in 'u', 'm' and 'M' message files, but
- any binary components should be encoded with quoted-printable or base64 (which
- is expected to be the most common usage of MIME in the near future). It is
- not required that 'b' or 'B' be used for MIME messages: only those containing
- raw unencoded binary data (as indicated by the Content-transfer-encoding
- header value "binary").
-
- INDEX FILES
-
- This specification defines four index file types, which provide varying
- degrees of support for packet readers.
-
- Type 'n' indicates that no index file is present, and it is up to the packet
- reader to extract messages from the message file. It is useful where the
- generating system is providing a USENET newsfeed using packets, and the
- receiving system is not interested in the index information.
-
- A type 'c' index file is a text file (LF terminated lines), with one line per
- message that occurs in the message file. The lines in the index file should
- be in the same order as the corresponding messages. Each line has the
- following form:
-
- offset<TAB>subject<TAB>author<TAB>date<TAB>mesgid<TAB>
- refs<TAB>bytes<TAB>lines[<TAB>selector]
-
- [Note: the line-wrapping here is for document-formating purposes only. No
- line-wrapping occurs in the index files]. The fields have the following
- semantics:
-
- offset Seek position in the message file of where the corresponding
- message starts. The first seek position is 0. For the 'u'
- format, this indicates the start of the line following the
- rnews header line. For the 'm' format, this indicates the
- start of the "From " line and for the 'M' format, this
- indicates the start of the article after the Control-A
- sequence. For the 'b' and 'B' formats, this indicates the
- first byte of the message after the 4-byte message length.
-
- subject The "Subject:" line from the message.
-
- author The "From:" line from the message.
-
- date The "Date:" line from the message.
-
- mesgid The "Message-Id:" line from the message.
-
- refs The "References:" line from the message.
-
- bytes The number of bytes in the message. If this field is zero,
- then it indicates that there is no corresponding message
- in the message file. This is used for summaries: see the
- section "MESSAGE AREA SUMMARIES" for more details.
-
- lines The "Lines:" line from the message. Note that this field
- is pretty useless these days on USENET, but is still popular.
- It is meant to indicate the number of lines in the body of
- the message. Generating software may elect to re-generate
- this value if it is not present in the original message,
- but this is not required.
-
- selector A string used for summaries to request that a message be
- sent in a future packet. See the section "MESSAGE AREA
- SUMMARIES" for more details. This string will usually be
- a number, but other values such as Message-ID's could be
- used. Packet readers should treat this string as an
- indivisible string to be sent in a "sendme" command in the
- COMMANDS file. A zero-length string indicates that there
- is no selector string.
-
- If any of these fields contained TAB's, newlines or other white space in the
- original articles, they should be converted into single spaces. All fields
- must be present, but some may be empty. The "bytes" field must not be empty,
- since it provides necessary information for packet readers. Each field must
- conform to the Internet RFC documents RFC-822 or RFC-1036.
-
- Optionally, a header line may end with one or more extra TAB-separated fields
- for other RFC-compliant header fields, together with the header field names.
- e.g. "Supersedes: <1234@foovax>". These fields are not defined by this
- version of the specification, and are by arrangement between the generating
- host and the reading host only.
-
- This format is compatible with the news overview (NOV) database format of
- C-news. The only difference being the substitution of an offset for the
- article number used by C-news, and the addition of the "selector" field.
- The C-news format was designed to assist threading newsreaders, so this packet
- format should provide similar assistance to threading packet readers.
-
- The 'C' format is similar to 'c', except that the "mesgid" and "refs" fields
- are dropped. These fields can commonly be quite long and are mainly of use to
- packet readers which perform Message-ID based message threading. Packet
- readers which perform subject threading (i.e. sort on the subject line and
- then on the date and/or arrival order) do not require such information. The
- format of the header lines in this case is as follows:
-
- offset<TAB>subject<TAB>author<TAB>date<TAB>bytes<TAB>lines[<TAB>selector]
-
- Further TAB-separated fields may be added in future versions of this
- specification.
-
- The "author" field is slightly different to the 'c' format. Instead of
- an RFC-822 format address, it is just the author's name, extracted from the
- "From:" line of the message. Most RFC-822 and RFC-1036 "From:" lines have one
- of the following forms:
-
- address
- address (name)
- name <address>
-
- Names may sometimes be surrounded by double-quote characters, have embedded
- "(...)" sequences, or contain "useless" information after a comma (",") or
- slash ("/"). The main requirement is that the generating software produce
- some kind of (more or less) meaningful string for the name of the author which
- can be displayed to the user by a packet reader. See RFC-822 and RFC-1036
- for more information on the syntax of the "From:" line in messages.
-
- The 'i' index format is purely binary, using 8 bytes for each message in the
- corresponding message file. The first 4 bytes specify the offset into the
- message file of the message and the remaining 4 bytes specify the number of
- bytes in the message. Each 4-byte quantity is stored in big-endian order
- (high byte first). This format is supplied to provide a trade-off between
- transmission time and easy extraction of messages from a message file.
-
- REPLIES FILE
-
- One of the requirements for an off-line reading system is a mechanism for a
- user to upload replies or new messages to a generating system for mailing or
- posting. While it is possible to re-use the AREAS file for this purpose,
- keeping the download and upload sections separate will help prevent messages
- being fed back into a network erroneously.
-
- The REPLIES file has a similar format to the AREAS file. Each line has the
- following form:
-
- prefix<TAB>reply kind<TAB>encoding
-
- The "prefix" and "encoding" fields are as before. The "reply kind" field
- indicates the mechanism to use when transmitting the messages in the message
- file. The following values are currently defined:
-
- mail Transmit an RFC-822 compliant personal mail message
- news Transmit an RFC-1036 compliant USENET news posting
-
- On a Unix system, transmission of mail and news is usually performed with the
- "sendmail" and "inews" programs respectively. Additional kinds may be
- specified in a future version of this specification for other message formats.
- Note: it is discouraged that the kinds "mail" and "news" be used for anything
- other than RFC-compliant messages. In particular, FidoNet or QWK messages
- should use a different reply kind. Messages of the same reply kind can be
- placed in the same message file, or in separate message files.
-
- Further TAB-separated fields may be added to the lines in the REPLIES file
- in a future version of this specification.
-
- It is recommended that a message file type of 'b' or 'B' be used for sending
- replies to minimise the chance of message corruption. The recommended index
- file types for replies are 'i' and 'n'. The index types 'c' and 'C' are
- discouraged because they do not provide useful information for reply purposes.
-
- The format of the messages in the message files should follow the relevant
- RFC standards, with the following restriction: any "From:", "Sender:",
- "Control:", "Approved:" or other similar "dangerous" header lines should be
- ignored by the system transmitting the replies to prevent forgeries from
- occuring. In particular, the "From:" header should be determined from the
- user's login name, or some other similar means, rather than from any data
- supplied in the user's message.
-
- In most cases, mail messages will contain "To:", "Subject:", "Cc:", "Bcc:"
- and "Reply-To:" header lines, and news messages will contain "Newsgroups:",
- "Subject:", "Followup-To:", "Keywords:", "Summary:" and "Reply-To:" header
- lines. Other optional headers (especially MIME content headers) may also
- be present.
-
- The automatic addition of a signature by the generating host which receives
- the reply packet is discouraged. Signatures should be added by the user's
- packet reading software instead, if desired.
-
- A method for allowing replies from more than one person to be stored in the
- same packet was considered, but was rejected for security reasons.
-
- The following example demonstrates the capabilities of the REPLIES file:
-
- R001 mail bn
- R002 mail bi
- R003 news Bn
- R004 news Bi
-
- LIST FILE
-
- The LIST file may be used to send a list of available message areas to the
- receiving system. Its format is similar to the AREAS file, with the prefix
- field deleted. Each line has the following form:
-
- area name<TAB>encoding[<TAB>description]
-
- where "area name" is the name of the message area, "encoding" is a 2, 3 or 4
- letter message, index, area kind, and subscription code, and "description"
- is an optional message area description. Further optional fields may be
- added in a future version of this specification.
-
- The message, index, and area kind codes are the same as for the AREAS file.
- The subscription code has one of the following values:
-
- y The user is subscribed to the message area
- n The user is not subscribed to the message area
-
- If this field is not present, it defaults to 'n'.
-
- Note that the message areas in the LIST file should only be those that can
- be subscribed to or unsubscribed from using a request in the COMMANDS file.
- Private e-mail message areas will normally not appear in the list.
-
- The following example demonstrates the capabilities of the LIST file:
-
- alt.flame ucnn
- comp.bbs.misc ucny
- comp.bbs.waffle ucny
- comp.lang.c ucnn C Programming Language Discussions
- news.future ucny Future of USENET
-
- SENDING COMMANDS BETWEEN SYSTEMS
-
- The COMMANDS and ERRORS files contain information for changing the behaviour
- of each end of a packet stream, or for reporting errors in the execution of
- commands or the generation of packets. Each is a text file with LF-terminated
- lines.
-
- The ERRORS file is the simplest: it consists of error messages from the
- program which generated the packet to report on the progress of previously
- executed commands. The format of these error messages is not defined, but
- they should be human readable so that packet readers may present the errors
- to the user for perusal.
-
- The COMMANDS file consists of a sequence of commands, one per line, which
- modify the behaviour of the packet processor at the other end of the
- packet stream. Usually these commands are sent from the packet reader
- to the packet generator to change the subscribed message areas, send
- files, etc. The names of the commands are NOT case significant, but SHOULD
- be sent in lower case. Any commands that are not understood by a program
- should be ignored.
-
- version n.m
-
- The command specifies the version of this specification that the
- packet conforms to. For this document the version is "1.2".
-
- date dd mmm ccyy hh:mm:ss [zone]
-
- The date and time when the packet was created. To prevent confusion
- with different country's date formats, the date MUST always appear
- as "dd mmm ccyy". For example, "25 Jul 1993". This date format can
- be converted to local conventions if desired. "hh:mm:ss" is a
- 24-hour clock time value. The "zone" field is the number of hours
- and minutes that the timezone is offset from Greenwich Mean Time as
- "+HHMM" or "-HHMM". For example, US Eastern Standard Time (EST) is
- "-0500", and Australian Eastern Standard Time is "+1000". If the
- zone is omitted, it defaults to "local time", however the zone should
- only be omitted if there is no way to determine it.
-
- subscribe name
-
- This command requests the packet generating program to subscribe to
- a new message area. The area name may contain spaces, but not TABs.
- Additional fields may be added in a future version of this
- specification after a separating TAB. For now, ignore anything after
- a TAB. This command may generate an error message if the message area
- does not exist, or cannot be subscribed to.
-
- unsubscribe name
-
- This command requests the packet generating program to unsubscribe
- from a message area. The same remarks about TABs and errors above
- also apply to this command.
-
- catchup [name]
-
- This command requests the packet generating program to catchup on
- the nominated message area. That is, to mark all messages in the
- area as read and continue batching from the next message received.
- If the area name is not present, the packet generating program
- should catchup on all message areas.
-
- list [always|never]
-
- This command requests the packet generating program to send a
- full list of all available message areas as a LIST file in
- the next packet. If the argument "always" is present, then
- the LIST file should be sent in every packet. The argument
- value "never" reverses this. For minimal compliance,
- "list always" should be treated as "list", and "list never"
- should be ignored.
-
- hostname string
-
- This command specifies the name of the host or BBS the packet was
- generated on. It serves an informational role only. The string
- can be any sequence of printable ASCII characters.
-
- software string
-
- This command specifies the name and version of the software which
- generated the packet. It servers an informational role only. The
- string can be any sequence of printable ASCII characters.
-
- sendme<TAB>area<TAB>selector[<TAB>selector[...]]
-
- This command requests that the packet generator send a number of
- messages from the nominated message area. The "selector" arguments
- are taken from the "selector" fields in a 'c' or 'C' index file.
- Multiple "sendme" commands for the same message area may be present
- in a COMMANDS file. The maximum length for this command is 500
- characters. Note that other commands use spaces to separate
- arguments, but this command uses TAB's.
-
- mail y
- mail n
-
- This command changes whether or not private e-mail should be sent
- in generated packets.
-
- deletemail y
- deletemail n
-
- This command changes whether or not the user's private mailbox should
- be deleted after being batched into a packet.
-
- mailindex x
-
- Set the preferred mail index format, where 'x' is one of the values
- 'n', 'c', 'C' or 'i'.
-
- newsindex x
-
- Set the preferred news index format, where 'x' is one of the values
- 'n', 'c', 'C' or 'i'.
-
- get filename [putname]
-
- Request that a file on the generating side be placed into a packet
- and sent to the packet reader. "putname" specifies the "filename"
- argument for the corresponding "put" command. If "putname" is
- not specified, the default is to use the base name of "filename".
- If directory paths are specified, the separator must be '/'. It
- should be noted that security could be breached through the use
- of this command, so programs which support this command should be
- very careful, preferably restricting requests to a particular
- directory tree.
-
- put pktname filename
-
- This command is usually sent in response to a "get" command, although
- it can be sent on its own. "pktname" specifies the name of the file
- in the packet which contains the requested file's contents. The
- "filename" argument specifies destination file to write the contents
- to. Note that security could be breached with this command, so
- the destination filename should be checked, or restricted to a
- particular directory tree. It is also recommended that the user
- be prompted for confirmation before writing the file. If directory
- paths are specified in "filename", the separator must be '/'. It
- is recommended that the extension "FIL" be used for files in a
- packet which contain data sent with this command. For example,
- "put 001.FIL abc.zip"
-
- supported cmd ...
-
- This command is usually sent from a packet generator to inform a
- packet reader as to which commands are supported by the generating
- program. The argument is a space-separated list of command names.
- For example, "supported subscribe unsubscribe list", or "supported
- subscribe unsubscribe catchup list mail deletemail".
-
- It is recommended that at least "subscribe", "unsubscribe" and "list" (with
- no arguments) be supported. Packet generators are recommended to add a
- "supported" line to all packets generated to inform the packet reader
- which commands can be used. In the absence of a "supported" line, only
- "subscribe", "unsubscribe" and "list" should be assumed to be supported.
-
- If more than one command is received for the same item (e.g. "subscribe",
- "unsubscribe", "list", "mail", ...), then the last command in the COMMANDS
- file takes precedence over any previous commands.
-
- The following example demonstrates a typical COMMANDS file sent from a
- packet generator:
-
- version 1.2
- date 25 Jul 1993 12:34:38 +1000
- hostname frobozz.domain.com
- software Fubar 1.3
- supported subscribe unsubscribe catchup list sendme get
- put 001.FIL abc.zip
- put 002.FIL def.txt
-
- The following example demonstrates a typical COMMANDS file sent from a
- packet reader:
-
- subscribe comp.lang.c
- subscribe comp.lang.misc
- unsubscribe alt.swedish.chef.bork.bork.bork
- list
- get xyzzy.zip
- get /usr/local/lib/fubar.txt frobozz.txt
-
- MESSAGE AREA SUMMARIES
-
- The preceding sections have described a number of features for supporting
- message area summaries. This section provides greater detail.
-
- Since some message areas, notably USENET newsgroups, can get quite large,
- the user may want to download a summary of a message area instead of all
- of the messages, and then request that messages of interest be sent at
- some later time for reading. Usually the summary will list the messages'
- subjects, authors, and other similar "header information". Optionally,
- the user may request that the first few lines of the messages also be
- sent so that the user may peruse the beginning of the message and decide
- whether to retrieve the rest of the message.
-
- This activity is supported in the following fashion in this packet format:
- summary information is sent in an index file of type 'c' or 'C', usually
- with no accompanying message file. Therefore, the message file format in
- the AREAS file will be set to 'i'. Each line in the index file has its
- "bytes" field set to 0 to indicate that the message is not present in
- the message file, and the "selector" field is set to some string that can
- be used to request the message by way of a "sendme" command. Usually this
- selection string will be the message number of the message on the generating
- host, but other values such as Message-ID's are allowable.
-
- If the first few lines of each message are also desired, the message file
- format is set to something other than 'i', and the "offset" and "bytes" fields
- in the index file may be used to extract the trimmed-down messages for
- perusal. The "selector" field is once again used to request that an entire
- message be sent at some later time, by way of a "sendme" command.
-
- It is possible to create a message area which contains both ordinary messages
- and summary messages. If the "selector" field is not present, or is
- zero-length, then the message should be processed in the usual way, and if
- the "selector" field is present and not zero-length, then it is a summary
- message and the "bytes" field can be used to determine if the first few
- lines of a message exist in the message file or not. This mixture can be
- useful in some situations where the user wishes to download all messages
- less than a certain length, and download the larger messages as summaries,
- so that the larger messages can be explicitly requested only if the user
- really wants them.
-
- MINIMAL CONFORMANCE
-
- This section describes the minimal amount of work that a packet processor
- must do to be compliant with this specification.
-
- Packet generators should be able to generate message areas for the 'b'
- and 'u' message formats for private and public message areas respectively,
- and process replies for the 'b' and 'B' message formats. For minimal
- conformance, index format 'n' must be supported, and if message area
- summaries are required, one of index formats 'c' or 'C' should be supported.
- It is recommended that either 'c' or 'C' be supported in all packet
- generators, even when message summaries are not required. If message
- summaries are supported, the minimal requirement is to send an index file
- with the message file format set to 'i'. Packet generators should support
- the "subscribe", "unsubscribe" and "list" commands, and also the "sendme"
- command if message area summaries are required.
-
- Packet readers should be able to read all message and index formats, and
- generate replies for the 'b' and 'B' message formats. If message area
- summaries are not supported, all areas with message format 'i' should be
- flagged to the user as not understood. Packet readers should also be
- able to display the INFO and LIST files if they are present in a packet
- and be able to prompt the user for "subscribe" and "unsubscribe" requests
- to be sent to the packet generator.
-
- FUTURE ENHANCEMENTS
-
- The obvious enhancement that can be made is to support other message formats,
- especially FidoNet formats. Currently the message area file code 'q' is
- reserved for QWK-format messages. This will be defined in a future version
- of this specification if demand warrants.
-
- Experimentation with other formats and auxillary files is encouraged, but
- please contact the author first to prevent double-ups from occurring.
- The author may be contacted via e-mail at rhys@cs.uq.oz.au.
-