home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.archives.admin
- Path: sparky!uunet!math.fu-berlin.de!Harpo.Chemie.FU-Berlin.DE!heiko
- From: heiko@Harpo.Chemie.FU-Berlin.DE (Heiko Schlichting)
- Subject: Re: genindex 1.0 released
- Message-ID: <O5R6LX@math.fu-berlin.de>
- Sender: news@math.fu-berlin.de (Math Department)
- Organization: Free University of Berlin, Dept. of Chemistry
- References: <R0G6YKV@mailgzrz.tu-berlin.de> <18d6ulINN7tv@nigel.msen.com>
- Date: Mon, 7 Sep 1992 04:34:55 GMT
- Lines: 252
-
- emv@msen.com (Edward Vielmetti) writes:
-
- >The FTP admins in Berlin have come up with a standardized tool to generate
- >/INDEX, /ls-lR, /INDEX-diffs/* etc. files, based on a format that's easier
- >to grep and to dump into archie than the ls-lR format. It lists only
- >readable files and avoids the slight nagging differences between ls
- >programs.
- >
- >See
- > ftp.cs.tu-berlin.de:/projects/genindex/*
- >for details.
-
- I've put genindex.tar.Z to ftp.uu.net:/tmp/genindex.tar.Z so that there
- is no problem with the slow lines to Berlin. This file is only 14 kBytes
- small. You also needs GNU find to use the software.
-
- >The code is based on GNU find, and comes with filters so
- >that archie admins can load these index files directly. Installation
- >involves running a configure script, compiling the code, and throwing in a
- >crontab entry.
-
- I should mention that this is version 1.0 and there is a little glitch in
- the source so that one feature (IGNORENAME, a file which contains pathnames
- which are not listed in the Index) does not work. There is a very small
- patch for this and when the author of genindex is back from vacation he
- will look for bugreports and make a new implemenattion. "genindex 1.0"
- is the first release and has still a little bit of an experimental state.
-
- BTW: it is a config file in shell syntax (config.sh) not a configure script.
-
- But genindex-1.0 works - look into FTP.FU-Berlin.DE:/INDEX[.Z]
- Because the connection to Berlin are so slow, here are the first 20
- lines from this INDEX to show the format:
-
- #NAME ftp.FU-Berlin.DE
- #ADDRESS 130.133.4.50
- #ORGANIZATION Freie Universitaet Berlin
- #LOCATION Berlin, Germany; 52 31 N / 13 24 E
- #ACCESS no restrictions
- #MAIL ftp-adm@FU-Berlin.DE
- #SERVER-TIMEZONE +0200 (MET DST)
- #
- #INDEX-TIMEZONE +0000 (UTC)
- #VERSION 1.0 (28-Aug-1992); de-mirror INDEX format
- #CREATED 07-Sep-1992 01:00
- #
- DR-X 0 24-Aug-1992 00:02 /
- DRWX 0 04-Sep-1992 01:43 /incoming
- DRWX 0 24-Aug-1992 16:57 /incoming/amiga
- FRW- 4435 01-Jul-1992 16:48 /incoming/amiga/JOES.amiga.sites
- FRW- 583680 24-Aug-1992 16:57 /incoming/amiga/MED_Songs.dms
- FRW- 452 24-Aug-1992 16:57 /incoming/amiga/MED_Songs.readme
- FRW- 168103 03-Jul-1992 12:23 /incoming/amiga/ReOrg_V2.3.lha
- FRW- 210243 07-Jul-1992 17:51 /incoming/amiga/vortaro.lha
-
-
- This format was discussed in the German FTP-mirror mailing list
- (de-mirror@informatik.tu-muenchen.de, to join this list write to
- de-mirror-owner@informatik.tu-muenchen.de - but the conversation language
- in this list is German). The name for this format comes from this
- mailing list: "de-mirror INDEX format".
-
- Also defined is a format for names of INDEX-diff-files so that
- archie and similar tools can get only diffs and not the hole INDEX file.
-
- Below is the draft text for the "de-mirror INDEX format" but it is
- *NOT* necessary that everyone writes own tools to create INDEX-files
- in this format. "genindex.tar.Z" already includes such a sh-script.
- If it does not work on your machine, please report bugs to
- ftpadm@cs.tu-berlin.de.
-
- (author: Carsten Rossenhoevel <cross@cs.tu-berlin.de>)
-
- [as discussed in de-mirror up to Aug 07, 1992]
- [editorial changes / bug fixes by {cross,marion}@cs.tu-berlin.de Aug 15, 1992]
- [info-line extensions by {heiko,vera,cross}@{cs.t,f}u-berlin.de Aug 22, 1992]
- [info-line extensions frozen Aug 28, 1992]
-
- DRAFT Syntax for FTP server index files
- ---------------------------------------
-
-
- index = *( line CRLF )
-
- line = comment-line
- / info-line
- / index-line
-
- comment-line = "#" white-space text
- -- user information, format undefined
-
- info-line = "#" 1*<any CHAR, excluding white-space and CRLF>
- white-space text
- -- reserved for further parsable
- information (contact addresses,
- IP addresses...)
- (preliminary standard see below)
-
- index-line = directory-line
- / symlink-line
- / file-line
-
- directory-line = ("D" / "d")
- ("R" / "r" / "-")
- ("W" / "w" / "-")
- ("X" / "x")
- white-space
- date-time white-space
- "0" " "
- pathname
- -- directories always have size 0.
- the search permission 'x' must
- be set.
-
- file-line = ("F" / "f")
- ("R" / "r")
- ("W" / "w" / "-")
- ("X" / "x" / "-")
- white-space
- date-time white-space
- size " "
- pathname
- -- files must be readable.
- There is NO white-space between
- size and pathname to support
- pathnames with leading
- white-space.
-
- symlink-line = ("L" / "l")
- "---"
- white-space
- date-time white-space
- "0" " "
- pathname
- " " "-" ">" " "
- pathname
- -- symlinks have size 0 and
- mode '---' by convention.
- There is NO white-space
- between the two pathnames
- in order to support pathnames
- containing white-space.
-
- date-time = date " " time
-
- date = day "-" month-name "-" year
-
- time = hour ":" minute
- -- time should be given in GMT
-
- day = ("0" / "1" / "2" / "3")
- DIGIT
- -- days < 10 have a leading 0.
-
- month-name = ("Jan" / "Feb" / "Mar" / "Apr" / "May" /
- "Jun" / "Jul" / "Aug" / "Sep" / "Oct" /
- "Nov" / "Dec")
-
- year = 4*DIGIT
- -- year should be >= 1970.
-
- hour = 2*DIGIT
- -- in 24 hour notation; entries
- < 10 have leading zeros
-
- minute = 2*DIGIT -- entries < 10 with leading
- zeros
-
- size = ( <any DIGIT except "0"> *( DIGIT ) )
- / "0"
-
- pathname = <any text not containing CRLF or " -> ">
- -- path components should be
- divided by "/" (UNIX
- pathname convention).
-
- text = *( CHAR )
-
- white-space = *( <ASCII SPACE> / <ASCII TAB>)
-
- token = <text without white-space>
-
- CHAR = <any ISO-8859-1 (8-bit ISO10646 subset) character>
- -- in POSIX.1, file names may
- contain any printable
- 8-bit-character (at least,
- I think so ;-)).
-
- NUMBER = DIGIT *( DIGIT )
-
- DIGIT = <any ASCII digit "0" .. "9">
-
-
-
- valid Info Line keywords && formats (preliminary)
- -------------------------------------------------
- ( M = multiple lines allowed, ! = mandatory )
-
- !M server-name = "#NAME" white-space token
- -- any CNAME, first occurrence
- is preferred
-
- M contact-address = "#MAIL" white-space *( token )
- -- RFC822 mail addresses
-
- ! index-version = "#VERSION" white-space NUMBER "." NUMBER
- -- INDEX format version. MANDATORY.
- Indices without a #VERSION-line
- are illegal.
-
- ! create-date = "#CREATED" white-space date-time
- -- index creation time, same timezone
- and format as other INDEX dates.
- Mandatory because it is used for
- diffs.
-
- timezone = ( "+" / "-" ) <four digits>
- white-space text
- -- offset to GMT as in RFC822
- text is optional and describes
- the timezone (not parsed)
-
- index-timezone = "#INDEX-TIMEZONE" white-space timezone
- -- timezone of INDEX dates,
- always +0000
-
- server-timezone = "#SERVER-TIMEZONE" white-space timezone
- -- server's home timezone
-
- organization = "#ORGANIZATION" white-space text
- -- name of server's organization;
- not parsed
-
- location = "#LOCATION" white-space text
- -- server coordinates etc. (not spec)
-
- access = "#ACCESS" white-space text
- -- access restrictions, not parsed.
-
- sort-field = "date" / "path" / "filename"
-
- sort-specifier = [ "casesensitive" white-space]
- [ "reverse" white-space]
- sort-field
-
- sort = "#SORT" sort-specifier
- -- no sort line means unsorted
-
-
- --
- Heiko Schlichting | Freie Universitaet Berlin
- heiko@groucho.chemie.fu-berlin.de | Institut fuer Kristallographie
-