home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: alt.gopher
- Path: sparky!uunet!snorkelwacker.mit.edu!thunder.mcrcim.mcgill.edu!sifon!news
- From: peterd@bunyip.com (Peter Deutsch)
- Subject: Re: index for all of gopherspace - try the new archie...
- Message-ID: <1992Jul28.001000.8292@sifon.cc.mcgill.ca>
- Sender: news@sifon.cc.mcgill.ca
- Nntp-Posting-Host: expresso.cc.mcgill.ca
- Organization: Bunyip Information Services (the archie people)
- References: <a-steiner-270792125230@steiner.acns.nwu.edu>
- Distribution: na
- Date: Tue, 28 Jul 1992 00:10:00 GMT
- Lines: 113
-
- In article <a-steiner-270792125230@steiner.acns.nwu.edu>
- a-steiner@nwu.edu (Albert Steiner) writes:
- > In article <1992Jul27.163746.974@nstn.ns.ca>, daniel@nstn.ns.ca
- (Daniel
- > MacKay) wrote:
- > >
- >
- > > Then- we just run an index on the abstracts, and gopher managers
- can put a
- > > search item onto their menus- something like "Gopherspace Resource
- Search"
- > > that points to the central registry.
- > >
- > > Our hapless user picks it, and asks for "recipies" or "food" or
- "cooking"
- > > and gets a pointer back to the resource- whereever it is. (who
- cares where
- > > it is?)
-
- Actually, as we discovered with archie, they will care once they find
- they get the same thing from multiple sites, and once is on the same
- regional network and one is behind three slow pipes to Europe or
- Australia.
-
- > , Canada
- > This is an idea I think is very important. I would like to see a way
- to
- > encode an address in "about" entries for "important" directory
- levels.
- > When the server selected an about entry with pointer, it would
- separate the
- > entry and the pointer into two items returned to the client. The
- first
- > item is the about, the second item is a directory entry to the actual
- > location else where.
- >
- > This still leaves the problem of verifying that the index's are up to
- date.
- > Perhaps they should be checked every week. The original directory
- checked
- > again for its "about" file, the directory address added, and reposted
- in
- > the directory of directories. In addition, new entries could be
- added by
- > mailing the address of a new "interesting" directory to the "index
- master"
-
- While talking about indexing Gopherspace, the next release of archie
- will allow gathering of arbitrary collections of information and as our
- first demo of its new capabilities we plan to convert the existing
- archie.mcgill.ca into a pilot Internet Yellow Pages service (we figure
- there are enough primary archies out there now, and we want to
- demo the new software on an unsolved problem). This will track all
- sorts of services, not just anonymous FTP, and we plan to put
- in pointers to the various Gopher servers as a proof of concept.
-
- The idea is that archie will revisit your server periodically to a)
- verify you're still alive, and b) pick up a description of your site in
- some format (provided you can give this to us). We were planning to
- look into asking the Gopher gang to standardize on such site
- descriptions as we set things up. Sound reaonable? Have such standards
- been worked out and I missed it?
-
- A simple twist on this would be to build a "Gopher archie", which
- in effect indexes every Goperh entry (as we do now for the 2,100,000
- entries in anonymous FTP). Users would then be able to search for
- specific items in the "Gopher archie" databases. Carrying this further,
- you could allow a hierarchy of textual "descriptions", which could
- be picked up and indexed for rapid searching.
-
- Now, the missing link here seems to be a simple way to get the index
- info out of Gopher. One alternative (kinda kludgy, but it would work)
- is to build such info "off-line" and offer it through anonymous FTP
- (our work through IAFA is aimed at standardizing ways of encoding
- this kind of stuff for tools such as our to pick up and this could
- be a candidate for a "Services" template).
-
- Another approach is a simple extension to the Gopher protocol to allow
- us to say "dump the equivalent of ls -lR" to the sender and have
- the server give back the entire listing in a format we can parse, or
- even "dump your hierarchy of descriptions files".
-
- There are people more qualified in Gopherspeak than I am in this group
- who might want to comment on the feasibility (and desirability) of
- doing this, but we basically are testing the code to do the general
- gathering and database building now.
-
- If you guys gave us the data gathering capability, we are just about
- ready to periodically pick the stuff up and drop it into the
- appropriate archie database and index if for you. At that point it
- would look pretty much like the anonFTP archie as far as the user was
- concerned, with the word anonFTP crossed out and the word "Gopher"
- written in in red crayon. Of course, Gopher already has a gateway
- to archie, so accessing this service from Gopher is a done deal.
-
- FYI, the next release will have support for WAIS databases out
- of the box, so indexing large text files will be easy. The new
- architecture is designed to allow arbitrary data gathering, parsing
- and database maintenance, so you should be able to build almost any
- database you want out of distributed information. Anything up to a
- couple of Gigabytes of information seems feasible with a combination
- of WAIS, cheap disk space and enough workstations to handle the
- anticipated load (I think a Yellow Pages service could easily generate
- as much traffic as the anonFTP archie does now, maybe more).
-
- Comments on this approach most welcome, as we are working now to get
- things ready for the first live test in the next week or so.
-
-
-
- peterd
-
-
-