home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cis.ohio-state.edu!magnus.acs.ohio-state.edu!usenet.ins.cwru.edu!agate!ames!bionet!snorkelwacker.mit.edu!thunder.mcrcim.mcgill.edu!sifon!peterd
- From: peterd@cc.mcgill.ca (Peter Deutsch)
- Newsgroups: alt.gopher
- Subject: Re: index for all of gopherspace
- Message-ID: <1992Jul29.153142.2281@sifon.cc.mcgill.ca>
- Date: 29 Jul 92 15:31:42 GMT
- References: <1992Jul27.180509.27470@mercury.unt.edu> <1992Jul27.201122.17017@msuinfo.cl.msu.edu> <1992Jul28.194722.19492@nstn.ns.ca>
- Sender: news@sifon.cc.mcgill.ca
- Organization: Bunyip Information Systems (the archie people)
- Lines: 103
- Nntp-Posting-Host: expresso.cc.mcgill.ca
-
- In article <1992Jul28.194722.19492@nstn.ns.ca> daniel@nstn.ns.ca (Daniel MacKay) writes:
- >Hello!
- . . .
- >Peter Deutsche writes:
-
- First, a little administriva note - that's "Deutsch", not
- "Deutsche". Think of me as a noun, not an adjective... :-)
-
-
- >> While talking about indexing Gopherspace, the next release of archie
- >> will allow gathering of arbitrary collections of information and ...
- >> [...]
- >> and we plan to put
- >> in pointers to the various Gopher servers as a proof of concept.
- >
- >I was describing my idea to a visitor to my office, and he suggested
- >exactly the same thing- there be a file with a well-known-name available on
- >every gopher that is authoritative for some resource. This file would have
- >a wordy and search-rich description of the resources, i.e. an abstract for
- >the resource written in a way that it will be full of keywords that people
- >are likely to use when they're looking for it. The one for the recipies
- >database would have words like "recipies" "food" "cooking" (but would it
- >contain an entry for "dessert" "entree"? Hmm.)
-
- Yup, this sounds like an IAFA-like template for each site
- admin to fill in, with liberal use of the "Keywords" and
- "Comments" fields. I like this approach because it is
- easy for admins (so they'll do it) and has enough
- structure to make searching a lot easier.
-
- >Peter's, or someone's, robot could sweep through the gopher servers
- >periodically, collect the files, and build the Index Into Gopherspace.
- >
- >My point: there are not *that* many things in gopherspace once you take
- >out all the redundancy. If we only collect descriptions from people who
- >are authoritative for their resource, the problem dwindles into something
- >quite doable- think of how much smaller Peter's archie database would be if
- >there was only one entry in it for every ftp'able resource!
-
- This is exactly what we are saying about anonFTP, too. I
- believe we have already reduced duplication to some extent
- with archie, since people now have some expectation that
- they can again find something, they no longer feel quite so
- inclined to store old copies. Of course, there is still a
- lot of "pack-ratting" going on. To address this completely
- we really need unique document identifers and resource
- serial numbers deployed to allow the tools to detect and
- eliminate the duplicates. Work is going on through the
- IETF to get these defined and deployed ASAP.
-
- >Anyway, it results in an entity that's like a couple of things:
- > a) Peter's "whatis" project.
- > b) My gopherizing of the SRI NISC List-of-Lists, available on the
- > nstn.ns.ca gopher; check it out as
- > Internet Resources
- > Mail Lists
- > Mail List Subject Search
- >
- >The important thing is that the *name* of something usually doesn't tell
- >you much about it. So a robot indexing all the words of the menu items it
- >finds in gopherspace is relatively useless. Someone carefully writing a
- >description of their resource, keeping in mind the kind of keywords a user
- >might use when looking for it, is *much* better. Yeah, writing it is work.
- >But- garbage in, garbage out. If you're authoritative for a resource,
- >presumably you should be able to take the time to describe the resource
- >once.
-
- Again, this is exactly the same problem we found with our
- anonFTP experience and was a driving force behind our work
- with IAFA. Once we have a standardized way of encoding
- information, the tools can be deployed to index and search
- them (given that we now have archie, WAIS, etc they're
- already written, at this point).
-
- >Peter continues:
- >> Now, the missing link here seems to be a simple way to get the index
- >> info out of Gopher.
- >I don't think that's much of a problem. *The* thing that gopher's best at
- >is delivering documents. And I think it's quite reasonable to have a
- >Well-Known-Directory containing descriptions of the data for which that
- >gopher is authoritative, (e.g. me and CA*Net news, Canadian Weather).
- >
- >On a related topic- Peter, does that mean that the archie data and the
- >whatis data will be integrated? I had a complaint from a user the other
- >day that my gopher/archie gateway didn't deliver whatis data, and I
- >thought- right! Why doesn't it?
-
- Well, the info templates we gather will automate the
- maintenance of the "whatis" database, but I believe that
- the reason you can't get to the current archie whatis data
- is simply that the current Gopher gateway doesn't support
- the "whatis" command. This is a minor loss right now,
- since the database is not automatically maintained, but
- hopefully someone will improve the gateway once the new
- release is deployed to allow Gophereens to access all the
- new databases. If someone wants to join the 3.0
- client development team, drop us a line (actually, send
- the note to Alan Emtage "bajan@bunyip.com", since he's
- coordinating this work). We're just about to release stuff
- to the client writers for 3.0.
-
-
- - peterd
-