NetNews Usenet Archive 1992 #20

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #20 / NN_1992_20.iso / spool / alt / hypertex / 735 < prev next >

Wrap

Text File | 1992-09-08 | 8.4 KB | 162 lines

Newsgroups: alt.hypertext Path: sparky!uunet!psinntp!wrldlnk!usenet From: "Ernest Perez" <demoep@psilink.com> Subject: Re: Interaction with a hypertext In-Reply-To: <1992Sep6.224648.3218@memstvx1.memst.edu> Message-ID: <2924988206.0.demoep@psilink.com> Sender: usenet@worldlink.com Nntp-Posting-Host: 127.0.0.1 Organization: Access Information Assoc., Inc. Date: Mon, 7 Sep 1992 22:51:23 GMT X-Mailer: PSILink (3.2) Lines: 148 Some more comments about dynamic hypertext and hypertext structuring - In his article, Mark C. Langston <langston@memstvx1.memst.edu>, writes... >* If the HT structure is theory-determined or research-determined (as > opposed to just thrown together, a very poor way to develop links), > a strong indexing/inference engine would need to be incorporated into > the sytem to correctly link user annotations/additions. Unfortunately, it would seem that a "strong indexing/inference engine" just ain't in the realm of reasonably current possibility. E.g., the library & information science profession spent a *lot* of research time and money on that approach in the 70's and 80s. Their studies covered knowledge representation, knowledge structure, and special focus upon systems for automated indexing or indexing assistance. Their findings generally showed that it was not yet possible to build a subjective text/language interpreter with any reliable performance or predictability. Study after study showed that the indexing/classification was not comparable in quality to that produced by human indexers ("harmless drudges," wrestling with knowledge, like Dr. Johnson's lexicographers). [Writing about another topic writer implies recognition of the poverty of these systems...] >the computational power >required approaches that of natural-language understanding systems, and, I >fear, would work about as well. Anyone really going into this area should really take a look at the writing of Cyril Cleverdon, Wilfred Lancaster, A.C. Foskett, Tefko Saracevic, Donald Cleveland, and that crowd. [As an aside here, I am writing from the viewpoint of a librarian and information science professional. That is, after all, the discipline that has historically concerned itself with "putting stuff away in such a manner that you can dependably find it again." Yes, like Vannevar Bush pointed out, they developed all kinds of arcane schemes and classifications, but remember that they have been dealing with the problem for centuries, heretofore hampered by manual methods and physical knowledge representations. Also, those schemes *were* mostly standardized or community property conventions, just like ASCII and ISO.] >HT's in general are developed 'off-the- >cuff', with no empirical or theoretical basis (correct me if I'm wrong, it >would be welcome news). If one assumes that the HT does have empirical/ >theoretical foundations for links, a sensible approach would be the knowledge >annealing system addressed in a previous post. However, one would hope that >the system could be more automated, and not require current updating by a >moderator. I believe that the "in general" off-the-cuff development of HTs, is a pretty sloppy approach. It is sadly representative of an amateurish approach to the small systems that most HT and hypermedia researchers have produced or studied. I am not putting down their work, but they are approaching it: 1) with no real knowledge of classification or knowledge representation; and 2) pretty much dealing with dime store/economy-size knowledge or content collections. Liora Alschuler wrote about the conversion of the Hypertext '87 proceedings into three different hypertext systems, done by the developers of the systems themselves. ("Hand-crafted hypertext -- lessons from the ACM experiment," in _The Society of Text..._, 1989. ed. by Edward Barrett. p. 342-361) Alschuler noted: the inconsistency of implementation, the poor indexing, the disorganization of index lists, and the team reports of extreme difficulty. She reports after talking with one of the principals, "By the end of the project, they were 'practically fabricating' meaningful connections in order to install more links." (p.358) Not surprising, from my point of view. Would we expect the editors of the _Encyclopedia Britannica_ to produce a good index? No, they hire professionals to act as colleagues for that part of the editorial creation. At Hypertext '91, Frank Halasz updated his "Seven Issues" for HT research emphasis. Automated search and query concerns predictably topped the earlier list (the computer jock brute force approach) approach. But he also prominently mentioned structure problems, ergo humanly-decipherable content representation. In the new list of priorities, he voiced two totally new concerns in the area of macro-scaled HT systems. They were 1) "User Interfaces for Large Information Spaces; and 2) [methods for controlling] Very Large Hypertexts. His platform response to an audience question of "How big is large?" went something like, "I don't know...maybe 1000, even 2000 nodes." To the librarian/indexer professionals in the audience, it was hard not to snicker too loud. :-) "Serious" or "big" systems are in production. Many of these tend to use classification hierarchies or indexing systems as a base. For example: ** _Facts on File_ is a standard printed library reference tool. They have translated a 12-year cumulation of the product into a CD-ROM hypermedia, including photos, maps, audio clips, and (I believe) video clips. For easy pinpoint or specific retrieval, they have a search engine producing dynamic link lists. BUT, they also translated/incorporated all the human editorial and indexing cross-references by linking the internal printed cross-references, and the standard printed index. A user can backtrack from a given article, to the index entries for that page, and thus see a map to related topics. *** _McGraw-Hill Encyclopedia of Science & Technology_ - Yeah, they use search, but again they build on the intellectual investment in print product access points. *** _Oxford English Dictionary_ - Read Robert Glushko's articles, for a real project engineering approach to exploiting the intellectual and the print format information access points. *** _DaTa_ (Deloitte & Touche's internal CD-ROM-hypertext on Accounting/ Auditing professional area. From another "pre-planned" approach, they use a complex taxonomy network, maintained with MaxThink's network-outliner software. This sophisticated "classification scheme" lays the groundwork for presenting a topical matrix used to give access to about 200meg of hypertext. Updated quarterly *by one guy*, a CPA professional, no less! Do remember, that for a *long* time to come, the majority of electronic information systems are going to be byproducts, spin-offs, from print product publishing. You wouldn't have Chem Abstracts database without _Chemical Abstracts_; the _New York Times_ online without the _New York Times_. And print publishing information access tools are not "off-the-cuff"; there's 500 years of heuristics of how-to-do-it-good. [In regard to large domains and user annotation...] >...this raises the question of computational explosion.... True, but it moreso raises the question of "cognitive explosion." However, in a *large* system (Chemical Abstracts, Engineering Index, MedLine) that's the breaks, Charlie. Even in your friendly local library online (card?) catalog, you've got three or four "links" to each of anywhere from 50,000 to a couple of million books/"nodes". The problem is ameliorated by human intent; at any one time I'm only interested in 3,5, maybe 15, of those nodes. Same comment applies to the mention of user annotation. Okay, they may grow, but they're - * not all gonna be on the same screen; * not all going to appear to be of beckoning interest to me; * not there with some rule that I've *got* to follow every link (Just Say No) :-) I believe that for the foreseeable future, it's gonna take human hyper-editorial and hyper-building talent to produce quality hypermedia systems. Yeah, there will be authoring and system-building utilities. But I don't think we're going to have a magical "automated moderator" of any real performance ability or quality. Cheers, ernest .............................. Ernest Perez, Ph.D Access Information Associates 2183 Buckingham, Suite 106 Richardson TX 75081 214-530-4800 INTERNET: eperez@utdallas.edu BITNET: eperez@utdallas ..............................