home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!paladin.american.edu!auvm!NWU.EDU!ALBERT-LUNDE
- X-Sender: lunde@casbah.acns.nwu.edu
- Message-ID: <9211192141.AA01063@casbah.acns.nwu.edu>
- Newsgroups: bit.listserv.cwis-l
- Date: Thu, 19 Nov 1992 15:41:51 -0600
- Sender: "Campus-Wide Information Systems" <CWIS-L@WUVMD.BITNET>
- From: Albert Lunde <Albert-Lunde@NWU.EDU>
- Subject: SUMMARY - Tools to clean-up text
- Comments: To: CWIS-L LISTSERV <CWIS-L@WUVMD.Wustl.Edu>,
- gopher-news@boombox.micro.umn.edu
- Lines: 136
-
- I sent out the question below
-
- > To: CWIS-L and comp.infosystems.gopher
- >
- >I am writing to ask about the availablity of tools for cleaning up text for
- >posting to gopher or other CWIS servers. I'm especially interested in the
- >process of converting word-processed text from Macs or PCs to a least
- >common-denominator that will work on micro gopher clients and timesharing
- >terminals.
-
- Here is a summary of replies:
-
- ** Several people suggested saving text in a mono-spaced font. The best
- collection of directions for this was pointed to by Steve Watkins
- (watkins%SCILIBX.UCSC.EDU@WUVMD.Wustl.Edu).
-
- These are on the University of Oregon Gopher under the headings:
-
- DuckScoop -- U of O campus Information
- Becoming a DuckScoop Provider
- B. Create-Convert Guidelines
-
- (These are adapted from a document created by Computing and Network
- Services at the University of Alberta.)
-
-
- ** Doug Anderson (danderson@frmnvax1.bitnet) suggests:
-
- >There is a shareware tool for PCs called TEXTCON that will do some of
- >what you are interested in. As I understand it, it was developed to
- >deal with the problems of transferring documents from one type of
- >system to another, especially from dedicated word processors like
- >Wang and NBI to PCs.
- >
- >It will deal with the newline problem and will expand tabs to spaces.
- >It can also deal with the use of CR without LF that is sometimes used
- >to produce bold, underline, etc. It is a pretty smart program and
- >allows a great deal of control through the use of command line arguments.
-
- I couldn't find an Internet source for this, but I found it on Compuserve,
- in the "IBMAPP" forum in the Library section "Word Processing (A)" as
- TEXTCN.ZIP.
-
- The shareware payment is $25 to:
- CrossCourt Systems
- 1521 Greenview Ave.
- East Lansing, MI 48823
- (They take phone orders at (517) 332-4353 with Mastercard or Visa.)
-
- I haven't tested it but I looks like it could be useful.
-
- ** Jon P. Knight (jon@hill.lut.ac.uk) suggests:
-
- >* Change spaces to tabs using expand(1) on a UNIX machine (its pretty
- >common; its on SunOS and HP-UX at least and is probably available in the
- >BSD Net2 release if your machine hasn't got it).
-
- >* Consider using TeX as an intermediate form - there are many
- >wordprocessor to TeX conversion utilities and then the dvi2tty display
- >programs which create plaintext, nicely formatted files. This also gets
- >rid of the wordwrapping problem as TeX makes its own, usually pretty
- >sane, decisions about where lines should break. Check out some of the
- >papers on hill.lut.ac.uk CFD gopher on port 5000; some of these are
- >ASCII-fied Mac WORD files and others are LaTeX files converted directly
- >to text.
-
- ** JQ Johnson (jqj@ns.uoregon.edu) offered this Unix shell script:
-
- - - -
- #!/bin/sh
- # this program attempts to correct some of the common problems in text
- # files submitted to gopher
- # Usage:
- # $0 [ -c ] file [ ... file ]
- # or
- # $0 < file > newfile
- # Problems addressed:
- # Mac files: cr instead of lf, long lines, stray mac characters, tabs
- # PC files: ^Z at end, long lines (n.b.: CRLF => LFLF)
- # (note: still need to handle files that don't end with LF or CR)
- case $# in
- 0 )
- # filter mode -- incompatible with compare mode
- tr '\015\032' '\012\012' | expand |
- sed -e 's/$//' -e 's//\
- /g' \
- -e 's/"/"/g' -e 's/"/"/g' -e 's/ /*/g' -e 's/ /.../g' \
- -e "s/'/'/g" -e 's/ /Copyright /' -e 's/ /--/g' |
- fmt -s
- ;;
- * ) # file mode
- MVORCOMPARE=mv
- if [ "$1" = "-c" ]; then
- # compare mode
- MVORCOMPARE=cmp
- shift
- fi
- for file do
- <$file tr '\015\032' '\012\012' | expand |
- sed -e 's/"/"/g' -e 's/"/"/g' -e 's/ /*/g' -e 's/ /.../g' \
- -e "s/'/'/g" -e 's/ /Copyright /' -e 's/ /--/g' |
- fmt -s >/tmp/fixit.$$
- if [ $MVORCOMPARE = mv ]; then
- mv /tmp/fixit.$$ $file
- else
- cmp -s /tmp/fixit.$$ $file || echo $0: $file needs fixing
- rm -f /tmp/fixit.$$
- fi
- done
- ;;
- esac
-
- - - -
-
- ** kai@berkp.uadv.uci.edu (Kai-Joachim Kamrath) comments:
-
- >Your question brings to mind a broader question: consistency in layout and
- >design. Perhaps it would make sense for the list to discuss standards of
- >consistency in layout of documents available at a gopher site. For example,
- >I've noticed that most documents don't contain a date within them. As
- >quicky things change in computing, it would be helpful to see the date an
- >article was written; you could easily skip anything which appears to be out
- >of date based on date written alone. I'd suggest that any gopher-able
- >document should contain a date written in the first few lines of text, so
- >users can easily skip if deemed old.
-
-
- ** As for myself, I favor QUED/M for the Mac (a commercial programmers
- editor) for cleaning up text, but as I said in my original posting, I don't
- have the operation neatly "packaged". It can be used to word wrap to a
- fixed width, expand tabs, convert line breaks, delete odd characters, and
- search and replace arbitrary characters. It has a macro capabability, but I
- haven't used this to put these features all together.
- ---
- Albert Lunde Albert-Lunde@nwu.edu
- alunde@nuacvm.bitnet
-