This directory holds three useful sets of scripts
- the perl scripts that I used to get the FAQ html files from ohio state.
- ftplib.pl
- url.pl
- w3get.orig - original netsrc
- hget
- w3get - tweaked src for my site...
- w3get.perl
- cleanup script to correct links for html files obtained using w3get
to the setup on my machine
- cleanupURL ( copies and cleanups dir contents to cwd )
- cleanupURLf ( copies and cleanups named files to cwd )
- perl script to convert man pages to html
- man2html ( original netsrc )
- tk2html ( tweaked version of above that can be run in the
tcl/tk doc dir, that also strips trailing change bars )
What I do to get FAQ files fm ohio state
1) Get files ( copies to dir ~/tmp/html_get )
w3get http://www.cis.ohio-state.edu:80/hypertext/faq/usenet/tcl-faq/part1/faq.html
w3get http://www.cis.ohio-state.edu:80/hypertext/faq/usenet/tcl-faq/part3/faq.html
w3get http://www.cis.ohio-state.edu:80/hypertext/faq/usenet/tcl-faq/part4/faq.html
w3get http://www.cis.ohio-state.edu:80/hypertext/faq/usenet/tcl-faq/part5/faq.html
w3get http://www.cis.ohio-state.edu:80/hypertext/faq/usenet/tcl-faq/part2/faq.html
( part2 is largest ).
2) Retry after a short pause if get connection errors
Can break when tries to reget existing files or just let run
Ignore errors re imperfectly formed https for ftp files
3) Copy got files to working dir correcting addresses on way
self references within FAQ are converted from absolute to relative refs
cd to target dir ( ~hops/doc/xmosaic/tcl/TclFAQ/part[1234] )
cd /part1
../../w3g/cleanupURL ~/tmp/html_get/www.cis:80/hypertext/faq/usenet/tcl-faq/part1
cd ../part2
../../w3g/cleanupURL ~/tmp/html_get/www.cis:80/hypertext/faq/usenet/tcl-faq/part2
cd ../part3
../../w3g/cleanupURL ~/tmp/html_get/www.cis:80/hypertext/faq/usenet/tcl-faq/part3
cd ../part4
../../w3g/cleanupURL ~/tmp/html_get/www.cis:80/hypertext/faq/usenet/tcl-faq/part4
4) check refs are correct reading fm mosaic.....
6) Make tar file and ftp to harbor
Update external WWW site.
7) Done....