Save Web pages, open multiple documents


Tip
Web pages are simply ASCII text files with embedded HTML codes. When you want to send the text of a Web page as an e-mail message or use it for some off-line purpose, you need to save the file to disk, then strip out all of its HTML codes. You can remove the codes using any of today?s word processors, but each program requires a different technique.
In Word 6.0 for Windows and Word 7.0 for Windows 95, you can download the Word Internet Assistant from the Microsoft site at http://www.microsoft.com. This lets you open HTML files for editing. You can then save them as text to remove the HTML tags. Alternatively you can use the Pattern Matching feature in the Replace dialogue box to replace the codes, but the process is far from intuitive. Here?s how to remove the codes in both versions of Word:
1. Select File--Open, select All Files in the "Files of type" drop-down list, and open the HTML document you want to strip.
2. Select Edit--Replace. In the Replace dialogue box, type \<*\> in the Find What box, check the Use Pattern Matching box, and click Replace All. After all the codes are removed, you may have to delete a few excess blank lines.
In WordPerfect 6.1 for Windows, you can't automatically strip out all HTML codes; WordPerfect just can't handle punctuation in a search. But you can get rid of most of the codes, leaving just a few to remove manually:
1. Select File--Open, choose All Files (*.*) in the List Files of Type drop-down list, and open the HTML document.
2. When WordPerfect prompts you to select a file type, choose ASCII (DOS) Text in the Convert File Format From drop-down list, then click OK.
3. Select Edit--Find and Replace. Type < in the Find box, then select Match--Codes.
4. In the Codes dialogue box, choose * (Many Char) from the Find Codes list, then click Insert and Close.
5. Type >, and then click Replace All. Confirm that the "Replace With" line says "<Nothing>"f, then click Close.
This will remove most of the HTML codes from the document. You'll have to delete any remaining codes or blank lines from the file manually. When you're done, save the document with a new file name, choosing ASCII (DOS) Text (*.*) in the Save File as Type list.
If you use Word Pro you're in luck: Word Pro can open and edit an HTML file directly. The text of the file looks the same in Word Pro as it does on the Web -- formatting and all. As a result, you just have to save the document in ASCII format to remove the HTML codes from the file.
1. Select File--Open, select HTML (*HTM) from the Files of type drop-down list, and open the HTML file.
2. Once the file has finished loading, select File--Save As, enter a new name, and select the Text-ASCII (DOS) (*.*) file type. The resulting file will be clean as a whistle, with no HTML codes at all.
- George Campbell

Category: Word processing, Internet
Issue: Nov 1996
Pages: 162-163

These Web pages are produced by Australian PC World © 1997 IDG Communications