From: | Neil Bothwick |
Date: | 25 Jun 2001 at 12:36:41 |
Subject: | Re: how do batch process 350 html files? |
Terry Chadban said,
> I have a related problem which hopefully someone a bit more
> knowledgable can help with as well - I need to batch convert a number
> of HTML files into a format suitable for import into a database or
> spreadsheet program - Comma Separated Values would probably be the
> best option. In my case the HTML files are in a columnal form (output
> from a database), so it SOUNDS easy enough in theory, but so far I
> haven't found a solution. HTTX will strip all the HTML tags and give
> clean, columnal format, but how do I convert to a format suitable for
> database or spreadsheet import,
Does HTTX give the same column layout each time? If so, you can read
each line of the file output by HTTX and use PARSE to split it into its
component parts.
e.g. if each line is laid out like
Item 1 Item 2 Item 3 Item 4
you could use
drop Items.
line = readln(file)
parse var line Items.1 10 Items.2 20 Items.3 30 Items.4
do i = 1 to 4
Items.i = trim(Items.i)
end
This assumes that the columns are at 10, 20, etc.
> and most importantly, can it be done
> in a batch mode, the way that Sarkis is talking about.
Put the file reading and processing into a function, then have a script
that reads a directory and calls the function for each file. Search
this list's archives for options for directory reading.
Cheers
Neil
The sergeant walked into the shower and caught me giving myself a
dishonorable discharge. Without missing a beat, I said, "It's my dick
and I can wash it as fast as I want!"
ARexx mailing list - No flames, no overquoting, no crossposting.
Unsub: Blank mail to mailto:arexx-unsubscribe@yahoogroups.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/