From: | Terry Chadban |
Date: | 27 Jun 2001 at 03:48:12 |
Subject: | Re: how do batch process 350 html files? |
Hello Neil,
On 25-Jun-01, you wrote
about Re: [arexx] Re: how do batch process 350 html files?:
>> I have a related problem which hopefully someone a bit more
>> knowledgable can help with as well - I need to batch convert a
>> number of HTML files into a format suitable for import into a
>> database or spreadsheet program - Comma Separated Values would
>> probably be the best option. In my case the HTML files are in a
>> columnal form (output from a database), so it SOUNDS easy enough in
>> theory, but so far I haven't found a solution. HTTX will strip all
>> the HTML tags and give clean, columnal format, but how do I convert
>> to a format suitable for database or spreadsheet import,
>
> Does HTTX give the same column layout each time? If so, you can read
> each line of the file output by HTTX and use PARSE to split it into
> its component parts.
>
> e.g. if each line is laid out like
>
> Item 1 Item 2 Item 3 Item 4
>
Nah, that would be too easy! :-)
The output is in a form similar to this:
NAME1
Personal details, each separated by a couple of spaces
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
-------------------------------------------------------------
NAME2
Personal details
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
DATE FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6 COMMENTS
-------------------------------------------------------------
NAME3
etc. The personal details info is not in a columnal format, which
doesn't help, and the comments field can vary greatly in length. At
the moment I am undecided as to try to import the data into a database
program - I have SBase and a couple of registered shareware progs,
into TurboCalc - I have v3.5 - or to bite the bullet and try to write
an ARexx prog from scratch. But I am an absolute beginner with ARexx
even though I have some BASIC programming skills, so that would
probably be the hardest option for me, if not for some of you guys.
Your parse suggestion should work fine for the main bulk of the data,
but I am not too sure about the personal details section.
Thank you to all who have already contributed - I do appreciate it.
Bye for now,
Terry
ARexx mailing list - No flames, no overquoting, no crossposting.
Unsub: Blank mail to mailto:arexx-unsubscribe@yahoogroups.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/