home *** CD-ROM | disk | FTP | other *** search
- Ch 8 − Using CSV files
-
- 8.1 What are CSV files?
-
- CSV stands for “comma-separated values” and is the name given to data files
- in which each item of data is separated from the next by a comma. Such files
- are widely used to transfer data from one application to another, e.g. from
- a database to a spreadsheet or from a RISC OS database to a PC database. The
- following points should be noted:−
-
- (1) Data items which are non-numeric (e.g. plain text items such as
- names and addresses) are often enclosed in double quotes (“”) whereas
- numeric items are not. This makes it possible for an application reading a
- CSV file to distinguish between numbers and strings (which might of course
- contain numerals) and also allows a comma to be used as a character within a
- string without being mistaken for a data separator. (Addresses often contain
- commas e.g. 112, Keighley Road). For many purposes the quotes aren’t
- necessary and may be omitted.
-
- (2) Null data items are usually included and can be located by looking
- for two commas with either nothing in between them or with only two
- quotation marks between them, i.e. ,, or ,“”,. Each line of data (record) in
- such a file will always contain the same number of data items (fields).
- Powerbase does, however, allow nulls to be omitted entirely when creating a
- file in which case the number of fields per record will vary.
-
- (3) The way in which each record is terminated varies from one system to
- another. The last item of data in a record is not followed by a comma but by
- a line terminator. On RISC OS systems this is the line-feed character (ASCII
- value 10) and on PCs it is usually the carriage-return character (ASCII
- value 13). You might, however, encounter CSV files in which both these
- characters are used, i.e. LF CR or CR LF.
-
- (4) Separator other than commas are sometimes used. The Tab character
- (ASCII value 9) is often used and such files are called TSV files. Powerbase
- allows you to separate data items and terminate lines (see (3) above) with
- any character or character pair.
-
- (5) Some applications which accept a CSV file as input expect the first
- line to contain the names of the fields which comprise the subsequent
- records, e.g. if each record consists of a name and a four-part address this
- header record might read:−
-
- “NAME”,“STREET”,“TOWN”,“COUNTY”,“POSTCODE”
-
- Powerbase is able to save data in the form of true CSV files or files using
- another data separator. If the separator is a comma then the file created is
- of type &dfe and its icon displays the letters CSV. If another separator,
- eg. Tab, is used (see (4) above) then the file is of the ordinary Text type,
- i.e. &fff. Files of these types may also be used to enter data into a
- Powerbase database. In what follows we will, for convenience, refer to them
- all as “CSV files” whether or not the separator is a comma. The choices on
- the main menu which control these actions are Export CSV and CSV options.
-
- 8.2 Setting the CSV options
-
- The main-menu choice CSV options leads to a dialogue box which lets you
- specify all the file characteristics described earlier. Selection of the
- first three option buttons causes a saved file to have, respectively:−
-
- • quotes round non-numeric fields (see (1) above)
-
- • a header record specifying the field names (see (5) above).
-
- • null fields included (see (2) above).
-
- Note that the field names referred to in (b) are, by default, the tags of
- the corresponding Powerbase fields, but may be changed to the descriptors by
- altering the setting in the Print options window. The field separator and
- record terminator may be changed by means of menus which appear when you
- click on the pop-up menu icons. Each of these menus includes a writable
- entry allowing you to define your own separator and/or terminator of one or
- two printing characters.
-
- 8.3 Exporting data as a CSV file
-
- Having set up your options as described above, creating a CSV file is very
- like printing a list. First highlight the fields to be exported by clicking
- on each with ADJUST. Remember that the order in which the fields are
- highlighted is important. Next choose Export CSV from the main menu. A
- window appears featuring the Query panel. Type in the search formula to
- determine which records are exported. Finally, enter the name of the file
- and drag the file icon to a filer window. By default the file is saved in
- PrintJobs as usual, and you may simply click on Export or type Return.
- Experiment with saving CSV files with different settings of the CSV options
- and then loading the resulting files into Edit to examine them.
-
- 8.4 Using CSV files to import data
-
- If you drag a CSV file onto the record display of an open database the
- options window (see CSV options) appears with a changed title and some
- additional icons, one of which displays the pathname of the CSV file.
- Clicking on Import will make Powerbase try to create new database records
- from the file. If you decide not to do this you should click on Cancel. If
- you do wish to import the data there are some important consideration which
- will now be explained.
-
- 8.4.1 Ensuring that the correct options are selected
-
- If the CSV file originally came from a Powerbase application the settings in
- the CSV options window should be exactly the same as they were when the file
- was exported. The exception to this is the Quotes button which is shaded on
- import because Powerbase doesn’t need it. If the CSV file came from a PC or
- another RISC OS application you might have to load it into Edit to find out
- what separator and terminator are used.
-
- There are two more option switches at the bottom of the window. One causes
- each record to be displayed as it is imported. The import process is slower
- with this turned on but much more informative if you like to know how things
- are progressing.
-
- The other option strips any trailing spaces in the imported data-fields.
- Some database programs pad all fields to their maximum length by adding
- spaces to the end of the data where necessary. If you import such a file
- into Powerbase you will find that the caret will always be at the far right
- of the field even though visible characters do not fill the field, and some
- queries won’t work properly. Setting the Strip spaces switch before
- importing the file overcomes the problem.
-
- 8.4.2 Directing imported data to the correct fields
-
- If no fields on the record screen are highlighted (i.e. with ADJUST) and the
- CSV file does not contain a header record (see 8.1) then the import process
- proceeds according to the following rules:−
-
- • The first CSV field will be read into the first Powerbase field for
- which importing is allowed (Graphics fields, Buttons or fields which are
- merely labels will be ignored). The next CSV field will be read into the
- second Powerbase field and so on.
-
- • If the end of the CSV record is reached before all the relevant
- fields have been filled (data underflow) then the next CSV record will start
- a new Powerbase record, i.e. the reading won’t get out of step. It does not
- matter, therefore, if the CSV file omits null items at the end of a line.
-
- • If all relevant fields are filled before reaching the end of the CSV
- record (data overflow) Powerbase ignores the remainder of the line and skips
- to the beginning of the next CSV record before starting a new Powerbase
- record. This is also to keep the operation in step.
-
- There will be occasions when you don’t want to fill the Powerbase fields
- sequentially as just described. There are two ways of making the process
- more specific:−
-
- • Highlight the required Powerbase fields with ADJUST before starting
- the import. Data will then be read only into the highlighted fields, all
- other fields being ignored. The order in which the fields are filled is the
- order in which you highlighted them. The rules given above about underflow
- and overflow of data still apply.
-
- • Give the file a header record containing the tags or descriptors of
- required fields in the Powerbase record. (There is nothing to prevent you
- manually adding such a header to a CSV file which did not originate from a
- Powerbase application.) Importing then occurs just as if those fields were
- highlighted. The Print options window must reflect whether the tags or
- descriptors of fields are used.
-
- Do not use both a header and highlighting.
-
- 8.4.3 Importing plain text files
-
- It was previously explained that files created with Export CSV can have
- separators other than a comma. Such files will be of type &fff (plain text)
- instead of &dfe (CSV). They can still be imported into a Powerbase database
- but a certain amount of caution is needed because there are other
- circumstances in which a text file might be dropped on the record window. A
- properly-written script file (see Ch 9) would be recognised as such and
- therefore cause no problem, but any text file dropped onto the appropriate
- type of External field (i.e. a Text or Text Block) field would become linked
- to that field instead of being treated like a CSV file. If you are importing
- data from a plain text file and your record contains fields of the
- aforementioned types be sure to drop the file on the window background, not
- on the External field. You are strongly advised to use proper CSV files if
- at all possible.
-
- 8.4.4 What if the imported data won’t fit?
-
- There are two situations in which this can happen. The database might not
- contain enough free records to hold all the imported data and so you get a
- “Database full when reading CSV file” error. To avoid this either make sure
- the database is big enough before you start or place a suitable value in the
- Increment for expansion icon in the Change length window. The latter is
- accessible from the Utilities submenu of the icon-bar menu.
-
- The second situation is where an item is too long for the destined database
- field. When importing data Powerbase maintains a file called TooBig inside
- the database’s PrintJobs directory. Anything which won’t fit in the target
- field is written to this file together with information about where it was
- intended to go. No writable Powerbase field may be longer than 246
- characters and if an item of imported data exceeds this a note will be made
- in the TooBig file advising you to define an External field (Text Block or
- Text) for such data.
-
- 8.4.5 Advanced features of CSV import
-
- By selecting a switch in the CSV options window you can make each record of
- a saved CSV file include the primary key of the Powerbase record. Try
- creating CSV files with and without this option set and compare the files
- using Edit. If such a file is dragged onto a record window Powerbase will
- attempt to locate records with the same primary keys as the records in the
- CSV file. If a matching key is not found the CSV record is simply ignored.
- If it is found then the CSV data goes into the same record, overwriting any
- data which the target fields already contain. In other words it is possible
- to use CSV importing to modify existing records as opposed to creating new
- ones.
-
- The operation described should be used with care. It is only useful for
- transferring data between Powerbase databases which have the same primary
- key field. It can lead to trouble if the primary key isn’t unique, since the
- record located by Powerbase, and into which the CSV data is read, might not
- be the correct one. You have been warned!
-
- There is an option button in the CSV options window called With field data
- which only becomes active when the With header button is selected. With the
- switch selected the header record of a saved CSV file contains not only the
- field names but also the field lengths and types. Again it is suggested that
- you create a file and look at it in Edit. Each item in the header begins
- with a number. This is the field length in characters, i.e. the maximum
- allowed length in the Powerbase record field. Next comes a ¤ character which
- separates the length from the field name. Another ¤ separates the field name
- from the concluding number which determines the field type. (The type
- numbers may be inspected by looking at the ValStrings file in the Powerbase
- directory.)
-
- A file created in this form is not meant to be dropped onto the window of an
- open database. It should be dropped onto the Powerbase icon on the icon-bar
- when no database is open. Powerbase will attempt to convert the file into a
- functioning database. All the fields will be ranged on the left of the
- record window, one beneath another and the primary key will consist of the
- first four characters of the first field. Don’t try to create databases
- containing Check box, Button or External fields using this method; it’s only
- meant for the most basic type of database.
-
- 8.5 CSV files and validation tables
-
- You can save the contents of a validation table as a CSV file (see 5.6).
- You can also import data into a validation table by dropping a CSV file onto
- it, the procedure being just the same as that described for importing into
- database records.
-
-