home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.databases.ingres:1941 comp.databases.sybase:363 comp.databases:7954
- Newsgroups: comp.databases.ingres,comp.databases.sybase,comp.databases
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!uchinews!gsbacd.uchicago.edu!cs_mj
- From: cs_mj@gsbacd.uchicago.edu (Mark Jaeger)
- Subject: re: Validating data imported from text files.
- Message-ID: <1992Nov19.134218.1@gsbacd.uchicago.edu>
- Lines: 54
- Sender: news@uchinews.uchicago.edu (News System)
- Organization:
- Date: Thu, 19 Nov 1992 19:42:18 GMT
-
- In <18NOV199217192059@stars.gsfc.nasa.gov>, thompson@stars.gsfc.nasa.gov
- (William Thompson, code 682.1, x2040) writes:
-
- > I was just told something startling. In an E-mail message from Myron China
- > <mryo@ind_maroon.gwl.com>, I was told
- >
- > problem with flat file xfers is verifying successful completions of both
- > the dump and the import. oracle products, for instance, always signal
- > a 0 return code whether or not it had a problem. so you end up spooling
- > log files and checking them for errors yourself. we also check file
- > record layouts for validity before anything goes through.
- >
- > That's an astonishing statement to me! Our current plan calls for *ALL* data
- > imported into the database to come from flat ASCII files generated by other
- > programs. Populating the database is supposed to be an automatic process that
- > works unattended. Problems are supposed to be signalled when they occur.
- >
- > Is this the same with the other major SQL databases: Ingres, Sybase, Informix?
-
- I can only speak for INGRES. I think this reflects a bias in relational
- databases in general and SQL in particular towards small transactions.
- In INGRES, bulk loads from flat files (using the SQL "copy table"
- statement) are optimized for fast operation. This means that, in
- particular, all integrities on the table are ignored. So you could end
- up with data in the table that violates the integrities. I think that
- there are other properties of a table that are ignored by "copy", but
- I'm not sure which ones (possibly unique indexes, enforcement of
- noduplicates, or triggers and rules).
-
- However, INGRES _is_ smart enough to detect certain kinds of errors, and
- has options for (1) aborting the whole operation on the first error, or
- after n errors; (2) continuing on errors and writing the bad input
- records to another file. If you choose the abort option (the default),
- then I'm pretty sure that the "copy" statement returns an SQL error code
- in the SQLCODE if there's an error.
-
- Ingres recommends changing the storage structure of your tables to heap
- (no index) before loading them with "copy table", and only _then_
- applying indexing, integrities, etc.
-
- If you want the full power of the database for enforcing rigid rules on
- the data, then you will have to write a program to load the data one row
- at a time, using the SQL "insert" statement instead. This will give you
- full error handling and assurance that what's in the table is what you
- expect. If you use INGRES's repeated queries, or database procedures,
- the performance may not be bad either. It would be interesting to do a
- benchmark comparison of these.
-
- --Mark Jaeger internet: cs_mj@gsbvax.uchicago.edu
- Graduate School of Business yellnet: (312) 702-0328
- University of Chicago faxnet: (312) 702-0233
- Disclaimer: My opinions are my own and not those of my employer.
- Ich bin ein Virus. Mach' mit und kopiere mich in Deine .signature.
-
-