Add
the file csvdriver.jar to your classpath or extract the jar file in the directory of the
application.
The
class name of the driver is jstels.jdbc.csv.CsvDriver.
The
connection URL is jdbc:jstels:csv:csvdir, where csvdir - an absolute or relative to the application pathname
of a local directory contained table files, e.g.:
jdbc:jstels:csv:c:\mydir\example or jdbc:jstels:csv:example |
The
driver supports a number of parameters that change default behavior of the driver.
These
properties are:
separator
used
to specify a different column separator (Default is ‘\t’ (tab))
suppressHeaders
used
to specify if the first line contains column header information (Default is
false
;
column headers are on first line).
fileExtension
used to specify a different file extension (Default is ".txt")
charset used to specify a different than default charset encoding of input file (Default is same VM default charset)
commentLine
used to specify a string denoting comment line (By default is not-defined)
dateFormat used to specify a sequence of date/time formats separated by the '|' character, e.g: “dd.MM.yy| dd.MM| dd”. (Default is "yyyy-MM-dd HH:mm:ss.SSS | yyyy-MM-dd HH:mm:ss | yyyy-MM-dd | HH:mm:ss.SSS | HH:mm:ss”). For more details about date/time format used in the driver please see chapter "time format syntax" in the SimpleDateFormat class documentation.
caching
used to speed up the performance of the driver (Default is true).
Using this option, you can
considerably increase
the rate of processing text files by the
driver due to caching tables in the operating memory. Keep in mind that Java
Virtual Machine must have enough free memory allotted for storing large tables (use
-Xms and -Xmx JVM options). You can also use the DROP TABLE <table
name> FROM CACHE command to force tables to be removed from the cache.
This
following example code shows how these properties are used:
Properties props = new java.util.Properties();
props.put("separator","|"); // separator is a bar props.put("suppressHeaders","true"); // first line contains data props.put("fileExtension",".txt"); // file extension is .txt props.put("charset","ISO-8859-2"); // file encoding is "ISO-8859-2" props.put("commentLine","--"); // string denoting comment line is "--" // date/time format props.put("dateFormat","yyyy-MM-dd HH:mm | dd/MM/yyyy");
Connection conn = DriverManager.getConnection("jdbc:jstels:csv:" + args[0],props); |
You can also append the properties to the URL like this:
Connection conn =
DriverManager.getConnection(
"jdbc:jstels:csv:path?suppressHeaders=true&dateFormat=yyyy-MM-dd HH:mm | dd/MM/yyyy&caching=false"); |
Connection Example
This
example code shows how the driver is used:
import java.sql.*;
public class DriverTest { public static void main(String[] args) { try { // load the driver into memory Class.forName("jstels.jdbc.csv.CsvDriver");
// create a connection. The first command line parameter is assumed to // be the directory in which the .csv files are held Connection conn = DriverManager.getConnection("jdbc:jstels:csv:" + args[0] );
// create a Statement object to execute the query with Statement stmt = conn.createStatement();
// execute a query ResultSet rs = stmt.executeQuery("SELECT * FROM \"test.txt\"");
// read the data and put it to the console while (rs.next()) { for(int j=1; j <= rs.getMetaData().getColumnCount(); j++){ System.out.print(rs.getObject(j)+ "\t"); } System.out.println(); }
// close the objects rs.close(); stmt.close(); conn.close(); } catch(Exception e) { e.printStackTrace(); } } }
|
<schema>
<table
name="my_table1.txt">
<column name="my_field1"
type="Integer"/>
<!-- the attribute 'name' used to define column name
-->
<!--
the attribute 'pos' used to define column’s position in the
table-->
<!--
note: tag 'column' must have defined either 'name' or 'pos' attribute -->
<!-- the attribute 'type' used to assign SQL data type to
a column-->
<column
pos="2" type="Integer"/>
<column name="my_field3"
type="Date"/>
</table>
<table
name="my_table2.txt">
...
... ...
</schema> |
SELECT
SUM(a) AS col1, MAX(b) /
MAX(c) AS col2
FROM “test.txt” GROUP
BY a HAVING AVG(a) >
30
SELECT a, b, to_string (c) FROM “test.txt” WHERE ( a>5 AND b<>3 ) OR
c IS NULL ORDER
BY a
SELECT name FROM
“salesreps.txt” WHERE ( rep_office IN ( 22, 11, 12 ) ) OR (
manager IS NULL AND hire_date >= to_date ( ‘01-05-2002’,’dd-MM-yyyy’ ) OR ( sales > quota AND
NOT sales > 600000.0 )
SELECT city, target, sales FROM “offices.txt” WHERE region = ‘Eastern’ AND sales > target ORDER BY city
SELECT trim ( “football club” ) AS club, to_int ( replace_string ( goals, '-' ,'0' ) ) AS goals FROM “football.txt” |
You can use your own SQL functions in the driver. To use this feature, you should do the following:
1)
Create a
static method that will act as an SQL function
Mind that:
The method must have the public modifier
The number of arguments in the method is arbitrary, but they must belong to one of the following java types: java.lang.String, java.lang.Integer, java.lang.Float and java.util.Date
It is advisable to process the null values for arguments in the method code
The method can return the null value as well
For example:
package my_pack; public class MyFuncs{ // user-defined SQL function that formats the given argument into a date/time string with specified format
public static
String format_date( java.util.Date d, String format ) {
if (d ==
null || format == null)
return
sdf.format(d); |
2) Register the function using the static method jstels.jdbc.csv.CsvDriver.registerFunction (String funcName, String handler). Where funcName argument is the name by which this method will be called in SQL queries and handler argument is the fully-qualified method name.
For example:
jstels.jdbc.csv.CsvDriver.registerFunction ( "format_date", "my_pack.MyFuncs.format_date" ) |
3) Call the function in an SQL query
For example:
Statement st = connection.createStatement(); st.execute( "select format_date( date_column , ‘yyyy-MM-dd’ ) from test" ); |