StelsCSV JDBC Driver v1.4 Documentation

 

 

Contents 

 

Installation

Driver Name

URL Syntax

Driver Properties

Connection Example

Database Schema

Supported SQL Syntax

User-defined SQL functions

 

Installation

 

Add the file csvdriver.jar to your classpath or extract the jar file in the directory of the application.

 

Driver Name

 

The class name of the driver is jstels.jdbc.csv.CsvDriver.

 

URL Syntax

 

The connection URL is jdbc:jstels:csv:csvdir, where csvdir - an absolute or relative to the application pathname of a local directory contained table files, e.g.:

 

jdbc:jstels:csv:c:\mydir\example or jdbc:jstels:csv:example

   

 

Driver Properties

 

The driver supports a number of parameters that change default behavior of the driver.

These properties are:

 

separator used to specify a different column separator (Default is ‘\t’ (tab))

 

suppressHeaders used to specify if the first line contains column header information (Default is false; column headers are on first line).

 

fileExtension used to specify a different file extension (Default is ".txt")

 

charset used to specify a different than default charset encoding of input file (Default is same VM default charset

 

commentLine used to specify a string denoting comment line (By default is not-defined)

 

dateFormat used to specify a sequence of date/time formats separated by the '|' character, e.g: “dd.MM.yy| dd.MM| dd”. (Default is "yyyy-MM-dd HH:mm:ss.SSS | yyyy-MM-dd HH:mm:ss | yyyy-MM-dd |  HH:mm:ss.SSS |  HH:mm:ss”). For more details about date/time format used in the driver please see chapter "time format syntax" in the SimpleDateFormat class documentation. 

 

caching used to speed up the performance of the driver (Default is true). Using this option, you can considerably increase the rate of processing text files by the driver due to caching tables in the operating memory.  Keep in mind that Java Virtual Machine must have enough free memory allotted for storing large tables (use -Xms and -Xmx JVM options). You can also use the DROP TABLE <table name> FROM CACHE command to force tables to be removed from the cache.
 

This following example code shows how these properties are used:

 
Properties props = new java.util.Properties();
 
props.put("separator","|");              // separator is a bar
props.put("suppressHeaders","true");     // first line contains data
props.put("fileExtension",".txt");       // file extension is .txt
props.put("charset","ISO-8859-2");       // file encoding is "ISO-8859-2"
props.put("commentLine","--");           // string denoting comment line is "--"
// date/time format
props.put("dateFormat","yyyy-MM-dd HH:mm | dd/MM/yyyy");       
 
Connection conn = DriverManager.getConnection("jdbc:jstels:csv:" + args[0],props);
 
You can also append the properties to the URL like this:
 
Connection conn = 
DriverManager.getConnection(
"jdbc:jstels:csv:path?suppressHeaders=true&dateFormat=yyyy-MM-dd HH:mm | dd/MM/yyyy&caching=false");
 
 
Connection Example

This example code shows how the driver is used:

 
import java.sql.*;
 
public class DriverTest
{
  public static void main(String[] args)
  {
    try
    {
      // load the driver into memory
      Class.forName("jstels.jdbc.csv.CsvDriver");
 
      // create a connection. The first command line parameter is assumed to
      //  be the directory in which the .csv files are held
      Connection conn = DriverManager.getConnection("jdbc:jstels:csv:" + args[0] );
 
      // create a Statement object to execute the query with
      Statement stmt = conn.createStatement();
 
      // execute a query
      ResultSet rs = stmt.executeQuery("SELECT * FROM \"test.txt\"");
 
      // read the data and put it to the console
      while (rs.next())
      {
            for(int j=1; j <= rs.getMetaData().getColumnCount(); j++){
                   System.out.print(rs.getObject(j)+ "\t");
            }
            System.out.println();
      }
 
      // close the objects
      rs.close();
      stmt.close();
      conn.close();
    }
    catch(Exception e)
    {
      e.printStackTrace();
    }
  }
}
      

 

Database Schema

 

The database schema file is called “schema.xml”. This is XML file that contains the information about the table files in the CSV directory.

The schema has the following format:

 

<schema>

  <table name="my_table1.txt">

    <column name="my_field1" type="Integer"/>

    <!-- the attribute 'name' used to define column name -->

<!-- the attribute 'pos' used to define column’s position in the table-->

<!-- note: tag 'column' must have defined either 'name' or 'pos' attribute -->

    <!-- the attribute 'type' used to assign SQL data type to a column-->

    <column pos="2" type="Integer"/>

    <column name="my_field3" type="Date"/>

  </table>

  <table name="my_table2.txt">

... ... ...   

</schema>

 

The driver supports the following column data types: CHAR, VARCHAR, LONGVARCHAR, INTEGER, TINYINT, SMALLINT, BIGINT, DOUBLE, REAL, FLOAT, DECIMAL, NUMERIC, DATETIME, DATE, TIME, TIMESTAMP that are finally transformed to INTEGER, FLOAT, VARCHAR and TIMESTAMP types.

 

Supported SQL Syntax

 

“SELECT” operatorsee specification, syntax and functions here.

 

Examples:

         SELECT SUM(a) AS col1, MAX(b) / MAX(c) AS col2 FROM “test.txt” GROUP BY a  HAVING AVG(a) > 30

    SELECT a, b, to_string (c) FROM “test.txt” WHERE ( a>5 AND b<>3 ) OR c IS NULL ORDER BY a

    SELECT name FROM “salesreps.txt” WHERE ( rep_office IN ( 22, 11, 12 ) )  OR ( manager IS NULL AND hire_date >= to_date ( ‘01-05-2002’,’dd-MM-yyyy’ ) OR ( sales > quota AND NOT sales > 600000.0 )

    SELECT city, target, sales FROM “offices.txt” WHERE region = ‘Eastern’ AND sales > target ORDER BY city

SELECT trim ( “football club” ) AS club, to_int ( replace_string ( goals, '-' ,'0' ) ) AS goals FROM  “football.txt”

 

“CREATE TABLE”,  UPDATE” and “DELETE” operators are coming with new versions.

 

 

User-defined SQL functions

 

You can use your own SQL functions in the driver. To use this feature, you should do the following: 

1) Create a static method that will act as an SQL function
    Mind that:

For example:

package my_pack;

public class MyFuncs{

// user-defined SQL function that formats the given argument into a date/time string with specified format

public static String format_date( java.util.Date d, String format ) {
    // process the null values

    if (d == null || format == null)
    return null;
    java.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat(format);
   
// return a function result with java.lang.String type

    return sdf.format(d);
}
}

 

2) Register the function using the static method jstels.jdbc.csv.CsvDriver.registerFunction (String funcName, String handler). Where funcName argument is the name by which this method will be called in SQL queries and handler argument is the fully-qualified method name.

For example:

jstels.jdbc.csv.CsvDriver.registerFunction ( "format_date",  "my_pack.MyFuncs.format_date" )

 

3) Call the function in an SQL query

For example:

Statement st = connection.createStatement();

st.execute( "select format_date( date_column , ‘yyyy-MM-dd ) from test" );

 

[HOME]   [TOP]