Parser docs

Intro

This parsing tool converts C function prototypes into Java wrappers for those functions and converts structs and trivial C++ classes into Java classes. It was designed for exposing functions from DLLs and providing access to them in Java using J/Direct. It handles all of J/Direct's rules for fixed arrays, alignment issues, and string conversion for you.

How effective is it?

This tool was used to generate almost everything in com.ms.win32.*. Nontrivial issues this tool handles include:

However there were many cases that had to be checked or totally redone by hand. Some of the items this tool will not translate include:

The parser was written to parse C code reasonably well, build up an accurate representation of the C code, then apply some Java conversion heuristics. While this works for simple types most of the time, there are cases developers may want to handle specially. For this reason, developers are encouraged to edit the program, adding conversion rules wherever appropriate. For example, you could increase Java performance by declaring some classes as final. But since this was designed as a generalized tool, the author felt allowing customizability through subclassing would be more important than the slight performance gain.

Inheritance isn't supported - J/Direct currently has no way to marshal the virtual function pointer correctly.

Requirements

Steps

  1. Make sure your code is in acceptable format. COM objects can't be parsed. Non-trivial classes are also unparsible, or at the very least problematic. All exported functions should have WINAPI, APIENTRY, or CALLBACK (or another #define'd token that expands to __stdcall or __cdecl) between a function's return type and the function name. Example:

    BOOL WINAPI SwitchDesktop(HDESK hDesktop);

  2. Create symbol files for your DLL. If you only want to export one DLL, you can ignore this step, but its still recommended since skipping it adds one more step and a decent amount of filtering. Run this command on each DLL:

    dumpbin /exports your_lib.dll > your_lib.sym.

    This list of symbols tells the parser which classes your functions belong in. All functions that aren't listed in a symbol file are put in Misc.java and need special attention (discussed below).

  3. Create symbolfiles.txt in the same directory that the parser will be run in, listing all the symbol files you generated. Format of the file is one file per line, comment character is #.

  4. Create a header file that includes all necessary headers. Make a file called (for example) gen.h, that includes everything you need. This is optional if you only have one header file to parse and/or you have one header file that includes all the other headers needed. Also make sure that your headers are only included once each, or if they are included multiple times that you wrap the headers with #ifdef's. Here's what your header should look like:

    #ifdef __YOUR_HEADER_H__
    #define __YOUR_HEADER_H__
    
    <all of your header file code>
    
    #endif
    

    (In addition to being necessary for running the parser, this may speed up compile time for your code. The only time you don't want this is if you have a circular dependency in your header files, which is bad style.)

  5. Run the C preprocessor over your header. Set up all the #define's you need for your code to compile in the current environment, then run the C preprocessor over your header files. Example:

    cl /Iinclude-path [/Dyour_symbol] /E gen.h > gen.prep

    After changing into an include directory, here's what I used:

    cl /I. /E windows.h > windows.prep

    This generates a C preprocessed version of your headers with all the macros expanded, which is easier to parse.

  6. Edit the parser source for your files. There need to be changes to a few places to set up output files and packages.
  7. Run the parser over your .prep file. Check the output for errors - anything starting with "Ack!" is a pretty serious problem that will probably leave at least one function or structure out of the output. Example:

    jview Parser gen.prep

  8. Check through punt.txt, which should have been generated by the parser. It should contain a (slightly convoluted) list of all functions or structures the parser wouldn't translate or read in properly.
  9. If you didn't generate a symbol file, go back through Misc.java, pulling out all of your functions. Change "<unknown_library>" in every @dll.import line to your DLL's name. There may be other functions in this class that might not be in your DLL and if you erroneously say they are, you'll get linking errors.
  10. Test with jvc. Other tests you could do include getting the size of the struct in Java and comparing that with the size of the struct in C. And you could get the size of all the parameters passed to functions in Java and C. These are reasonable first-step test passes, but they don't guarantee correctness.

Define scanner

To process #define'd constants, use defscan, a separate tool for scanning through files. As input, it expects files containing only #defines that span only one line with no other text in between. A good way to generate these is as follows:

grep #define your_header.h > your_header.def

After you do that, scan for C-style comments that span multiple lines. If you don't do this, you may end up commenting out most of your constants! For any lines that pop up from the following command, fix them up, usually by adding a */ in the correct place.

grep \/\* your_header.def | grep -v \*\/

It cannot parse macros, although a few extremely simple ones have been special cased into the code. If you only have a handful of macros, you could add them into the code. Casting is supported to a limited extend. Strings can be outputted into the file or suppressed, depending on the value of defscan.SuppressStrings.

Run defscan with the following command if you have a small number of files:

jview defscan your_file.def

If there are many files, then list them all in files and run defscan with no arguments.

The output is a set of interface files containing all the constants. Since this was written to handle the Win32 constants, the output files all start with win. They are broken down alphabetically into multiple interfaces (win[a-z].java), plus there's one master interface file that contains them all (win.java).