This tool was used to generate a significant portion of the Win32 API classes. It is being included for you to use and modify to fit your specific needs. Remember that C header files were not designed to be language-independent descriptions, and that there is more than one correct way to represent some data types in Java. Thus, some functions will require hand-translation. For information on how to do this, see the J/Direct documentation.
Notes:
- Currently we prefer the Unicode version of functions if possible. This avoids the overhead on NT of converting the Java string to ASCII then the ASCII wrapper converting the string back to Unicode internally. But most of the time we will output the auto keyword on structs and functions.
- COM uses a different object and function handling method.
- Inside structs the output only includes strings.
Class Parser
public class Parser
{
// Fields
protected static final String PackageName;
protected PrintWriter PuntOutput;
protected Vector Functions;
protected Vector Symbols;
protected Hashtable TypeLUT;
protected Hashtable StringTypes;
protected Hashtable OutputClasses;
protected Hashtable StructureLUT;
protected Hashtable Precedence;
protected Vector IncludeHeaders;
protected Vector ExcludeHeaders;
protected Vector ExcludeFunctions;
protected static final boolean ReadSymbols;
protected static final boolean Suppress_UnknownLib_Functions;
protected static final boolean Suppress_Unused_Structures;
protected static final boolean Comment_Variant_Types;
protected static final int DEBUG;
protected static final String AnonymousString;
protected static final String UnknownFileString;
protected static final String UnknownLibraryString;
protected static final String ExcludeFunctionFile;
public static final String CallbackString;
// Constructors
public Parser();
// Methods
public void finalizer();
protected void ReadExcludeFunctions();
public static final void usage();
protected void PopulateTypeLUT();
protected void SetStringTypes();
public void Convert() throws InvalidParameterException;
protected void OutputToClassFile(Function func)
throws InvalidParameterException;
public String ConvertFunction(Function func)
throws InvalidParameterException;
public String ConvertArgumentType(Variable var, Function func)
throws UnrecognizedCodeException, InvalidParameterException;
public void ParseFile(String FileIn) throws UnrecognizedCodeException,
InvalidParameterException;
protected void MungeVariables(Function func, StreamTokenizer st)
throws IOException, UnrecognizedCodeException;
protected multiFieldDataStructure ReadStructure(StreamTokenizer st,
boolean insideStructure) throws UnrecognizedCodeException,
InvalidParameterException, IOException;
protected Variable readField(StreamTokenizer st, char separator,
char terminator, boolean isInsideStruct, boolean allowAnonymous)
throws UnrecognizedCodeException, InvalidParameterException, IOException,
PuntException;
public boolean isCOperator(char c);
public Operator readOperator(StreamTokenizer st, boolean couldBePrefix)
throws IOException;
protected void PackHandler(StreamTokenizer st, Stack packsize)
throws InvalidParameterException, IOException, UnrecognizedCodeException;
protected void ParseSymbolFile(String File) throws BadInputFileException,
InvalidParameterException;
protected void CompareFunctionWithSymbols();
public void UnifyFunctions();
public void UnifyStructures();
protected void SetupFileFilters();
protected void SetupOutputClasses();
protected void SetupPrecedenceTable();
protected boolean CheckFile(String File);
protected Function findFunction(String Name);
protected void FindLibrary(Function func) throws InvalidParameterException;
public void OfficeFunctions(String OfficeFileName, String MissingFileName);
public void WriteOutFunctions(PrintWriter pw);
public void ReadListofSymbolFiles(String list)
throws InvalidParameterException;
public static final void main(String args[]);
}
Constructors
public Parser();
Methods
protected boolean CheckFile(String File);
Decides if we should examine the current file or not. Checks IncludeHeaders, ExcludeHeaders, whether its an IDL file, and whether it starts with "mm".
Return Value
Returns true if we should parse it, else false.
Parameter | Description |
File | Header file we may want to parse.
|
protected void CompareFunctionWithSymbols();
Compares parsed functions with symbols from a DLL. Assumes a file has been parsed and a symbols file has been read in.
Return Value
No return value.
public void Convert() throws InvalidParameterException;
Converts all functions and structures from C to Java.
Return Value
No return value.
public String ConvertArgumentType(Variable var, Function func)
throws UnrecognizedCodeException, InvalidParameterException;
Converts a C function argument's type into the equivalent Java type. Also determines how to do any string conversion (Ansi vs. Unicode) by setting function's stringformat.
Return Value
Returns String holding Java type name.
Parameter | Description |
var | Variable object to convert
|
func | Function containing this Variable
|
Exceptions
UnrecognizedCodeException
if cannot convert argument's C type to Java.
public String ConvertFunction(Function func)
throws InvalidParameterException;
Converts a C function prototype to a Java wrapper.
Return Value
Returns String of converted Java wrapper or "" if conversion failed.
Parameter | Description |
func | A Function object to convert.
|
Exceptions
InvalidParameterException
if func is null.
public void finalizer();
protected Function findFunction(String Name);
Finds a Function with the given name in the Functions vector, returning the Function object.
Return Value
Returns reference to the Function or null if a function with that name didn't exist.
Parameter | Description |
Name | Name of the Function to look for.
|
protected void FindLibrary(Function func) throws InvalidParameterException;
Finds which library function occurs in based on loaded symbol files. Assumes the symbol tables have been set up to be effective. Changes Function's library field.
Return Value
No return value.
Parameter | Description |
func | Function to search for in symbol tables.
|
public boolean isCOperator(char c);
Is this character an operator or a valid first token in an operator in C?
Return Value
Returns true if c is a C operator or the first character in a C operator, else false
Parameter | Description |
c | char to test
|
public static final void main(String args[]);
Main. Runs the application.
Return Value
No return value.
Parameter | Description |
args[] | Array of Strings containing command line parameters.
|
protected void MungeVariables(Function func, StreamTokenizer st)
throws IOException, UnrecognizedCodeException;
MungeVariables takes the current function name and a StreamTokenizer positioned right after the first parenthesis. It prints the function name, a tab, the type of a parameter, then the parameter name on a line for every parameter.
Example: void WINAPI foo(int, char ch);
Translates into:
foo\t void
foo\t int\t <anonymous>
foo\t char\t ch
Return Value
No return value.
Parameter | Description |
func | Function object whose arguments are being munged.
|
st | StreamTokenizer positioned right after beginning '(' of function arguments.
|
Exceptions
IOException
if StreamTokenizer has a problem
UnrecognizedCodeException
if it can't parse a variable (unlikely, but possible)
public void OfficeFunctions(String OfficeFileName, String MissingFileName);
Given a file name for a list of function names, it will print out the ones not in the parser's internal storage. Writes out the missing function names to the screen and to a file.
Return Value
No return value.
Parameter | Description |
OfficeFileName | File containing a list of function names, separated by whitespace
|
MissingFileName | File to output all missing function names to
|
protected void OutputToClassFile(Function func)
throws InvalidParameterException;
Writes out a Function to the correct class file for that function.
Return Value
No return value.
Parameter | Description |
func | Function to write to a class.
|
Exceptions
InvalidParameterException
if func is null.
protected void PackHandler(StreamTokenizer st, Stack packsize)
throws InvalidParameterException, IOException, UnrecognizedCodeException;
Handles #pragma pack lines. Adjusts packsize stack as needed.
Return Value
No return value.
Parameter | Description |
st | StreamTokenizer positioned on a #pragma pack (specifically on "pack").
|
packsize | Stack of PackContainer's representing current alignment.
|
Exceptions
InvalidParameterException
if st isn't positioned on the word "pack"
IOException
if st encounters an I/O error.
UnrecognizedCodeException
if PackHandler encounters syntax error.
public void ParseFile(String FileIn) throws UnrecognizedCodeException,
InvalidParameterException;
Parsefile(String) reads in the filename you pass it, stores the functions and structures from that file in a vector of functions or a hash table of structures. This is the main input processing function.
Return Value
No return value.
Parameter | Description |
FileIn | Filename to parse.
|
Exceptions
UnrecognizedCodeException
if there was a parsing problem.
InvalidParameterException
if there's a problem with a function called by this one.
protected void ParseSymbolFile(String File) throws BadInputFileException,
InvalidParameterException;
Reads in a symbol file, adding all symbols to the Symbol table, noting which file each symbol came from for use later when putting functions in files.
Symbol files can be generated by calling dumpbin on a library, like this:
dumpbin /exports c:\windows\system\kernel32.dll > kernel32.sym
Do not edit symbol files, except to get rid of function names with question marks or other really odd names in them. You can leave the Microsoft dumpbin header + trailer info, or you can remove them if you need to. (Its not used by this program, but there is a keyword used to stop skipping over the header). Symbol files should contain info like this:
1 0 AddAtomA (000079FE)
2 1 AddAtomW (00004478)
Return Value
No return value.
Parameter | Description |
File | Symbol file name.
|
Exceptions
BadInputFileException
if file is not strictly the output of dumpbin /exports
InvalidParameterException
if one of the functions called here failed.
protected void PopulateTypeLUT();
Inserts C type names and their corresponding Java types into TypeLUT, this class's internal hashtable. Edit this function if you want to handle another type name in a different way.
Tricky types are handled by leaving their type names as they were. Then you are forced to deal with them yourself when you try to compile the resulting file.
Took out most pointer to function types, hoping to recognize those at runtime. They need some special case handling anyway that I don't think we can do easily.
Return Value
No return value.
protected void ReadExcludeFunctions();
Reads through list of functions to exclude, in the file described by ExcludeFunctionFile.
Return Value
No return value.
protected Variable readField(StreamTokenizer st, char separator,
char terminator, boolean isInsideStruct, boolean allowAnonymous)
throws UnrecognizedCodeException, InvalidParameterException, IOException,
PuntException;
Given a StreamTokenizer, it will read a variable type and name, including the more complex user-defined data types like unions and structs. Assumes it is being called on text within a structure or a union, although it should work with functions too. Recursively calls ReadStructure if it hits an embedded structure. Reads a field until the ending separator or terminator, leaving st there. within this one. Also tries to handle some slightly different conversion rules while reading fields from structures.
Return Value
Returns a Variable object (or if isInsideStruct is true, a Field) representing the field read in.
Parameter | Description |
st | StreamTokenizer positioned at beginning of a field.
|
separator | character used to separate multiple fields
|
terminator | character used to end a list of fields
|
isInsideStruct | whether we're reading a data structure
|
allowAnonymous | whether we can have anonymous data types declared in place.
|
Exceptions
UnrecognizedCodeException
if the function gets lost.
InvalidParameterException
if separator or terminator equal StreamTokenizer.TT_WORD, or if st is null.
IOException
if StreamTokenizer has an IO problem.
PuntException
if the parser doesn't understand this field or is told to ignore it, based on name and/or type.
public void ReadListofSymbolFiles(String list)
throws InvalidParameterException;
Reads in a file containing filenames of symbol files, then subsequently parses each symbol file. Filenames should be separated by newlines. Comment char is '#'.
Return Value
No return value.
Parameter | Description |
list | name of file containing paths to symbol files.
|
Exceptions
InvalidParameterException
if list file isn't in the correct format.
public Operator readOperator(StreamTokenizer st, boolean couldBePrefix)
throws IOException;
Reads in a C++ operator, given a set of constraints on what this operator could be.
Return Value
Returns Operator instance of token we just read.
Parameter | Description |
st | StreamTokenizer positioned at start of operator.
|
Exceptions
IOException
if StreamTokenizer has problems.
protected multiFieldDataStructure ReadStructure(StreamTokenizer st,
boolean insideStructure) throws UnrecognizedCodeException,
InvalidParameterException, IOException;
Parses a multiFieldDataStructure, reading in its fields, etc. Returns the Struct or Union object.
Return Value
Returns a new multiFieldDataStructure object.
Parameter | Description |
st | StreamTokenizer positioned at the struct or union keyword.
|
insideStructure | true if ReadStructure is nested in a struct or union.
|
Exceptions
UnrecognizedCodeException
if struct was unparsible.
InvalidParameterException
if StreamTokenizer wasn't positioned on struct.
IOException
if StreamTokenizer can't read stream.
protected void SetStringTypes();
Fills the StringTypes Hashtable with all String types and how to convert them from Unicode to whatever format is needed. Punts on TCHAR and derivatives, setting them to auto.
Return Value
No return value.
protected void SetupFileFilters();
Builds file Include and Exclude lists. Here is where we set IncludeHeaders and ExcludeHeaders to their original values. This should be edited when you add a new set of libraries, although the default rules should let your own header files be parsed with a warning. Remember, this program can't parse COM.
Return Value
No return value.
protected void SetupOutputClasses();
Initializes hash table containing package files for each of the DLL's. To add a new output class, create a PrintWriter for it, output the header info and class name to it, and add it to the hash table, using the symbol file name as the key. This function totally controls how various functions are routed into their own Java classes.
Return Value
No return value.
protected void SetupPrecedenceTable();
Set up a table of C operator precedence. Taken from the VC++ 5 online help. Keys are String's containing operator and values are ints describing precedence, with 0 being the lowest. Duplicate entries are handled by appending odd characters that convey some sense of the meaning with them. I spaced out the precedence numbers to add new operators, in case the table was incomplete or if the ISO committee goes change-happy.
Return Value
No return value.
public void UnifyFunctions();
Scans through read in functions, looking for the ASCII and Unicode versions of any such functions. If it finds them both, it will strip off the last character, merging them into one function call. Deals with the 4 special cases I found in the Win32 API.
Return Value
No return value.
public void UnifyStructures();
Scans through read in functions, looking for the ASCII and Unicode versions of any such functions. If it finds them both, it will strip off the last character, merging them into one function call. Deals with the 4 special cases I found in the Win32 API.
Return Value
No return value.
public static final void usage();
Prints the command line syntax to stdout.
Return Value
No return value.
public void WriteOutFunctions(PrintWriter pw);
Prints the functions out to the PrintWriter. Uses the format specified in Function::toString().
Return Value
No return value.
Parameter | Description |
pw | PrintWriter to send output to.
|
Fields
- AnonymousString
-
- CallbackString
-
- Comment_Variant_Types
-
- CopyrightNotice
-
- DEBUG
-
- ExcludeFunctionFile
-
- ExcludeFunctions
-
- ExcludeHeaders
-
- Functions
-
- IncludeHeaders
-
- OutputClasses
-
- PackageName
-
- Precedence
-
- PuntOutput
-
- ReadSymbols
-
- StringTypes
-
- StructureLUT
-
- Suppress_UnknownLib_Functions
-
- Suppress_Unused_Structures
-
- Symbols
-
- TypeLUT
-
- UnknownFileString
-
- UnknownLibraryString
-