public class Parser { // Fields protected static final String PackageName; protected PrintWriter PuntOutput; protected Vector Functions; protected Vector Symbols; protected Hashtable TypeLUT; protected Hashtable StringTypes; protected Hashtable OutputClasses; protected Hashtable StructureLUT; protected Hashtable Precedence; protected Vector IncludeHeaders; protected Vector ExcludeHeaders; protected Vector ExcludeFunctions; protected static final boolean ReadSymbols; protected static final boolean Suppress_UnknownLib_Functions; protected static final boolean Suppress_Unused_Structures; protected static final boolean Comment_Variant_Types; protected static final int DEBUG; protected static final String AnonymousString; protected static final String UnknownFileString; protected static final String UnknownLibraryString; protected static final String ExcludeFunctionFile; public static final String CallbackString; // Constructors public Parser(); // Methods public void finalizer(); protected void ReadExcludeFunctions(); public static final void usage(); protected void PopulateTypeLUT(); protected void SetStringTypes(); public void Convert() throws InvalidParameterException; protected void OutputToClassFile(Function func) throws InvalidParameterException; public String ConvertFunction(Function func) throws InvalidParameterException; public String ConvertArgumentType(Variable var, Function func) throws UnrecognizedCodeException, InvalidParameterException; public void ParseFile(String FileIn) throws UnrecognizedCodeException, InvalidParameterException; protected void MungeVariables(Function func, StreamTokenizer st) throws IOException, UnrecognizedCodeException; protected multiFieldDataStructure ReadStructure(StreamTokenizer st, boolean insideStructure) throws UnrecognizedCodeException, InvalidParameterException, IOException; protected Variable readField(StreamTokenizer st, char separator, char terminator, boolean isInsideStruct, boolean allowAnonymous) throws UnrecognizedCodeException, InvalidParameterException, IOException, PuntException; public boolean isCOperator(char c); public Operator readOperator(StreamTokenizer st, boolean couldBePrefix) throws IOException; protected void PackHandler(StreamTokenizer st, Stack packsize) throws InvalidParameterException, IOException, UnrecognizedCodeException; protected void ParseSymbolFile(String File) throws BadInputFileException, InvalidParameterException; protected void CompareFunctionWithSymbols(); public void UnifyFunctions(); public void UnifyStructures(); protected void SetupFileFilters(); protected void SetupOutputClasses(); protected void SetupPrecedenceTable(); protected boolean CheckFile(String File); protected Function findFunction(String Name); protected void FindLibrary(Function func) throws InvalidParameterException; public void OfficeFunctions(String OfficeFileName, String MissingFileName); public void WriteOutFunctions(PrintWriter pw); public void ReadListofSymbolFiles(String list) throws InvalidParameterException; public static final void main(String args[]); }
Parser for Win32 API header files
By Brian Grunkemeyer, June-August 1997
This tool was used to generate a significant portion of the Win32 API classes. It is being included for you to use and modify to fit your specific needs. Remember that C header files were not designed to be language-independent descriptions, and that there is more than one correct way to represent some data types in Java. Thus, some functions will require hand-translation. For information on how to do this, see the J/Direct documentation.
Notes:
public Parser();
protected boolean CheckFile(String File);Decides if we should examine the current file or not. Checks IncludeHeaders, ExcludeHeaders, whether its an IDL file, and whether it starts with "mm".
Return Value
Returns true if we should parse it, else false.
Parameter Description File Header file we may want to parse.
protected void CompareFunctionWithSymbols();Compares parsed functions with symbols from a DLL. Assumes a file has been parsed and a symbols file has been read in.
Return Value
No return value.
public void Convert() throws InvalidParameterException;Converts all functions and structures from C to Java.
Return Value
No return value.
public String ConvertArgumentType(Variable var, Function func) throws UnrecognizedCodeException, InvalidParameterException;Converts a C function argument's type into the equivalent Java type. Also determines how to do any string conversion (Ansi vs. Unicode) by setting function's stringformat.
Return Value
Returns String holding Java type name.
Parameter Description var Variable object to convert func Function containing this Variable Exceptions
UnrecognizedCodeException if cannot convert argument's C type to Java.
public String ConvertFunction(Function func) throws InvalidParameterException;Converts a C function prototype to a Java wrapper.
Return Value
Returns String of converted Java wrapper or "" if conversion failed.
Parameter Description func A Function object to convert. Exceptions
InvalidParameterException if func is null.
public void finalizer();
protected Function findFunction(String Name);Finds a Function with the given name in the Functions vector, returning the Function object.
Return Value
Returns reference to the Function or null if a function with that name didn't exist.
Parameter Description Name Name of the Function to look for.
protected void FindLibrary(Function func) throws InvalidParameterException;Finds which library function occurs in based on loaded symbol files. Assumes the symbol tables have been set up to be effective. Changes Function's library field.
Return Value
No return value.
Parameter Description func Function to search for in symbol tables.
public boolean isCOperator(char c);Is this character an operator or a valid first token in an operator in C?
Return Value
Returns true if c is a C operator or the first character in a C operator, else false
Parameter Description c char to test
public static final void main(String args[]);Main. Runs the application.
Return Value
No return value.
Parameter Description args[] Array of Strings containing command line parameters.
protected void MungeVariables(Function func, StreamTokenizer st) throws IOException, UnrecognizedCodeException;MungeVariables takes the current function name and a StreamTokenizer positioned right after the first parenthesis. It prints the function name, a tab, the type of a parameter, then the parameter name on a line for every parameter.
Example: void WINAPI foo(int, char ch);
Translates into:
foo\t void
foo\t int\t <anonymous>
foo\t char\t chReturn Value
No return value.
Parameter Description func Function object whose arguments are being munged. st StreamTokenizer positioned right after beginning '(' of function arguments. Exceptions
IOException if StreamTokenizer has a problem
UnrecognizedCodeException if it can't parse a variable (unlikely, but possible)
public void OfficeFunctions(String OfficeFileName, String MissingFileName);Given a file name for a list of function names, it will print out the ones not in the parser's internal storage. Writes out the missing function names to the screen and to a file.
Return Value
No return value.
Parameter Description OfficeFileName File containing a list of function names, separated by whitespace MissingFileName File to output all missing function names to
protected void OutputToClassFile(Function func) throws InvalidParameterException;Writes out a Function to the correct class file for that function.
Return Value
No return value.
Parameter Description func Function to write to a class. Exceptions
InvalidParameterException if func is null.
protected void PackHandler(StreamTokenizer st, Stack packsize) throws InvalidParameterException, IOException, UnrecognizedCodeException;Handles #pragma pack lines. Adjusts packsize stack as needed.
Return Value
No return value.
Parameter Description st StreamTokenizer positioned on a #pragma pack (specifically on "pack"). packsize Stack of PackContainer's representing current alignment. Exceptions
InvalidParameterException if st isn't positioned on the word "pack"
IOException if st encounters an I/O error.
UnrecognizedCodeException if PackHandler encounters syntax error.
public void ParseFile(String FileIn) throws UnrecognizedCodeException, InvalidParameterException;Parsefile(String) reads in the filename you pass it, stores the functions and structures from that file in a vector of functions or a hash table of structures. This is the main input processing function.
Return Value
No return value.
Parameter Description FileIn Filename to parse. Exceptions
UnrecognizedCodeException if there was a parsing problem.
InvalidParameterException if there's a problem with a function called by this one.
protected void ParseSymbolFile(String File) throws BadInputFileException, InvalidParameterException;Reads in a symbol file, adding all symbols to the Symbol table, noting which file each symbol came from for use later when putting functions in files.
Symbol files can be generated by calling dumpbin on a library, like this:
dumpbin /exports c:\windows\system\kernel32.dll > kernel32.symDo not edit symbol files, except to get rid of function names with question marks or other really odd names in them. You can leave the Microsoft dumpbin header + trailer info, or you can remove them if you need to. (Its not used by this program, but there is a keyword used to stop skipping over the header). Symbol files should contain info like this:
1 0 AddAtomA (000079FE) 2 1 AddAtomW (00004478)Return Value
No return value.
Parameter Description File Symbol file name. Exceptions
BadInputFileException if file is not strictly the output of dumpbin /exports
InvalidParameterException if one of the functions called here failed.
protected void PopulateTypeLUT();Inserts C type names and their corresponding Java types into TypeLUT, this class's internal hashtable. Edit this function if you want to handle another type name in a different way.
Tricky types are handled by leaving their type names as they were. Then you are forced to deal with them yourself when you try to compile the resulting file.
Took out most pointer to function types, hoping to recognize those at runtime. They need some special case handling anyway that I don't think we can do easily.
Return Value
No return value.
protected void ReadExcludeFunctions();Reads through list of functions to exclude, in the file described by ExcludeFunctionFile.
Return Value
No return value.
protected Variable readField(StreamTokenizer st, char separator, char terminator, boolean isInsideStruct, boolean allowAnonymous) throws UnrecognizedCodeException, InvalidParameterException, IOException, PuntException;Given a StreamTokenizer, it will read a variable type and name, including the more complex user-defined data types like unions and structs. Assumes it is being called on text within a structure or a union, although it should work with functions too. Recursively calls ReadStructure if it hits an embedded structure. Reads a field until the ending separator or terminator, leaving st there. within this one. Also tries to handle some slightly different conversion rules while reading fields from structures.
Return Value
Returns a Variable object (or if isInsideStruct is true, a Field) representing the field read in.
Parameter Description st StreamTokenizer positioned at beginning of a field. separator character used to separate multiple fields terminator character used to end a list of fields isInsideStruct whether we're reading a data structure allowAnonymous whether we can have anonymous data types declared in place. Exceptions
UnrecognizedCodeException if the function gets lost.
InvalidParameterException if separator or terminator equal StreamTokenizer.TT_WORD, or if st is null.
IOException if StreamTokenizer has an IO problem.
PuntException if the parser doesn't understand this field or is told to ignore it, based on name and/or type.
public void ReadListofSymbolFiles(String list) throws InvalidParameterException;Reads in a file containing filenames of symbol files, then subsequently parses each symbol file. Filenames should be separated by newlines. Comment char is '#'.
Return Value
No return value.
Parameter Description list name of file containing paths to symbol files. Exceptions
InvalidParameterException if list file isn't in the correct format.
public Operator readOperator(StreamTokenizer st, boolean couldBePrefix) throws IOException;Reads in a C++ operator, given a set of constraints on what this operator could be.
Return Value
Returns Operator instance of token we just read.
Parameter Description st StreamTokenizer positioned at start of operator. Exceptions
IOException if StreamTokenizer has problems.
protected multiFieldDataStructure ReadStructure(StreamTokenizer st, boolean insideStructure) throws UnrecognizedCodeException, InvalidParameterException, IOException;Parses a multiFieldDataStructure, reading in its fields, etc. Returns the Struct or Union object.
Return Value
Returns a new multiFieldDataStructure object.
Parameter Description st StreamTokenizer positioned at the struct or union keyword. insideStructure true if ReadStructure is nested in a struct or union. Exceptions
UnrecognizedCodeException if struct was unparsible.
InvalidParameterException if StreamTokenizer wasn't positioned on struct.
IOException if StreamTokenizer can't read stream.
protected void SetStringTypes();Fills the StringTypes Hashtable with all String types and how to convert them from Unicode to whatever format is needed. Punts on TCHAR and derivatives, setting them to auto.
Return Value
No return value.
protected void SetupFileFilters();Builds file Include and Exclude lists. Here is where we set IncludeHeaders and ExcludeHeaders to their original values. This should be edited when you add a new set of libraries, although the default rules should let your own header files be parsed with a warning. Remember, this program can't parse COM.
Return Value
No return value.
protected void SetupOutputClasses();Initializes hash table containing package files for each of the DLL's. To add a new output class, create a PrintWriter for it, output the header info and class name to it, and add it to the hash table, using the symbol file name as the key. This function totally controls how various functions are routed into their own Java classes.
Return Value
No return value.
protected void SetupPrecedenceTable();Set up a table of C operator precedence. Taken from the VC++ 5 online help. Keys are String's containing operator and values are ints describing precedence, with 0 being the lowest. Duplicate entries are handled by appending odd characters that convey some sense of the meaning with them. I spaced out the precedence numbers to add new operators, in case the table was incomplete or if the ISO committee goes change-happy.
Return Value
No return value.
public void UnifyFunctions();Scans through read in functions, looking for the ASCII and Unicode versions of any such functions. If it finds them both, it will strip off the last character, merging them into one function call. Deals with the 4 special cases I found in the Win32 API.
Return Value
No return value.
public void UnifyStructures();Scans through read in functions, looking for the ASCII and Unicode versions of any such functions. If it finds them both, it will strip off the last character, merging them into one function call. Deals with the 4 special cases I found in the Win32 API.
Return Value
No return value.
public static final void usage();Prints the command line syntax to stdout.
Return Value
No return value.
public void WriteOutFunctions(PrintWriter pw);Prints the functions out to the PrintWriter. Uses the format specified in Function::toString().
Return Value
No return value.
Parameter Description pw PrintWriter to send output to.