A regular expression is a general pattern notation that allows you to describe patterns in text that range from simple to extremely complex. A regular expression is much more powerful than the simple glob style matching many people are familiar with, for example "*.*" or the "Like" operator in Visual Basic. A regular expression is actually a small program that is compiled on the fly and run using text as its input. The result of running the regular expression "program" is usually a matching offset and length within the input text. Most regular expression engines allow you to request the next matching offset and length, and PatternPro is no exception.
PatternPro follows a basic regular expression syntax with a few POSIX and VB specific additions. It is beyond the scope of this document to discuss regular expressions in great detail although we do supply a brief introduction. Numerous resources on the subject of regular expressions are available on the Web and through your local bookstore.
PatternPro is a powerful ActiveX DLL that allows developers to implement regular expression pattern matching within their applications quickly and easily. In most cases, even the most complex pattern matching requirements can be easily met using PatternPro. The ActiveX packaging of the library makes it possible to use PatternPro in any COM capable development language, including Visual Basic, Powerbuilder, FoxPro, C/C++, and Delphi. In addition, PatternPro may be used in scripted applications that are COM capable including Excel, Word, and Access. PatternPro regular expressions are an excellent add-in for your Sax, Cypress , or WinWrap Basic VBA engine. In addition to regular expressions, PatternPro exposes classes that aid in the creation of tokenizers and scanners.
A regular expression is constructed using a series of pattern descriptions. For clarity we will call these pattern descriptions "expressions". An expression is optionally followed by a repetition operator that alters the number of times the pattern is required to match. For example: The expression "A+" will match the first three characters of the string "AAABC". It is implicit that when one pattern immediately follows another, that both are required for a complete match and that they are required to match in the same order. The "|" operator is used to designate when only one of a series of expressions is required to match. For example, the expression A|B matches either of the letters A or B. Expressions may also be grouped using parenthesis "(...)" into a subexpression. Continuing our example from above the expression (A|B)+ matches the first four characters of the string "AAABC".
Click here to view a table of the supported sytntax
Regular expression matching can be an intensive operation and although the PatternPro engine is highly optimized, it is possible to craft expressions that are inefficient and slow. Optimizing your expressions is paramount to obtaining the best possible matching speed. An example although somewhat unrealistic will prove useful here. Assume for a moment that the following expression is to be used for find the longest match in a string of "A" and "B"
(A|B)+
If we knew that our input was going to contain many more letters "A" than "B" the expression above would not be a wise choice. The following expression would be much better suited for the purpose of locating the longest match:
(A+|B)+
The reason the second expression is better suited for matching our input is that the first looks for and matches A or B equally fast, while the second is crafted to look for multiple A then B which better matches our purposes. For a much more complete discussion of Regular Expression and optimization, please see Mastering Regular Expressions by Jeffrey E.F. Friedl available from O'Reilly.
Click here to download a zip archive containing the setup files for the Evaluation DLL and the demo project. Important: Register the DLL before using it.
Q: How much does it cost?
A: The demo version of the software is a fully functional working copy. Try it, you'll like it.
Q: What is the difference between the demo version and the commercial version?
A: Speed. pcode vs. optimized binary. The commercial version is much faster.
Q: What is the cost of upgrading to the commercial version?
A: Only $20.00 (US Currency, Business Check or Money Order)
Copyright 1999, 2000 BlackBox Software & Consulting