home *** CD-ROM | disk | FTP | other *** search
- Most of the interfaces are covered by the documentation for GNU regex.
-
- This file will accumulate factoids about other interfaces until
- somebody writes a manual.
-
-
-
-
-
-
- * RX_SEARCH
-
- For an example of how to use rx_search, you can look at how
- re_search_2 is defined (in rx.c). Basicly you need to define three
- functions. These are GET_BURST, FETCH_CHAR, and BACK_REF. They each
- operate on `struct rx_string_position' and a closure of your design
- passed as void *.
-
- struct rx_string_position
- {
- const unsigned char * pos; /* The current pos. */
- const unsigned char * string; /* The current string burst. */
- const unsigned char * end; /* First invalid position >= POS. */
- int offset; /* Integer address of current burst */
- int size; /* Current string's size. */
- int search_direction; /* 1 or -1 */
- int search_end; /* First position to not try. */
- };
-
- On entry to GET_BURST, all these fields are set, but POS may be >=
- END. In fact, STRING and END might both be 0.
-
- The function of GET_BURST is to make all the fields valid without
- changing the logical position in the string. SEARCH_DIRECTION is a
- hint about which way the matcher will move next. It is usually 1, and
- is -1 only when fastmapping during a reverse search. SEARCH_END
- terminates the burst.
-
- typedef enum rx_get_burst_return
- (*rx_get_burst_fn) (struct rx_string_position * pos,
- void * app_closure,
- int stop);
-
-
- The closure is whatever you pass to rx_search. STOP is an argument to
- rx_search that bounds the search. You should never return a string
- position from with SEARCH_END set beyond the position indicated by
- STOP.
-
-
- enum rx_get_burst_return
- {
- rx_get_burst_continuation,
- rx_get_burst_error,
- rx_get_burst_ok,
- rx_get_burst_no_more
- };
-
- Those are the possible return values of get_burst. Normally, you only
- ever care about the last two. An error return indicates something
- like trouble reading a file. A continuation return means suspend the
- search and resume by retrying GET_BURST if the search is restarted.
-
- GET_BURST is not quite as trivial as you might hope. If you have a
- fragmented string, you really have to keep two adjacent fragments at
- all times, even though the GET_BURST interface looks like you only
- need one. This is because of operators like `word-boundry' that try
- to look at two adjacent characters. Such operators are implemented
- with FETCH_CHAR.
-
- typedef int (*rx_fetch_char_fn) (struct rx_string_position * pos,
- int offset,
- void * app_closure,
- int stop);
-
- That takes the same closure passed to GET_BURST. It returns the
- character at POS or at one past POS according to whether OFFSET is 0
- or 1.
-
- It is guaranteed that POS + OFFSET is within the string being searched.
-
-
-
- The last function compares characters at one position with characters
- previously matched by a parenthesized subexpression.
-
- enum rx_back_check_return
- {
- rx_back_check_continuation,
- rx_back_check_error,
- rx_back_check_pass,
- rx_back_check_fail
- };
-
- typedef enum rx_back_check_return
- (*rx_back_check_fn) (struct rx_string_position * pos,
- int lparen,
- int rparen,
- unsigned char * translate,
- void * app_closure,
- int stop);
-
- LPAREN and RPAREN are the integer indexes of the previously matched
- characters. The comparison should translate both characters being
- compared by mapping them through TRANSLATE. POS is the point at which
- to begin comparing. It should be advanced to the last character
- matched during backreferencing.
-
-
-