Interested in improving this site? Please check the To Do page.
Back to Regular Expression (re) Verbs
re.getPatternInfo
Compiles pattern into the internal byte-code representation for a regular expression, taking the specified options into account.
Syntax
re.getPatternInfo (cp, adrInfoTable)
Params
- cp is a binary param containing the compiled pattern that was returned by a call to “re.compile”.
- adrInfoTable is an address param determining where a table will be created and filled with detailed information about the compiled pattern.
Returns
True.
Errors
An error will be generated if the cp param is not a valid byte-code representation of a regular expression.
Notes
- After a successful call to re.getPatternInfo, the info table will contain the following elements:
- backRefMax, a number indicating the index of the highest backreference in the pattern.
- captureCount, a number indicating the total number of capturing parentheses in the pattern.
- firstByte, a char value indicating the character that must start any possible match, or the number -1 if the pattern only matches at the beginning of a string or the beginning of any line, or the number -2 otherwise.
- firstByteTable, a sub-table containing boolean entries numbered from 000 to 255 whose values indicate whether the corresponding ascii character could start a possible match.
- lastLiteral, a char value indicating the rightmost character that must exist in any matched string, other than at its start, if such a character has been found, or the number -1 otherwise.
- nameTable, a sub-table containing number entries whose names are the names of all named capturing parentheses and whose values are the corresponding group numbers.
- options, a sub-table containing boolean entries corresponding to all the boolean parameters of “re.compile”.
- size, a number indicating the size (in bytes) of the compiled pattern.
- studySize, a number indicating the size (in bytes) of the table of possible first characters that might start a match.
- The kernel implementation of the re verbs is based on the Perl Compatible Regular Expressions (PCRE) library. This is the same library used by Python, Apache, and PHP. Mandatory excerpt from the PCRE license:
- Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel, and copyright by the University of Cambridge, England.
- The source code for the PCRE library is available for download from ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/