|
Recursive patterns
Consider the problem of matching a string in parentheses,
allowing for unlimited nested parentheses. Without the use
of recursion, the best that can be done is to use a pattern
that matches up to some fixed depth of nesting. It is not
possible to handle an arbitrary nesting depth. Perl 5.6 has
provided an experimental facility that allows regular
expressions to recurse (among other things). The special
item (?R) is provided for the specific case of recursion.
This PCRE pattern solves the parentheses problem (assume
the PCRE_EXTENDED
option is set so that white space is
ignored):
First it matches an opening parenthesis. Then it matches any number of substrings which can either be a sequence of non-parentheses, or a recursive match of the pattern itself (i.e. a correctly parenthesized substring). Finally there is a closing parenthesis.
This particular example pattern contains nested unlimited
repeats, and so the use of a once-only subpattern for matching
strings of non-parentheses is important when applying
the pattern to strings that do not match. For example, when
it is applied to
The values set for any capturing subpatterns are those from
the outermost level of the recursion at which the subpattern
value is set. If the pattern above is matched against
If the syntax for a recursive subpattern reference (either by number or
by name) is used outside the parentheses to which it refers, it operates
like a subroutine in a programming language. An earlier example
pointed out that the pattern
The maximum length of a subject string is the largest positive number that an integer variable can hold. However, PCRE uses recursion to handle subpatterns and indefinite repetition. This means that the available stack space may limit the size of a subject string that can be processed by certain patterns. |