C++ Boost

Boost.Regex

Boost-Extended Format String Syntax

Boost.Regex Index

Boost-Extended format strings treat all characters as literals except for '$', '\', '(', ')', '?', ':' and '\'.

Grouping

The characters '(' and ')' perform lexical grouping, use \( and \) if you want a to output literal parenthesis.

Conditionals

The character '?' begins a conditional expression, the general form is:

?Ntrue-expression:false-expression

where N is decimal digit.

If sub-expression N was matched, then true-expression is evaluated and sent to output, otherwise false-expression is evaluated and sent to output.

You will normally need to surround a conditional-expression with parenthesis in order to prevent ambiguities.

Placeholder Sequences

Placeholder sequences specify that some part of what matched the regular expression should be sent to output as follows:

Placeholder Meaning
$& Outputs what matched the whole expression.
$` Outputs the text between the end of the last match found (or the start of the text if no previous match was found), and the start of the current match.
$' Outputs all the text following the end of the current match.
$$ Outputs a literal '$'
$n Outputs what matched the n'th sub-expression.

Any $-placeholder sequence not listed above, results in '$' being treated as a literal.

Escape Sequences

An escape character followed by any character x, outputs that character unless x is one of the escape sequences shown below.

Escape Meaning
\a Outputs the bell character: '\a'.
\e Outputs the ANSI escape character (code point 27).
\f Outputs a form feed character: '\f'
\n Outputs a newline character: '\n'.
\r Outputs a carriage return character: '\r'.
\t Outputs a tab character: '\t'.
\v Outputs a vertical tab character: '\v'.
\xDD Outputs the character whose hexadecimal code point is 0xDD
\x{DDDD} Outputs the character whose hexadecimal code point is 0xDDDDD
\cX Outputs the ANSI escape sequence "escape-X".
\D If D is a decimal digit in the range 1-9, then outputs the text that matched sub-expression D.
\l Causes the next character to be outputted, to be output in lower case.
\u Causes the next character to be outputted, to be output in upper case.
\L Causes all subsequent characters to be output in lower case, until a \E is found.
\U Causes all subsequent characters to be output in upper case, until a \E is found.
\E Terminates a \L or \U sequence.


Revised 24 Nov 2004

© Copyright John Maddock 2004

Use, modification and distribution are subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)