[section Introduction] [h2 What is xpressive?] xpressive is an object-oriented regular expression library. Regular expressions (regexes) can be written as strings that are parsed dynamically at runtime (dynamic regexes), or as expression templates that are parsed at compile-time (static regexes). Dynamic regexes have the advantage that they can be accepted from the user as input at runtime or read from an initialization file. Static regexes have several advantages. Since they are C++ expressions instead of strings, they can be syntax-checked at compile-time. Also, they can refer to other regexes and to themselves, giving static regexes the power of context-free grammars. Finally, since they are statically bound, the compiler can generate faster code for static regexes. xpressive's dual nature is unique and powerful. Static xpressive is a bit like the _spirit_fx_. Like _spirit_, you can build grammars with static regexes using expression templates. (Unlike _spirit_, xpressive does exhaustive backtracking, trying every possibility to find a match for your pattern.) Dynamic xpressive is a bit like _regexpp_. In fact, xpressive's interface should be familiar to anyone who has used _regexpp_. xpressive's innovation comes from allowing you to mix and match static and dynamic regexes in the same program, and even in the same expression! You can embed a dynamic regex in a static regex, and the dynamic regex will participate fully in the search, back-tracking as needed to make the match succeed. [h2 Hello, world!] Enough theory. Let's have a look at ['Hello World], xpressive style: #include #include using namespace boost::xpressive; int main() { std::string hello( "hello world!" ); sregex rex = sregex::compile( "(\\w+) (\\w+)!" ); smatch what; if( regex_match( hello, what, rex ) ) { std::cout << what[0] << '\n'; // whole match std::cout << what[1] << '\n'; // first capture std::cout << what[2] << '\n'; // second capture } return 0; } This program outputs the following: [pre hello world! hello world ] The first thing you'll notice about the code is that all the types in xpressive live in the `boost::xpressive` namespace. [note Most of the rest of the examples in this document will leave off the `using namespace boost::xpressive;` directive. Just pretend it's there.] Next, you'll notice the type of the regular expression object is `sregex`. If you are familiar with _regexpp_, this is different than what you are used to. The "`s`" in "`sregex`" stands for "`string`", indicating that this regex can be used to find patterns in `std::string` objects. I'll discuss this difference and its implications in detail later. Notice how the regex object is initialized: sregex rex = sregex::compile( "(\\w+) (\\w+)!" ); To create a regular expression object from a string, you must call a factory method such as _regex_compile_. This is another area in which xpressive differs from other object-oriented regular expression libraries. Other libraries encourage you to think of a regular expression as a kind of string on steroids. In xpressive, regular expressions are not strings; they are little programs in a domain-specific language. Strings are only one ['representation] of that language. Another representation is an expression template. For example, the above line of code is equivalent to the following: sregex rex = (s1= +_w) >> ' ' >> (s2= +_w) >> '!'; This describes the same regular expression, except it uses the domain-specific embedded language defined by static xpressive. As you can see, static regexes have a syntax that is noticeably different than standard Perl syntax. That is because we are constrained by C++'s syntax. The biggest difference is the use of `>>` to mean "followed by". For instance, in Perl you can just put sub-expressions next to each other: abc But in C++, there must be an operator separating sub-expressions: a >> b >> c In Perl, parentheses `()` have special meaning. They group, but as a side-effect they also create back-references like [^$1] and [^$2]. In C++, there is no way to overload parentheses to give them side-effects. To get the same effect, we use the special `s1`, `s2`, etc. tokens. Assign to one to create a back-reference (known as a sub-match in xpressive). You'll also notice that the one-or-more repetition operator `+` has moved from postfix to prefix position. That's because C++ doesn't have a postfix `+` operator. So: "\\w+" is the same as: +_w We'll cover all the other differences [link boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes later]. [endsect]