FBB::Pattern(3bobcat)

Pattern matcher
(libbobcat-dev_4.08.03-x.tar.gz)

2005-2018

NAME

FBB::Pattern - Performs RE pattern matching

SYNOPSIS

#include <bobcat/pattern>
Linking option: -lbobcat

DESCRIPTION

Pattern objects may be used for Regular Expression (RE) pattern matching. The class is a wrapper around the regcomp(3) family of functions. By default it uses `extended regular expressions', requiring you to escape multipliers and bounding-characters when they should be interpreted as ordinary characters (i.e., *, +, ?, ^, $, |, (, ), [, ], {, } should be escaped when used as literal characters).

The Pattern class supports the use of the following (Perl-like) special escape sequences:
\b - indicating a word-boundary
\d - indicating a digit ([[:digit:]]) character
\s - indicating a white-space ([:space:]) character
\w - indicating a word ([:alnum:]) character

The corresponding capitals (e.g., \W) define the complementary character sets. The capitalized character set shorthands are not expanded inside explicit character-classes (i.e., [ ... ] constructions). So [\W] represents a set of two characters: \ and W.

As the backslash (\) is treated as a special character it should be handled carefully. Pattern converts the escape sequences \d \s \w (and outside of explicit character classes the sequences \D \S \W) to their respective character classes. All other escape sequences are kept as is, and the resulting regular expression is offered to the pattern matching compilation function regcomp(3). This function will again interpret escape sequences. Consequently some care should be exercised when defining patterns containing escape sequences. Here are the rules:

NAMESPACE

FBB
All constructors, members, operators and manipulators, mentioned in this man-page, are defined in the namespace FBB.

INHERITS FROM

-

TYPEDEF

CONSTRUCTORS

Pattern offers copy and move constructors.

MEMBER FUNCTIONS

All members of std::ostringstream and std::exception are available, as Pattern inherits from these classes.

OVERLOADED OPERATORS

EXAMPLE

/*
                              driver.cc
*/

#include "driver.h"

//#include <bobcat/pattern>
#include "../pattern.ih"

using namespace std;
using namespace FBB;

#include <algorithm>
#include <cstring>

void Pattern::swap(Pattern &other)
{
    fswap(*this, other);
    fswap(d_text, other.d_text);
    d_text.swap(other.d_text);
}


void showSubstr(string const &str)
{
    static int 
        count = 1;

    cout << "String " << count++ << " is '" << str << "'\n";
}


int main(int argc, char **argv)
{
//    {
//        Pattern one("one");
////        Pattern two(one);
//        Pattern three("a");
//        Pattern four;
//        three = three;
//    }


//    try 
//    {
//        Pattern pattern("aap|noot|mies");
//
//        {
//            Pattern extra(Pattern(pattern));
//        }
//    
//        if (pattern << "noot")
//            cout << "noot matches\n";
//        else
//            cout << ": noot doesn't match\n";
//    }
//    catch (exception const &e)
//    {
//        cout << e.what() << ": compilation failed" << endl;
//    }
//        
    string pat = "\\d+";

    while (true)
    {
        cout << "Pattern: '" << pat << "'\n";

        try
        {
            Pattern patt(pat, argc == 1);   // case sensitive by default,
                                            // any arg for case insensitive

            cout << "Compiled pattern: " << patt.pattern() << endl;

            Pattern pattern;
            pattern = patt;                 // assignment operator

            while (true)
            {
                cout << "string to match : ";

                string st;
                getline(cin, st);
                if (st == "")
                    break;
                cout << "String: '" << st << "'\n";
                try
                {
                    pattern.match(st);

                    Pattern p3(pattern);
        
                    cout << "before:  " << p3.before() << "\n"
                            "matched: " << p3.matched() << "\n"  
                            "beyond:  " << pattern.beyond() << "\n"  
                            "end() = " << pattern.end() << endl;
        
                    for (size_t idx = 0; idx < pattern.end(); ++idx)
                    {
                        string str = pattern[idx];
            
                        if (str == "")
                            cout << "part " << idx << " not present\n";
                        else
                        {
                            Pattern::Position pos = pattern.position(idx);
        
                            cout << "part " << idx << ": '" << str << "' (" <<
                                    pos.first << "-" << pos.second << ")\n";
                        }
                    }
                }
                catch (exception const &e)
                {
                    cout << e.what() << ": " << st << " doesn't match" << endl;
                    continue;
                }
            }
        }            
        catch (exception const &e)
        {
            cout << e.what() << ": compilation failed" << endl;
        }

        cout << "New pattern: ";

        if (!getline(cin, pat) || !pat.length())
            return 0;
    }
}





FILES

bobcat/pattern - defines the class interface

SEE ALSO

bobcat(7), regcomp(3), regex(3), regex(7)

BUGS

Using Pattern objects as static data members of classes (or as global objects) is potentially dangerous. If the object files defining these static data members are stored in a dynamic library they may not be initialized properly or timely, and their eventual destruction may result in a segmentation fault. This is a well-known problem with static data, see, e.g., http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.15. In situations like this prefer the use of a (shared, unique) pointer to a Pattern, initializing the pointer when, e.g., first used.

DISTRIBUTION FILES

BOBCAT

Bobcat is an acronym of `Brokken's Own Base Classes And Templates'.

COPYRIGHT

This is free software, distributed under the terms of the GNU General Public License (GPL).

AUTHOR

Frank B. Brokken (f.b.brokken@rug.nl).