bisonc++api(3)
bisonc++ API
(bisonc++.6.02.00.tar.gz)
2005-2018
NAME
bisonc++ - Application programmer's interface of bisonc++ generated classes
DESCRIPTION
Bisonc++ derives from bison++(1), originally derived from
bison(1). Like these programs bisonc++ generates a parser for an LALR(1)
grammar. Bisonc++ generates C++ code: an expandable C++ class.
Refer to bisonc++(1) for a general overview. This manual page covers the
application programmer's interface of classes generated by bisonc++. It contains
the following sections:
- DESCRIPTION: this section;
- PUBLIC SYMBOLS: constructor, enums, members, and types that can
be used by calling software;
- PRIVATE ENUMS AND -TYPES: enumerations and types only
available to the Parser class;
- PRIVATE MEMBER FUNCTIONS: member functions that are only
available to the Parser class;
- PRIVATE DATA MEMBERS: data members that are only available to
the Parser class;
- TYPES AND VARIABLES IN THE ANONYMOUS NAMESPACE: an overview of
the types and variables that are used to define and store the
grammar-tables generated by bisonc++;
- SEE ALSO: references to other programs and documentation;
- AUTHOR: at the end of this man-page.
All identifiers ending in an underscore character are reserved for
bisonc++. Member functions ending in an underscore character must not be
redefined. Data members ending in an underscore character are available in
the generated parser class, and may be modified by user-defined members of the
parser class. Some members like (error, exceptionHandler, lex) are defined
in the parser class and must remain present, but their implementations may be
altered by the user. Members for which no default implementation is provided
in the parser's internal header file (e.g., Parser.ih) may not be
redefined or masked by user-provided code.
UNDERSCORES
Starting with version 6.02.00 bisonc++ reserved identifiers no longer end
in two underscore characters, but in one. This modification was necessary
because according to the C++ standard identifiers having two or more
consecutive underscore characters are reserved by the language. In
practice this could require some minor modifications of existing source
files using bisonc++'s facilities, most likely limited to changing Tokens__
into Tokens_ and changing Meta__ into Meta_.
The complete list of affected names is:
- Enums:
DebugMode_, ErrorRecovery_, Return_, Tag_, Tokens_
- Enums values:
PARSE_ABORT_, PARSE_ACCEPT_, UNEXPECTED_TOKEN_, sizeofTag_
- Type / namespace designators:
Meta_, PI_, STYPE_
- Member functions:
clearin_, errorRecovery_, errorVerbose_, executeAction_, lex_,
lookup_, nextCycle_, nextToken_, popToken_, pop_, print_,
pushToken_, push_, recovery_, redoToken_, reduce_, savedToken_,
shift_, stackSize_, startRecovery_, state_, token_, top_, vs_,
- Protected data members:
d_acceptedTokens_, d_actionCases_, d_debug_, d_nErrors_,
d_requiredTokens_, d_val_, idOfTag_, s_nErrors_
PUBLIC SYMBOLS
Parser classes generated by bisonc++ offer the following public constructor,
enums, members and types (in the following overview parser class-name prefixes
(e.g., Parser::) prefixes were omitted):
- Constructors: the generated parser class merely defines the default
constructor. Copy and move constructors are not available. The default
constructor is a real default: it is declared as such in the parser's
header file. Additional constructors can easily be added to the parser
class's interface. Since the initialization of the parser's base class
is performed by the parser base class's default constructor,
constructors that are added to the generated parser class
automatically call the base class constructor, so additional
constructors do not have to explicitly initialize the parser's base
class.
- DebugMode_:
The values of this enum are used to configure the type of debug
information that will be displayed (assuming that the debug
option/directive was specified when bisonc++ generated the parser's
code). It has three values:
OFF: no debug information is displayed when the generated
parser's parse function is called;
ON: extensive debug information about the parsing process is
displayed when the generated parser's parse function is called;
ACTIONCASES: just before executing the grammar's action blocks
the action block number is written to the standard output
stream. These action block numbers refer to case labels of the switch
that is defined in the parser's executeAction function. It is
commonly used to find the action block where a fatal semantic value
mismatch was observed.
The bit_or operator can be used to combine ON and
ACTIONCASES (see the member function setDebug(DebugMode_
mode) below).
- LTYPE_:
The parser's location type (user-definable). Available only when
either %lsp-needed, %ltype or %locationstruct has been
declared.
- STYPE_:
The parser's stack-type (user-definable), defaults to int.
- Tokens_:
The enumeration type of all the symbolic tokens defined in the grammar
file (i.e., bisonc++'s input file). The scanner should be prepared to
return these symbolic tokens. Note that, since the symbolic tokens are
defined in the parser's class and not in the scanner's class, the
lexical scanner must prefix the parser's class name to the symbolic
token names when they are returned. E.g., return Parser::IDENT
should be used rather than return IDENT.
- int parse():
The parser's parsing member function. It returns 0 when parsing was
successfully completed; 1 if errors were encountered while parsing the
input.
- void setDebug(bool mode):
This member can be used to activate or deactivate the debug-code
compiled into the parsing function. It is always defined but is only
operational if the debug directive option was specified when bisonc++
generated the parse function. If so, it is not active by
default; to activate the debug output call setDebug(true), to
suppress the debug output call setDebug(false).
- void setDebug(DebugMode_ mode):
This member can also be used to activate or deactivate the debug-code
compiled into the parsing function. Like setDebug(bool) it is
always defined but only operational if the debug directive option
was specified when bisonc++ generated the parse function. If so, it
is not active by default; to activate, call
setDebug(Parser::ON), setDebug(Parser::ACTIONCASES), or
setDebug(Parser::ON | Parser::ACTIONCASES). To suppress the
debug code output call setDebug(Parser_::OFF) or simply
setDebug(false).
When the %polymorphic directive is used:
- Meta_:
Templates and classes that are required for implementing the
polymorphic semantic values are all declared in the Meta_
namespace. The Meta_ namespace itself is nested under the
namespace that may have been declared by the %namespace
directive.
- Tag_:
The (strongly typed) enum class Tag_ contains all the
tag-identifiers specified by the %polymorphic directive. It is
declared outside of the Parser's class, but within the namespace
that may have been declared by the %namespace directive.
PRIVATE ENUMS AND -TYPES
The following enumerations and types can be used by members of parser
classes generated by bisonc++. They are actually protected members inherited from
the parser's base class.
- Base::ErrorRecovery_:
This enumeration defines one values:
UNEXPECTED_TOKEN_
When the parsing process throws UNEXPECTED_TOKEN_ the recovery
procedure is started (i.e., it is started whenever a syntactic error
is encountered or ERROR() is called).
The recovery procedure consists of (1) looking for the first state on
the state-stack having an error-production, followed by (2) handling
all state transitions that are possible without retrieving a terminal
token. Then, in the state requiring a terminal token and starting with
the initial unexpected token (3) all subsequent terminal tokens are
ignored until a token is retrieved which is a continuation token in
that state.
If the error recovery procedure fails (i.e., if no acceptable token is
ever encountered) error recovery falls back to the default recovery
mode: the parsing process terminates.
- Base::Return_:
This enumeration defines two values:
PARSE_ACCEPT = 0,
PARSE_ABORT = 1
(which are also used as the parse function's return values).
When the %polymorphic directive is used:
- Meta_::sizeofTag_:
sizeofTag_ defines the number of tags that were defined for
polymorphic semantic values.
PRIVATE MEMBER FUNCTIONS
The following members can be used by members of parser classes generated
by bisonc++. When prefixed by Base:: they are actually protected members
inherited from the parser's base class. These members are shown
below. Following the description of those members several more are listed:
those members are used during the parsing process, andshould not be modified
or masked by user-defined code.
- void Base::ABORT() const throw(Return_):
This member can be called from any member function (called from any of
the parser's action blocks) to indicate a failure while parsing thus
terminating the parsing function with an error value 1. Note that this
offers a marked extension and improvement of the macro YYABORT
defined by bison++ in that YYABORT could not be called from
outside of the parsing member function.
- void Base::ACCEPT() const throw(Return_):
This member can be called from any member function (called from any of
the parser's action blocks) to indicate successful parsing and thus
terminating the parsing function. Note that this offers a marked
extension and improvement of the macro YYACCEPT defined by
bison++ in that YYACCEPT could not be called from outside of
the parsing member function.
- void Base::ERROR() const throw(ErrorRecovery_):
This member can be called from any member function (called from any of
the parser's action blocks) to generate an error, and results in the
parser executing its error recovery code. Note that this offers a
marked extension and improvement of the macro YYERROR defined by
bison++ in that YYERROR could not be called from outside of
the parsing member function.
- void error():
By default implemented inline in the parser.ih internal header file,
it writes a simple message to the standard error stream. It is called
when a syntactic error is encountered, and its default implementation
may safely be altered.
- void exceptionHandler(std::exception const &exc):
This member's default implementation is provided inline in the
parser.ih internal header file. It consists of a mere throw
statement, rethrowing a caught exception.
The parse member function's body essentially consists of a
while statement, in which the next token is obtained via the
parser's lex member. This token is then processed according to the
current state of the parsing process. This may result in executing
actions over which the parsing process has no control and which may
result in exceptions being thrown.
Such exceptions do not necessarily have to terminate the parsing
process: they could be thrown by code, linked to the parser, that
simply checks for semantic errors (like divisions by zero)
throwing exceptions if such errors are observed.
The member exceptionHandler receives and may handle such
exceptions without necessarily ending the parsing process. It receives
any std::exception thrown by the parser's actions, as though the
action block itself was surrounded by a try ... catch statement.
It is of course still possible to use an explicit try ... catch
statement within action blocks. However, exceptionHandler can
be used to factor out code that is common to various action blocks.
The next example shows an explicit implementation of
exceptionHandler: any std::exception thrown by the parser's
action blocks is caught, showing the exception's message, and
increasing the parser's error count. After this parsing continues as
if no exception had been thrown:
void Parser::exceptionHandler(std::exception const &exc)
{
std::cout << exc.what() << '\n';
++d_nErrors_;
}
- int lex():
By default implemented inline in the parser.ih internal header file,
it can be pre-implemented by bisonc++ using the scanner option or
directive (see above); alternatively it must be implemented by the
programmer. It interfaces to the lexical scanner, and should return the
next token produced by the lexical scanner, either as a plain character
or as one of the symbolic tokens defined in the Parser::Tokens_
enumeration. Zero or negative token values are interpreted as `end of
input'.
- void print():
By default implemented inline in the parser.ih internal header file,
this member calls print_ to display the last received token and
corresponding matched text. The print_ member is only implemented
if the --print-tokens option or %print-tokens directive was
used when the parsing function was generated. Calling print_ from
print is unconditional, but can easily be controlled by the using
program, by defining, e.g., a command-line option.
- size_t stackSize_() const:
Returns the current number of elements in the parser's state-stack.
- size_t state_() const:
Returns the current parsing-state.
- bool Base::recovery_() const:
Returns true while recovering from a syntax error.
- int Base::token_() const:
Returns the currently considered token.
The following members are required during the parsing process. They should not
be modified or masked by user-defined code:
- Base::ParserBase()
- void Base::clearin_()()
- void errorRecovery_()
- void Base::errorVerbose_()
- void executeAction_(int)
- int lex_(int token)
- int Base::lookup()
- LTYPE_ const &lsp_(int) const
(only available when %lsp-needed, %ltype or
%locationstruct was specified).
- void nextCycle_()
- void nextToken_()
- void Base::pop_()
- void Base::popToken_()
- void print_()()
- void Base::push_()
- void Base::pushToken_()
- void Base::shift_(int state)
- void Base::redoToken_(int rule)
- void Base::reduce_(int rule)
- void Base::savedToken_()
- void Base::symbol_()
- void Base::startRecovert_()
- void Base::top_()
- int Base::token_() const
- void Base::vs_(int idx)
PRIVATE DATA MEMBERS
The following data members can be used by members of parser classes
generated by bisonc++. All data members are actually protected members inherited
from the parser's base class.
- size_t d_acceptedTokens_:
Counts the number of accepted tokens since the start of the parse()
function or since the last detected syntactic error. It is initialized
to d_requiredTokens_ to allow an early error to be detected as
well.
- bool d_actionCases_:
When the debug option has been specified, this variable (false
by default) determines whether the number of action block which is
about to be executed by the parser's member executeAction will be
displayed to the standard output stream.
- bool d_debug_:
When the debug option has been specified, this variable (true
by default) determines whether debug information is actually
displayed.
- LTYPE_ d_loc_:
The location type value associated with a terminal token. It can be
used by, e.g., lexical scanners to pass location information of a
matched token to the parser in parallel with a returned token. It is
available only when %lsp-needed, %ltype or %locationstruct has
been defined.
Lexical scanners may be offered the facility to assign a value to this
variable in parallel with a returned token. In order to allow a
scanner access to d_loc_, d_loc_'s address should be passed
to the scanner. This can be realized, for example, by defining a
member void setLoc(STYPE_ *) in the lexical scanner, which is
then called from the parser's constructor as follows:
d_scanner.setSLoc(&d_loc_);
Subsequently, the lexical scanner may assign a value to the parser's
d_loc_ variable through the pointer to d_loc_ stored inside
the lexical scanner.
- size_t d_nErrors_:
The number of errors counted by parse. It is initialized by the
parser's base class initializer, and is updated while parse
executes. When parse has returned it contains the total number
of errors counted by parse. Errors are not counted if suppressed
(i.e., if d_acceptedTokens_ is less than d_requiredTokens_).
- size_t d_requiredTokens_:
Defines the minimum number of accepted tokens that the parse
function must have processed before a syntactic error can be
generated.
- STYPE_ d_val_:
The semantic value of a returned token or nonterminal symbol. With
nonterminal tokens it is assigned a value through the action rule's
symbol $$. Lexical scanners may be offered the facility to assign
a semantic value to this variable in parallel with a returned
token. In order to allow a scanner access to d_val_,
d_val_'s address should be passed to the scanner. This can be
realized, for example, by passing d_val_'s address to the lexical
scanner's constructor.
Subsequently, the lexical scanner may assign a value to the parser's
d_val_ variable through the pointer to d_val_ stored in a
data member of the lexical scanner.
Note that in some cases this approach must be used to make
the correct semantic value available to the parser. In particular,
when a grammar state defines multiple reductions, depending on the
next token, the reduction's action only takes place following the
retrieval of the next token, thus losing the initially matched token
text.
If STYPE is a polymorphic semantic value, specific requirements for
assigning values to d_val_ apply.
.
BUGS
With bisonc++ version 6.00.00 the following members were modified. Where
necessary alternatives are mentioned:
- bool Base::debug() const: use the d_debug_ data member;
- void error(char const *): replaced by void error();
- void exceptionHandler_(std::exception const &exc): omit the final
underscores: void exceptionHandler(std::exception const
&exc)
- void executeAction(int): add one underscore to the
declaration in the parser class interface: void
executeAction_(int)
- int lookup(bool): omit this member from the parser class
interface.
- void nextCycle_(int): add this member declaration to the
parser class interface.
- void nextToken(int): add one underscore to the declaration
in the parser class interface: void nextToken_(int)
- size_t d_nextToken_: removed from the interface.
- int d_state_: use state_().
- int d_token_: use token_().
- LTYPE_ d_vsp_: removed from the interface. Use vsp_()
instead.
TYPES AND VARIABLES IN THE ANONYMOUS NAMESPACE
In the file defining the parse function the following types and
variables are defined in the anonymous namespace. These are mentioned here for
the sake of completeness, and are not normally accessible to other parts of
the parser.
- char const author[]:
Defining the name and e-mail address of Bisonc++'s author.
- Reserved_:
This enumeration defines some token values used internally by the
parsing functions. They are:
UNDETERMINED_ = -2,
EOF_ = -1,
errTok_ = 256,
These tokens are used by the parser to determine whether another token
should be requested from the lexical scanner, and to handle
error-conditions.
- StateType:
This enumeration defines several additional token values used
internally by the parsing functions. They are:
NORMAL,
ERR_ITEM,
REQ_TOKEN,
ERR_REQ, // ERR_ITEM | REQ_TOKEN
DEF_RED, // state having default reduction
ERR_DEF, // ERR_ITEM | DEF_RED
REQ_DEF, // REQ_TOKEN | DEF_RED
ERR_REQ_DEF // ERR_ITEM | REQ_TOKEN | DEF_RED
These tokens are used by the parser to define the types of the various
states of the analyzed grammar.
- StateTransition
This enumeration only defines a single symbolic constant: ACCEPT_,
which is used in the state transition tables to indicate that the
accepting state has been reached.
- PI_ (Production Info):
This struct provides information about production rules. It has two
fields: d_nonTerm is the identification number of the production's
nonterminal, d_size represents the number of elements of the
productin rule.
- static PI_ s_productionInfo:
Used internally by the parsing function.
- SR_ (Shift-Reduce Info):
This struct provides the shift/reduce information for the various
grammatic states. SR_ values are collected in arrays, one array
per grammatic state. These array, named s_<nr>,
where tt<nr> is a state number are defined in the anonymous namespace
as well. The SR_ elements consist of two unions,
defining fields that are applicable to, respectively, the first,
intermediate and the last array elements.
The first element of each array consists of (1st field) a StateType
and (2nd field) the index of the last array element;
intermediate elements consist of (1st field) a symbol value and (2nd
field) (if negative) the production rule number reducing to the
indicated symbol value or (if positive) the next state when the symbol
given in the 1st field is the current token;
the last element of each array consists of (1st field) a placeholder for
the current token and (2nd field) the (negative) rule number to reduce
to by default or the (positive) number of an error-state to go to when
an erroneous token has been retrieved. If the 2nd field is zero, no
error or default action has been defined for the state, and
error-recovery is attepted.
- STACK_EXPANSION_:
An enumeration value specifying the number of additional elements that
are added to the state- and semantic value stacks when full.
- static SR_ s_<nr>[]:
Here, <nr> is a numerical value representing a state number.
Used internally by the parsing function.
- static SR_ *s_state[]:
Used internally by the parsing function.
SEE ALSO
bison(1), bison++(1),
bisonc++(1), bisonc++input(7),
bison.info (using texinfo),
flexc++(1),
https://fbb-git.github.io/bisoncpp/
Lakos, J. (2001) Large Scale C++ Software Design, Addison Wesley.
Aho, A.V., Sethi, R., Ullman, J.D. (1986) Compilers, Addison Wesley.
AUTHOR
Frank B. Brokken (f.b.brokken@rug.nl).