Most of these base-class members might also be defined directly in the
parser class, but were defined in the parser's base-class. This design results
in a very lean parser class, declaring only members that are actually defined
by the programmer or that have to be defined by bisonc++ itself (e.g., the
member function parse as well as some support functions requiring access
to facilities that are only available in the parser class itself, rather than
in the parser's base class).
This design does not require any virtual members: the members which are
not involved in the actual parsing process may always be (re)implemented
directly by the programmer. Thus there is no need to apply or define virtual
member functions.
AUTHOR: at the end of this man-page.
)
The bisonc++input(7) man-page covers the details of the
grammar-specification file. This man-page offers these sections:
- DESCRIPTION: a short description of bisonc++ and its grammar
file(s);
- DIRECTIVES: bisonc++'s grammar-specification directives;
- POLYMORPHIC SEMANTIC VALUES: how to use polymorphic semantic
values in parsers generated by bisonc++;
- DOLLAR NOTATIONS: available $-shorthand notations with single,
union, and polymorphic semantic value types.
- RESTRICTIONS ON TOKEN NAMES: name restrictions for user-defined
symbols;
- OBSOLETE SYMBOLS: symbols available to bison(1), but not
to bisonc++;
- EXAMPLE: an example of using bisonc++;
- USING PARSER-CLASS SYMBOLS IN LEXICAL SCANNERS: how to refer
to tokens defined in the grammar from within a lexical scanner;
- SEE ALSO: references to other programs and documentation;
- AUTHOR: at the end of this man-page.
The bisonc++api(3) describes the application programmer's
interface, containing these sections:
- DESCRIPTION: a short description of bisonc++ and its application
programmer's interface;
- PUBLIC SYMBOLS: constructor, enums, members, and types that can
be used by calling software;
- PRIVATE ENUMS AND -TYPES: enumerations and types only
available to the Parser class;
- PRIVATE MEMBER FUNCTIONS: member functions that are only
available to the Parser class;
- PRIVATE DATA MEMBERS: data members that are only available to
the Parser class;
- TYPES AND VARIABLES IN THE ANONYMOUS NAMESPACE: an overview of
the types and variables that are used to define and store the
grammar-tables generated by bisonc++;
- SEE ALSO: references to other programs and documentation;
- AUTHOR: at the end of this man-page.
FROM BISONC++ < 6.00.00 TO BISONC++ >= 6.00.00
This section is only relevant when re-generating parser code previously
generated by bisonc++ versions before 6.00.00.
Bisonc++ version 6.00.00 generates code that significantly differs from code
generated by earlier versions. The identifiers of all members (both data and
functions) that are generated by bisonc++ and accessible to the generated
parser-class end in an underscore character. Member functions whose
identifiers end in an underscore are `owned' by bisonc++, are rewritten each
time bisonc++ is run, and should not be modified. Some members are defined as
members of the generated parser-class, and are declared in the parser class
header file (e.g., parser.h) and some members are given default
implementations in the parser's internal header file (e.g.,
parser.ih). Once generated, these files are left alone by
bisonc++. Therefore, when using bisonc++ version 6.00.00 or beyond to re-generate a
parser which was originally generated by an earlier bisonc++ version, the
existing parser header and internal header files need some minor
modifications:
- void error(char const *) was changed to void error(). A
default implementation is provided in the parser's internal header file. The
current implementation directly inserts the text Syntax error into the
standard output stream;
- void exceptionHandler_(std::exception const &exc) was changed to
void exceptionHandler(std::exception const &exc). A
default implementation is provided in the parser's internal header file, and
only its trailing underscore characters need to be removed;
- int lookup(bool recovery): remove this member declaration from the
previously generated parser class;
- The following members are declared without a trailing underscore
character in the generated parser class. An underscore character should
be added to their identifiers: executeAction, errorRecovery, nextToken.
- The member void nextCycle_() must be declared in the private
section of the generated parser class.
Previously, several data members of the parser's base class were directly
accessible to the parser class. Bisonc++ version 6.00.00 restricts access to
those members. They can still be read, but no longer modified by the parser
class. This applies to the following members:
- d_token_: use int token_() instead;
- d_state_: use size_t state_() instead;
OPTIONS
Where available, single letter options are listed between parentheses
beyond their associated long-option variants. Single letter options require
arguments if their associated long options also require arguments. Options
affecting the class- or implementation header files are ignored if these
files already exist. Options accepting a `filename' do not accept path names,
i.e., they cannot contain directory separators (/); options accepting a
'pathname' may contain directory separators.
Some options may cause errors. This happens when they conflict with the
contents of a file which bisonc++ cannot modify (e.g., a parser class header file
exists, but doesn't define a namespace, but a --namespace option was
specified).
To solve the error the offending option could be omitted; the existing file
could be removed; or the existing file could be hand-edited according to the
option's specification.
Note that bisonc++ currently does not handle the opposite error condition: if a
previously used option is omitted, then bisonc++ does not report an
inconsistency. In those cases compilation errors may be observed.
- --analyze-only (-A)
Only analyze the grammar. No files are (re)written. This option can
be used to test the grammatic correctness of modification `in
situ', without overwriting previously generated files. If the
grammar contains syntactic errors only syntax analysis is
performed.
- --baseclass-header=filename (-b)
Filename defines the name of the file to contain the parser's
base class. This class defines, e.g., the parser's symbolic
tokens. Defaults to the name of the parser class plus the suffix
base.h. It is generated, unless otherwise indicated (see
--no-baseclass-header and --dont-rewrite-baseclass-header
below).
It is an error if this option is used and an already
existing parser class header file does not contain #include
"filename".
- --baseclass-preinclude=pathname (-H)
Pathname defines the path to the file preincluded in the
parser's base-class header. This option is needed in situations
where the base class header file refers to types which might not
yet be known. E.g., with polymorphic semantic values a
std::string value type might be used. Since the string
header file is not by default included in parserbase.h we
somehow need to inform the compiler about this and possibly other
headers. The suggested procedure is to use a pre-include header
file declaring the required types. By default `header' is
surrounded by double quotes: #include "header" is used when
the option -H header is specified. When the argument is
surrounded by pointed brackets #include <header> is
included. In the latter case, quotes might be required to escape
interpretation by the shell (e.g., using -H '<header>').
- --baseclass-skeleton=pathname (-B)
Pathname defines the path name to the file containing the
skeleton of the parser's base class. It defaults to the
installation-defined default path name (e.g.,
/usr/share/bisonc++/ plus bisonc++base.h).
- --class-header=filename (-c)
Filename defines the name of the file to contain the parser
class. Defaults to the name of the parser class plus the suffix
.h
It is an error if this option is used and an already
existing implementation header file does not contain #include
"filename".
- --class-name className
Defines the name of the C++ class that is generated. If
neither this option, nor the %class-name directory is
specified, then the default class name (Parser) is used.
It is an error if this option is used and className differs
from the name of the class that is defined in an already existing
parser-class header file and/or if an already existing
implementation header file does not define members of the class
`className'.
- --class-skeleton=pathname (-C)
Pathname defines the path name to the file containing the
skeleton of the parser class. It defaults to the
installation-defined default path name (e.g.,
/usr/share/bisonc++/ plus bisonc++.h).
- --construction
Details about the construction of the parsing tables are written to
the same file as written by the --verbose option (i.e.,
<grammar>.output, where <grammar> is the input file read
by bisonc++). This information is primarily useful for developers. It
augments the information written to the verbose grammar output
file, generated by the --verbose option.
- --debug
Provide the generated parse and its support functions with
debugging code, optionally showing the actual parsing process on
the standard output stream. When included, the debugging output is
active by default, but its activity may be controlled using the
setDebug(bool on-off) member. Bisonc++ does not use #ifdef
DEBUG macros. Rerun bisonc++ without the --debug option to
remove the debugging code.
Note that this option does not show the actions of bisonc++'s own
parser, its own lexical scanner or merely the numbers of the
case-entries executed by the parser's parse function. If that
is what you want, use the --own-debug, --action-cases,
--scanner-debug and/or --own-tokens options.
When polymorphic semantic values
are used (see section
??)
then the generated parser might attempt to retrieve an incorrect
polymorphic value. In that case a fatal error is generated, ending
bisonc++'s run. The error message itself cannot refer to the action
block where the error occurred. If this situation is encountered,
rerun bisonc++, specifying --debug and call
parser.setDebug(Parser::ACTIONCASES): as a debugging aid the
generated parser then shows the executeAction switch's case entry
numbers just before their execution.
- --default-actions=off|quiet|warn (-d)
When warn is specified (which is the default) an action block
executing $$ = $1 (or $$ = STYPE_{} for empty production
rules) is added to production rules that do not explicitly define
their own final action blocks, while issuing a warning. When
quiet is specified these action blocks are silently added. It
is an error when the types of $$ and $1 differ (but it is OK if
neither $$ nor $1 is associated with a specific type). When
off is specified no action blocks are added (issuing a warning
issued, unless the option/directive tag-mismatches off has
been specified).
- --error-verbose
When a syntactic error is reported, the generated parse function
dumps the parser's state stack to the standard output
stream. The stack dump shows on separate lines a stack index
followed by the state stored at the indicated stack element. The
first stack element is the stack's top element.
- --filenames=filename (-f)
Filename is a generic file name that is used for all header
files generated by bisonc++. Options defining specific file names are
also available (which then, in turn, overrule the name specified
by this option).
- --flex
Bisonc++ generates code calling d_scanner.yylex() to obtain the
next lexical token, and calling d_scanner.YYText() for the
matched text, unless overruled by options or directives explicitly
defining these functions. By default, the interface defined by
flexc++(1) is used. This option is only interpreted if the
--scanner option or %scanner directive is also used.
- --help (-h)
Write basic usage information to the standard output stream and
terminate.
- --implementation-header=filename (-i)
Filename defines the name of the file to contain the
implementation header. It defaults to the name of the generated
parser class plus the suffix .ih.
The implementation header should contain all directives and
declarations only used by the implementations of the parser's
member functions. It is the only header file that is included by
the source file containing parse's implementation. User
defined implementation of other class members may use the same
convention, thus concentrating all directives and declarations
that are required for the compilation of other source files
belonging to the parser class in one header file.
- --implementation-skeleton=pathname (-I)
Pathname defines the path name to the file containing the
skeleton of the implementation header. t defaults to the
installation-defined default path name (e.g.,
/usr/share/bisonc++/ plus bisonc++.ih).
- --insert-stype
This option is only effective if the debug option (or
%debug directive) has been specified. When insert-stype
has been specified the parsing function's debug output also shows
selected semantic values. It should only be used if objects or
variables of the semantic value type STYPE_ can be inserted
into ostreams.
- --max-inclusion-depth=value
Set the maximum number of nested grammar files. Defaults to 10.
- --namespace identifier
Define all of the code generated by bisonc++ in the namespace
identifier. By default no namespace is defined. If this
options is used the implementation header is provided with a
commented out using namespace declaration for the specified
namespace. In addition, the parser and parser base class
header files also use the specified namespace to define their
include guard directives.
It is an error if this option is used and an already existing
parser-class header file and/or implementation header file does
not define namespace identifier.
- --no-baseclass-header
Do not write the file containing the parser class' base class, even
if that file doesn't yet exist. By default the file containing the
parser's base class is (re)written each time bisonc++ is called. Note
that this option should normally be avoided, as the base class
defines the symbolic terminal tokens that are returned by the
lexical scanner. When the construction of this file is suppressed,
modifications of these terminal tokens are not communicated to the
lexical scanner.
- --no-decoration (-D)
Do not include user-defined or default actions when generating the
parser's parse member. This effectively generates a parser
which merely performs syntax checks, without performing the
actions which are normally executed when rules have been
matched. This may be useful in situations where a (partially or
completely) decorated grammar is reorganized, and the syntactic
correctness of the modified grammar must be verified, or in
situations where the grammar has already been decorated, but
functions which are called from the rules's actions have not yet
been impleemented.
- --no-lines
Do not put #line preprocessor directives in the file containing
the parser's parse function. By default the file containing
the parser's parse function also contains #line
preprocessor directives. This option allows the compiler and
debuggers to associate errors with lines in your grammar
specification file, rather than with the source file containing
the parse function itself.
- --no-parse-member
Do not write the file containing the parser's predefined parser
member functions, even if that file doesn't yet exist. By default
the file containing the parser's parse member function is
(re)written each time bisonc++ is called. Note that this option
should normally be avoided, as this file contains parsing
tables which are altered whenever the grammar definition is
modified.
- --own-debug
Displays the actions performed by bisonc++'s parser when it processes
the grammar specification file(s) (lots of output!). This implies
the --verbose option.
- --own-tokens (-T)
The tokens returned as well as the text matched by bisonc++'s lexcial
scanner are shown when this option is used.
This option does not result in the generated parsing
function displaying returned tokens and matched text. If that is
what you want, use the --print-tokens option.
- --parsefun-skeleton=pathname (-P)
Pathname defines the path name of the file containing the
parsing member function's skeleton. It defaults to the
installation-defined default path name (e.g.,
/usr/share/bisonc++/ plus bisonc++.cc).
- --parsefun-source=filename (-p)
Filename defines the name of the source file to contain the
parser member function parse. Defaults to parse.cc.
- --polymorphic-code-skeleton=pathname (-L)
Pathname defines the path name of the file containing the
non-template members of the polymorphic Base class. It defaults
to the installation-defined default path name (e.g.,
/usr/share/bisonc++/ plus bisonc++polymorphic).
- --polymorphic-skeleton=pathame (-M)
Pathname defines the path name of the file containing the
skeleton of the polymorphic template classes. It defaults to the
installation-defined default path name (e.g.,
/usr/share/bisonc++/ plus bisonc++polymorphic.code).
- --print-tokens (-t)
The generated parsing function implements a function print_
displaying (on the standard output stream) the tokens returned by
the parser's scanner as well as the corresponding matched
text. This implementation is suppressed when the parsing function
is generated without using this option. The member print_ is
called from Parser::print, which is defined in-line in the the
parser's class header. Calling Parser::print_ can thus easily
be controlled from print, using, e.g., a variable that set by
the program using the parser generated by bisonc++.
This option does not show the tokens returned and text matched
by bisonc++ itself when it is reading its input file(s). If
that is what you want, use the --own-tokens option.
- --prompt
When adding debugging code (using the debug option or
directive) the debug information is displayed continuously while
the parser processes its input. When using the prompt option
(or directive) the generated parser displays a prompt (a question
mark) at each step of the parsing process. Caveat: when using this
option the parser's input cannot be provided at the parser's
standard input stream.
- --required-tokens=number
Following a syntactic error, require at least number
successfully processed tokens before another syntactic error can
be reported. By default number is zero.
- --scanner=pathname (-s)
Pathname defines the path name to the file defining the
scanner's class interface (e.g., "../scanner/scanner.h"). When
this option is used the parser's member int lex() is
predefined as
int Parser::lex()
{
return d_scanner.lex();
}
and an object Scanner d_scanner is composed into the parser
(but see also option scanner-class-name). The example shows
the function that's called by default. When the --flex option
(or %flex directive) is specified the function
d_scanner.yylex() is called. Any other function to call can be
specified using the --scanner-token-function option (or
%scanner-token-function directive).
By default bisonc++ surrounds pathname by double quotes (using,
e.g., #include "pathname"). When pathname is surrounded
by pointed brackets #include <pathname> is included.
It is an error if this option is used and an already existing
parser class header file does not include `pathname'.
- --scanner-class-name scannerClassName
Defines the name of the scanner class, declared by the pathname
header file that is specified at the scanner option or
directive. By default the class name Scanner is used.
It is an error if this option is used and either the
scanner option was not provided, or the parser class interface
in an already existing parser class header file does not declare a
scanner class d_scanner object.
- --scanner-debug
Show de scanner's matched rules and returned tokens. This
extensively displays the rules and tokens matched and returned by
bisonc++'s scanner, instead of just showing the tokens and matched
text which are received by bisonc++. If you want the latter, use the
option --own-tokens.
- --scanner-matched-text-function=function-call
The scanner function returning the text that was matched at the
last call of the scanner's token function. A complete function
call expression should be provided (including a scanner object, if
used). This option overrules the d_scanner.matched() call used
by default when the %scanner directive is specified, and it
overrules the d_scanner.YYText() call used when the %flex
directive is provided. Example:
--scanner-matched-text-function "myScanner.matchedText()"
- --scanner-token-function=function-call
The scanner function returning the next token, called from the
parser's lex function. A complete function
call expression should be provided (including a scanner object, if
used). This option overrules the d_scanner.lex() call used
by default when the %scanner directive is specified, and it
overrules the d_scanner.yylex() call used when the %flex
directive is provided. Example:
--scanner-token-function "myScanner.nextToken()"
It is an error if this option is used and the scanner token
function is not called from the code in an already
existing implementation header.
- --show-filenames
Writes the names of the generated files to the standard error
stream.
- --skeleton-directory=directory (-S)
Specifies the directory containing the skeleton files. In addition
to specifying a common names for the skeleton files the locations
of individual skeleton files can be specified using the options
(-B -C, -H, -I, -L and -M).
- --stack-expansion(size)
Defines the number of elements to be added to the generated
parser's semantic value stack when it must be enlarged. By default
10 elements are added to the stack. This option/directive is
interpreted only once, and only if size at least equals the
default stack expansion size of 10.
- --tag-mismatches off|on
When on is specified (which is the default), a warning is
issued if no $$ assignment was detected in an action block, or if
adding a default $$ = ... action was suppressed (cf. the
default-actions off option or directive).
- --target-directory=pathname
Pathname defines the directory where generated files should be
written. By default this is the directory where bisonc++ is
called.
- --thread-safe
Only used with polymorphic semantic values, and then only required
when the parser is used in multiple threads: it ensures that each
thread's polymorphic code only accesses its own parser's error
counting variable.
- --usage
Writes basic usage information to the standard output stream and
terminates.
- --verbose (-V)
Writes a file containing verbose descriptions of the parser states
and what is done for each type of look-ahead token in that state.
This file also describes all conflicts detected in the grammar,
both those resolved by operator precedence and those that remain
unresolved. It is not created by default, but if requested the
information is written on <grammar>.output, where
<grammar> is the grammar specification file passed to bisonc++.
- --version (-v)
Displays bisonc++'s version number and terminates.
QUICK START
Bisonc++ may be used as follows:
GENERATED FILES
Bisonc++ may create the following files:
- A file containing the implementation of the member function parse
and its support functions. The member parse is a public member that can be
called to parse a token-sequence according to a specified LALR1 type of
grammar. By default the implementations of these members are written on the
file parse.cc. The programmer should not modify the contents of this file;
it is rewritten every time bisonc++ is called.
- A file containing an initial setup of the parser class, containing
the declaration of the public member parse and of its (private) support
members. New members may safely be declared in the parser class, as it is only
created by bisonc++ if not yet existing, using the filename <parser-class>.h
(where <parser-class> is the the name of the defined parser class).
- A file containing the parser class' base class. This base
class should not be modified by the programmer. It contains types defined by
bisonc++, as well as several (protected) data members and member functions, which
should not be redefined by the programmer. All symbolic parser terminal tokens
are defined in this class, thereby escalating these definitions to a separate
class (cf. Lakos, (2001)), which in turn prevents circular dependencies
between the lexical scanner and the parser (here, circular dependencies may
easily be encountered, as the parser needs access to the lexical scanner class
when defining the lexical scanner as one of its data members, whereas the
lexical scanner needs access to the parser class to know about the grammar's
symbolic terminal tokens; escalation is a way out of such circular
dependencies). By default this file is (re)written any time bisonc++ is called,
using the filename <parser-class>base.h.
- A file containing an implementation header. The
implementation header rather than the parser's class header file should be
included by the parser's source files implementing member functions declared
by the programmer. The implementation header first includes the parser class's
header file, and then provides default in-line implementations for its members
error and print (which may be altered by the programmer). The member
lex may also receive a standard in-line implementation. Alternatively, its
implementation can be provided by the programmer (see below). Any directives
and/or namespace directives required for the proper compilation of the
parser's additional member functions should be declared next. The
implementation header is included by the file defining parse. By default
the implementation header is created if not yet existing, receiving the
filename <parser-class>.ih.
- A verbose description of the generated parser. This file is
comparable to the verbose output file originally generated by bison++. It
is generated when the option --verbose or -V is provided. If so, bisonc++
writes the file <grammar>.output, where <grammar> is the name of the
file containing the grammar definition.
FILES
- bisonc++base.h: skeleton of the parser's base class;
- bisonc++.h: skeleton of the parser class;
- bisonc++.ih: skeleton of the implementation header;
- bisonc++.cc: skeleton of the member parse;
- bisonc++polymorphic: skeleton of the declarations used by
%polymorphic;
- bisonc++polymorphic.code: skeleton of the non-inline
implementations of the members declared in bisonc++polymorphic.
- debugdecl.in: skeleton declaring members of the parser's base
class that are only required when the debug option or directive
was specified.
- debugfunctions1.in: skeleton defining the members declared in
debugdecl.in.
- debugfunctions2.in: skeleton implementing symbol_, which is
only needed when the print-tokens option or directive was
specified.
- debugfunctions3.in: skeleton implementing errorVerbose_,
which is only needed when the error-verbose option or directive was
specified.
- debugincludes.in: skeleton specifying the header files
#include directives that are required when the debug option
or directive was specified.
- debuglookup.in: skeleton containing extra code required in the
Parser::lookup member when the debug option of directive was
specified.
- lex.in: skeleton implementing the Parser::lex function.
- ltypedata.in: skeleton declaring the location variables
- ltype.in: skeleton defining the default or user defined
LTYPE_.
- print.in: skeleton implementing the actions of Parser::print
if the print-tokens option or directive was specified.
SEE ALSO
bison(1), bison++(1),
bisonc++api(3), bisonc++input(7),
bison.info (using texinfo),
flexc++(1),
https://fbb-git.github.io/bisoncpp/
Lakos, J. (2001) Large Scale C++ Software Design, Addison Wesley.
Aho, A.V., Sethi, R., Ullman, J.D. (1986) Compilers, Addison Wesley.
BUGS
Parser-class header files (e.g., Parser.h) and parser-class internal
header files (e.g., Parser.ih) generated with bisonc++ < 6.00.00 require
several minor hand-modifications when re-generating the parser with bisonc++
>= 6.00.00. See the earlier section FROM BISONC++ < 6.00.00 TO BISONC++
>= 6.00.00 for details.
To avoid collisions with names defined by the parser's (base) class, the
following identifiers should not be used as token names:
- Identifiers ending in an underscore;
- Any of the following identifiers: ABORT, ACCEPT, ERROR,
debug, error, or setDebug.
ABOUT bisonc++
Bisonc++ was based on bison++, originally developed by Alain
Coetmeur (coetmeur@icdc.fr), R&D department (RDT), Informatique-CDC, France,
who based his work on bison, GNU version 1.21.
Bisonc++ version 0.98 and beyond is a complete rewrite of an LALR-1 parser
generator, closely following the construction process as described in Aho,
Sethi and Ullman's (1986) book Compilers (i.e., the Dragon book). It
uses the same grammar specification as bison and bison++, and it uses
practically the same options and directives as bisonc++ versions earlier than
0.98. Variables, declarations and macros that are obsolete were removed.
Compared to bison and bison++, the number and functions of the
various %define declarations was thoroughly modified. All of
bison's %define declarations were replaced by their (former) first
arguments. Furthermore, `macro-style' declarations are not supported or
required. Finally, all directives only use lower-case characters and do not
contain underscore characters (but sometimes hyphens). E.g., %define DEBUG
is now declared as %debug; %define LSP_NEEDED is now declared as
%lsp-needed (note the hyphen).
AUTHOR
Frank B. Brokken (f.b.brokken@rug.nl).