Kea 2.7.3
|
Tokenizer for parsing DNS master files. More...
#include <master_lexer.h>
Classes | |
class | LexerError |
Exception thrown from a wrapper version of MasterLexer::getNextToken() for non fatal errors. More... | |
class | ReadError |
Exception thrown when we fail to read from the input stream or file. More... | |
Public Types | |
enum | Options { NONE = 0 , INITIAL_WS = 1 , QSTRING = 2 , NUMBER = 4 } |
Options for getNextToken. More... | |
Public Member Functions | |
MasterLexer () | |
The constructor. | |
~MasterLexer () | |
The destructor. | |
const MasterToken & | getNextToken (MasterToken::Type expect, bool eol_ok=false) |
Parse the input for the expected type of token. | |
const MasterToken & | getNextToken (Options options=NONE) |
Parse and return another token from the input. | |
size_t | getPosition () const |
Return the position of lexer in the pushed sources so far. | |
size_t | getSourceCount () const |
Get number of sources inside the lexer. | |
size_t | getSourceLine () const |
Return the input source line number. | |
std::string | getSourceName () const |
Return the name of the current input source name. | |
size_t | getTotalSourceSize () const |
Return the total size of pushed sources. | |
void | popSource () |
Stop using the most recently opened input source (file or stream). | |
bool | pushSource (const char *filename, std::string *error=0) |
Open a file and make it the current input source of MasterLexer. | |
void | pushSource (std::istream &input) |
Make the given stream the current input source of MasterLexer. | |
void | ungetToken () |
Return the last token back to the lexer. | |
Static Public Attributes | |
static const size_t | SOURCE_SIZE_UNKNOWN |
Special value for input source size meaning "unknown". | |
Friends | |
class | master_lexer_internal::State |
Tokenizer for parsing DNS master files.
The MasterLexer
class provides tokenize interfaces for parsing DNS master files. It understands some special rules of master files as defined in RFC 1035, such as comments, character escaping, and multi-line data, and provides the user application with the actual data in a more convenient form such as a std::string object.
In order to support the $INCLUDE notation, this class is designed to be able to operate on multiple files or input streams in the nested way. The pushSource()
and popSource()
methods correspond to the push and pop operations.
While this class is public, it is less likely to be used by normal applications; it's mainly expected to be used within this library, specifically by the MasterLoader
class and Rdata
implementation classes.
try
and catch
(depending on the underlying implementation of the exception handling). For these reasons, some of this class does not throw for an error that would be reported as an exception in other classes. Definition at line 303 of file master_lexer.h.
Options for getNextToken.
A compound option, indicating multiple options are set, can be specified using the logical OR operator (operator|()).
Enumerator | |
---|---|
NONE | No option. |
INITIAL_WS | recognize begin-of-line spaces after an end-of-line |
QSTRING | recognize quoted string |
NUMBER | recognize numeric text as integer |
Definition at line 345 of file master_lexer.h.
isc::dns::MasterLexer::MasterLexer | ( | ) |
The constructor.
std::bad_alloc | Internal resource allocation fails (rare case). |
isc::dns::MasterLexer::~MasterLexer | ( | ) |
The destructor.
It internally closes any remaining input sources.
const MasterToken & isc::dns::MasterLexer::getNextToken | ( | MasterToken::Type | expect, |
bool | eol_ok = false ) |
Parse the input for the expected type of token.
This method is a wrapper of the other version, customized for the case where a particular type of token is expected as the next one. More specifically, it's intended to be used to get tokens for RDATA fields. Since most RDATA types of fixed format, the token type is often predictable and the method interface can be simplified.
This method basically works as follows: it gets the type of the expected token, calls the other version of getNextToken(Options)
, and returns the token if it's of the expected type (due to the usage assumption this should be normally the case). There are some non trivial details though:
eol_ok
parameter is true
(very rare case), MasterToken::END_OF_LINE and MasterToken::END_OF_FILE are recognized and returned if they are found instead of the expected type of token.In some very rare cases where the RDATA has an optional trailing field, the eol_ok
parameter would be set to true
. This way the caller can handle both cases (the field does or does not exist) by a single call to this method. In all other cases eol_ok
should be set to false
, and that is the default and can be omitted.
Unlike the other version of getNextToken(Options)
, this method throws an exception of type LexerError
for non fatal errors such as broken syntax or encountering an unexpected type of token. This way the caller can write RDATA parser code without bothering to handle errors for each field. For example, pseudo parser code for MX RDATA would look like this:
In the case where LexerError
exception is thrown, it's expected to be handled comprehensively for the parser of the RDATA or at a higher layer. The token_
member variable of the corresponding LexerError
exception object stores a token of type MasterToken::ERROR
that indicates the reason for the error.
Due to the specific intended usage of this method, only a subset of MasterToken::Type
values are acceptable for the expect
parameter: MasterToken::STRING
, MasterToken::QSTRING
, and MasterToken::NUMBER
. Specifying other values will result in an InvalidParameter
exception.
InvalidParameter | The expected token type is not allowed for this method. |
LexerError | The lexer finds non fatal error or it finds an |
other | Anything the other version of getNextToken() can throw. |
expect | Expected type of token. Must be either STRING, QSTRING, or NUMBER. |
eol_ok | true iff END_OF_LINE or END_OF_FILE is acceptable. |
const MasterToken & isc::dns::MasterLexer::getNextToken | ( | Options | options = NONE | ) |
Parse and return another token from the input.
It reads a bit of the last opened source and produces another token found in it.
This method does not provide the strong exception guarantee. Generally, if it throws, the object should not be used any more and should be discarded. It was decided all the exceptions thrown from here are serious enough that aborting the loading process is the only reasonable recovery anyway, so the strong exception guarantee is not needed.
options | The options can be used to modify the tokenization. The method can be made reporting things which are usually ignored by this parameter. Multiple options can be passed at once by bitwise or (eg. option1 | option 2). See description of available options. |
isc::InvalidOperation | in case the source is not available. This may mean the pushSource() has not been called yet, or that the current source has been read past the end. |
ReadError | in case there's problem reading from the underlying source (eg. I/O error in the file on the disk). |
std::bad_alloc | in case allocation of some internal resources or the token fail. |
size_t isc::dns::MasterLexer::getPosition | ( | ) | const |
Return the position of lexer in the pushed sources so far.
This method returns the position in terms of the number of recognized characters from all sources that have been pushed by the time of the call. Conceptually, the position in a single source is the offset from the beginning of the file or stream to the current "read cursor" of the lexer. The return value of this method is the sum of the positions in all the pushed sources. If any of the sources has already been popped, the position of the source at the time of the pop operation will be used for the calculation.
If the lexer reaches the end for each of all the pushed sources, the return value should be equal to that of getTotalSourceSize()
. It's generally expected that a source is popped when the lexer reaches the end of the source. So, when the application of this class parses all contents of all sources, possibly with multiple pushes and pops, the return value of this method and getTotalSourceSize()
should be identical (unless the latter returns SOURCE_SIZE_UNKNOWN). But this is not necessarily guaranteed as the application can pop a source in the middle of parsing it.
Before pushing any source, it returns 0.
The return values of this method and getTotalSourceSize()
would give the caller an idea of the progress of the lexer at the time of the call. Note, however, that since it's not predictable whether more sources will be pushed after the call, the progress determined this way may not make much sense; it can only give an informational hint of the progress.
Note that the conceptual "read cursor" would move backward after a call to ungetToken()
, in which case this method will return a smaller value. That is, unlike getTotalSourceSize()
, return values of this method may not always monotonically increase.
None |
Referenced by isc::dns::MasterLoader::MasterLoaderImpl::getPosition().
size_t isc::dns::MasterLexer::getSourceCount | ( | ) | const |
Get number of sources inside the lexer.
This method never throws.
size_t isc::dns::MasterLexer::getSourceLine | ( | ) | const |
Return the input source line number.
If there is an opened source, the return value will be a non-0 integer indicating the line number of the current source where the MasterLexer
is currently working. The expected usage of this value is to print a helpful error message when parsing fails by specifically identifying the position of the error.
If there is no opened source at the time of the call, this method returns 0.
None |
std::string isc::dns::MasterLexer::getSourceName | ( | ) | const |
Return the name of the current input source name.
If it's a file, it will be the C string given at the corresponding pushSource()
call, that is, its filename. If it's a stream, it will be formatted as "stream-%p"
where p
is hex representation of the address of the stream object.
If there is no opened source at the time of the call, this method returns an empty string.
std::bad_alloc | Resource allocation failed for string construction (rare case) |
size_t isc::dns::MasterLexer::getTotalSourceSize | ( | ) | const |
Return the total size of pushed sources.
This method returns the sum of the size of sources that have been pushed to the lexer by the time of the call. It would give the caller some hint about the amount of data the lexer is working on.
The size of a normal file is equal to the file size at the time of the source is pushed. The size of other type of input stream is the size of the data available in the stream at the time of the source is pushed.
In some special cases, it's possible that the size of the file or stream is unknown. It happens, for example, if the standard input is associated with a pipe from the output of another process and it's specified as an input source. If the size of some of the pushed source is unknown, this method returns SOURCE_SIZE_UNKNOWN.
The total size won't change when a source is popped. So the return values of this method will monotonically increase or SOURCE_SIZE_UNKNOWN
; once it returns SOURCE_SIZE_UNKNOWN
, any subsequent call will also result in that value, by the above definition.
Before pushing any source, it returns 0.
None |
Referenced by isc::dns::MasterLoader::MasterLoaderImpl::getSize().
void isc::dns::MasterLexer::popSource | ( | ) |
Stop using the most recently opened input source (file or stream).
If it's a file, the previously opened file will be closed internally. If it's a stream, MasterLexer
will simply stop using the stream; the caller can assume it will be never used in MasterLexer
thereafter.
This method must not be called when there is no source pushed for MasterLexer
. This method is otherwise exception free.
isc::InvalidOperation | Called with no pushed source. |
bool isc::dns::MasterLexer::pushSource | ( | const char * | filename, |
std::string * | error = 0 ) |
Open a file and make it the current input source of MasterLexer.
The opened file can be explicitly closed by the popSource()
method; if popSource()
is not called within the lifetime of the MasterLexer
, it will be closed in the destructor.
In the case possible system errors in opening the file (most likely because of specifying a non-existent or unreadable file), it returns false, and if the optional error
parameter is non null, it will be set to a description of the error (any existing content of the string will be discarded). If opening the file succeeds, the given error
parameter will be intact.
Note that this method has two styles of error reporting: one by returning false
(and setting error
optionally) and the other by throwing an exception. See the note for the class description about the distinction.
InvalidParameter | filename is null |
filename | A non null string specifying a master file |
error | If non null, a placeholder to set error description in case of failure. |
Referenced by isc::dns::rdata::generic::Generic::Generic(), isc::dns::rdata::generic::detail::TXTLikeImpl< Type, typeCode >::TXTLikeImpl(), isc::dns::MasterLoader::MasterLoaderImpl::pushSource(), and isc::dns::MasterLoader::MasterLoaderImpl::pushStreamSource().
void isc::dns::MasterLexer::pushSource | ( | std::istream & | input | ) |
Make the given stream the current input source of MasterLexer.
The caller still holds the ownership of the passed stream; it's the caller's responsibility to keep it valid as long as it's used in MasterLexer
or to release any resource for the stream after that. The caller can explicitly tell MasterLexer
to stop using the stream by calling the popSource()
method.
The data in input
must be complete at the time of this call. The behavior of the lexer is undefined if the caller builds or adds data in input
after pushing it.
Except for rare case system errors such as memory allocation failure, this method is generally expected to be exception free. However, it can still throw if it encounters an unexpected failure when it tries to identify the "size" of the input source (see getTotalSourceSize()
). It's an unexpected result unless the caller intentionally passes a broken stream; otherwise it would mean some system-dependent unexpected behavior or possibly an internal bug. In these cases it throws an Unexpected
exception. Note that this version of the method doesn't return a boolean unlike the other version that takes a file name; since this failure is really unexpected and can be critical, it doesn't make sense to give the caller an option to continue (other than by explicitly catching the exception).
Unexpected | An unexpected failure happens in initialization. |
input | An input stream object that produces textual representation of DNS RRs. |
void isc::dns::MasterLexer::ungetToken | ( | ) |
Return the last token back to the lexer.
The method undoes the lasts call to getNextToken(). If you call the getNextToken() again with the same options, it'll return the same token. If the options are different, it may return a different token, but it acts as if the previous getNextToken() was never called.
It is possible to return only one token back in time (you can't call ungetToken() twice in a row without calling getNextToken() in between successfully).
It does not work after change of source (by pushSource or popSource).
isc::InvalidOperation | If called second time in a row or if getNextToken() was not called since the last change of the source. |
|
friend |
Definition at line 304 of file master_lexer.h.
|
static |
Special value for input source size meaning "unknown".
This constant value will be used as a return value of getTotalSourceSize()
when the size of one of the pushed sources is unknown. Note that this value itself is a valid integer in the range of the type, so there's still a small possibility of ambiguity. In practice, however, the value should be sufficiently large that should eliminate the possibility.
Definition at line 339 of file master_lexer.h.