presage  0.8.7
Classes | Public Member Functions | Protected Member Functions | Protected Attributes | Private Attributes
Tokenizer Class Reference

#include <tokenizer.h>

Inheritance diagram for Tokenizer:
Inheritance graph
[legend]

List of all members.

Classes

class  StreamGuard

Public Member Functions

 Tokenizer (std::istream &stream, const std::string blankspaces, const std::string separators)
virtual ~Tokenizer ()
virtual int countTokens ()=0
virtual bool hasMoreTokens () const =0
virtual std::string nextToken ()=0
virtual double progress () const =0
void blankspaceChars (const std::string)
std::string blankspaceChars () const
void separatorChars (const std::string)
std::string separatorChars () const
void lowercaseMode (const bool)
bool lowercaseMode () const
std::string streamToString () const

Protected Member Functions

bool isBlankspace (const int character) const
bool isSeparator (const int character) const

Protected Attributes

std::istream & stream
std::ios::iostate sstate
std::streamoff offbeg
std::streamoff offend
std::streamoff offset

Private Attributes

std::string blankspaces
std::string separators
bool lowercase

Detailed Description

The Tokenizer class takes an input stream and parses it into "tokens", allowing the tokens to be read one at a time.

The parsing process is controlled by the character classification sets:

Each byte read from the input stream is regarded as a character in the range '\u0000' through '\u00FF'.

In addition, an instance has flags that control:

A typical application first constructs an instance of this class, supplying the input stream to be tokenized, the set of blankspaces, and the set of separators, and then repeatedly loops, while method hasMoreTokens returns true, calling the nextToken method.

Definition at line 64 of file tokenizer.h.


Constructor & Destructor Documentation

Tokenizer::Tokenizer ( std::istream &  stream,
const std::string  blankspaces,
const std::string  separators 
)

Definition at line 27 of file tokenizer.cpp.

References blankspaceChars(), offbeg, offend, offset, separatorChars(), sstate, and stream.

Here is the call graph for this function:

Tokenizer::~Tokenizer ( ) [virtual]

Definition at line 53 of file tokenizer.cpp.

References sstate, and stream.


Member Function Documentation

void Tokenizer::blankspaceChars ( const std::string  chars)

Sets blankspace characters.

Definition at line 61 of file tokenizer.cpp.

References blankspaces.

std::string Tokenizer::blankspaceChars ( ) const

Gets blankspace characters.

Definition at line 66 of file tokenizer.cpp.

References blankspaces.

Referenced by Tokenizer().

Here is the caller graph for this function:

virtual int Tokenizer::countTokens ( ) [pure virtual]

Returns the number of tokens left.

Implemented in ForwardTokenizer, and ReverseTokenizer.

virtual bool Tokenizer::hasMoreTokens ( ) const [pure virtual]

Tests if there are more tokens.

Implemented in ForwardTokenizer, and ReverseTokenizer.

bool Tokenizer::isBlankspace ( const int  character) const [protected]

Definition at line 91 of file tokenizer.cpp.

References blankspaces.

Referenced by ForwardTokenizer::nextToken(), and ReverseTokenizer::nextToken().

Here is the caller graph for this function:

bool Tokenizer::isSeparator ( const int  character) const [protected]

Definition at line 101 of file tokenizer.cpp.

References separators.

Referenced by ForwardTokenizer::nextToken(), and ReverseTokenizer::nextToken().

Here is the caller graph for this function:

void Tokenizer::lowercaseMode ( const bool  value)

Sets lowercase mode.

Definition at line 81 of file tokenizer.cpp.

References lowercase.

Referenced by main().

Here is the caller graph for this function:

bool Tokenizer::lowercaseMode ( ) const

Gets lowercase mode.

Definition at line 86 of file tokenizer.cpp.

References lowercase.

Referenced by ForwardTokenizer::nextToken(), and ReverseTokenizer::nextToken().

Here is the caller graph for this function:

virtual std::string Tokenizer::nextToken ( ) [pure virtual]

Returns the next token.

Implemented in ForwardTokenizer, and ReverseTokenizer.

virtual double Tokenizer::progress ( ) const [pure virtual]

Returns progress percentage.

Implemented in ForwardTokenizer, and ReverseTokenizer.

void Tokenizer::separatorChars ( const std::string  chars)

Sets separator characters.

Definition at line 71 of file tokenizer.cpp.

References separators.

std::string Tokenizer::separatorChars ( ) const

Gets separator characters.

Definition at line 76 of file tokenizer.cpp.

References separators.

Referenced by Tokenizer().

Here is the caller graph for this function:

std::string Tokenizer::streamToString ( ) const [inline]

Definition at line 109 of file tokenizer.h.

References offbeg, and offend.


Member Data Documentation

std::string Tokenizer::blankspaces [private]

Definition at line 154 of file tokenizer.h.

Referenced by blankspaceChars(), and isBlankspace().

bool Tokenizer::lowercase [private]

Definition at line 157 of file tokenizer.h.

Referenced by lowercaseMode().

std::streamoff Tokenizer::offbeg [protected]
std::streamoff Tokenizer::offend [protected]
std::streamoff Tokenizer::offset [protected]
std::string Tokenizer::separators [private]

Definition at line 155 of file tokenizer.h.

Referenced by isSeparator(), and separatorChars().

std::ios::iostate Tokenizer::sstate [protected]

Definition at line 145 of file tokenizer.h.

Referenced by Tokenizer(), and ~Tokenizer().

std::istream& Tokenizer::stream [protected]

The documentation for this class was generated from the following files: