Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members  

KYGrammar Class Reference

Class implementing the Kieffer-Yang grammar based data compression. More...

#include <kygrammar.h>

Inheritance diagram for KYGrammar:

GrammarBase UserOfTextCodec List of all members.

Public Methods

 KYGrammar (void)
 A constructor. More...

virtual ~KYGrammar (void)
 A destructor. More...

virtual void purge (void)
 Purge the grammar. More...

virtual void setAlphabetBaseSize (size_t size)
 Set the base size of the terminal alphabet. More...

virtual void setContext (ContextBase *ctxt, bool useForOutput=true)
 Set context for encoding and decoding. More...

void setOutputDevice (IODevice *device)
 Set output device for output. More...

virtual void append (TerminalValue)
 Append one terminal symbol to the root rule of the grammar. More...

virtual void appendToRootRule (RuleElement *)
 Append one symbol to the root rule. More...

virtual void flush (void)
 Ensure that all appended terminal symbols have been processed. More...

virtual RulefindRule (RuleId id)
 Find rule with given id. More...

virtual void print (void)
 Print contents of the grammar. More...

virtual void reconstructInput (void)
 Reconstruct data represented by the grammar. More...

virtual void printDigrams (void)
 Print list of all digrams occurring in the rules. More...


Protected Methods

virtual RulecreateRootRule (void)
 Create root rule in empty grammar. More...

virtual RulecreateRule (RuleElement *, RuleElement *)
 Create new rule with give right side. More...

virtual bool ruleMatchesInput (TestedRule *testedRule, bool flush=false)
 Test if the rule matches the current input. More...

virtual void eatData (bool=false)
 Process all the data in input queue.

virtual void reconstructRule (Rule *rule, IODevice *outputDevice)
 Reconstruct data represented by one rule. More...

virtual size_t getRuleLength (Rule *rule)
 Get the length of the data string represented by the rule.

virtual void reductionRule1 (Rule *)
 Apply Reduction Rule 1 (as given by Kieffer and Yang). More...

virtual RulereductionRule2 (RuleElement *, RuleElement *)
 Apply Reduction Rule 2 (as given by Kieffer and Yang). More...

virtual RulereductionRule3 (Rule *, RuleElement *, RuleElement *)
 Apply Reduction Rule 3 (as given by Kieffer and Yang). More...

virtual void initKYGrammar (void)
 Perform some basic initialization steps. More...


Protected Attributes

ContextBasecontext
 A coding context.

bool useContextForOutput
 Indication is the arithmetic context is used for output (when compressing).

long newSymbolToInstall
 A value of the symbol that has to be installed.

long lastFixedContextSymbol
 Last symbol fixed in the initial table of the context. More...

size_t size
 The amount of the memory occupied by the grammar.

IODeviceoutputDevice
 Output device. More...

InputIteminputFirst
 First item of the input queue.

InputIteminputLast
 Last item of the input queue.

TestedRulestestedRules
 Rules that match the current input during eating the data.

RuleSetruleSet
 Set of all grammar rules. More...

RulerootRule
 Pointer to the root rule. More...

TerminalDigramsterminalDigrams
 Structure with information about dograms starting with terminals. More...

VariableDigramsvariableDigrams
 Structure with information about digrams starting with variables. More...

MatchingRule bestTestedRule
 Description of best rule matching the contents of the input queue. More...

bool I
 Indicator of extension of variable set. More...

RuleId ruleCounter
 Counter for numbering rules. More...

Queue< RuleId > * availableRuleNumbers
 Queue of available rule identificators. More...

unsigned long runCount
 Number of the "transformation runs". More...

unsigned long inputQueueLength
 The size of the input queue.

unsigned long maxInputQueueLength
 The maximum size of the input queue reached.

size_t totalTime
 Total time needed for compression/decompression.

size_t totalSymbols
 Total symbols processed during compression.

size_t variablesEncoded
 Total number of variables encoded during compression.

size_t terminalsEncoded
 Total number of terminals encoded during compression.

size_t eatDataPeriodicity
 The periodicity of calls to eatData().

RuleElement ** flushStack
 The stack of rule elements used when flushing the grammar.

int flushStackPos
 The position in the flush stack.

Rule ** ruleStack
 The stack of rules used when eating the data.

int ruleStackPos
 The position in the rule stack.


Detailed Description

Class implementing the Kieffer-Yang grammar based data compression.

Grammar structure implementing the Kieffer-Yang grammar-based compression algorithm.

Definition at line 84 of file kygrammar.h.


Constructor & Destructor Documentation

KYGrammar::KYGrammar void   
 

A constructor.

Method initKYGrammar() with parameter TERMINAL_ALPHABET_DEFAULT_SIZE (default size of terminal alphabet) is called.

Definition at line 241 of file kygrammar.cpp.

References initKYGrammar.

KYGrammar::~KYGrammar void    [virtual]
 

A destructor.

Contents of the grammar are freed.

Definition at line 307 of file kygrammar.cpp.

References availableRuleNumbers, Collection< Rule >::count, DELETE, DELETE_ARRAY, UserOfTextCodec::deleteDefaultTextCodec, flush, flushStack, ExaltOptions::getOption, maxInputQueueLength, OUTPUTE, OUTPUTEENDLINE, OUTPUTENL, print, ExaltOptions::PrintGrammar, ruleSet, ruleStack, runCount, terminalDigrams, terminalsEncoded, testedRules, totalSymbols, totalTime, variableDigrams, variablesEncoded, ExaltOptions::Verbose, and ExaltOptions::Yes.


Member Function Documentation

void KYGrammar::append TerminalValue    value [virtual]
 

Append one terminal symbol to the root rule of the grammar.

New terminal symbol with value value is inserted into the input queue.

Parameters:
value  Value of inserted terminal symbol.
See also:
eatData()

Implements GrammarBase.

Definition at line 1115 of file kygrammar.cpp.

References eatData, eatDataPeriodicity, INCREASE_INPUT_QUEUE_LENGTH, inputFirst, inputLast, KY_GRAMMAR_EATDATA_PERIODICITY, NEW, InputItem::next, TerminalValue, totalSymbols, totalTime, and InputItem::value.

void KYGrammar::appendToRootRule RuleElement   newElement [virtual]
 

Append one symbol to the root rule.

This method appends one element (terminal or variable) to the root rule. Depending on the previous state of the grammar, various steps are performed. If the appending of the symbol causes some digram to appear twice in the range of the grammar, Reduction Rule 2 (digram appears twice within one rule) or Reduction Rule 3 (digram appears twice in two distinct rules) is performed, in some cases followed by Reduction Rule 1 (some rule appears only once).

Parameters:
newElement  Pointer to appended element.
See also:
reductionRule1() , reductionRule2() , reductionRule3()

Definition at line 1180 of file kygrammar.cpp.

References List< RuleElementNeighbour >::append, context, UserOfTextCodec::createDefaultTextCodec, DBG, DIGRAM_ADD, ContextBase::encode, FATAL, HashTable< unsigned long, NeighboursDescription, List, 98317 >::find, HashTable< unsigned long, NeighboursDescription, List, 3079 >::find, List< RuleElementNeighbour >::first, I, Rule::id, ContextBase::installSymbol, ContextBase::isInitialized, KY_GRAMMAR_MAX_SIZE, Rule::last, ContextBase::lastFixedSymbol, NeighboursDescription::neighbours, NEW, newSymbolToInstall, RuleElement::next, List< RuleElementNeighbour >::next, ContextBase::NotKnown, RuleElementNeighbour::origin, outputDevice, RuleElement::prev, purge, reconstructRule, reductionRule1, reductionRule2, reductionRule3, RIGHT_SIDE_APPEND, rootRule, RuleElementNeighbour::rule, RuleElement::rule, runCount, setAlphabetBaseSize, size, TextCodec::suggestAlphabetBaseSize, Terminal, terminalDigrams, terminalsEncoded, UserOfTextCodec::textCodec, RuleElement::type, RuleElement::value, Variable, variableDigrams, variablesEncoded, and IODevice::writeData.

Referenced by XmlCodec::decode, and eatData.

Rule * KYGrammar::createRootRule void    [protected, virtual]
 

Create root rule in empty grammar.

This method creates new root rule.

Warning:
If the grammar has a root rule already, it will be overwritten by the new one. This can cause fatal consequences.
Returns:
Pointer to newly created root rule.

Definition at line 458 of file kygrammar.cpp.

References List< Rule >::append, Rule::body, Rule::counter, Rule::id, Rule::last, NEW, Rule::refCount, rootRule, ruleCounter, ruleSet, and WRN.

Referenced by initKYGrammar, and purge.

Rule * KYGrammar::createRule RuleElement   start,
RuleElement   last
[protected, virtual]
 

Create new rule with give right side.

This method creates new rule with unique id. The right side of the rule consists of symbols start and last.

Parameters:
start  First symbol of the right side of the rule.
last  Last symbol of the right side of the rule.
Returns:
Pointer to newly created rule.

Definition at line 497 of file kygrammar.cpp.

References List< Rule >::append, availableRuleNumbers, Rule::body, context, Rule::counter, DBG, DELETE, Queue< RuleId >::dequeue, Rule::id, Ignore, Collection< RuleId >::isEmpty, Rule::last, ContextBase::lastFixedSymbol, Rule::matchResult, Rule::matchRun, NEW, newSymbolToInstall, Rule::refCount, ruleCounter, RuleId, and ruleSet.

Referenced by reductionRule3.

Rule * KYGrammar::findRule RuleId    id [virtual]
 

Find rule with given id.

This method finds the rule with given rule.

Parameters:
id  The id of the rule.
Returns:
Pointer to the rule, or NULL.

Definition at line 552 of file kygrammar.cpp.

References List< Rule >::first, Rule::id, List< Rule >::next, RuleId, and ruleSet.

Referenced by XmlCodec::decode.

void KYGrammar::flush void    [virtual]
 

Ensure that all appended terminal symbols have been processed.

After call to this method, all symbols in the input queue are processed.

See also:
eatData()

Implements GrammarBase.

Definition at line 1160 of file kygrammar.cpp.

References DBG, and eatData.

Referenced by ruleMatchesInput, and ~KYGrammar.

void KYGrammar::initKYGrammar void    [protected, virtual]
 

Perform some basic initialization steps.

Some necessary initialization actions are performed.

Definition at line 252 of file kygrammar.cpp.

References availableRuleNumbers, context, createRootRule, eatDataPeriodicity, flushStack, flushStackPos, I, inputFirst, inputLast, inputQueueLength, KY_GRAMMAR_EATDATA_PERIODICITY, KY_GRAMMAR_FLUSH_STACK_SIZE, KY_GRAMMAR_RULE_STACK_SIZE, maxInputQueueLength, NEW, newSymbolToInstall, outputDevice, rootRule, ruleCounter, ruleSet, ruleStack, ruleStackPos, runCount, HashTable< unsigned long, NeighboursDescription, List, 98317 >::setAutoDelete, HashTable< unsigned long, NeighboursDescription, List, 3079 >::setAutoDelete, Collection< RuleId >::setAutoDelete, size, terminalDigrams, terminalsEncoded, testedRules, totalSymbols, totalTime, useContextForOutput, variableDigrams, and variablesEncoded.

Referenced by KYGrammar.

void KYGrammar::print void    [virtual]
 

Print contents of the grammar.

This method prints human readable representation of the grammar on the standard input.

Implements GrammarBase.

Definition at line 1928 of file kygrammar.cpp.

References Rule::body, List< Rule >::first, getRuleLength, Rule::id, RuleElement::next, List< Rule >::next, OUTPUTE, OUTPUTEENDLINE, OUTPUTENL, Rule::refCount, RIGHT_SIDE_IS_EMPTY, RuleElement::rule, ruleSet, Terminal, RuleElement::type, and RuleElement::value.

Referenced by ~KYGrammar.

void KYGrammar::printDigrams void    [virtual]
 

Print list of all digrams occurring in the rules.

Prints all digrams that appear within the grammar.

Definition at line 1984 of file kygrammar.cpp.

References Collection< RuleElementNeighbour >::count, HashTable< unsigned long, NeighboursDescription, List, 98317 >::find, HashTable< unsigned long, NeighboursDescription, List, 3079 >::find, List< RuleElementNeighbour >::first, List< Rule >::first, Rule::id, NeighboursDescription::neighbours, List< RuleElementNeighbour >::next, List< Rule >::next, RuleElementNeighbour::origin, OUTPUTE, OUTPUTEENDLINE, OUTPUTENL, NeighboursDescription::representedBy, RuleElementNeighbour::rule, RuleElement::rule, Terminal, terminalDigrams, RuleElement::type, RuleElement::value, and variableDigrams.

void KYGrammar::purge void    [virtual]
 

Purge the grammar.

Empties the grammar, deletes all the rules and creates a new (emtpy) grammar.

Definition at line 365 of file kygrammar.cpp.

References availableRuleNumbers, createRootRule, DELETE, flushStack, flushStackPos, I, KY_GRAMMAR_FLUSH_STACK_SIZE, KY_GRAMMAR_RULE_STACK_SIZE, NEW, rootRule, ruleCounter, ruleSet, ruleStack, ruleStackPos, runCount, HashTable< unsigned long, NeighboursDescription, List, 98317 >::setAutoDelete, HashTable< unsigned long, NeighboursDescription, List, 3079 >::setAutoDelete, Collection< RuleId >::setAutoDelete, size, terminalDigrams, testedRules, and variableDigrams.

Referenced by appendToRootRule.

void KYGrammar::reconstructInput void    [virtual]
 

Reconstruct data represented by the grammar.

This method reconstructs the data represented by the grammar. Output is done via outputDevice. If the device is NULL, standard output is used.

See also:
reconstructRule()

Implements GrammarBase.

Definition at line 1860 of file kygrammar.cpp.

References outputDevice, reconstructRule, and rootRule.

void KYGrammar::reconstructRule Rule   rule,
IODevice   outputDevice
[protected, virtual]
 

Reconstruct data represented by one rule.

Use of this method causes recursive reconstruction of the data represented by given rule. Output is done via outputDevice. If the device is NULL, standard output is used.

Parameters:
rule  Pointer to the rule to be reconstructed.
outputDevice  pointer of the IO device.
See also:
reconstructInput()

Definition at line 1877 of file kygrammar.cpp.

References Rule::body, RuleElement::next, OUTPUT, RuleElement::rule, Terminal, RuleElement::type, RuleElement::value, and IODevice::writeData.

Referenced by appendToRootRule, and reconstructInput.

void KYGrammar::reductionRule1 Rule   rule [protected, virtual]
 

Apply Reduction Rule 1 (as given by Kieffer and Yang).

Right side of the rule that appears only once is inserted into the rule that references this rule.

Parameters:
rule  Pointer to the rule that starts with a variable that is used only once.

Definition at line 1410 of file kygrammar.cpp.

References availableRuleNumbers, Rule::body, CHECK_POINTER, DELETE, DIGRAM_ADD, DIGRAM_CHANGE, DIGRAM_REMOVE, Queue< RuleId >::enqueue, FATAL, HashTable< unsigned long, NeighboursDescription, List, 98317 >::find, HashTable< unsigned long, NeighboursDescription, List, 3079 >::find, Rule::id, NEW, RuleElement::next, RuleElement::prev, List< Rule >::remove, REPRESENTED_BY_APPEND, NeighboursDescription::representedBy, RuleElement::rule, RuleId, ruleSet, size, Terminal, terminalDigrams, RuleElement::type, RuleElement::value, Variable, and variableDigrams.

Referenced by appendToRootRule.

Rule * KYGrammar::reductionRule2 RuleElement   elOrig1,
RuleElement   elNew1
[protected, virtual]
 

Apply Reduction Rule 2 (as given by Kieffer and Yang).

Both occurences of the digram that appears twice within one rule are replaced by a new variable representing this digram.

Parameters:
elOrig1  Pointer to the first symbol of the first digram.
elNew1  Pointer to the second symbol of the second digram.
Returns:
Pointer to newly created rule.

Definition at line 1541 of file kygrammar.cpp.

References Rule::body, CHECK_POINTER, DELETE, DIGRAM_ADD, DIGRAM_CHANGE, DIGRAM_REMOVE, NEW, RuleElement::next, RuleElement::prev, Rule::refCount, REPRESENTED_BY_APPEND, NeighboursDescription::representedBy, RuleElement::rule, Terminal, RuleElement::type, RuleElement::value, and Variable.

Referenced by appendToRootRule.

Rule * KYGrammar::reductionRule3 Rule   ruleOrig,
RuleElement   elOrig1,
RuleElement   elNew1
[protected, virtual]
 

Apply Reduction Rule 3 (as given by Kieffer and Yang).

Both occurences of the digram that appears in two distinct rules (one of the rules is always the root rule) are replaced by a new variable representing this digram.

Parameters:
ruleOrig  Pointer to the non-root rule.
elOrig1  Pointer to the first symbol of the first digram.
elNew1  Pointer to the second symbol of the second digram.
Returns:
Pointer to newly created rule.

Definition at line 1685 of file kygrammar.cpp.

References Rule::body, CHECK_POINTER, createRule, DELETE, DIGRAM_ADD, DIGRAM_CHANGE, DIGRAM_REMOVE, HashTable< unsigned long, NeighboursDescription, List, 3079 >::find, Rule::last, NEW, RuleElement::next, RuleElement::prev, Rule::refCount, REPRESENTED_BY_APPEND, NeighboursDescription::representedBy, rootRule, RuleElement::rule, Terminal, terminalDigrams, RuleElement::type, RuleElement::value, and Variable.

Referenced by appendToRootRule.

bool KYGrammar::ruleMatchesInput TestedRule   testedRule,
bool    flush = false
[protected, virtual]
 

Test if the rule matches the current input.

Tests if given rule matches the current input string. If flush is FALSE, also the rules that represents more than the current input are considered.

Parameters:
testedRule  Pointer to the rule that is being tested.
flush  Indication if we are in the flush-mode.
Return values:
TRUE  The rule matches.
FALSE  The rule doesn't match.

Definition at line 574 of file kygrammar.cpp.

References Rule::body, CHECK_POINTER, TestedRule::currentElements, DoesntMatch, FATAL, flush, flushStack, flushStackPos, inputFirst, KY_GRAMMAR_FLUSH_STACK_SIZE, TestedRule::lastInputItem, TestedRule::matchedLength, Matches, TestedRule::matchesCompletely, Rule::matchResult, Rule::matchRun, RuleElement::next, InputItem::next, RuleElement::rule, TestedRule::rule, runCount, Terminal, TESTEDRULES_POP, TESTEDRULES_PUSH, RuleElement::type, InputItem::value, and RuleElement::value.

Referenced by eatData.

void KYGrammar::setAlphabetBaseSize size_t    size [virtual]
 

Set the base size of the terminal alphabet.

This method sets the base size of the alphabet of the arithmetic context.

Parameters:
size  The size of the alphabet.

Implements GrammarBase.

Definition at line 419 of file kygrammar.cpp.

References context, DynamicContext, Fibonacci::encodeToBuffer, ContextBase::initialize, ContextBase::isInitialized, outputDevice, ContextBase::setType, size, and IODevice::writeData.

Referenced by appendToRootRule.

void KYGrammar::setContext ContextBase   ctxt,
bool    useForOutput = true
[virtual]
 

Set context for encoding and decoding.

Sets the arithmetic context used by the grammar.

Parameters:
ctxt  Pointer to the context.
useForOutput  Indication whether the context should be used for the output of data (only compression).

Definition at line 2047 of file kygrammar.cpp.

References context, and useContextForOutput.

Referenced by XmlCodec::decode, XmlCodec::encode, and XmlCodec::initializePushCoder.

void KYGrammar::setOutputDevice IODevice   device
 

Set output device for output.

The device used for the output of data is set.

Parameters:
device  The output device.

Definition at line 447 of file kygrammar.cpp.

References outputDevice.

Referenced by XmlCodec::decode, XmlCodec::encode, and XmlCodec::initializePushCoder.


Member Data Documentation

Queue<RuleId>* KYGrammar::availableRuleNumbers [protected]
 

Queue of available rule identificators.

After applying RR1, the identifier of deleted rule can be reused again. This identifier is stored in the queue to be available when newRule() is called.

See also:
ruleCounter, createRule()

Definition at line 233 of file kygrammar.h.

Referenced by createRule, initKYGrammar, purge, reductionRule1, and ~KYGrammar.

MatchingRule KYGrammar::bestTestedRule [protected]
 

Description of best rule matching the contents of the input queue.

This item contains information about the best rule matching current contents of the input queue.

See also:
testedRules , append()

Definition at line 206 of file kygrammar.h.

Referenced by eatData.

bool KYGrammar::I [protected]
 

Indicator of extension of variable set.

Whenever new variable is introduced, this indicator equals true; otherwise it equals false.

Definition at line 213 of file kygrammar.h.

Referenced by appendToRootRule, initKYGrammar, and purge.

long KYGrammar::lastFixedContextSymbol [protected]
 

Last symbol fixed in the initial table of the context.

Used for encoding newly created variables. Each new variable is encoded by a number which is a sum of the variable id and the number of lastFixedContextSymbol.

Definition at line 144 of file kygrammar.h.

IODevice* KYGrammar::outputDevice [protected]
 

Output device.

In some cases the grammar needs to output some data to the output stream (for example, the information about the alphabet size etc.). In order to be this possible, the grammar should have direct access to the output device.

See also:
setOutputDevice().

Definition at line 157 of file kygrammar.h.

Referenced by appendToRootRule, initKYGrammar, reconstructInput, setAlphabetBaseSize, and setOutputDevice.

Rule* KYGrammar::rootRule [protected]
 

Pointer to the root rule.

Used mainly for optimization and lucidity.

Definition at line 181 of file kygrammar.h.

Referenced by appendToRootRule, createRootRule, eatData, initKYGrammar, purge, reconstructInput, and reductionRule3.

RuleId KYGrammar::ruleCounter [protected]
 

Counter for numbering rules.

Warning:
Number of rules is limited by the range of unsigned long type.
Used for generation of unique rule numbers during createRule(). When availableRuleNumbers queue is non-empty, the first available rule id from it is used, otherwise the ruleCounter is used.

See also:
availableRuleNumbers, createRule()

Definition at line 223 of file kygrammar.h.

Referenced by createRootRule, createRule, initKYGrammar, and purge.

RuleSet* KYGrammar::ruleSet [protected]
 

Set of all grammar rules.

The set is represented by a linked list with items of type Rule.

Definition at line 174 of file kygrammar.h.

Referenced by createRootRule, createRule, eatData, findRule, initKYGrammar, print, purge, reductionRule1, and ~KYGrammar.

unsigned long KYGrammar::runCount [protected]
 

Number of the "transformation runs".

New input run occurs when some symbol was appended to the right side of the root rule.

Definition at line 243 of file kygrammar.h.

Referenced by appendToRootRule, eatData, initKYGrammar, purge, ruleMatchesInput, and ~KYGrammar.

TerminalDigrams* KYGrammar::terminalDigrams [protected]
 

Structure with information about dograms starting with terminals.

The structure is implemented as a hash array that maps values of terminal symbols to lists of elements which follow given terminal.

Definition at line 188 of file kygrammar.h.

Referenced by appendToRootRule, eatData, initKYGrammar, printDigrams, purge, reductionRule1, reductionRule3, and ~KYGrammar.

VariableDigrams* KYGrammar::variableDigrams [protected]
 

Structure with information about digrams starting with variables.

The structure is implemented as a hash array that maps id numbers of variable symbols to lists of elements which follow given variable.

Definition at line 195 of file kygrammar.h.

Referenced by appendToRootRule, initKYGrammar, printDigrams, purge, reductionRule1, and ~KYGrammar.


The documentation for this class was generated from the following files:
Generated on Wed Feb 5 10:43:06 2003 for Exalt by doxygen1.2.14 written by Dimitri van Heesch, © 1997-2002