1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165
|
/* This file is part of kdev-pg-qt
Copyright (C) 2010 Jonathan Schmidt-Dominé <devel@the-user.org>
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public License
along with this library; see the file COPYING.LIB. If not, write to
the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
Boston, MA 02110-1301, USA.
*/
#ifndef KDEV_PG_REGEXP
#define KDEV_PG_REGEXP
#include <kdev-pg-char-sets.h>
#include <vector>
#include <set>
#include <map>
#include <algorithm>
#include <stack>
#include <string>
#include <cassert>
#include <QBitArray>
#include <QString>
#include <QFile>
#include <QStringList>
namespace KDevPG
{
template<typename T> class DFA;
template<typename T> class NFA;
template<CharEncoding enc> class SeqCharSet;
template<CharEncoding enc> class TableCharSet;
// General
class GNFA;
enum AutomatonType { SAscii, SLatin1, SUtf8, SUcs2, SUtf16, SUcs4, TAscii, TLatin1, TUtf8, TUcs2, TUtf16/*, TUcs4*/ };
/**
* Deterministic finite automaton
*/
class GDFA
{
union
{
DFA<SeqCharSet<Ascii> > *s0;
DFA<SeqCharSet<Latin1> > *s1;
DFA<SeqCharSet<Ucs2> > *s2;
DFA<SeqCharSet<Ucs4> > *s3;
DFA<TableCharSet<Ascii> > *t0;
DFA<TableCharSet<Latin1> > *t1;
DFA<TableCharSet<Ucs2> > *t2;
// DFA<TableCharSet<Ucs4> > *t3;
};
friend class GNFA;
public:
static AutomatonType type;
/// Generation of the core state-machine
void codegen(QTextStream& str);
/// Minimization of the automaton
GDFA& minimize();
/// Code used for the detected tokens in the generated code
void setActions(const vector<QString>& actions);
GDFA(const GDFA& o);
~GDFA();
GDFA& operator=(const GDFA& o);
/// Debugging-information
void inspect();
/// Nice output in .dot-format
void dotOutput(QTextStream& o, const QString& name);
/// NFA
GNFA nfa();
private:
/// Has to be generated using a GNFA
GDFA();
};
/// Non-deterministic finite automaton
class GNFA
{
union
{
NFA<SeqCharSet<Ascii> > *s0;
NFA<SeqCharSet<Latin1> > *s1;
NFA<SeqCharSet<Ucs2> > *s2;
NFA<SeqCharSet<Ucs4> > *s3;
NFA<TableCharSet<Ascii> > *t0;
NFA<TableCharSet<Latin1> > *t1;
NFA<TableCharSet<Ucs2> > *t2;
// NFA<TableCharSet<Ucs4> > *t3;
};
friend class GDFA;
public:
GNFA(const GNFA& o);
~GNFA();
GNFA& operator=(const GNFA& o);
explicit GNFA(const std::vector<GNFA*>& init);
/// Intersection
GNFA& operator&=(const GNFA& o);
/// Concatenation
GNFA& operator<<=(const GNFA& o);
/// Union
GNFA& operator|=(const GNFA& o);
/// Difference
GNFA& operator^=(const GNFA& o);
/// Kleene-star
GNFA& operator*();
/// Complement
GNFA& negate();
/// Whether it accepts the empty word
bool acceptsEpsilon() const;
/// Whether it represents the empty set
bool isEmpty() const;
/// Whether it contains arrivable loops
bool isUnbounded() const;
/// Length of the shortest accepted input (-1 iff isEmpty)
int minLength() const;
/// Length of the longest accepted input (-1 iff isEmpty, -2 iff isUnbounded)
int maxLength() const;
/// DFA
GDFA dfa();
/// Minimize
GNFA& minimize();
/// Debugging-information
void inspect();
/// Nice output in .dot-format
void dotOutput(QTextStream& o, const QString& name);
/// Accepts nothing
static GNFA empty();
/// Accepts any single character
static GNFA anyChar();
/// Accepts the given word
static GNFA word(const QString& str);
/// Accepts any of the chars in the string
static GNFA collection(const QString& str);
/// Accepts only the empty word
static GNFA emptyWord();
/// Accepts anything
static GNFA anything();
/// Accepts any character between begin and end (including begin, excluding end)
static GNFA range(quint32 begin, quint32 end);
/// Accepts a single codepoint (or nothing if it is not represantable with tha charset)
static GNFA character(quint32 codepoint);
private:
GNFA(); // has to be constructed using a static member
};
}
#endif
|