1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274
|
Last updated: 07/20/1999, 11:04
-------- 08/01/1998 tcLex version 1.0 --------
-------- 09/02/1998 tcLex version 1.0p1 --------
1. Corrected potential bug when a global lexer was created from within a
namespace. For example:
namespace eval foo {
lexer ::bar::baz ...
}
The created commmand was ::foo::bar::baz instead of ::bar::baz. Also, the return
value is now the fully qualified name (like with proc) and not the specified
name (namespace-relative).
2. Corrected major bug with incremental processing. When used with rule
rejection, some rules were incorrectly bypassed. This correction is also a
performance enhancement for incremental processing.
3. Minor typos corrected in the man page: the example demonstrating the
difference between inclusive and exclusive conditions was incorrect.
4. Corrected syntax error in the default pkgIndex.tcl file provided with the
previous version. This file didn't work due to extra curly braces :(. Hopefully
doing "pkg_mkIndex" worked.
5. Added configure files for Unix, many thanks to John Ellson of Lucent
<ellson@lucent.com> for these files!
6. Changed "static const char*" to "static char*" in some places to avoid
compiler warnings on Unix. Thanks to John Ellson <ellson@lucent.com> and Paul
Vogel <vogel@cygnet.rsn.hp.com> for pointing that out.
7. Added .txt extension to all text files in the distrib. This makes them easier
to read them under Windows.
8. Added the current file (changes.txt)
-------- 11/11/1998 tcLex version 1.1a1 --------
1. Completely rewrote the regexp interface. A patched version of Tcl's regexp
package is now included in the code. Although it makes the code a bit bigger
(the binay is a few KB more), it allows for better handling of string overrun
cases, which was a major limitation in previous versions. Also allows
newline-sensitive regexps in Tcl8.0 (see below). Added files tcLexRE.h and
tcLexRE.c -- which in turn includes RE80.c or RE81.c (modified regexp engines)
depending on the Tcl version number.
2. Completely reworked the string handling code, so it is Unicode-clean now
under Tcl8.1. Now it stores the Unicode string instead of UTF-8, so that using
string indices is easier (UTF-8 uses variable-byte chars and thus needs special
parsing procedures). Also brings significant performance enhancement with big
strings (previously, the whole UTF-8 string was converted to Unicode by the
regexp package every time a rule was tried).
3. Corrected bug with index under Tcl8.1 (the correction is related to the above
changes). The returned index was the byte index and not the character index.
4. Renamed tcLexPrivate.h to tcLexInt.h for more consistency with Tcl.
5. Added -args option to allow extra arguments passing, using the same syntax as
proc. For example:
lexer foo -args {a b {c 3}} ...
foo eval $string 1 2; # a=1, b=2, c defaults to 3
6. Added -lines flag for line-sensitive processing. This changes the behavior of
"^$" and "." in regexps, and provides a portable way to use line-sensitive
regexps (Tcl8.0 doesn't support them, and Tcl8.1 requires special syntax). This
has been implemented thanks to the inclusion of the regexp code.
7. Added a TcLex_Buffer structure to allow future improvements: different types
of inputs (string, variable, file, channel) as well as multiple input buffers.
8. Reorganized the code to make future improvements easier to implement.
9. The return value to "lexer" is now an empty string, like with proc (contrary
to what I previous wrote)
10. Fixed bug due to overzealous memory deallocation, thanks to Claude BARRAS
<barras@etca.fr>.
11. Added "input" and "unput" subcommands, following the suggestions of Neil
Walker <neil.walker@mrc-bsu.cam.ac.uk>. They are similar to flex's input() and
unput() functions, except that unput can't put arbitrary chars back into the
input string (this is a design choice, not a technical limiation).
-------- 11/19/1998 tcLex version 1.1a2 --------
1. Added -nocase flag for case-insensitivity. Under Tcl8.0, it needed further
incursion into the regexp code.
2. Added -longest flag to chose longest matching rule (as flex) instead of first
matching rule (the default).
3. Reworked the rule rejection code so that it works correctly and efficiently
with -longest. It also made it safer.
-------- 11/25/1998 tcLex version 1.1a3 --------
1. Corrected major bug in the modified Tcl8.0 regexp engine, which caused some
regexps to fail (especially those with ?-marked subexpressions). For instance,
the expression "a?b" matched the string "b", but not the string "ab".
2. Added "create" and "current" subcommands to the lexer command. The first is
optional and is used when creating lexers:
lexer ?create? <name> ?args ... args?
The second can be used during a processing to get the name of the currently
active lexer, for example:
[lexer current] index
This avoids using the name of the lexer everywhere, and is useful when lexers
are renamed, aliased or imported. Suggestion made by Leo Schubert
<leo@bj-ig.de>. These new subcommands introduce a potential incompatibility:
lexers cannot be named "create" or "current" anymore (but this shouldn't be a
problem).
-------- 12/18/1998 tcLex version 1.1b1 --------
1. TcLex is now intended to be linked against Tcl8.0.4 or Tcl8.1b1. Some changes
have been made in the source files to take the new import directives into
account when building Windows DLLs (introduced in Tcl8.0.3).
2. Slighly modified the Windows makefile.vc to build the object files into
distinct directories depending on some settings (debug, Tcl version).
3. File RE81.c is now based on the regexp source from Tcl8.1b1.
4. Completely rewrote the documentation. This now includes a comparison with
flex, as well as a classical man page. It uses HTML + CSS so that newer browsers
can display enhanced presentation while still allowing text-based browsers to
display properly formatted text.
5. Added several examples, some from Neil Walker (thanks, Neil!), some from me
(Frdric BONNET).
-------- 01/11/1999 tcLex version 1.1b2 --------
1. Added SafeTcl entry point (Tclex_SafeInit).
2. Corrected bug that seemed to occur only on some Unix systems (eg. SGI and
Solaris) but potentially affected others as well. This caused some lexers to be
incorrectly reported as inactive even when returned by [lexer current]. The
source of a bug was a missing lower bound in the lexer state deallocator
(StateDelete) that caused subsequent states to be given a negative index,
causing the "inactive lexer" error. Bug reported by Claude BARRAS and Neil
Walker.
3. Corrected bug in the modified Tcl8.0 regexp engine that caused newlines to be
treated as any characters even in line-sensitive mode, when used with * or +.
Bug reported by Neil Walker.
4. Improved handling of ^$ in line-sensitive mode under Tcl8.0 so that they
behave the same as under Tcl8.1.
5. Corrected bug with empty string match handling: some actions were called
twice, once for the matched string and once for an empty string at the end of
the previous one.
6. Fixed Unix warnings previously reported by Claude BARRAS but forgotten in the
previous version: the struct regexec_state in RE80.c (modified Tcl8.0 regexp
engine) was used before defined. This warning was silent under Windows (too low
warning level?).
-------- 04/04/1999 tcLex version 1.1 final --------
1. Corrected minor typo in RE80.c: in function findChar, parameter c was
declared as int* instead of int. This had no influence (it got cast to a char
anyway) but generated warnings with some compilers (not mine unfortunately )-:
Reported by Volker Hetzer <hetzer.abg@sni.de>.
2. TcLex is now intended to be linked against Tcl8.0.4 (or higher patchlevel) or
Tcl8.1b2. On the latter, tcLex is configured by default to use the new stubs
facility. Only minor code modifications were needed. Tcl8.1b1 isn't supported
anymore.
3. Removed compatibility macros from tcLexInt.h now that the old functions are
back in Tcl8.1b2.
4. Fixed major bug occuring with longest-prefered matching lexers. When several
rules matched the same number of characters, the last defined rule was chosen
instead of the first one, due to a bad comparison operator ('<' was used instead
of '<=' in RuleTry). This broke the "pascal" example.
5. Reformatted the code so that it uses 4 spaces indentations instead of 2, to
better conform with Tcl C coding conventions. This is rather cosmetic but makes
the code a bit more readable.
-------- 04/30/1999 tcLex version 1.1.1 --------
1. TcLex is now intended to be linked against Tcl8.0.4 (or higher patchlevel) or
Tcl8.1b3. Tcl8.1b2 isn't supported anymore.
2. Removed redefinition of TclUtfToUniCharDString and TclUniCharToUtfDString
that were needed by stub-enabled Tcl8.1b2, now that Tcl_UtfToUniCharDString and
Tcl_UniCharToUtfDString are publicly available in Tcl8.1b3.
3. Removed the hack needed by TclRegCompObj not being exported by stub-enabled
Tcl8.1b2. Tcl8.1b3 now exports the public Tcl_GetRegExpFromObj which does the
same thing.
4. Fixed regexp inconsistency between Tcl8.0 and Tcl8.1 with line-sensitive
matching. Regexps with negated ranges (eg. [^a]) could span multiple lines under
Tcl8.0 but couldn't under Tcl8.1 (the right behavior).
5. Cleaned up the modified regexp exec code and proposed it as a patch to the
Tcl core.
6. Rewrote arguments parsing code using Tcl_GetIndexFromObj to use symbolic
constants rather than integer indices.
7. Added links to Neil Walker's tcLex page (thanks Neil!) from the doc.
-------- 04/25/1999 tcLex version 1.1.2 --------
1. Corrected bug in line-sensitive matching. This bug was introduced by the
above change #4, and was located in the negated range processing code in certain
cases.
-------- 06/24/1999 tcLex version 1.1.3 --------
1. Corrected major bug with Tcl 8.1.1. The new regexp caching scheme introduced
by Tcl 8.1.1 conflicted with the way tcLex stored compiled regexps. The regexp
handling code has been completely reworked. Bug reported by Claude BARRAS.
2. Added URL to Scriptics' regexp-HOWTO in the doc
(http://www.scriptics.com/support/howto/regexp81.html).
-------- 07/20/1999 tcLex version 1.1.4 --------
1. Corrected major bug with Tcl 8.1. The functions BufferNotStarving()
and BufferAtEnd() mixed character and byte indices. which resulted in string
overflows. Bug reported by Neil Walker. It is surprising that this bug did not
show up earlier because the string overflows occured eventually in virtually
any case, however it only crashed tcLex in very precise cases (hard to
reproduce on Windows).
-------- 09/03/1999 tcLex version 1.2a1 --------
1. Added support for Tcl8.2 and higher. Now that Tcl8.2's regexp engine provides
the features needed by tcLex (ie string overrun detection and matching at the
beginning of the string), tcLex no longer needs a patched version of this
engine. This makes the code much simpler as it now uses standard Tcl library
functions. Added file RE82.c
2. The input string is now stored as a Tcl_Obj instead of a Tcl_DString.
Reworked the related code in consequence (RuleTry(), RuleExec(),
RuleGetRange()). Under Tcl8.0, use the obj's 8bits string. Under Tcl8.2, use the
obj's Unicode (not UTF-8) string (actually, only pass the string obj to the Tcl
library procs, which in turn use the obj's Unicode representation). Under
Tcl8.1, added a Unicode object type and related procs (eg. Tcl_NewUnicodeObj(),
Tcl_GetUnicode() and Tcl_GetCharLength()) to be source compatible with Tcl8.2.
These new Unicode objects use Unicode Tcl_DStrings as their internal rep.
3. Modified "lexer begin initial" behavior so that it empties the conditions
stack rather than pushing the "initial" condition on top of it. This makes some
lexers easier to write (eg. Neil Walker's flex examples).
|