1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418
|
Tcllib package indexing
=======================
This document describes the possibilities for using one or more
pkgIndex.tcl files in an installation of tcllib to provide the
information about all of its packages to a tcl interpreter, discusses
their pro and contra and makes a choice for Tcllib 1.4. A roadmap of
changes in the future is made available as an appendix.
Background under which to see the solutions:
There are three level of groupings:
- The tcllib project itself
- Modules in the project (== subdirectory of 'modules')
- Packages in a module.
Each module currently contains one package index file.
Some modules contain more than one package. They share the index.
Most packages require specific versions of the Tcl
interpreter. They perform the checks in their package index
file and do not register if the pre-requisites are not
fulfilled.
Other checks are possible, but currently not in use.
Note I:
Whether a solution is actually applicable depends on external
factors, like the chosen directory layout of an installed
tcllib.
Note II:
All solutions currently depend on the specific implementation
of [tclPkgUnknown] coming with the basic core, simply by the
fact that the files looked at are called 'pkgIndex.tcl'. This
is therefore no contra argument against any specific solution,
but against all. We ignore this as currently there is no
better replacement in existence.
Note III:
We have to support Tcl before 8.3. as some packages in tcllib
allow this.
[i1/ng] No global package index
-------------------------------
In this solution the module package indices are the only index
files present in an installation.
This solution is applicable if and only if one of the flat
directory layouts (L2/Fa or L2/Fb) has been chosen.
Pro:
Simple. No need for complex management.
[i2/ad] Global package index, auto_path extension, direct
---------------------------------------------------------
A single global package index is present in the toplevel
directory of the installation.
This solution is applicable if and only if the deep directory
layout (L2/D) has been chosen.
The package index contains a series of statements extending
the auto_path variable with all module directories. The list
of names of the module directories is hardcoded. In other
words, it is _not_ determined via [glob].
Example:
lappend auto_path [file join $dir md4]
lappend auto_path [file join $dir md5]
lappend auto_path [file join $dir sha1]
...
Pro:
[[0]] Compared to [i3/ag] this should be bit faster
as glob'ing the directory tree of tcllib is
avoided. This performance-boost is not a big
pro according to the opinions below.
[[1]] Relies on the module package index files for
the actual registration of packages, thus
automatically inherits the correct constraints
on the registration of packages. No additional
complexities.
[[2]] Easier to generate than [i6/dr].
Contra:
[[3]] Hard coding the directory names implies that
adding modules to the installed tcllib is not
as easy as just creating a new directory for
the module/package. The global index has to be
updated too.
Contra-Contra:
<<Don: New, updated packages should be
installed separately, outside of
tcllib. The ticked version number
ensures that it is prefered over the
package in tcllib.>>
<<AK: Agree fully>>
[[4]] Extending the 'auto_path' list causes the
package management of the tcl core to re-read
the list and glob through all of them for new
package indices. This has a high cost in terms
of filesystem access, i.e. is an issue of
performance.
Contra-Contra:
<<Don: IMHO, it's not really tcllib's
job to fix [tclPkgUnknown]'s
performance problems. If performance
is a problem, users could try the
patch with Tcl Feature Request 680169
and see if it helps.>>
<<AK: Not sure yet about this>>
[[5]] This enables auto-loading in each module
(according to any tclIndex file installed).
This should not be done by the package
indexer, but by the package itself. See
control for an example.
[[10]] Will not work with Tcl releases prior to
8.3.1. Only then was [tclPkgUnknown]
"enhanced" to deal with changing ::auto_path
values. If tcllib 1.4 wishes to continue
supporting pre-8.3.1 Tcl, then this option has
to be supplemented with a fallback.
[i3/ag] Global package index, auto_path extension, glob
-------------------------------------------------------
This is like [i2/ad], except that the list of sub directories
is not hardcoded into the index, but determined through glob.
Example:
foreach subdir [glob -nocomplain -type d $dir/*] {
lappend auto_path $subdir
}
Pro:
Anti-[[3]]
[[1]]
Contra:
All the contras of [i2/ad] and Anti-[[0]].
[i4/sd] Global package index, sourcing module indices, direct
-------------------------------------------------------------
A single global package index is present in the toplevel
directory of the installation.
This solution is applicable if and only if the deep directory
layout (L2/D) has been chosen.
The package index contains a series of statements source'ing
the package index files of the modules in tcllib. The list
of names of the module directories is hardcoded. In other
words, it is _not_ determined via [glob].
Example:
set main $dir
set dir [file join $main md4] ; source [file join $dir pkgIndex.tcl]
set dir [file join $main md5] ; source [file join $dir pkgIndex.tcl]
set dir [file join $main sha1] ; source [file join $dir pkgIndex.tcl]
...
Pro:
[[0]], but compared to [i5/sg].
[[1]]
[[2]]
[[6]] In contrast to [i2/ad] and [i3/ag] repeated
glob'ing for package index files is
avoided. This cuts down on costly FS accesses.
I.e. another perf. boost.
Contra:
[[3]]
[i5/sg] Global package index, sourcing module indices, glob
-----------------------------------------------------------
This is like [i4/sd], except that the list of package indices
to source is not hardcoded into the index, but determined
through glob.
Example:
foreach subdir [glob -nocomplain -type d $dir/*] {
set dir $subdir
source [file join $dir pkgIndex.tcl]
}
Pro:
Anti-[[3]]
[[1]]
[[2]]
Contra:
All the contras of [i2/sd], and Anti-[[0]]
[i6/dr] Global package index, direct registration
-------------------------------------------------
A single global package index is present in the toplevel
directory of the installation.
This solution is applicable if and only if the deep directory
layout (L2/D) has been chosen.
The package index contains a series of statements which
directly register all the tcllib packages.
Example:
if {[constraint]} {return}
package ifneeded md4 [list source [file join $dir md4 md4.tcl]]
package ifneeded md5 [list source [file join $dir md4 md4.tcl]]
package ifneeded sha1 [list source [file join $dir md4 md4.tcl]]
... more constraints ... package ifneeded
Pro:
[[7]] This is the fasted solution as the number of
accesses to the filesystem is minimal.
Contra:
[[[3]]
Anti-[[1]] Care has to be taken to ensure that
the constraints the module indices
place on the registration of packages
are replicated in the global
index. All other solutions simply used
the module indices and thus got it
right automatically. Now supporting
code is required to detect such
constraints and then to properly
recreate them globally.
= High complexity for the maintainer.
[i7/ad] Global package index, auto_path extension, direct
---------------------------------------------------------
A single global package index is present in the toplevel
directory of the installation.
This solution is applicable if and only if the deep directory
layout (L2/D) has been chosen.
The package index contains a single statement extending the
auto_path variable with the tcllib main directory. The
standard package management will then find all module sub
directories and the package indices in them.
Example:
lappend auto_path $dir
Pro:
[[1]]
[[8]] This is the easiest solution by far in terms
of code to write, and complexities to solve
(none).
[[9]] <<Don: I believe this is the only proposal listed
that actually fixes tcllib Bug 720318
(successful [package require] of packages
within a SafeBase) because it is the only one
that changes the value of ::auto_path.>>
<<AK: This is true, yet brittle. It depends on
when the SafeBase sees the auto_path. If it
happens to be before a [package require
something] forced the reading of all package
indices (and thus the extension of
'auto_path') we are still SOL.>>
Contra: [[4]]
[[10]]
[i8/pm] Global package index, pkg_mkIndex
-----------------------------------------
Just use [pkg_mkIndex modules */*.tcl] to generate the master index.
Pro:
Easy to do.
Contra:
Does not handle constraints in subordinate package
indices, simply because they are actually ignored
during processing.
Adding code to handle constraints evolves this into
[i6/dr].
Note: The contra is hard enough IMHO to make this solution not
applicable for 1.4, which does have constraints, and handling
them wrong (not at all) is a bug.
General discussion
------------------
Given that a deep directory layout was chosen [i1/ng] is not
applicable and therefore dropped from the discussion.
In the pro and contra arguments listed above three independent axes of
reasoning emerged:
a) Performance of the solution, with the number of accesses to
filesystem the main factor determining it.
b) Complexity/difficulty of the solution with regard to
adding/updating packages.
c) Complexity of generating the master index.
Axis (b) has essentially been thrown out. Trying to modify the
installation of tcllib itself is bad practice. Install new/updated
packages separately. The version numbering takes care of the rest,
i.e. usage of the new over the older version found in tcllib.
With respect to axis (c), complexity of generation, [i7/ad] is the
definite winner, with the other *d solutions close behind (all use
fixed scripts, I7/ad wins on size). This is followed by the *g
solutions as they require actual dynamic generation of code. And at
the bottom of the ladder is [i6/dr] with its need for close inspection
of the sub-ordinate indices to get everything right.
Now axis (a), performance, [i6/dr] is most likely the winner as it
causes only one index to be read and nothing else. This is followed by
the all *d solutions, which read the subordinate indices, but do not
need much glob'ing. The actual order in this group is difficult to
determine. I guess that the auto_path extending methods are slower
than the sourcing methods, and the adding of one directory faster than
the adding of all, as the latter looks for much more subdirectories.
The next group are the *g solutions as they perform their own glob'ing too
beyond that done by the package mgmt.
Two final rankings
(c), then (a) (a), then (c)
------------- -------------
[i7/ad] [i6/dr]
[i4/sd] [i4/sd]
[i2/ad] [i7/ad]
[i5/sg] [i2/ad]
[i3/ag] [i5/sg]
[i6/dr] [i3/ag]
------------- -------------
[i4/sd] seems to be a good compromise solution between performance and
complexity of generation, but [i7/ad] is not too bad either.
[i4/sd] reminder:
set main $dir
set dir [file join $main md4] ; source [file join $dir pkgIndex.tcl]
set dir [file join $main md5] ; source [file join $dir pkgIndex.tcl]
set dir [file join $main sha1] ; source [file join $dir pkgIndex.tcl]
...
[i7/ad] reminder:
lappend auto_path $dir
Other opinions:
Don Porter prefers [i7/ad], and [i6/dr] as second choice. Also
as [i7/ad] fallback for older Tcl before 8.3.1
Joe English strictly opposes any solution modifying the
auto_path, violating his opinion that index scripts should
have no side-effects beyond registering a package.
Chosen solution for Tcllib 1.4
------------------------------
After comparing the code for the combination of [i7/ad] and [i6/dr] as
submitted by Don Porter, and for [i4/sd] as submitted by myself
(Andreas), and a small discussion on the Tcl'ers chat between Don and
me, we took [i4/sd] for the main body of the index, and the header of
Don's code. Basically the chosen package index is a combination of
[i7/id] and of [i4/sd] as fallback.
This is still as easy to generate as [4/sd], the index is also only a
bit more complex, and speed should be okay too.
Don convinced me that while extending auto_path is definitely bad in
the long-term it is still okay for the short-term and release 1.4.
Roadmap
-------
After Tcllib has been driven into the state of one package per module
directory, and switched to a flat directory layout for its
installation we switch to [i1/ng] for the indexing structure.
-----------------------------------
This document is in the public domain.
Andreas Kupries <andreas_kupries@users.sf.net>
|