1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
|
% This file was created automatically from hash2.msk.
% DO NOT EDIT!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%A hash2.msk GAP documentation Gene Cooperman
%A Scott Murray
%A Alexander Hulpke
%%
%A @(#)$Id: hash2.msk,v 1.8 2002/04/15 10:02:30 sal Exp $
%%
%Y (C) 2000 School Math and Comp. Sci., University of St. Andrews, Scotland
%Y Copyright (C) 2002 The GAP Group
%%
\PreliminaryChapter{Dictionaries and General Hash Tables}
People and computers spend a large amount of time with searching.
Dictionaries are an abstract data structure which facilitates searching for
certain objects. An important way of implementing dictionaries is via hash
tables.
*The functions and operations described in this chapter have been added
very recently and are still undergoing development. It is conceivable that
names of variants of the functionality might change in future versions. If
you plan to use these functions in your own code, please contact us.*
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Dictionaries}
\>IsDictionary( <obj> ) C
A dictionary is a growable collection of objects that permits to add
objects (with associated values) and to check whether an object is
already known.
\>IsLookupDictionary( <obj> ) C
A *lookup dictionary* is a dictionary, which permits not only to check
whether an object is contained, but also to retrieve associated values,
using the operation `LookupDictionary'.
\>KnowsDictionary( <dict>, <key> ) O
checks, whether <key> is known to the dictionary <dict>, and returns
`true' or `false' accordingly. <key> *must* be an object of the kind for
which the dictionary was specified, otherwise the results are
unpredictable.
\>LookupDictionary( <dict>, <key> ) O
looks up <key> in the lookup dictionary <dict> and returns the
associated value. If <key> is not known to the dictionary, `fail' is
returned.
There are several ways how dictionaries are implemented: As lists, as
sorted lists, as hash tables or via binary lists. A user however will
just have to call `NewDictionary' and obtain a ``suitable'' dictionary
for the kind of objects she wants to create. It is possible however to
create hash tables (see~"General hash table definitions and operations")
and dictionaries using binary lists (see~"DictionaryByPosition").
\>NewDictionary( <obj>, <look>[, <objcoll>] ) F
creates a new dictionary for objects such as <obj>. If <objcoll> is
given the dictionary will be for objects only from this collection,
knowing this can improve the performance. If <objcoll> is given, <obj>
may be replaced by `false', i.e. no sample object is needed.
The function tries to find the right kind of dictionary for the basic
dictionary functions to be quick.
If <look> is `true', the dictionary will be a lookup dictionary,
otherwise it is an ordinary dictionary.
The use of two objects, <obj> and <objcoll> to parametrize the objects a
dictionary is able to store might look confusing. However there are
situations where either of them might be needed:
The first situation is that of objects, for which no formal ``collection
object'' has been defined. A typical example here might be subspaces of
a vector space. {\GAP} does not formally define a ``Grassmannian'' or
anything else to represent the multitude of all subspaces. So it is only
possible to give the dictionary a ``sample object''.
The other situation is that of an object which might represent quite
varied domains. The permutation $(1,10^6)$ might be the nontrivial
element of a cyclic group of order 2, it might be a representative of
$S_{10^6}$. In the first situation the best approach might be just to
have two entries for the two possible objects, in the second situation a
much more elaborate approach might be needed.
An algorithm that creates a dictionary will usually know a priori, from what
domain all the objects will be, giving this domain permits to use a more
efficient dictionary.
This is particularly true for vectors. From a single vector one cannot
decide whether a calculation will take place over the smallest field
containing all its entries or over a larger field.
As there are situations where the approach via binary lists is explicitly
desired, such dictionaries can be created deliberately.
\>DictionaryByPosition( <list>, <lookup> ) F
creates a new (lookup) dictionary which uses `PositionCanonical' in
<list> for indexing. The dictionary will have an entry `<dict>!.blist'
which is a bit list corresponding to <list> indicating the known
If <look> is `true', the dictionary will be a lookup dictionary,
otherwise it is an ordinary dictionary.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{General Hash Tables}
This chapter describes hash tables for general objects.
We hash by keys and also store a value. Keys
cannot be removed from the table, but the corresponding value can be
changed. Fast access to last hash index allows you to efficiently store
more than one array of values -- this facility should be used with care.
This code works for any kind of object, provided you have a DenseIntKey
or KeyIntSparse method to convert the key into a positive integer.
These methods should ideally be implemented efficiently in the core.
Note that, for efficiency, it is currently impossible to create a
hash table with non-positive integers.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{General hash table definitions and operations}
\>IsHash( <obj> ) C
The category of hash tables for arbitrary objects (provided an `IntKey'
function
is defined).
\>PrintHashWithNames( <hash>, <keyName>, <valueName> ) O
Print a hash table with the given names for the keys and values.
\>GetHashEntry( <hash>, <key> ) O
If the key is in hash, return the corresponding value. Otherwise
return fail. Note that it is not a good idea to use fail as a value.
\>AddHashEntry( <hash>, <key>, <value> ) O
Add the key and value to the hash table.
\>RandomHashKey( <hash> ) O
Return a random Key from the hash table (Random returns a random value).
\>HashKeyEnumerator( <hash> ) O
Enumerates the keys of the hash table (Enumerator enumerates values).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Hash keys}
The crucial step of hashing is to transform key objects into integers such
that equal objects produce the same integer.
\>TableHasIntKeyFun( <hash> ) P
If this filter is set, the hash table has an `IntKey' function in its
component `<hash>!.intKeyFun'.
The actual function used will vary very much on the type of objects. However
{\GAP} provides already key functions for some commonly encountered objects.
\>DenseIntKey( <objcoll>, <obj> ) O
returns a function that can be used as hash key function for objects
such as <obj> in the collection <objcoll>. <objcoll> typically will be a
large domain. If the domain is not available, it can be given as
`false' in which case the hash key function will be determined only
based on <obj>. (For a further discussion of these two arguments
see~`NewDictionary', section~"NewDictionary").
The function returned by `DenseIntKey' is guaranteed to give different
values for different objects.
If no suitable hash key function has been predefined, `fail' is returned.
\>SparseIntKey( <objcoll>, <obj> ) O
returns a function that can be used as hash key function for objects
such as <obj> in the collection <objcoll>. In contrast to `DenseIntKey',
the function returned may return the same key value for different
objects.
If no suitable hash key function has been predefined, `fail' is returned.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Dense hash tables}
Dense hash tables are used for hashing dense sets without collisions,
in particular integers.
Stores keys as an unordered list and values as an
array with holes. The position of a value is given by the attribute
`IntKeyFun' or the function returned by `DenseIntKey',
and so KeyIntDense must be one-to-one.
\>DenseHashTable( ) F
Construct an empty dense hash table. This is the only correct way to
construct such a table.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Sparse hash tables}
Sparse hash tables are used for hashing sparse sets.
Stores keys as an array with fail
denoting an empty position, stores values as an array with holes.
Uses `HashFunct' applied to the `IntKeyFun' (respectively the result of
calling `SparseIntKey') of the key. DefaultHashLength
is the default starting hash table length; the table is doubled
when it becomes half full.
\>SparseHashTable( [<intkeyfun>] ) F
Construct an empty sparse hash table. This is the only correct way to
construct such a table.
If the argument <intkeyfun> is given, this function will be used to
obtain numbers for the keys passed to it.
\>GetHashEntryIndex( <hash>, <key> ) F
If the key is in hash, return its index in the hash array.
\>DoubleHashArraySize( <hash> ) F
Double the size of the hash array and rehash all the entries.
This will also happen automatically when the hash array is half full.
In sparse hash tables, the integer obtained from the hash key is then
transformed to an index position, this transformation is done using the hash
function `HashFunct':
\>HashFunct( <key>, <i>, <size> ) F
This will be a good double hashing function for any reasonable KeyInt
(see Cormen, Leiserson and Rivest, Introduction to Algorithms,
1e, p. 235).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Fast access to last hash index}
These functions allow you to use the index of last hash access or modification.
Note that this is global across all hash tables. If you want to
have two hash tables with identical layouts, the following works:
GetHashEntry( hashTable1, object ); GetHashEntryAtLastIndex( hashTable2 );
These functions should be used with extreme care, as they bypass most
of the inbuilt error checking for hash tables.
\>GetHashEntryAtLastIndex( <hash> ) O
Returns the value of the last hash entry accessed.
\>SetHashEntryAtLastIndex( <hash>, <newValue> ) O
Resets the value of the last hash entry accessed.
\>SetHashEntry( <hash>, <key>, <value> ) O
Resets the value corresponding to <key>.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%E
|