File: table.html

package info (click to toggle)
gobo 1.5-2
  • links: PTS
  • area: main
  • in suites: potato
  • size: 7,728 kB
  • ctags: 13,070
  • sloc: ansic: 85,961; lex: 2,758; yacc: 2,298; sh: 1,464; makefile: 64
file content (298 lines) | stat: -rw-r--r-- 16,390 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>

<head>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Microsoft FrontPage 2.0">
<title>Tables</title>
</head>

<body bgcolor="#FFFFFF">

<table border="0" width="100%">
    <tr>
        <td><font size="6"><strong>Tables</strong></font></td>
        <td align="right"><a href="list.html"><img
        src="../image/previous.gif" alt="Previous" border="0"
        width="40" height="40"></a><a href="dispenser.html"><img
        src="../image/next.gif" alt="Next" border="0" width="40"
        height="40"></a></td>
    </tr>
</table>

<hr size="1">

<p align="center"><!--webbot bot="ImageMap" startspan
rectangle=" (131,235) (233, 267)  flatshort/ds_hash_table.html"
rectangle=" (124,160) (242, 191)  flatshort/ds_sparse_table.html"
rectangle=" (258,85) (350, 115)  flatshort/ds_resizable.html"
rectangle=" (141,84) (230, 116)  flatshort/ds_bilinear.html"
rectangle=" (22,85) (107, 116)  flatshort/ds_table.html"
rectangle=" (18,9) (110, 40)  flatshort/ds_container.html"
src="image/table.gif" border="0" width="368" height="279" --><MAP NAME="FrontPageMap0"><AREA SHAPE="RECT" COORDS="131, 235, 233, 267" HREF="flatshort/ds_hash_table.html"><AREA SHAPE="RECT" COORDS="124, 160, 242, 191" HREF="flatshort/ds_sparse_table.html"><AREA SHAPE="RECT" COORDS="258, 85, 350, 115" HREF="flatshort/ds_resizable.html"><AREA SHAPE="RECT" COORDS="141, 84, 230, 116" HREF="flatshort/ds_bilinear.html"><AREA SHAPE="RECT" COORDS="22, 85, 107, 116" HREF="flatshort/ds_table.html"><AREA SHAPE="RECT" COORDS="18, 9, 110, 40" HREF="flatshort/ds_container.html"></MAP><img src="image/table.gif" border="0" width="368" height="279" usemap="#FrontPageMap0"><!--webbot
bot="ImageMap" i-checksum="35398" endspan --></p>

<h2>Class <em><tt>DS_TABLE</tt></em></h2>

<p>Tables are data structures whose items are accessible by keys.
Therefore the class <a href="flatshort/ds_table.html"><font
color="#008080"><em><tt>DS_TABLE</tt></em></font></a> has two
generic parameters. As for all other container classes, the first
parameter <font color="#008080"><em><tt>G </tt></em></font>represents
the type of the items, whereas the second parameter <font
color="#008080"><em><tt>K</tt></em></font> is the type of the
keys which are associated with the items. The features provided
by class <font color="#008080"><em><tt>DS_TABLE </tt></em></font>are
very similar to some of those of class <a
href="container.html#DS_INDEXABLE"><font color="#008080"><em><tt>DS_INDEXABLE</tt></em></font></a>.
The difference comes from the fact that in <font color="#008080"><em><tt>DS_INDEXABLE
</tt></em></font>keys are contiguous integers whereas in <font
color="#008080"><em><tt>DS_TABLE </tt></em></font>keys can be of
any type, and hence not necessarily contiguous. The main features
introduced in class <font color="#008080"><em><tt>DS_TABLE </tt></em></font>are
<a href="flatshort/ds_table.html#item"><font color="#008080"><em><tt>item</tt></em></font></a><font
color="#008080"><em><tt> </tt></em></font>to access items by key,
<a href="flatshort/ds_table.html#put"><font color="#008080"><em><tt>put</tt></em></font></a>
to insert new items and <a href="flatshort/ds_table.html#replace"><font
color="#008080"><em><tt>replace</tt></em></font></a> to associate
an existing key with another item, <a
href="flatshort/ds_table.html#remove"><font color="#008080"><em><tt>remove</tt></em></font></a>
to remove an item and its associated key from the table, <a
href="flatshort/ds_table.html#valid_key"><font color="#008080"><em><tt>valid_key</tt></em></font></a>
to check whether a key can be used in the table and <a
href="flatshort/ds_table.html#has"><font color="#008080"><em><tt>has</tt></em></font></a>
to check whether an item has already been inserted in the table
with a given key.</p>

<h2>Class <em><tt>DS_HASH_TABLE</tt></em></h2>

<p>One possible implementation of tables is hash tables. A hash
table is typically made up of an array where items are accessed
by integer index. Therefore the keys used in the hash tables
should provide a means to yield such integer value through a
hashing mechanism. This is exactly what feature <font
color="#008080"><em><tt>hash_code</tt></em></font> from <font
color="#008080"><em><tt>HASHABLE</tt></em></font> is for, and
therefore the second generic parameter of <a
href="flatshort/ds_hash_table.html"><font color="#008080"><em><tt>DS_HASH_TABLE</tt></em></font></a>
is constrained by <font color="#008080"><em><tt>HASHABLE</tt></em></font>.
Thanks to this implementation, features of hash tables are
usually more efficient than linked implementations since access
time in an array is bounded by a constant regardless of the
number of items in the container. However the hash code
associated with the keys is not necessarily unique, and therefore
collisions may happen and hence slow down the process. No
optimization effort has been made when writing <font
color="#008080"><em><tt>DS_HASH_TABLE</tt></em></font> in order
to take care of collisions. In particular it doesn't take
advantage of the well known algorithms using prime numbers when
collisions occur. The implementation used here is very simple and
was deemed satisfactory enough for its usage in the rest if the <em>Gobo
Eiffel </em>libraries. In particular, some benchmarks have proven
that <font color="#008080"><em><tt>DS_HASH_TABLE </tt></em></font>behaved
better than class <font color="#008080"><em><tt>HASH_TABLE</tt></em></font>
from <a href="see_also.html#EiffelBase"><em>EiffelBase</em></a>
which uses a collision-resolution algorithm based on prime
numbers. It is not clear to me whether this difference of
performance comes from the algorithm itself or from other
differences between the implementation of both hash tables, but
it is unlikely that the implementation of <font color="#008080"><em><tt>DS_HASH_TABLE
</tt></em></font>will use more sophisticated algorithms in the
future based on these observations.</p>

<p>Another interesting remark about efficiency is that the
performance of hash tables depends on the number of collisions
that may occur in the table. Therefore the implementation for the
<font color="#008080"><em><tt>hash_code</tt></em></font> of the
keys is very important since returning often the same value for
different keys will trigger too many collisions and yield
performance degradations. If the hash table appears to be slow
and the type of the keys is one of the basic Eiffel types such as
<font color="#008080"><em><tt>STRING</tt></em></font>, it is
recommended to try with another Eiffel compiler whose
implementation of <font color="#008080"><em><tt>hash_code </tt></em></font>may
be better in order to see if it makes a difference. Finally, if
the default implementation of <font color="#008080"><em><tt>hash_code
</tt></em></font>is not optimal for a given set of keys, one can
inherit from <font color="#008080"><em><tt>DS_HASH_TABLE </tt></em></font>and
redefine its feature <font color="#008080"><em><tt>hash_code </tt></em></font>to
provide a better implementation. For example, consider a school
which keeps track of project assignments in a table indexed by
students. The obvious solution is to declare:</p>

<blockquote>
    <pre><em>assignments</em>: <em>DS_HASH_TABLE</em> [<em>PROJECT</em>, <em>STUDENT</em>]</pre>
</blockquote>

<p>However the implementation of <font color="#008080"><em><tt>hash_code
</tt></em></font>in class <font color="#008080"><em><tt>STUDENT</tt></em></font>
inherited from <font color="#008080"><em><tt>PERSON</tt></em></font>
just returns the age of the person. This implementation of <font
color="#008080"><em><tt>hash_code</tt></em></font> in class <font
color="#008080"><em><tt>PERSON</tt></em></font> is perfectly
valid in most cases, but it is clear that when dealing with
students in the same classroom it is likely that they will all
have more or less the same age and hence the same hash code. In
this particular case it is better to provide a better
implementation for <font color="#008080"><em><tt>hash_code </tt></em></font>in
<font color="#008080"><em><tt>DS_HASH_TABLE </tt></em></font>to
avoid the numerous collisions:</p>

<blockquote>
    <pre><font color="#000080"><em><strong>class</strong></em></font><em> ASSIGNMENTS
</em><font color="#000080"><em><strong>inherit</strong></em></font><em>
    DS_HASH_TABLE </em>[<em>PROJECT</em>,<em> STUDENT</em>]<em>
        </em><font color="#000080"><em><strong>redefine</strong></em></font><em>
            hash_code
        </em><font color="#000080"><em><strong>end
creation</strong></em></font><em>
    make
</em><font color="#000080"><em><strong>feature</strong></em></font><em> </em>{<em>NONE</em>} <font
color="#008000">-- Implementation</font><em>
    hash_code </em>(<em>s</em>:<em> STUDENT</em>):<em> INTEGER </em><font
color="#000080"><em><strong>is</strong></em></font><em>
            </em><font color="#008000">-- Hash code of student</font> <em>s
        </em><font color="#000080"><em><strong>do</strong></em></font><em>
            </em><font color="#008080"><em>Result</em></font><em> </em>:=<em> s</em>.<em>name</em>.<em>hash_code
        </em><font color="#000080"><em><strong>end
end</strong></em></font><em> </em><font color="#008000">-- class ASSIGNMENTS</font></pre>
</blockquote>

<p>The keys are directly stored in the hash table without being
copied. Therefore it is important that the hash code associated
with each key doesn't change while the key is stored in the hash
table. Otherwise the hashing mechanism would be broken and it
would be impossible to access the item associated with this key.
Likewise, keys are compared in the hash table using the feature <font
color="#008080"><em><tt>is_equal</tt></em></font> from class <font
color="#008080"><em><tt>GENERAL</tt></em></font>. Therefore if
the critera used to implement the function <font color="#008080"><em><tt>is_equal
</tt></em></font>are changed while the key is stored in the hash
table, this key might not be recognized properly anymore within
this hash table. Needless to say that if two keys are considered
equal, they should have the same hash code. The solution when the
hash code or equality criteria of a key are likely to vary while
the key is stored in the hash table is clone that key when
inserting an item associated with it.</p>

<p>The class <font color="#008080"><em><tt>DS_HASH_TABLE </tt></em></font>provides
traversal facilities inherited from <a
href="traversal.html#DS_BILINEAR"><font color="#008080"><em><tt>DS_BILINEAR</tt></em></font></a>.
Although all items will be visited once and only once during a
traversal, they will be traversed in an unpredictable order and
subsequent traversals may traverse the items in different orders.
This is because a hash table is not an ordered containter as can
a list be. Items are not inserted before or after other items in
the hash table but based on a hashing mechanism and
collision-resolution algorithm. Therefore, altering the hash
table by adding or removing items or by resizing the table may
change the order of the items and hence will invalidate current
traversals. For this reason, most of these routines will move <font
color="#008080"><em><tt>off </tt></em></font>all existing cursors
currently traversing the hash table (see the header comments of
these routines for details).</p>

<p>As you may now realize after reading the first paragraph
above, the performance of hash tables is one of their <em>raison
d'tre</em>. This had of course to be taken into account when
implementing the routines inherited from <font color="#008080"><em><tt>DS_TABLE</tt></em></font>.
Because of the precondition of <a
href="flatshort/ds_hash_table.html#item"><font color="#008080"><em><tt>item</tt></em></font></a>
which states that in order to be able to query an item associated
with a given key, that item has to exist in the first place, one
has to write:</p>

<blockquote>
    <pre><font color="#000080"><em><strong>if</strong></em></font><em> table</em>.<em>has </em>(<em>k</em>)<em> </em><font
color="#000080"><em><strong>then</strong></em></font><em>
    v </em>:= <em>table</em>.<em>item </em>(<em>k</em>)<em>
</em><font color="#000080"><em><strong>else</strong></em></font><em>
    ...
</em><font color="#000080"><em><strong>end</strong></em></font></pre>
</blockquote>

<p>However, both <a href="flatshort/ds_hash_table.html#has"><font
color="#008080"><em><tt>has</tt></em></font></a><font
color="#008080"><em><tt> </tt></em></font>and <a
href="flatshort/ds_hash_table.html#item"><font color="#008080"><em><tt>item</tt></em></font></a><font
color="#008080"><em><tt> </tt></em></font>will have to compute
the hash code of <font color="#008080"><em><tt>k</tt></em></font>
and deal with possible collisions in the hash table. In other
words we do twice the same thing. The solution adopted in the
current implementation of <font color="#008080"><em><tt>DS_HASH_TABLE
</tt></em></font>is to keep track of the result of last hashing
operation in the hash table in a cache. Therefore, in the code
above, most of the work of accessing the item at key <font
color="#008080"><em><tt>k</tt></em></font> will be done only once
in the routine <font color="#008080"><em><tt>has</tt></em></font>,
and <font color="#008080"><em><tt>item </tt></em></font>will
realize that the key given as argument is the same and hence
avoid calling the hashing mechanism again.</p>

<p>Another solution to avoid that would have been to get rid of
the precondition in <font color="#008080"><em><tt>item </tt></em></font>and
return <font color="#008080"><em><tt>Void</tt></em></font> when
there is no item associated with key <font color="#008080"><em><tt>k</tt></em></font>.
However this solution goes against the principle of <em>Design by
Contract</em> since getting <font color="#008080"><em><tt>Void</tt></em></font>
could either mean that there is no item for key <font
color="#008080"><em><tt>k</tt></em></font> or that there is
actually one which happens to be <font color="#008080"><em><tt>Void</tt></em></font>.
Therefore this solution has not been adopted in <font
color="#008080"><em><tt>DS_HASH_TABLE </tt></em></font>but a
better designed alternative also based on the <em>try-and-see</em>
principle is available:</p>

<blockquote>
    <pre><em>table</em>.<em>search </em>(<em>k</em>)
<font color="#000080"><em><strong>if</strong></em></font><em> table</em>.<em>found </em><font
color="#000080"><em><strong>then</strong></em></font><em>
    v </em>:= <em>table</em>.<em>found_item
</em><font color="#000080"><em><strong>else</strong></em></font><em>
    ...
</em><font color="#000080"><em><strong>end</strong></em></font></pre>
</blockquote>

<p>Both code excerpts should have the same execution time
performance thanks to <font color="#008080"><em><tt>DS_HASH_TABLE</tt></em></font>'s
internal optimizations. Using one or the other is just a question
of taste. I personally prefer the first pattern.</p>

<hr size="1">

<table border="0" width="100%">
    <tr>
        <td><address>
            <font size="2"><b>Copyright  1999</b></font><font
            size="1"><b>, </b></font><font size="2"><strong>Eric
            Bezault</strong></font><strong> </strong><font
            size="2"><br>
            <strong>mailto:</strong></font><a
            href="mailto:ericb@gobosoft.com"><font size="2">ericb@gobosoft.com</font></a><font
            size="2"><br>
            <strong>http:</strong></font><a
            href="http://www.gobosoft.com"><font size="2">//www.gobosoft.com</font></a><font
            size="2"><br>
            <strong>Last Updated:</strong> 25 September 1999</font><br>
            <!--webbot bot="PurpleText"
            preview="
$Date: 1999/10/02 11:59:41 $ 
$Revision: 1.6 $"
            --> 
        </address>
        </td>
        <td align="right" valign="top"><a
        href="http://www.gobosoft.com"><img
        src="../image/home.gif" alt="Home" border="0" width="40"
        height="40"></a><a href="index.html"><img
        src="../image/toc.gif" alt="Toc" border="0" width="40"
        height="40"></a><a href="list.html"><img
        src="../image/previous.gif" alt="Previous" border="0"
        width="40" height="40"></a><a href="dispenser.html"><img
        src="../image/next.gif" alt="Next" border="0" width="40"
        height="40"></a></td>
    </tr>
</table>
</body>
</html>