OpenCores
URL https://opencores.org/ocsvn/scarts/scarts/trunk

Subversion Repositories scarts

[/] [scarts/] [trunk/] [toolchain/] [scarts-gcc/] [gcc-4.1.1/] [libstdc++-v3/] [docs/] [html/] [ext/] [pb_assoc/] [resize_policies.html] - Blame information for rev 20

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 20 jlechner
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2
<html>
3
    <head>
4
        <title>Resize Policies</title>
5
        <meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1">
6
        <meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
7
    </head>
8
<body bgcolor = "white">
9
<h1>Hash-Based Continers' Resize Policies</h1>
10
 
11
<p>
12
    This subsection describes resize policies. It is organized as follows:
13
</p>
14
 
15
<ol>
16
    <li> The <a href = "#general">General Terms</a> Subsection describes general
17
        terms.
18
    </li>
19
    <li> The <a href = "#size_policies">Size Policies</a> Subsection describes size
20
    policies.
21
    </li>
22
    <li> The <a href = "#trigger_policies">Trigger Policies</a> Subsection describes trigger
23
    policies.
24
    </li>   <li> The <a href = "#imp_pb_assoc">Implementation in <tt>pb_assoc</tt></a>
25
        Subsection describes the implementation of these concepts in <tt>pb_assoc</tt>.
26
    </li>
27
</ol>
28
 
29
 
30
<h2><a name = "general">General Terms</a></h2>
31
 
32
<p>
33
    Hash-tables, as opposed to trees, do not naturally grow or shrink. It
34
is necessary to specify policies to determine how and when a hash table should change
35
its size.
36
</p>
37
 
38
<p>
39
    In general, resize policies can be decomposed into (probably orthogonal)
40
policies:
41
</p>
42
<ol>
43
    <li> A <i>size policy</i> indicating <i>how</i> a hash table should
44
grow (<i>e.g.,</i> it should multiply by powers of 2).
45
    </li>
46
    <li> A <i>trigger policy</i> indicating <i>when</i> a hash table should
47
grow (<i>e.g.,</i> a load factor is exceeded).
48
    </li>
49
</ol>
50
 
51
 
52
 
53
<h2><a name = "size_policies">Size Policies</a></h2>
54
 
55
<p>
56
    Size policies determine how a hash table
57
changes size. These policies are simple, and there are relatively
58
few sensible options. An exponential-size policy (with the initial
59
size and growth factors both powers of 2) works well with a
60
mask-based range-hashing function (see the <a href = "hash_policies.html#hash_policies_range_hashing_policies">Range-Hashing Policies</a> subsection),
61
and is the
62
hard-wired policy used by Dinkumware
63
[<a href = "references.html#dinkumware_stl">dinkumware_stl</a>]. A
64
prime-list based policy works well with a modulo-prime range
65
hashing function (see the <a href = "hash_policies.html#hash_policies_range_hashing_policies">Range-Hashing Policies</a> subsection),
66
and is the
67
hard-wired policy used by SGI's implementation
68
[<a href = "references.html#sgi_stl">sgi_stl</a>].
69
</p>
70
 
71
 
72
 
73
 
74
<h2><a name = "trigger_policies">Trigger Policies</a></h2>
75
 
76
<p>
77
    Trigger policies determine when a hash table changes size.
78
Following is a description of two polcies: <i>load-check</i>
79
policies, and a collision-check policies.
80
</p>
81
 
82
<p>
83
    Load-check policies are straightforward. The user
84
specifies two factors, <i>&alpha;<sub>min</sub></i> and <i>&alpha;<sub>max</sub></i>, and
85
the hash table maintains the invariant that
86
</p>
87
<p>
88
    <i>
89
        <a name = "eqn:load_factor_min_max">
90
        &alpha;<sub>min</sub>
91
        &le;
92
        (number of stored elements) / (hash-table size)
93
        &le;
94
        &alpha;<sub>max</sub>
95
        </a>
96
    </i>
97
    (1)
98
    .
99
</p>
100
 
101
<p>
102
    Collision-check policies work in the opposite direction of
103
load-check policies. They focus on keeping the number of
104
collisions moderate and hoping
105
that the size of the table will not grow very large,
106
instead of keeping a moderate load-factor and
107
hoping that the number of collisions will be small.
108
A
109
maximal collision-check policy resizes when the shortest
110
probe-sequence grows too large.
111
</p>
112
 
113
 
114
<p>
115
    Consider Figure
116
<a href = "#balls_and_bins">Balls and bins</a>.
117
    Let the size of the hash table be denoted by <i>m</i>, the
118
length of a probe sequence be denoted by <i>k</i>, and some load
119
factor be denoted by &alpha;. We would like to calculate the
120
minimal length of <i>k</i>, such that if there were <i>&alpha; m</i> elements
121
in the hash table, a probe sequence of length <i>k</i> would be found
122
with probability at most <i>1/m</i>.
123
</p>
124
 
125
<h6 align = "center">
126
<a name = "balls_and_bins">
127
<img src = "balls_and_bins.jpg" width = "60%" alt = "no image">
128
</a>
129
Balls and bins.
130
</h6>
131
 
132
 
133
<p>
134
    Denote the probability that a probe sequence of length <i>k</i>
135
appears in bin <i>i</i> by <i>p<sub>i</sub></i>, the length of the probe sequence
136
of bin <i>i</i> by <i>l<sub>i</sub></i>, and assume uniform distribution.
137
Then
138
</p>
139
    <p>
140
    <a name = "eqn:prob_of_p1">
141
        <i>p<sub>1</sub>
142
        = </i>(3)
143
    </a>
144
    </p>
145
    <p>
146
    <i>
147
    <b>P</b>(l<sub>1</sub> &ge; k)
148
    =
149
    </i>
150
    </p>
151
    <p>
152
    <i><b>P</b>(l<sub>1</sub> &ge; &alpha; ( 1 + k / &alpha; - 1 )
153
    &le; </i>(a)
154
    </p>
155
    <p>
156
    <i>
157
    e
158
    ^
159
    (
160
        -
161
        (
162
            &alpha; ( k / &alpha; - 1 )<sup>2</sup>
163
        )
164
        /2
165
    )
166
    </i>
167
    ,
168
</p>
169
<p>
170
    where (a) follows from the Chernoff bound
171
[<a href = "references.html#motwani95random">motwani95random</a>].
172
To
173
calculate the probability that <i>some</i> bin contains a probe
174
sequence greater than <i>k</i>, we note that the <i>l<sub>i</sub></i> are
175
negatively-dependent
176
[<a href = "references.html#dubhashi98neg">dubhashi98neg</a>].
177
Let <i><b>I</b>(.)</i>
178
denote the indicator function. Then
179
    <p>
180
    <a name = "eqn:at_least_k_i_n_some_bin">
181
        <i><b>P</b>( exists<sub>i</sub> l<sub>i</sub> &ge; k )
182
        = </i>(3)
183
    </a>
184
    </p>
185
    <p>
186
    <i>
187
   <b>P</b>
188
    (
189
        &sum; <sub>i = 1</sub><sup>m</sup>
190
        <b>I</b>(l<sub>i</sub> &ge; k) &ge; 1
191
    )
192
    =
193
    </i>
194
    </p>
195
    <p>
196
    <i>
197
    <b>P</b>
198
    (
199
        &sum; <sub>i = 1</sub><sup>m</sup>
200
        <b>I</b>
201
        (
202
            l<sub>i</sub> &ge; k
203
        )
204
        &ge;
205
        m  p<sub>1</sub> ( 1 + 1 / (m p<sub>1</sub>) - 1 )
206
    )
207
    &le; </i>(a)
208
    </p>
209
    <p>
210
    <i>
211
    e
212
    ^
213
    (
214
        (
215
            -
216
            m p<sub>1</sub>
217
            (
218
                1 / (m p<sub>1</sub>) - 1
219
            )
220
            <sup>2</sup>
221
        )
222
        /
223
        2
224
    )
225
    ,
226
    </i>
227
    </p>
228
<p>
229
where (a) follows from the fact that the Chernoff bound can be
230
applied to negatively-dependent variables
231
[<a href = "references.html#dubhashi98neg">dubhashi98neg</a>].
232
Inserting <a href = "#prob_of_p1">(2)</a> into
233
<a href = "#at_least_k_i_n_some_bin">(3)</a>, and equating with <i>1/m</i>,
234
we obtain
235
</p>
236
<p>
237
    <i>
238
    k
239
    ~
240
    &radic;
241
    (
242
        2 &alpha; </i>ln<i> 2 m </i>ln<i>(m) )
243
    )
244
    </i>
245
    .
246
</p>
247
 
248
 
249
 
250
 
251
 
252
 
253
 
254
 
255
 
256
<h2><a name = "imp_pb_assoc">Implementation in <tt>pb_assoc</tt></a></h2>
257
 
258
<p>
259
    The resize policies in the previous subsection are conceptually straightforward. The design
260
of hash-based containers' size-related interface is complicated by some factors.
261
</p>
262
<ol>
263
    <li> Most containers, <i>i.e.</i> lists, trees, and vectors, have a single "size" concept. There is no
264
distinction between the number of entries the container holds and the number of entries it is using. This,
265
of course, is not the case for hash-based containers. Moreover, even describing the
266
"actual" size of a hash-based container (as opposed to its logical size) is difficult - a probing-based container
267
holds entries to elements, even those it does not use, while a chaining-based container holds pointers to entries.
268
    </li>
269
    <li>
270
    The policies mentioned above operate in terms of invariants. <i>E.g.</i> a load-check trigger policy
271
maintains an invariant concerning the load factor of a container object. This is sometimes too rigid:
272
    <ol>
273
        <li>In some cases it is desirable to allow controlled override of an entire policy, <i>e.g.</i> by externally resizing a container object (or giving it an initial size, which is a special case of externally resizing the container).
274
        </li>
275
        <li>
276
            In other cases it is desirable to allow changing the specifics of a policy in runtime, <i>e.g.</i>, changing the load factors of a load-check policy.
277
        </li>
278
    </ol>
279
    </li>
280
    <li>
281
    Resize policies interact strongly with hash policies. Performance-wise, for example, it is undesirable to use an exponential size policy of powers of two with a modulo range-hashing function, and it is undesirable to use a prime size policy with a mask range-hashing function. In other cases, the effects are more dramatic. For example, using a quadratic probe function with an exponential size policy will probably cause cases where the container object has available entries which are never reached by the probe function. (<a href = "hash_policies.html">Hash Policies</a>
282
discusses the previous concepts.)
283
    </li>
284
</ol>
285
 
286
<p>
287
    Clearly, the more of these points an interface addresses, the greater its flexibility but the lower its encapsulation and uniformity between associative containers.
288
</p>
289
 
290
 
291
 
292
<p>
293
        This library attempts to address these types of problems by delegating all size-related functionality to
294
policy classes. Hash-based containers
295
are parameterized by a resize-policy class (among others), and derive publicly from
296
the resize-policy class
297
[<a href = "references.html#alexandrescu01modern">alexandrescu01modern</a>]
298
 <i>E.g.</i>, a collision-chaining
299
hash table is defined as follows:
300
</p>
301
<pre>
302
cc_ht_map&lt;
303
  <b>class</b> Key,
304
  <b>class</b> Data,
305
  ...
306
  <b>class</b> Resize_Policy
307
  ...&gt; :
308
    <b>public</b> Resize_Policy
309
</pre>
310
 
311
<p>
312
    The containers themselves lack any functionality or public interface for manipulating sizes. A container
313
object merely forwards events to its resize policy object and queries it for needed actions.
314
</p>
315
 
316
<p>
317
    Figure
318
<a href = "#insert_resize_sequence_diagram1">
319
Insert resize sequence diagram
320
</a>
321
shows a (possible) sequence diagram of an insert operation.
322
The user inserts an element; the hash table
323
notifies its resize policy that a search has started (point A);
324
in this case, a single collision is encountered - the table
325
notifies its resize policy of this (point B); the container
326
finally notifies its resize policy that the search has ended (point C);
327
it then queries its resize policy whether a resize is needed, and if so,
328
what is the new size (points D to G); following the resize, it notifies
329
the policy that a resize has completed (point H); finally, the element
330
is inserted, and the policy notified (point I).
331
</p>
332
 
333
<h6 align = "center">
334
<a name = "insert_resize_sequence_diagram1">
335
<img src = "insert_resize_sequence_diagram1.jpg" width = "50%" alt = "no image">
336
</a>
337
</h6>
338
<h6 align = "center">
339
Insert resize sequence diagram.
340
</h6>
341
 
342
<p>
343
    This addresses, to some extent, the problems mentioned above:
344
</p>
345
<ol>
346
    <li>
347
        Different instantiations of range-hashing policies can be met with different instantiations of
348
    resize policies.
349
    </li>
350
    <li>
351
        Questions on size-related interface are avoided, since the containers have no size-related methods. Thus
352
    a container has no method for querying its actual size. It merely continuously forwards enough information to
353
    its resize policy to answer such queries; the designer of the resize policy can decide whether, or how, to design     the appropriate method. Also, a container has no methods for setting its size. It merely queries its
354
resize policy for an initial size, queries it on a new size (if the resize policy indicates a resize is needed), and
355
supports a <tt><b>protected virtual</b></tt> function for external resize.
356
    </li>
357
</ol>
358
 
359
<p>
360
    The library contains a single class for instantiating a resize policy,
361
<tt>pb_assoc</tt> contains
362
a standard resize policy,
363
<a href = "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> (the name is explained shortly).
364
In terms of interface, it is parameterized by a boolean constant indicating whether its public interface supports
365
queries of actual size and external resize operations (the inclusion and exclusion of these methods in the interface have obvious tradeoffs in terms of encapsulation and flexibility).
366
([<a href = "references.html#alexandrescu01modern">alexandrescu01modern</a>] shows many techniques for
367
changing between alternative interfaces at compile time.)
368
</p>
369
 
370
<p>
371
As noted before,
372
    size and trigger policies are usually orthogonal.
373
<a href = "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
374
is parameterized by size and trigger policies. For example,
375
a collision-chaining hash table
376
is typically be defined as follows:
377
</p>
378
<pre>
379
cc_ht_map&lt;
380
  key,
381
  data,
382
  ...
383
  hash_standard_resize_policy&lt;
384
    some_trigger_policy,
385
    some_size_policy,
386
    ...&gt; &gt;
387
</pre>
388
 
389
<p>
390
 The sole function of
391
<a href = "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
392
 is to
393
act as a standard delegator
394
[<a href = "references.html#gamma95designpatterns">gamma95designpatterns</a>] for these
395
policies.
396
 
397
<p>
398
    Figures
399
<a href = "#insert_resize_sequence_diagram2">Standard resize policy trigger sequence diagram</a>
400
    and
401
<a href = "#insert_resize_sequence_diagram3">Standard resize policy size sequence diagram</a>
402
    show sequence diagrams illustrating the interaction between
403
    the standard resize policy and its trigger and size policies, respectively.
404
</p>
405
 
406
<h6 align = "center">
407
<a name = "insert_resize_sequence_diagram2">
408
<img src = "insert_resize_sequence_diagram2.jpg" width = "50%" alt = "no image">
409
</a>
410
</h6>
411
<h6 align = "center">
412
Standard resize policy trigger sequence diagram.
413
</h6>
414
 
415
<h6 align = "center">
416
<a name = "insert_resize_sequence_diagram3">
417
<img src = "insert_resize_sequence_diagram3.jpg" width = "50%" alt = "no image">
418
</a>
419
</h6>
420
<h6 align = "center">
421
Standard resize policy size sequence diagram.
422
</h6>
423
 
424
<p>
425
    The library (currently) supports the following instantiations of size
426
and trigger policies:
427
</p>
428
 
429
<ol>
430
    <li>
431
        <a href = "hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a> implements
432
    a load check trigger policy.
433
    </li>
434
    <li>
435
        <a href = "cc_hash_max_collision_check_resize_trigger.html"><tt>cc_hash_max_collision_check_resize_trigger</tt></a>
436
    implements a collision check trigger policy.
437
    </li>
438
    <li>
439
<a href = "hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a> implemens
440
an exponential-size policy (which should be used with mask range hashing).
441
    </li>
442
    <li>
443
<a href = "hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a> implementing
444
a size policy based on a sequence of primes
445
[<a href = "references.html#sgi_stl">sgi_stl</a>] (which should be used with mod range hashing
446
    </li>
447
</ol>
448
 
449
<p>
450
    The trigger policies also support interfaces for changing their specifics which depend on compile time constants.
451
</p>
452
 
453
 
454
<p>
455
    Figure
456
<a href = "#resize_policy_cd">Resize policy class diagram</a> gives an overall picture
457
of the resize-related classes.
458
<tt>Container</tt> (which stands for any of the hash-based containers) is parameterized
459
by <tt>Resize_Policy</tt>, from which it subclasses publicly
460
[<a href = "references.html#alexandrescu01modern">alexandrescu01modern</a>].
461
This class is currently instantiated only by
462
<a href = "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>.
463
<a href = "hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a> itself
464
is parameterized by <tt>Trigger_Policy</tt> and <tt>Size_Policy</tt>.
465
Currently, <tt>Trigger_Policy</tt> is instantiated by
466
<a href = "hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>,
467
or
468
<a href = "cc_hash_max_collision_check_resize_trigger.html"><tt>cc_hash_max_collision_check_resize_trigger</tt></a>; <tt>Size_Policy</tt> is instantiated by
469
<a href = "hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>,
470
or
471
<a href = "hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>.
472
</p>
473
 
474
 
475
<h6 align = "center">
476
<a name = "resize_policy_cd">
477
<img src = "resize_policy_cd.jpg" width = "40%" alt = "no image">
478
</a>
479
</h6>
480
<h6 align = "center">
481
Resize policy class diagram.
482
</h6>
483
 
484
 
485
</body>
486
 
487
</html>

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.