1 |
20 |
jlechner |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
2 |
|
|
<html>
|
3 |
|
|
<head>
|
4 |
|
|
<title>Data-Structure Genericity</title>
|
5 |
|
|
<meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1">
|
6 |
|
|
<meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
|
7 |
|
|
</head>
|
8 |
|
|
<body bgcolor = "white">
|
9 |
|
|
<h1>Data-Structure Genericity</h1>
|
10 |
|
|
|
11 |
|
|
<p>
|
12 |
|
|
This section describes genericity over different underlying data-structures. It is organized as follows.
|
13 |
|
|
</p>
|
14 |
|
|
<ol>
|
15 |
|
|
<li><a href = "#problem">The Basic Problem</a></li>
|
16 |
|
|
<li><a href = "#ds_hierarchy">Container Hierarchy</a></li>
|
17 |
|
|
<li><a href = "#ds_traits">Data-Structure Tags and Traits</a></li>
|
18 |
|
|
<li><a href = "#find_range">Find-Type and Range-Type Methods and Iterators</a></li>
|
19 |
|
|
</ol>
|
20 |
|
|
|
21 |
|
|
<h2><a name = "problem">The Basic Problem</a></h2>
|
22 |
|
|
|
23 |
|
|
<p>
|
24 |
|
|
The design attempts to address the following problem. When writing a function manipulating a generic container object, what is the behaviour of the object? <i>E.g.</i>, suppose one writes
|
25 |
|
|
</p>
|
26 |
|
|
<pre>
|
27 |
|
|
<b>template</b><
|
28 |
|
|
<b>class</b> Cntnr>
|
29 |
|
|
<b>void</b> some_op_sequence
|
30 |
|
|
(Cntnr &r_cntnr)
|
31 |
|
|
{
|
32 |
|
|
...
|
33 |
|
|
}
|
34 |
|
|
</pre>
|
35 |
|
|
then one needs to address the following questions in the body
|
36 |
|
|
of <tt>some_op_sequence</tt>:
|
37 |
|
|
<ol>
|
38 |
|
|
<li> Which types and methods does <tt>Cntnr</tt> support? Containers based on hash tables can be queries for the hash-functor type and object; this is meaningless for tree-based containers. Containers based on trees can be split, joined, or can erase iterators and return the following iterator; this cannot be done by hash-based containers. </li>
|
39 |
|
|
<li>
|
40 |
|
|
What are the guarantees of <tt>Cntnr</tt>? A container based on a probing hash-table invalidates all iterators when it is modified; this is not the case for containers based on node-based trees. Containers based on a node-based tree can be split or joined without exceptions; this is not the case for containers based on vector-based trees.
|
41 |
|
|
</li>
|
42 |
|
|
<li> How does the container maintain its elements? containers based on splay trees or lists with update policies "cache" "frequently accessed" elements; containers based on most other underlying data-structures do not.</li>
|
43 |
|
|
</ol>
|
44 |
|
|
|
45 |
|
|
<h2><a name = "ds_hierarchy">Container Hierarchy</a></h2>
|
46 |
|
|
|
47 |
|
|
<p>
|
48 |
|
|
Figure
|
49 |
|
|
<a href = "#cd">Class hierarchy</a>
|
50 |
|
|
shows the container hierarchy.
|
51 |
|
|
</p>
|
52 |
|
|
<ol>
|
53 |
|
|
<li>
|
54 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>
|
55 |
|
|
contains types and methods shared by all associative containers, <i>e.g.</i>, the type <tt>allocator</tt> and the method <tt>find</tt>.
|
56 |
|
|
</li>
|
57 |
|
|
<li><a href = "basic_assoc_cntnr.html"><tt>basic_hash_assoc_cntnr</tt></a> subclasses
|
58 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>, and contains
|
59 |
|
|
types and methods shared by all hash-based containers, <i>e.g.</i>, the type <tt>hash_fn</tt>.
|
60 |
|
|
</li>
|
61 |
|
|
<ol>
|
62 |
|
|
<li>
|
63 |
|
|
<a href = "cc_hash_assoc_cntnr.html"><tt>cc_hash_assoc_cntnr</tt></a>
|
64 |
|
|
and
|
65 |
|
|
<a href = "gp_hash_assoc_cntnr.html"><tt>gp_hash_assoc_cntnr</tt></a>
|
66 |
|
|
each subclass
|
67 |
|
|
<a href = "basic_hash_assoc_cntnr.html"><tt>basic_hash_assoc_cntnr</tt></a>, and encapsulate collision-chaining and (general) probing hash tables, respectively. These two types of hash tables have somewhat different policies and methods (<i>i.e.</i>, constructors and policy-access methods).
|
68 |
|
|
</li>
|
69 |
|
|
</ol>
|
70 |
|
|
<li>
|
71 |
|
|
<a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a>
|
72 |
|
|
subclasses one of
|
73 |
|
|
<a href = "basic_tree_assoc_cntnr.html"><tt>basic_tree_assoc_cntnr</tt></a> which
|
74 |
|
|
subclasses
|
75 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>.
|
76 |
|
|
<a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a>
|
77 |
|
|
encapsulates a tree-based container, and is parameterized by which underlying data-structure to use (<i>e.g.</i>, a red-black tree);
|
78 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>.
|
79 |
|
|
is specialized to the capabilities of the underlying structure.
|
80 |
|
|
<a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a> contains some additional methods over
|
81 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>,
|
82 |
|
|
<i>e.g.</i>, split and join methods.
|
83 |
|
|
</li>
|
84 |
|
|
<li>
|
85 |
|
|
<a href = "lu_assoc_cntnr.html"><tt>lu_assoc_cntnr</tt></a>
|
86 |
|
|
subclasses
|
87 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>,
|
88 |
|
|
and encapsulates a list with update policies.
|
89 |
|
|
</li>
|
90 |
|
|
</ol>
|
91 |
|
|
|
92 |
|
|
<p>
|
93 |
|
|
The hierarchy is composed naturally, such that each container inherits
|
94 |
|
|
all types and methods from its base. <a href = "#ds_traits">Data-Structure Tags and Traits</a> discusses how to query which types and methods each container supports.
|
95 |
|
|
</p>
|
96 |
|
|
|
97 |
|
|
|
98 |
|
|
|
99 |
|
|
<h2><a name = "ds_traits">Data-Structure Tags and Traits</a></h2>
|
100 |
|
|
|
101 |
|
|
<p>
|
102 |
|
|
<tt>pb_assoc</tt> contains a tag and traits mechanism similar to that of the STL's iterators.
|
103 |
|
|
</p>
|
104 |
|
|
|
105 |
|
|
<p>
|
106 |
|
|
<tt>pb_assoc</tt> contains a tag hierarchy corresponding to the hierarchy
|
107 |
|
|
in Figure
|
108 |
|
|
<a href = "#cd">Class hierarchy</a>.
|
109 |
|
|
The tag hierarchy is shown in Figure
|
110 |
|
|
<a href = "#ds_tag_cd">Data-structure tag class hierarchy</a>.
|
111 |
|
|
</p>
|
112 |
|
|
|
113 |
|
|
<h6 align = "center">
|
114 |
|
|
<a name = "cd">
|
115 |
|
|
<img src = "ds_tag_cd.jpg" width = "70%" alt = "no image">
|
116 |
|
|
</h6>
|
117 |
|
|
</a>
|
118 |
|
|
<h6 align = "center">
|
119 |
|
|
Data-structure tag class hierarchy.
|
120 |
|
|
</h6>
|
121 |
|
|
|
122 |
|
|
<p>
|
123 |
|
|
<a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a> publicly defines
|
124 |
|
|
<tt>ds_category</tt> as one of the classes in Figure
|
125 |
|
|
.
|
126 |
|
|
Given any container <tt>Cntnr</tt>, the tag of the underlying data-structure can be found via <tt><b>typename</b> Cntnr::ds_category</tt>.
|
127 |
|
|
</p>
|
128 |
|
|
|
129 |
|
|
<p>
|
130 |
|
|
Additionally, a traits mechanism can be used to query a container type for its attributes. Given any container <tt>Cntnr</tt>, then
|
131 |
|
|
<tt><a href = "ds_traits.html">ds_traits</a><Cntnr></tt>
|
132 |
|
|
is a traits class identifying the properties of the container.
|
133 |
|
|
</p>
|
134 |
|
|
|
135 |
|
|
<p>
|
136 |
|
|
To find if a container can throw when a key is erased (which is true for vector-based trees, for example), one can use
|
137 |
|
|
</p>
|
138 |
|
|
<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::erase_can_throw</tt>,
|
139 |
|
|
for example.
|
140 |
|
|
|
141 |
|
|
<p>
|
142 |
|
|
Some of the definitions in
|
143 |
|
|
<a href = "ds_traits.html"><tt>ds_traits</tt></a>
|
144 |
|
|
are dependent on other definitions. <i>E.g.</i>, if
|
145 |
|
|
<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join</tt>
|
146 |
|
|
is <tt><b>true</b></tt> (which is the case for containers based on trees),
|
147 |
|
|
then
|
148 |
|
|
<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join_can_throw</tt>
|
149 |
|
|
indicates whether splits or joins can throw exceptions (which is true for vector-based trees); otherwise
|
150 |
|
|
<a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join_can_throw</tt>
|
151 |
|
|
will yield a compilation error. (This is somewhat similar to a compile-time
|
152 |
|
|
version of the COM model
|
153 |
|
|
[<a href = "references.html#mscom">mscom</a>]).
|
154 |
|
|
|
155 |
|
|
|
156 |
|
|
<h2><a name = "find_range">Find-Type and Range-Type Methods and Iterators</a></h2>
|
157 |
|
|
|
158 |
|
|
<p>
|
159 |
|
|
<tt>pb_assoc</tt> differentiates between two types of methods: find-type methods, and range-type methods. For example, <tt>find</tt> is a find-type method, since a container object searches for an element with a given key; <tt>insert</tt> is a find-type method, since, by STL convention, a container object returns an iterator corresponding to an element with a given key; <tt>begin</tt> and <tt>end</tt> are range-type methods, since they are not used to find a specific element, but rather to go over all elements in a container object.
|
160 |
|
|
</p>
|
161 |
|
|
|
162 |
|
|
<p>
|
163 |
|
|
Correspondingly, containers in <tt>pb_assoc</tt> define two families of iterators. <tt>const_find_iterator</tt> and <tt>find_iterator</tt> are the iterator types returned by find-type methods; <tt>const_iterator</tt> and <tt>iterator</tt> are the iterator types returned by range-type methods.
|
164 |
|
|
</p>
|
165 |
|
|
|
166 |
|
|
<p>
|
167 |
|
|
The relationship between these iterator types varies between container types. In a tree-based container, for example, <tt>const_find_iterator</tt> and <tt>const_iterator</tt> are synonymous, and <tt>find_iterator</tt> and <tt>iterator</tt> are synonymous; in a hash-based container, for example, this is not the case. Futhermore, find-type iterators in a hash-based container lack movement operators, such as
|
168 |
|
|
<tt><b>operator++</b></tt>.
|
169 |
|
|
All containers, however, maintain the invariants shown in Figure
|
170 |
|
|
|
171 |
|
|
.
|
172 |
|
|
</p>
|
173 |
|
|
|
174 |
|
|
|
175 |
|
|
<p>
|
176 |
|
|
This distinction between find-type and range-type iterators and methods, while complicating the interface, has several advantages:
|
177 |
|
|
</p>
|
178 |
|
|
|
179 |
|
|
<h3>Iterators in unordered container types</h3>
|
180 |
|
|
|
181 |
|
|
<p>
|
182 |
|
|
Given an unordered container type, <i>e.g.</i>, a hash-based container, it makes no sense to move an iterator returned by a find-type method.
|
183 |
|
|
Let <tt>cntnr</tt> be an associative-container object, and
|
184 |
|
|
consider:
|
185 |
|
|
</p>
|
186 |
|
|
|
187 |
|
|
<pre>
|
188 |
|
|
std::for_each(m.find(1), m.find(5), foo);
|
189 |
|
|
</pre>
|
190 |
|
|
|
191 |
|
|
<p>
|
192 |
|
|
which applies <tt>foo</tt> to all elements in <tt>m</tt>
|
193 |
|
|
between <tt>1</tt> and <tt>5</tt>.
|
194 |
|
|
</p>
|
195 |
|
|
|
196 |
|
|
<p>If <tt>cntnr</tt> is a
|
197 |
|
|
tree-based container object, then an in-order walk will apply <tt>foo</tt>
|
198 |
|
|
to the relevant elements, <i>e.g.</i>, as in Figure
|
199 |
|
|
<a href = "#range_it_in_hts">Range iteration in different data-structures</a>
|
200 |
|
|
-A. If <tt>m</tt> is a
|
201 |
|
|
hash-based container, then the order of elements between any two
|
202 |
|
|
elements is undefined (and probably time-varying); there is no
|
203 |
|
|
guarantee that the elements traversed will coincide with the
|
204 |
|
|
<i>logical</i> elements between 1 and 5, <i>e.g.</i>, as in
|
205 |
|
|
Figure <a href = "#range_it_in_hts">Range iteration in different data-structures</a>-B.
|
206 |
|
|
</p>
|
207 |
|
|
|
208 |
|
|
<p>
|
209 |
|
|
The application of a
|
210 |
|
|
range function <i>e.g.</i>, <tt>for_each</tt>, to a
|
211 |
|
|
pair of hash-based container's iterators is possibly sensical only
|
212 |
|
|
if the iterators are those returned by <tt>begin</tt> and <tt>end</tt>,
|
213 |
|
|
respectively. Therefore, the iterator returned by
|
214 |
|
|
<tt>m</tt>'s <tt>find</tt> method should be immovable.
|
215 |
|
|
</p>
|
216 |
|
|
|
217 |
|
|
<p>
|
218 |
|
|
Another point also indicates that hash-based containers'
|
219 |
|
|
find-type iterators and range-type iterators should be distinct.
|
220 |
|
|
Consider Figure
|
221 |
|
|
<a href = "#find_its_in_hash_tables">
|
222 |
|
|
Find-type iterators in hash tables</a>-A.
|
223 |
|
|
An
|
224 |
|
|
(immovable) find-type iterator, designed only to access an
|
225 |
|
|
element, requires at most a single pointer to the element's link.
|
226 |
|
|
Conversely, an iterator designed for range operations
|
227 |
|
|
requires some more information <i>e.g.</i>, the bucket number),
|
228 |
|
|
since a cross-list traversal might be necessary. Alternatively,
|
229 |
|
|
the lists might be linked, forming a monolithic total-element
|
230 |
|
|
list, as in Figure
|
231 |
|
|
<a href = "#find_its_in_hash_tables">
|
232 |
|
|
Find-type iterators in hash tables</a>-B (this seems
|
233 |
|
|
similar to the Dinkumware design
|
234 |
|
|
[<a href = "references.html#dinkumware_stl">dinkumware_stl</a>]). This,
|
235 |
|
|
however, complicates the hash-table's operations.
|
236 |
|
|
|
237 |
|
|
<h6 align = "center">
|
238 |
|
|
<a name = "range_it_in_hts">
|
239 |
|
|
<img src = "find_iterators_range_ops_1.jpg" width = "70%" alt = "no image">
|
240 |
|
|
</a>
|
241 |
|
|
</h6>
|
242 |
|
|
<h6 align = "center">
|
243 |
|
|
Range iteration in different data-structures.
|
244 |
|
|
</h6>
|
245 |
|
|
|
246 |
|
|
|
247 |
|
|
<h6 align = "center">
|
248 |
|
|
<a name = "find_its_in_hash_tables">
|
249 |
|
|
<img src = "find_iterators_range_ops_2.jpg" width = "70%" alt = "no image">
|
250 |
|
|
</a>
|
251 |
|
|
</h6>
|
252 |
|
|
<h6 align = "center">
|
253 |
|
|
Find-type iterators in hash tables.
|
254 |
|
|
</h6>
|
255 |
|
|
|
256 |
|
|
<p>
|
257 |
|
|
As a consequence of this design,
|
258 |
|
|
</p>
|
259 |
|
|
|
260 |
|
|
<pre>
|
261 |
|
|
std::for_each(m.find(1), m.find(5), foo);
|
262 |
|
|
</pre>
|
263 |
|
|
|
264 |
|
|
<p>
|
265 |
|
|
will compile for tree-based containers, but will not compile
|
266 |
|
|
for hash-tables or other types. The returned type of <tt>find</tt>
|
267 |
|
|
is a find-type iterator. For tree-based containers, this is synonymous
|
268 |
|
|
with a range-type iterator, and therefore supports <tt><b>operator</b>++</tt>;
|
269 |
|
|
for other types of containers, a find-type iterator lacks <tt><b>operator</b>++</tt>.
|
270 |
|
|
</p>
|
271 |
|
|
|
272 |
|
|
<h3>Invalidation Guarantees</h3>
|
273 |
|
|
|
274 |
|
|
<p>
|
275 |
|
|
Consider the following snippet:
|
276 |
|
|
</p>
|
277 |
|
|
|
278 |
|
|
<pre>
|
279 |
|
|
it = c.find(3);
|
280 |
|
|
|
281 |
|
|
c.erase(5);
|
282 |
|
|
</pre>
|
283 |
|
|
|
284 |
|
|
<p>
|
285 |
|
|
Following the call to <tt>erase</tt>, what is the validity
|
286 |
|
|
of <tt>it</tt>: can it be dereferenced? can it be incremented?
|
287 |
|
|
</p>
|
288 |
|
|
|
289 |
|
|
<p>
|
290 |
|
|
The answer depends on the underlying data-structure of the container.
|
291 |
|
|
Figure
|
292 |
|
|
<a href = "#invalidation_guarantee_erase">Effect of erase in different underlying data-structures</a>
|
293 |
|
|
shows three cases: A1 and A2 show a red-black tree;
|
294 |
|
|
B1 and B2 show an ordered-vector tree; C1 and C2
|
295 |
|
|
show a collision-chaining hash table.
|
296 |
|
|
</p>
|
297 |
|
|
|
298 |
|
|
<h6 align = "center">
|
299 |
|
|
<a name = "invalidation_guarantee_erase">
|
300 |
|
|
<img src = "invalidation_guarantee_erase.jpg" width = "70%" alt = "no image">
|
301 |
|
|
</h6>
|
302 |
|
|
</a>
|
303 |
|
|
<h6 align = "center">
|
304 |
|
|
Effect of erase in different underlying data-structures.
|
305 |
|
|
</h6>
|
306 |
|
|
|
307 |
|
|
|
308 |
|
|
<ol>
|
309 |
|
|
<li>
|
310 |
|
|
`Erasing 5 from A1 yields A2. Clearly, an iterator to 3
|
311 |
|
|
can be dereferenced and incremented.
|
312 |
|
|
</li>
|
313 |
|
|
<li>
|
314 |
|
|
Erasing 5 from B1 yields B2. Clearly, an iterator to 3 is
|
315 |
|
|
not valid at all.
|
316 |
|
|
</li>
|
317 |
|
|
<li>
|
318 |
|
|
Erasing 5 from C1 yields C2. Here the situation is more complicated.
|
319 |
|
|
On the one hand, incrementing <tt>it</tt> can be undefined. On the other
|
320 |
|
|
hand, there is no problem in dereferencing <tt>it</tt>. In
|
321 |
|
|
classic STL, it is not possible to express whether <tt>it</tt>
|
322 |
|
|
is valid or not.
|
323 |
|
|
</li>
|
324 |
|
|
</ol>
|
325 |
|
|
|
326 |
|
|
<p>
|
327 |
|
|
Thus again, the iterator concept seems overloaded. Distinguishing
|
328 |
|
|
between find and range types allows fine-grained invalidation guarantees.
|
329 |
|
|
<a href = #invalidation_guarantee_cd">Invalidation guarantees class hierarchy</a>
|
330 |
|
|
shows tags corresponding to different types of invalidation guarantees.
|
331 |
|
|
</p>
|
332 |
|
|
|
333 |
|
|
<h6 align = "center">
|
334 |
|
|
<a name = "invalidation_guarantee_cd">
|
335 |
|
|
<img src = "invalidation_guarantee_cd.jpg" width = "70%" alt = "no image">
|
336 |
|
|
</h6>
|
337 |
|
|
</a>
|
338 |
|
|
<h6 align = "center">
|
339 |
|
|
Invalidation guarantees class hierarchy.
|
340 |
|
|
</h6>
|
341 |
|
|
|
342 |
|
|
<ol>
|
343 |
|
|
<li> <a href = "basic_invalidation_guarantee.html"><tt>basic_invalidation_guarantee</tt></a> corresponds to a basic guarantee that a find-type iterator, a found pointer, or a found reference, remains valid as long as the container object is not modified.
|
344 |
|
|
</li>
|
345 |
|
|
<li> <a href = "find_invalidation_guarantee.html"><tt>find_invalidation_guarantee</tt></a> corresponds to a guarantee that a find-type iterator, a found pointer, or a found reference, remains valid even if the containter object is modified.
|
346 |
|
|
</li>
|
347 |
|
|
<li> <a href = "range_invalidation_guarantee.html"><tt>range_invalidation_guarantee</tt></a> corresponds to a guarantee that a range-type iterator remains valid even if the containter object is modified.
|
348 |
|
|
</li>
|
349 |
|
|
</ol>
|
350 |
|
|
|
351 |
|
|
|
352 |
|
|
<p>
|
353 |
|
|
To find the invalidation guarantee of a container, one can use
|
354 |
|
|
</p>
|
355 |
|
|
<pre>
|
356 |
|
|
<b>typename</b> <a href = "ds_traits.html">ds_traits</a><Cntnr>::invalidation_guarantee
|
357 |
|
|
</pre>
|
358 |
|
|
|
359 |
|
|
<p>
|
360 |
|
|
which is one of the classes in Figure
|
361 |
|
|
<a href = #invalidation_guarantee_cd">Invalidation guarantees class hierarchy</a>.
|
362 |
|
|
</p>
|
363 |
|
|
|
364 |
|
|
|
365 |
|
|
|
366 |
|
|
</body>
|
367 |
|
|
|
368 |
|
|
</html>
|