URL https://opencores.org/ocsvn/or1k_soc_on_altera_embedded_dev_kit/or1k_soc_on_altera_embedded_dev_kit/trunk

Subversion Repositories or1k_soc_on_altera_embedded_dev_kit

[/] [or1k_soc_on_altera_embedded_dev_kit/] [trunk/] [linux-2.6/] [linux-2.6.24/] [Documentation/] [networking/] [fib_trie.txt] - Blame information for rev 17

Go to most recent revision | Details | Compare with Previous | View Log


                        LC-trie implementation notes.
 
Node types
----------
leaf
        An end node with data. This has a copy of the relevant key, along
        with 'hlist' with routing table entries sorted by prefix length.
        See struct leaf and struct leaf_info.
 
trie node or tnode
        An internal node, holding an array of child (leaf or tnode) pointers,
        indexed through a subset of the key. See Level Compression.
 
A few concepts explained
------------------------
Bits (tnode)
        The number of bits in the key segment used for indexing into the
        child array - the "child index". See Level Compression.
 
Pos (tnode)
        The position (in the key) of the key segment used for indexing into
        the child array. See Path Compression.
 
Path Compression / skipped bits
        Any given tnode is linked to from the child array of its parent, using
        a segment of the key specified by the parent's "pos" and "bits"
        In certain cases, this tnode's own "pos" will not be immediately
        adjacent to the parent (pos+bits), but there will be some bits
        in the key skipped over because they represent a single path with no
        deviations. These "skipped bits" constitute Path Compression.
        Note that the search algorithm will simply skip over these bits when
        searching, making it necessary to save the keys in the leaves to
        verify that they actually do match the key we are searching for.
 
Level Compression / child arrays
        the trie is kept level balanced moving, under certain conditions, the
        children of a full child (see "full_children") up one level, so that
        instead of a pure binary tree, each internal node ("tnode") may
        contain an arbitrarily large array of links to several children.
        Conversely, a tnode with a mostly empty child array (see empty_children)
        may be "halved", having some of its children moved downwards one level,
        in order to avoid ever-increasing child arrays.
 
empty_children
        the number of positions in the child array of a given tnode that are
        NULL.
 
full_children
        the number of children of a given tnode that aren't path compressed.
        (in other words, they aren't NULL or leaves and their "pos" is equal
        to this tnode's "pos"+"bits").
 
        (The word "full" here is used more in the sense of "complete" than
        as the opposite of "empty", which might be a tad confusing.)
 
Comments
---------
 
We have tried to keep the structure of the code as close to fib_hash as
possible to allow verification and help up reviewing.
 
fib_find_node()
        A good start for understanding this code. This function implements a
        straightforward trie lookup.
 
fib_insert_node()
        Inserts a new leaf node in the trie. This is bit more complicated than
        fib_find_node(). Inserting a new node means we might have to run the
        level compression algorithm on part of the trie.
 
trie_leaf_remove()
        Looks up a key, deletes it and runs the level compression algorithm.
 
trie_rebalance()
        The key function for the dynamic trie after any change in the trie
        it is run to optimize and reorganize. Tt will walk the trie upwards
        towards the root from a given tnode, doing a resize() at each step
        to implement level compression.
 
resize()
        Analyzes a tnode and optimizes the child array size by either inflating
        or shrinking it repeatedly until it fulfills the criteria for optimal
        level compression. This part follows the original paper pretty closely
        and there may be some room for experimentation here.
 
inflate()
        Doubles the size of the child array within a tnode. Used by resize().
 
halve()
        Halves the size of the child array within a tnode - the inverse of
        inflate(). Used by resize();
 
fn_trie_insert(), fn_trie_delete(), fn_trie_select_default()
        The route manipulation functions. Should conform pretty closely to the
        corresponding functions in fib_hash.
 
fn_trie_flush()
        This walks the full trie (using nextleaf()) and searches for empty
        leaves which have to be removed.
 
fn_trie_dump()
        Dumps the routing table ordered by prefix length. This is somewhat
        slower than the corresponding fib_hash function, as we have to walk the
        entire trie for each prefix length. In comparison, fib_hash is organized
        as one "zone"/hash per prefix length.
 
Locking
-------
 
fib_lock is used for an RW-lock in the same way that this is done in fib_hash.
However, the functions are somewhat separated for other possible locking
scenarios. It might conceivably be possible to run trie_rebalance via RCU
to avoid read_lock in the fn_trie_lookup() function.
 
Main lookup mechanism
---------------------
fn_trie_lookup() is the main lookup function.
 
The lookup is in its simplest form just like fib_find_node(). We descend the
trie, key segment by key segment, until we find a leaf. check_leaf() does
the fib_semantic_match in the leaf's sorted prefix hlist.
 
If we find a match, we are done.
 
If we don't find a match, we enter prefix matching mode. The prefix length,
starting out at the same as the key length, is reduced one step at a time,
and we backtrack upwards through the trie trying to find a longest matching
prefix. The goal is always to reach a leaf and get a positive result from the
fib_semantic_match mechanism.
 
Inside each tnode, the search for longest matching prefix consists of searching
through the child array, chopping off (zeroing) the least significant "1" of
the child index until we find a match or the child index consists of nothing but
zeros.
 
At this point we backtrack (t->stats.backtrack++) up the trie, continuing to
chop off part of the key in order to find the longest matching prefix.
 
At this point we will repeatedly descend subtries to look for a match, and there
are some optimizations available that can provide us with "shortcuts" to avoid
descending into dead ends. Look for "HL_OPTIMIZE" sections in the code.
 
To alleviate any doubts about the correctness of the route selection process,
a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which
gives userland access to fib_lookup().

Line No.	Rev	Author	Line
1	3	xianfeng	`LC-trie implementation notes.`
2
3			`Node types`
4			`----------`
5			`leaf`
6			`An end node with data. This has a copy of the relevant key, along`
7			`with 'hlist' with routing table entries sorted by prefix length.`
8			`See struct leaf and struct leaf_info.`
9
10			`trie node or tnode`
11			`An internal node, holding an array of child (leaf or tnode) pointers,`
12			`indexed through a subset of the key. See Level Compression.`
13
14			`A few concepts explained`
15			`------------------------`
16			`Bits (tnode)`
17			`The number of bits in the key segment used for indexing into the`
18			`child array - the "child index". See Level Compression.`
19
20			`Pos (tnode)`
21			`The position (in the key) of the key segment used for indexing into`
22			`the child array. See Path Compression.`
23
24			`Path Compression / skipped bits`
25			`Any given tnode is linked to from the child array of its parent, using`
26			`a segment of the key specified by the parent's "pos" and "bits"`
27			`In certain cases, this tnode's own "pos" will not be immediately`
28			`adjacent to the parent (pos+bits), but there will be some bits`
29			`in the key skipped over because they represent a single path with no`
30			`deviations. These "skipped bits" constitute Path Compression.`
31			`Note that the search algorithm will simply skip over these bits when`
32			`searching, making it necessary to save the keys in the leaves to`
33			`verify that they actually do match the key we are searching for.`
34
35			`Level Compression / child arrays`
36			`the trie is kept level balanced moving, under certain conditions, the`
37			`children of a full child (see "full_children") up one level, so that`
38			`instead of a pure binary tree, each internal node ("tnode") may`
39			`contain an arbitrarily large array of links to several children.`
40			`Conversely, a tnode with a mostly empty child array (see empty_children)`
41			`may be "halved", having some of its children moved downwards one level,`
42			`in order to avoid ever-increasing child arrays.`
43
44			`empty_children`
45			`the number of positions in the child array of a given tnode that are`
46			`NULL.`
47
48			`full_children`
49			`the number of children of a given tnode that aren't path compressed.`
50			`(in other words, they aren't NULL or leaves and their "pos" is equal`
51			`to this tnode's "pos"+"bits").`
52
53			`(The word "full" here is used more in the sense of "complete" than`
54			`as the opposite of "empty", which might be a tad confusing.)`
55
56			`Comments`
57			`---------`
58
59			`We have tried to keep the structure of the code as close to fib_hash as`
60			`possible to allow verification and help up reviewing.`
61
62			`fib_find_node()`
63			`A good start for understanding this code. This function implements a`
64			`straightforward trie lookup.`
65
66			`fib_insert_node()`
67			`Inserts a new leaf node in the trie. This is bit more complicated than`
68			`fib_find_node(). Inserting a new node means we might have to run the`
69			`level compression algorithm on part of the trie.`
70
71			`trie_leaf_remove()`
72			`Looks up a key, deletes it and runs the level compression algorithm.`
73
74			`trie_rebalance()`
75			`The key function for the dynamic trie after any change in the trie`
76			`it is run to optimize and reorganize. Tt will walk the trie upwards`
77			`towards the root from a given tnode, doing a resize() at each step`
78			`to implement level compression.`
79
80			`resize()`
81			`Analyzes a tnode and optimizes the child array size by either inflating`
82			`or shrinking it repeatedly until it fulfills the criteria for optimal`
83			`level compression. This part follows the original paper pretty closely`
84			`and there may be some room for experimentation here.`
85
86			`inflate()`
87			`Doubles the size of the child array within a tnode. Used by resize().`
88
89			`halve()`
90			`Halves the size of the child array within a tnode - the inverse of`
91			`inflate(). Used by resize();`
92
93			`fn_trie_insert(), fn_trie_delete(), fn_trie_select_default()`
94			`The route manipulation functions. Should conform pretty closely to the`
95			`corresponding functions in fib_hash.`
96
97			`fn_trie_flush()`
98			`This walks the full trie (using nextleaf()) and searches for empty`
99			`leaves which have to be removed.`
100
101			`fn_trie_dump()`
102			`Dumps the routing table ordered by prefix length. This is somewhat`
103			`slower than the corresponding fib_hash function, as we have to walk the`
104			`entire trie for each prefix length. In comparison, fib_hash is organized`
105			`as one "zone"/hash per prefix length.`
106
107			`Locking`
108			`-------`
109
110			`fib_lock is used for an RW-lock in the same way that this is done in fib_hash.`
111			`However, the functions are somewhat separated for other possible locking`
112			`scenarios. It might conceivably be possible to run trie_rebalance via RCU`
113			`to avoid read_lock in the fn_trie_lookup() function.`
114
115			`Main lookup mechanism`
116			`---------------------`
117			`fn_trie_lookup() is the main lookup function.`
118
119			`The lookup is in its simplest form just like fib_find_node(). We descend the`
120			`trie, key segment by key segment, until we find a leaf. check_leaf() does`
121			`the fib_semantic_match in the leaf's sorted prefix hlist.`
122
123			`If we find a match, we are done.`
124
125			`If we don't find a match, we enter prefix matching mode. The prefix length,`
126			`starting out at the same as the key length, is reduced one step at a time,`
127			`and we backtrack upwards through the trie trying to find a longest matching`
128			`prefix. The goal is always to reach a leaf and get a positive result from the`
129			`fib_semantic_match mechanism.`
130
131			`Inside each tnode, the search for longest matching prefix consists of searching`
132			`through the child array, chopping off (zeroing) the least significant "1" of`
133			`the child index until we find a match or the child index consists of nothing but`
134			`zeros.`
135
136			`At this point we backtrack (t->stats.backtrack++) up the trie, continuing to`
137			`chop off part of the key in order to find the longest matching prefix.`
138
139			`At this point we will repeatedly descend subtries to look for a match, and there`
140			`are some optimizations available that can provide us with "shortcuts" to avoid`
141			`descending into dead ends. Look for "HL_OPTIMIZE" sections in the code.`
142
143			`To alleviate any doubts about the correctness of the route selection process,`
144			`a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which`
145			`gives userland access to fib_lookup().`

Browse

Tools

Subversion Repositories or1k_soc_on_altera_embedded_dev_kit

[/] [or1k_soc_on_altera_embedded_dev_kit/] [trunk/] [linux-2.6/] [linux-2.6.24/] [Documentation/] [networking/] [fib_trie.txt] - Blame information for rev 17