1 |
17 |
ghutchis |
==========================
|
2 |
|
|
| Component Descriptions |
|
3 |
|
|
==========================
|
4 |
|
|
|
5 |
|
|
1.0 Timing Closure Components
|
6 |
|
|
|
7 |
|
|
The timing closure components are intended for designing custom blocks and pipeline
|
8 |
|
|
stages. Each block provides timing closure for block outputs, or for block inputs
|
9 |
|
|
and outputs.
|
10 |
|
|
|
11 |
|
|
The two most common design methodologies today are registered-output (RO) and
|
12 |
|
|
registered-input-registered-output (RIRO). The library is generally built around
|
13 |
|
|
an assumption of an RO design style but also supports RIRO.
|
14 |
|
|
|
15 |
|
|
1.1 sd_input
|
16 |
|
|
|
17 |
|
|
When using an RO design style, the sd_input provides timing closure for a block's
|
18 |
|
|
consumer interface. The only block output for the consumer interface is c_drdy.
|
19 |
|
|
sd_input also provides a one-word buffer on c_data, but doesn't provide timing
|
20 |
|
|
closure for this input.
|
21 |
|
|
|
22 |
|
|
1.2 sd_output
|
23 |
|
|
|
24 |
|
|
The sd_output is the companion block to sd_input, providing timing closure for a
|
25 |
|
|
block's producer interface (or interfaces). It provides timing closure on p_srdy
|
26 |
|
|
and p_data.
|
27 |
|
|
|
28 |
|
|
1.3 sd_iohalf
|
29 |
|
|
|
30 |
|
|
The sd_iohalf can be used as either an input or output timing closure block, as
|
31 |
|
|
it closes timing on all of its inputs and outputs. It has an efficiency of 0.5,
|
32 |
|
|
meaning it can only accept data on at most every other clock, so it is useful for
|
33 |
|
|
low-rate interfaces.
|
34 |
|
|
|
35 |
|
|
1.4 sd_iofull
|
36 |
|
|
|
37 |
|
|
Provided for completeness, this block can be used with a RIRO design style to
|
38 |
|
|
provide timing closure for all of a block's inputs and outputs. Combines an
|
39 |
|
|
sd_input and sd_output.
|
40 |
|
|
|
41 |
|
|
2.0 Buffers
|
42 |
|
|
|
43 |
|
|
The buffers section of the library contains FIFOs for rate-matching and storage.
|
44 |
|
|
Each buffer consists of a "head" (write) block, and a "tail" (read) block, so that
|
45 |
|
|
the user can construct their own FIFOs from the blocks provided without having to
|
46 |
|
|
modify the library code. Each buffer is built around a synthesizable memory-like
|
47 |
|
|
block, so the buffers can be synthesized as-is or the top-level blocks can be
|
48 |
|
|
used as a template for creating your own FIFO around a library-specific memory.
|
49 |
|
|
|
50 |
|
|
ECC generate/correct blocks can also be placed inside this wrapper if error
|
51 |
|
|
correction is needed (see https://sourceforge.net/projects/xtgenerate/ for ECC
|
52 |
|
|
generator/checker).
|
53 |
|
|
|
54 |
|
|
2.1 sd_fifo_s
|
55 |
|
|
|
56 |
|
|
This "small" (or "sync") FIFO is used for rate-matching between blocks. It also
|
57 |
|
|
has built-in grey code conversion, so it can be used for crossing clock domains.
|
58 |
|
|
When the "async" parameter is set, the FIFO switches to using grey code pointers,
|
59 |
|
|
and instantiates double-sync flops between the head and tail blocks.
|
60 |
|
|
|
61 |
|
|
sd_fifo_s can only be used in natural powers of 2, due to the async support.
|
62 |
|
|
|
63 |
|
|
2.2 sd_fifo_b
|
64 |
|
|
|
65 |
|
|
This "big" FIFO supports non-power-of-2 sizes, as well as abort/commit behavior on
|
66 |
|
|
both of its interfaces. It is intended for packet FIFOs where the writer may want
|
67 |
|
|
to "forget" about a partially-written packet when an error is detected. It is also
|
68 |
|
|
useful for blocks which want to read ahead in the FIFO without actually removing data
|
69 |
|
|
(p_abort rewinds the read pointer), or for retransmission.
|
70 |
|
|
|
71 |
|
|
3.0 Forks and Joins
|
72 |
|
|
|
73 |
|
|
This section provides pipeline fork (split) and join blocks. A fork refers to any
|
74 |
|
|
block which has multiple producer interfaces, with usually a single consumer
|
75 |
|
|
interface. A join is the corresponding block with multiple consumer interfaces and
|
76 |
|
|
a single producer interface.
|
77 |
|
|
|
78 |
|
|
3.1 sd_mirror
|
79 |
|
|
|
80 |
|
|
This block is used to implement a mirrored fork, i.e. one in which all producer
|
81 |
|
|
interfaces carry the same data. This is useful in control pipelines when a single
|
82 |
|
|
item of data needs to go to multiple blocks, which may all acknowledge at different
|
83 |
|
|
times.
|
84 |
|
|
|
85 |
|
|
It has an optional c_dst_vld input, which can be used to "steer" data to one or more
|
86 |
|
|
destinations, instead of all of them. c_dst_vld should be asserted with c_srdy, if
|
87 |
|
|
it is being used. If not used, tie this input to 0 and it will mirror to all
|
88 |
|
|
outputs.
|
89 |
|
|
|
90 |
|
|
Note that sd_mirror is low-throughput, as it waits until all downstream blocks have
|
91 |
|
|
acknoweldged before accepting another word.
|
92 |
|
|
|
93 |
21 |
ghutchis |
3.2 sd_rrmux
|
94 |
17 |
ghutchis |
|
95 |
21 |
ghutchis |
This block implements a round-robin arbiter/mux. It has multiple modes
|
96 |
17 |
ghutchis |
with options on whether a grant implies that input will "hold" the grant, or
|
97 |
|
|
whether it moves on.
|
98 |
|
|
|
99 |
|
|
Mode 0 multiplexes between single words of data. Mode 1 allows an interface to burst,
|
100 |
|
|
so once the interface begins transmitting it can transmit until it deasserts srdy.
|
101 |
|
|
|
102 |
|
|
Mode 2 is for multiplexing packets, or other data where multiple words need to be
|
103 |
|
|
kept together. Once srdy is asserted, the block will not switch inputs until the
|
104 |
|
|
end pattern is seen, even if srdy is deasserted.
|
105 |
|
|
|
106 |
21 |
ghutchis |
Also has a slow (1 cycle per input) and fast (immediate) arb mode.
|
107 |
|
|
|
108 |
17 |
ghutchis |
Validation note: modes 1 and 2 have not been verified to date.
|
109 |
|
|
|
110 |
|
|
4.0 Utility
|
111 |
|
|
|
112 |
19 |
ghutchis |
This is intended for blocks which do not fit into one of the above categories.
|
113 |
|
|
Utility blocks could be items like a switch fabric, packet ring, or a scoreboard.
|
114 |
17 |
ghutchis |
|
115 |
19 |
ghutchis |
4.1 sd_ring_node
|
116 |
|
|
|
117 |
|
|
This is a building block for a unidirectional ring. Data is placed on the ring
|
118 |
|
|
using the consumer interface and is removed on the producer interface. sd_ring_node
|
119 |
|
|
supports only point-to-point single-transaction processing (single transaction meaning
|
120 |
|
|
that subsequent requests from the same source are treated as independent, and other
|
121 |
|
|
requests from other nodes may be interleaved at the destination).
|
122 |
|
|
|
123 |
|
|
4.2 sd_scoreboard
|
124 |
|
|
|
125 |
|
|
This implements a "scoreboard", or centralized repository of information about a number
|
126 |
|
|
of items. The scoreboard has a single consumer and producer interface. The user
|
127 |
|
|
is expected to use a pipeline join block (such as sd_rrslow) to serialize requests.
|
128 |
|
|
|
129 |
|
|
The scoreboard has a transaction id that it carries with each read request that can be
|
130 |
|
|
used to steer the results back to the requestor. For example, the "p_grant" output from
|
131 |
|
|
rrslow can be connected to the c_txid input, and the p_txid output can be connected to
|
132 |
|
|
the c_dst_vld input of sd_mirror, giving multi-read/multi-write capability.
|
133 |
|
|
|
134 |
|
|
The scoreboard supports both read and write, where write can also use a mask to implement
|
135 |
|
|
partial updates. If the mask is set to anything other than all 1's, the scoreboard performs
|
136 |
|
|
a read-modify-write to change only the unmasked portion of the data.
|
137 |
|
|
|
138 |
17 |
ghutchis |
5.0 Memory
|
139 |
|
|
|
140 |
|
|
Contains synthesizable memories implemented as flops. These correspond to the
|
141 |
|
|
commonly used registered-output memories available in most technologies.
|
142 |
|
|
|