OpenCores
URL https://opencores.org/ocsvn/an-fpga-implementation-of-low-latency-noc-based-mpsoc/an-fpga-implementation-of-low-latency-noc-based-mpsoc/trunk

Subversion Repositories an-fpga-implementation-of-low-latency-noc-based-mpsoc

[/] [an-fpga-implementation-of-low-latency-noc-based-mpsoc/] [trunk/] [mpsoc/] [src_processor/] [mor1kx-3.1/] [rtl/] [verilog/] [mor1kx_cache_lru.v] - Blame information for rev 42

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 38 alirezamon
/******************************************************************************
2
 This Source Code Form is subject to the terms of the
3
 Open Hardware Description License, v. 1.0. If a copy
4
 of the OHDL was not distributed with this file, You
5
 can obtain one at http://juliusbaxter.net/ohdl/ohdl.txt
6
 
7
 Description: Data cache LRU implementation
8
 
9
 Copyright (C) 2012 Stefan Wallentowitz <stefan.wallentowitz@tum.de>
10
 
11
 ******************************************************************************/
12
 
13
// This is the least-recently-used (LRU) calculation module. It
14
// essentially has two types of input and output. First, the history
15
// information needs to be evaluated to calculate the LRU value.
16
// Second, the current access and the LRU are one hot values of the
17
// ways.
18
//
19
// This module is pure combinational. All registering is done outside
20
// this module. The following parameter exists:
21
//
22
//  * NUMWAYS: Number of ways (must be greater than 1)
23
//
24
// The following ports exist:
25
//
26
//  * current: The current LRU history
27
//  * update: The new LRU history after access
28
//
29
//  * access: 0 if no access or one-hot of the way that accesses
30
//  * lru_pre: LRU before the access (one hot of ways)
31
//  * lru_post: LRU after the access (one hot of ways)
32
//
33
// The latter three have the width of NUMWAYS apparently. The first
34
// three are more complicated as this is an optimized way of storing
35
// the history information, which will be shortly described in the
36
// following.
37
//
38
// A naive approach to store the history of the access is to store the
39
// relative "age" of each element in a vector, for example for four
40
// ways:
41
//
42
//   0: 1 1: 3 2: 1 3:0
43
//
44
// This needs 4x2 bits, but more important it also needs a set of
45
// comparators and adders. This can become increasingly complex when
46
// using a higher number of cache ways with an impact on area and
47
// timing.
48
//
49
// Similarly, it is possible to store a "stack" of the access and
50
// reorder this stack on an access. But the problems are similar, it
51
// needs comparators etc.
52
//
53
// A neat approach is to store the history efficiently coded, while
54
// also easing the calculation. This approach stores the information
55
// whether each entry is older than the others. For example for the
56
// four-way example (x<y means x is older than y):
57
//
58
// |0<1|0<2|0<3|1<0|1<2|1<3|2<0|2<1|2<3|3<0|3<1|3<2|
59
//
60
// This is redundant as two entries can never be equally old meaning
61
// x<y == !y<x, leading to a simpler version
62
//
63
// |0<1|0<2|0<3|1<2|1<3|2<3|
64
//
65
// The calculations on this vector are much simpler and it is
66
// therefore used by this module.
67
//
68
// The width of this vector is the triangular number of (NUMWAYS-1),
69
// specifically:
70
//  WIDTH=NUMWAYS*(NUMWAYS-1)/2.
71
//
72
// The details of the algorithms are described below. The designer
73
// just needs to apply current history vector and the access and gets
74
// the updated history and the LRU before and after the access.
75
//
76
// Instantiation example:
77
// mor1kx_dcache_lru
78
//  (.NUMWAYS(4))
79
// u_lru(.current  (current_history[((NUMWAYS*(NUMWAYS-1))>>1)-1:0])),
80
//       .update   (updated_history[((NUMWAYS*(NUMWAYS-1))>>1)-1:0])),
81
//       .access   (access[NUMWAYS-1:0]),
82
//       .lru_pre  (lru_pre[NUMWAYS-1:0]),
83
//       .lru_post (lru_post[NUMWAYS-1:0]));
84
 
85 42 alirezamon
`timescale       1ns/1ps
86 38 alirezamon
 
87
module mor1kx_cache_lru(/*AUTOARG*/
88
   // Outputs
89
   update, lru_pre, lru_post,
90
   // Inputs
91
   current, access
92
   );
93
   parameter NUMWAYS = 2;
94
 
95
   // Triangular number
96
   localparam WIDTH = NUMWAYS*(NUMWAYS-1) >> 1;
97
 
98
   input [WIDTH-1:0]        current;
99
   output reg [WIDTH-1:0]   update;
100
 
101
   input [NUMWAYS-1:0]      access;
102
   output reg [NUMWAYS-1:0] lru_pre;
103
   output reg [NUMWAYS-1:0] lru_post;
104
 
105
   reg [NUMWAYS-1:0]        expand [0:NUMWAYS-1];
106
 
107
   integer              i, j;
108
   integer              offset;
109
 
110
   //
111
   // <    0      1      2      3
112
   // 0    1    (0<1)  (0<2)  (0<3)
113
   // 1  (1<0)    1    (1<2)  (1<3)
114
   // 2  (2<0)  (2<1)    1    (2<3)
115
   // 3  (3<0)  (3<1)  (3<2)    1
116
   //
117
   // As two entries can never be equally old (needs to be avoided on
118
   // the outside) this is equivalent to:
119
   //
120
   // <    0      1      2      3
121
   // 0    1    (0<1)  (0<2)  (0<3)
122
   // 1 !(0<1)    1    (1<2)  (1<3)
123
   // 2 !(0<2) !(1<2)    1    (2<3)
124
   // 3 !(0<3) !(1<3) !(2<3)    1
125
   //
126
   // The lower half below the diagonal is the inverted mirror of the
127
   // upper half. The number of entries in each half is of course
128
   // equal to the width of our LRU vector and the upper half is
129
   // filled with the values from the vector.
130
   //
131
   // The algorithm works as follows:
132
   //
133
   //  1. Fill the matrix (expand) with the values. The entry (i,i) is
134
   //     statically one.
135
   //
136
   //  2. The LRU_pre vector is the vector of the ANDs of the each row.
137
   //
138
   //  3. Update the values with the access vector (if any) in the
139
   //     following way: If access[i] is set, the values in row i are
140
   //     set to 0. Similarly, the values in column i are set to 1.
141
   //
142
   //  4. The update vector of the lru history is then generated by
143
   //     copying the upper half of the matrix back.
144
   //
145
   //  5. The LRU_post vector is the vector of the ANDs of each row.
146
   //
147
   // In the following an example will be used to demonstrate the algorithm:
148
   //
149
   //  NUMWAYS = 4
150
   //  current = 6'b110100;
151
   //  access  = 4'b0010;
152
   //
153
   // This current history is:
154
   //
155
   //  0<1 0<2 0<3 1<2 1<3 2<3
156
   //   0   0   1   0   1   1
157
   //
158
   // and way 2 is accessed.
159
   //
160
   // The history of accesses is 3>0>1>2 and the expected result is an
161
   // update to 2>3>0>1 with LRU_pre=2 and LRU_post=1
162
 
163
 
164
   always @(*) begin : comb
165
      // The offset is used to transfer the flat history vector into
166
      // the upper half of the
167
      offset = 0;
168
 
169
      // 1. Fill the matrix (expand) with the values. The entry (i,i) is
170
      //    statically one.
171
      for (i = 0; i < NUMWAYS; i = i + 1) begin
172
         expand[i][i] = 1'b1;
173
 
174
         for (j = i + 1; j < NUMWAYS; j = j + 1) begin
175
            expand[i][j] = current[offset+j-i-1];
176
         end
177
         for (j = 0; j < i; j = j + 1) begin
178
            expand[i][j] = !expand[j][i];
179
         end
180
 
181
         offset = offset + NUMWAYS - i - 1;
182
      end // for (i = 0; i < NUMWAYS; i = i + 1)
183
 
184
      // For the example expand is now:
185
      // <    0      1      2      3        0 1 2 3
186
      // 0    1    (0<1)  (0<2)  (0<3)    0 1 0 0 1
187
      // 1  (1<0)    1    (1<2)  (1<3) => 1 1 1 0 1
188
      // 2  (2<0)  (2<1)    1    (2<3)    2 1 1 1 1
189
      // 3  (3<0)  (3<1)  (3<2)    1      3 0 0 0 1
190
 
191
 
192
      //  2. The LRU_pre vector is the vector of the ANDs of the each
193
      //     row.
194
      for (i = 0; i < NUMWAYS; i = i + 1) begin
195
         lru_pre[i] = &expand[i];
196
      end
197
 
198
      // We derive why this is the case for the example here:
199
      // lru_pre[2] is high when the following condition holds:
200
      //
201
      //  (2<0) & (2<1) & (2<3).
202
      //
203
      // Applying the negation transform we get:
204
      //
205
      //  !(0<2) & !(1<2) & (2<3)
206
      //
207
      // and this is exactly row [2], so that here
208
      //
209
      // lru_pre[2] = &expand[2] = 1'b1;
210
      //
211
      // At this point you can also see why we initialize the diagonal
212
      // with 1.
213
 
214
      //  3. Update the values with the access vector (if any) in the
215
      //     following way: If access[i] is set, the values in row i
216
      //     are set to 0. Similarly, the values in column i are set
217
      //     to 1.
218
      for (i = 0; i < NUMWAYS; i = i + 1) begin
219
         if (access[i]) begin
220
            for (j = 0; j < NUMWAYS; j = j + 1) begin
221
               if (i != j) begin
222
                  expand[i][j] = 1'b0;
223
               end
224
            end
225
            for (j = 0; j < NUMWAYS; j = j + 1) begin
226
               if (i != j) begin
227
                  expand[j][i] = 1'b1;
228
               end
229
            end
230
         end
231
      end // for (i = 0; i < NUMWAYS; i = i + 1)
232
 
233
      // Again this becomes obvious when you see what we do here.
234
      // Accessing way 2 leads means now
235
      //
236
      // (0<2) = (1<2) = (3<2) = 1, and
237
      // (2<0) = (2<1) = (2<3) = 0
238
      //
239
      // The matrix changes accordingly
240
      //
241
      //   0 1 2 3      0 1 2 3
242
      // 0 1 0 0 1    0 1 0 1 1
243
      // 1 1 1 0 1 => 1 1 1 1 1
244
      // 2 1 1 1 1    2 0 0 1 0
245
      // 3 0 0 0 1    3 0 0 1 1
246
 
247
      // 4. The update vector of the lru history is then generated by
248
      //    copying the upper half of the matrix back.
249
      offset = 0;
250
      for (i = 0; i < NUMWAYS; i = i + 1) begin
251
         for (j = i + 1; j < NUMWAYS; j = j + 1) begin
252
            update[offset+j-i-1] = expand[i][j];
253
         end
254
         offset = offset + NUMWAYS - i - 1;
255
      end
256
 
257
      // This is the opposite operation of step 1 and is clear now.
258
      // Update becomes:
259
      //
260
      //  update = 6'b011110
261
      //
262
      // This is translated to
263
      //
264
      //  0<1 0<2 0<3 1<2 1<3 2<3
265
      //   0   1   1   1   1   0
266
      //
267
      // which is: 2>3>0>1, which is what we expected.
268
 
269
      // 5. The LRU_post vector is the vector of the ANDs of each row.
270
      for (i = 0; i < NUMWAYS; i = i + 1) begin
271
         lru_post[i] = &expand[i];
272
      end
273
 
274
      // This final step is equal to step 2 and also clear now.
275
      //
276
      // lru_post[1] = &expand[1] = 1'b1;
277
      //
278
      // lru_post = 4'b0010 is what we expected.
279
   end
280
 
281
 
282
endmodule // mor1kx_dcache_lru

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.