1 |
2 |
tak.sugawa |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
2 |
|
|
<HTML>
|
3 |
|
|
<HEAD>
|
4 |
|
|
<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
5 |
|
|
<META name="GENERATOR" content="IBM WebSphere Studio Homepage Builder Version 9.0.2.0 for Windows">
|
6 |
|
|
<META http-equiv="Content-Style-Type" content="text/css">
|
7 |
|
|
<TITLE></TITLE>
|
8 |
|
|
</HEAD>
|
9 |
|
|
<BODY>
|
10 |
|
|
<P><B>5 Analysis of Design</B> <BR>
|
11 |
|
|
<B>5. 1 Problem</B><BR>
|
12 |
|
|
Here is a section of source code analysis. An example analysis was performed
|
13 |
|
|
using trace mode-2 in Veritak Simulator. (Off course, you can use
|
14 |
|
|
another tool).<BR>
|
15 |
|
|
</P>
|
16 |
|
|
<P>Load Veritak Project "altera_rtl_trace_count.vtakprj"<BR>
|
17 |
|
|
Let's look at the instruction "ori $sp,$sp,#$3380". Micro operation
|
18 |
|
|
is<BR>
|
19 |
|
|
</P>
|
20 |
|
|
<TABLE border="1">
|
21 |
|
|
<TBODY>
|
22 |
|
|
<TR>
|
23 |
|
|
<TH bgcolor="#97ffff">$sp | #$3380 => $sp</TH>
|
24 |
|
|
</TR>
|
25 |
|
|
</TBODY>
|
26 |
|
|
</TABLE>
|
27 |
|
|
<P>It is noted that Register File $sp has not been written yet even at Stage5.<BR>
|
28 |
|
|
What will be happened if $sp is referenced in any stage of stage2-stage5
|
29 |
|
|
? This is called RAW (Read After Write) Data Hazard. <BR>
|
30 |
|
|
<B>Forwarding mechanism </B>overcomes this problem.</P>
|
31 |
|
|
<TABLE border="1">
|
32 |
|
|
<TBODY>
|
33 |
|
|
<TR>
|
34 |
|
|
<TH valign="middle" align="center" height="32">Time Slot</TH>
|
35 |
|
|
<TH valign="middle" align="center" height="32">Stage1</TH>
|
36 |
|
|
<TH valign="middle" align="center" height="32">Stage2</TH>
|
37 |
|
|
<TH valign="middle" align="center" height="32" width="240">Stage3</TH>
|
38 |
|
|
<TH valign="middle" align="center" height="32" width="46">Stage4</TH>
|
39 |
|
|
<TH valign="middle" align="center" width="185" height="32">Stage5</TH>
|
40 |
|
|
</TR>
|
41 |
|
|
<TR>
|
42 |
|
|
<TD valign="middle" align="center" height="48"> </TD>
|
43 |
|
|
<TD valign="middle" align="center" height="48">Set Register File Address</TD>
|
44 |
|
|
<TD valign="middle" align="center" height="48">Read Register File<BR>
|
45 |
|
|
ALU_LEFT/Right Latch</TD>
|
46 |
|
|
<TD valign="middle" align="center" height="48" width="240">Mem Write<BR>
|
47 |
|
|
AReg<=ALU</TD>
|
48 |
|
|
<TD valign="middle" align="center" height="48" width="46">Mem Read<BR>
|
49 |
|
|
NReg<=AReg</TD>
|
50 |
|
|
<TD valign="middle" align="center" width="185" height="48">Write Register File<BR>
|
51 |
|
|
RReg<=NReg</TD>
|
52 |
|
|
</TR>
|
53 |
|
|
<TR>
|
54 |
|
|
<TD valign="middle" align="center">1</TD>
|
55 |
|
|
<TD valign="middle" align="center" bgcolor="#ffff00">Fetch & Decode<BR>
|
56 |
|
|
ori $sp,$sp,#$3380</TD>
|
57 |
|
|
<TD valign="middle" align="center"></TD>
|
58 |
|
|
<TD valign="middle" align="center" width="240"></TD>
|
59 |
|
|
<TD valign="middle" align="center" width="46"></TD>
|
60 |
|
|
<TD valign="middle" align="center" width="185"></TD>
|
61 |
|
|
</TR>
|
62 |
|
|
<TR>
|
63 |
|
|
<TD valign="middle" align="center">2</TD>
|
64 |
|
|
<TD valign="middle" align="center" bgcolor="#00cccc">Fetch & Decode<BR>
|
65 |
|
|
sw $z0,0($a0)</TD>
|
66 |
|
|
<TD valign="middle" align="center" bgcolor="#ffff00">ReadRegisterFile<BR>
|
67 |
|
|
ALU_LEFT<=0<BR>
|
68 |
|
|
ALU_RIGHT<=#$3380</TD>
|
69 |
|
|
<TD valign="middle" align="center" width="240"></TD>
|
70 |
|
|
<TD valign="middle" align="center" width="46"></TD>
|
71 |
|
|
<TD valign="middle" align="center" width="185"></TD>
|
72 |
|
|
</TR>
|
73 |
|
|
<TR>
|
74 |
|
|
<TD valign="middle" align="center">3</TD>
|
75 |
|
|
<TD valign="middle" align="center" bgcolor="#cccccc">Fetch & Decode<BR>
|
76 |
|
|
slt $v1,$a0,$a1</TD>
|
77 |
|
|
<TD valign="middle" align="center" bgcolor="#00cccc">ReadRegisterFile</TD>
|
78 |
|
|
<TD valign="middle" align="center" bgcolor="#ffff00" width="240">ALU=LEFT(0) or RIGHT(#$3380);<BR>
|
79 |
|
|
AReg<=ALU</TD>
|
80 |
|
|
<TD valign="middle" align="center" width="46"></TD>
|
81 |
|
|
<TD valign="middle" align="center" width="185"></TD>
|
82 |
|
|
</TR>
|
83 |
|
|
<TR>
|
84 |
|
|
<TD valign="middle" align="center">4</TD>
|
85 |
|
|
<TD valign="middle" align="center"></TD>
|
86 |
|
|
<TD valign="middle" align="center" bgcolor="#cccccc">ReadRegisterFile</TD>
|
87 |
|
|
<TD valign="middle" align="center" bgcolor="#00cccc" width="240">ALU</TD>
|
88 |
|
|
<TD valign="middle" align="center" bgcolor="#ffff00" width="46">NReg<=AReg</TD>
|
89 |
|
|
<TD valign="middle" align="center" width="185"></TD>
|
90 |
|
|
</TR>
|
91 |
|
|
<TR>
|
92 |
|
|
<TD valign="middle" align="center">5</TD>
|
93 |
|
|
<TD valign="middle" align="center"></TD>
|
94 |
|
|
<TD valign="middle" align="center"></TD>
|
95 |
|
|
<TD valign="middle" align="center" bgcolor="#cccccc" width="240">ALU</TD>
|
96 |
|
|
<TD valign="middle" align="center" bgcolor="#00cccc" width="46">MEM</TD>
|
97 |
|
|
<TD valign="middle" align="center" bgcolor="#ffff00" width="185">WB<BR>
|
98 |
|
|
Register FieAddres=$sp<BR>
|
99 |
|
|
Write Data=#$3380<BR>
|
100 |
|
|
</TD>
|
101 |
|
|
</TR>
|
102 |
|
|
<TR>
|
103 |
|
|
<TD valign="middle" align="center">6</TD>
|
104 |
|
|
<TD valign="middle" align="center"></TD>
|
105 |
|
|
<TD valign="middle" align="center"></TD>
|
106 |
|
|
<TD valign="middle" align="center" width="240"></TD>
|
107 |
|
|
<TD valign="middle" align="center" bgcolor="#cccccc" width="46">MEM</TD>
|
108 |
|
|
<TD valign="middle" align="center" bgcolor="#00cccc" width="185">WB</TD>
|
109 |
|
|
</TR>
|
110 |
|
|
<TR>
|
111 |
|
|
<TD valign="middle" align="center">7</TD>
|
112 |
|
|
<TD valign="middle" align="center"></TD>
|
113 |
|
|
<TD valign="middle" align="center"></TD>
|
114 |
|
|
<TD valign="middle" align="center" width="240"></TD>
|
115 |
|
|
<TD valign="middle" align="center" width="46"></TD>
|
116 |
|
|
<TD valign="middle" align="center" bgcolor="#cccccc" width="185">WB</TD>
|
117 |
|
|
</TR>
|
118 |
|
|
<TR>
|
119 |
|
|
<TD valign="middle" align="center">8</TD>
|
120 |
|
|
<TD valign="middle" align="center"></TD>
|
121 |
|
|
<TD valign="middle" align="center"></TD>
|
122 |
|
|
<TD valign="middle" align="center" width="240"></TD>
|
123 |
|
|
<TD valign="middle" align="center" width="46"></TD>
|
124 |
|
|
<TD valign="middle" align="center" width="185"></TD>
|
125 |
|
|
</TR>
|
126 |
|
|
</TBODY>
|
127 |
|
|
</TABLE>
|
128 |
|
|
<P><BR>
|
129 |
|
|
<BR>
|
130 |
|
|
<IMG src="yacc12.jpg" width="1280" height="1024" border="0"></P>
|
131 |
|
|
<P><BR>
|
132 |
|
|
<BR>
|
133 |
|
|
<B>5.2 "Forwarding" Analysis</B><BR>
|
134 |
|
|
<BR>
|
135 |
|
|
Let's look at tool-tip displays "sw $z0, 0($a0)"<BR>
|
136 |
|
|
This instruction causes RAM Write operation at the address of 0x928 followed
|
137 |
|
|
by 2 clocks. However, 0x928(=$a0) has been set by the instruction ori $a0,$a0,0x928
|
138 |
|
|
prior to 5cycles. So this is the <B>forwarding</B> case.<BR>
|
139 |
|
|
<BR>
|
140 |
|
|
Pipelined Registers carry 0x928 in <B>forwarding</B>.<BR>
|
141 |
|
|
Let' trace the situation back in following analysis.</P>
|
142 |
|
|
<TABLE border="1">
|
143 |
|
|
<TBODY>
|
144 |
|
|
<TR>
|
145 |
|
|
<TH align="left" bgcolor="#d7ffff">
|
146 |
|
|
<P>0: 3c1c0000 lui $gp,0x0 <BR>
|
147 |
|
|
4: 379c0000 ori $gp,$gp,0x88a0 <BR>
|
148 |
|
|
8: 3c040000 lui $a0,0x0 <I><BR>
|
149 |
|
|
<B><U><FONT color="#ff0000">c: 34840000 ori $a0,$a0,0x928</FONT></U></B></I> <BR>
|
150 |
|
|
10: 3c050000 lui $a1,0x0 <BR>
|
151 |
|
|
14: 34a50000 ori $a1,$a1,0x934 <BR>
|
152 |
|
|
18: 3c1d0000 lui $sp,0x0 <BR>
|
153 |
|
|
1c: 37bdfff0 ori $sp,$sp,0x3f80 <FONT color="#ff0000"><U><B><BR>
|
154 |
|
|
20: ac800000 sw $zero,0($a0)</B></U></FONT> <BR>
|
155 |
|
|
24: 0085182a slt $v1,$a0,$a1 <BR>
|
156 |
|
|
28: 1460fffd bnez $v1,0x20 <BR>
|
157 |
|
|
2c: 24840004 addiu $a0,$a0,4 <BR>
|
158 |
|
|
30: 0c00019d jal 0x674 </P>
|
159 |
|
|
</TH>
|
160 |
|
|
</TR>
|
161 |
|
|
</TBODY>
|
162 |
|
|
</TABLE>
|
163 |
|
|
<P><IMG src="yacc2.png" width="1280" height="1024" border="0"></P>
|
164 |
|
|
<P><BR>
|
165 |
|
|
To investigate what is driving the Daddress 0x928, Jump to tag file,<BR>
|
166 |
|
|
<IMG src="yacc_trace1.png" width="1199" height="934" border="0"><BR>
|
167 |
|
|
<BR>
|
168 |
|
|
Then jump to the tag file which describes entire structure of design by
|
169 |
|
|
text file.<BR>
|
170 |
|
|
<BR>
|
171 |
|
|
<IMG src="yacc_trace2.png" width="1210" height="967" border="0"><BR>
|
172 |
|
|
<BR>
|
173 |
|
|
Move to <FONT color="#ff00ff">SourceDriver</FONT> =><FONT color="#ff00ff">Assigned</FONT>:<BR>
|
174 |
|
|
Select the signal,and DBLCLICK<BR>
|
175 |
|
|
<IMG src="yacc_trace3.png" width="1210" height="967" border="0"><BR>
|
176 |
|
|
<BR>
|
177 |
|
|
Jumps to the source code where DAddress is assigned.<BR>
|
178 |
|
|
We realize that DAddress is result of ( not time consuming) add operation
|
179 |
|
|
between alu_source and IRD2 concerned.<BR>
|
180 |
|
|
<IMG src="yacc_trace4.png" width="1210" height="967" border="0"><BR>
|
181 |
|
|
<BR>
|
182 |
|
|
There is another way to jump to the driver.<BR>
|
183 |
|
|
You can jump to the source code by "Jump to Driver" directly.<BR>Set T1 cursor at write strobe time, select the signal ,and Jump to Driver..<BR>
|
184 |
|
|
<IMG src="yacc_trace5.png" width="1210" height="967" border="0"><BR>
|
185 |
|
|
<BR>
|
186 |
|
|
Same result as tag-jump.<BR>
|
187 |
|
|
<IMG src="yacc_trace6.png" width="1210" height="967" border="0"><BR>
|
188 |
|
|
<BR>
|
189 |
|
|
<BR>
|
190 |
|
|
Since this assignment is combinational logic, you can view the value by
|
191 |
|
|
tool tip.<BR>
|
192 |
|
|
<IMG src="yacc_trace7.png" width="1210" height="967" border="0"><BR>
|
193 |
|
|
<BR>
|
194 |
|
|
Now, we realize that "alu_source" is 32'h928 which is next target
|
195 |
|
|
for further analysis.<BR>
|
196 |
|
|
<IMG src="yacc_trace8.png" width="1210" height="967" border="0"><BR>
|
197 |
|
|
<BR>
|
198 |
|
|
Add "alu_source " to WaveformView for further analysis.<BR>
|
199 |
|
|
<IMG src="yacc_trace9.png" width="1210" height="967" border="0"><BR>
|
200 |
|
|
<BR>
|
201 |
|
|
<BR>
|
202 |
|
|
Jump to Driver for "alu_source"<BR>
|
203 |
|
|
<IMG src="yacc_trace10.png" width="1210" height="967" border="0"><BR>
|
204 |
|
|
<BR>
|
205 |
|
|
<BR>
|
206 |
|
|
Warning Displayed.<BR>
|
207 |
|
|
<IMG src="yacc_trace11.png" width="1210" height="967" border="0"><BR>
|
208 |
|
|
<BR>
|
209 |
|
|
<BR>
|
210 |
|
|
Expand the signal to bits.<BR>
|
211 |
|
|
<IMG src="yacc_trace12.png" width="1210" height="967" border="0"><BR>
|
212 |
|
|
<BR>
|
213 |
|
|
<BR>
|
214 |
|
|
Then jump to driver by any signal activated.<BR>
|
215 |
|
|
<IMG src="yacc_trace13.png" width="1210" height="967" border="0"><BR>
|
216 |
|
|
<BR>
|
217 |
|
|
Then jumps to the position. This is combinational circuit which has no
|
218 |
|
|
time-consumption. We can tool-tip the value in this case.<BR>
|
219 |
|
|
<IMG src="yacc_trace14.png" width="1210" height="967" border="0"><BR>
|
220 |
|
|
<BR>
|
221 |
|
|
<BR>
|
222 |
|
|
Add "alu_left_latch " to the WaveformView for further analysis.<BR>
|
223 |
|
|
<IMG src="yacc_trace15.png" width="1210" height="967" border="0"><BR>
|
224 |
|
|
<BR>
|
225 |
|
|
<BR>
|
226 |
|
|
Jump to Driver..<BR>
|
227 |
|
|
<IMG src="yacc_trace16.png" width="1210" height="967" border="0"><BR>
|
228 |
|
|
<BR>
|
229 |
|
|
<BR>
|
230 |
|
|
Jumps to the position where "alu_left_latch" is assigned by non-blocking
|
231 |
|
|
statement<BR>
|
232 |
|
|
<IMG src="yacc_trace17.png" width="1210" height="967" border="0"><BR>
|
233 |
|
|
</P>
|
234 |
|
|
<P>Jump to Driver by "DReg" at T1 cursor.<BR>
|
235 |
|
|
<IMG src="yacc_trace18.png" width="1210" height="967" border="0"><BR>
|
236 |
|
|
</P>
|
237 |
|
|
<P>We realize "DReg " is just pipelined register</P>
|
238 |
|
|
<P><IMG src="yacc_trace19.png" width="1210" height="967" border="0"></P>
|
239 |
|
|
<P><BR>
|
240 |
|
|
Let's investigate what is driving "RReg".</P>
|
241 |
|
|
<P><IMG src="yacc_trace20.png" width="1210" height="967" border="0"><BR>
|
242 |
|
|
<BR>
|
243 |
|
|
<BR>
|
244 |
|
|
NReg is Driving.<BR>
|
245 |
|
|
<IMG src="yacc_trace21.png" width="1210" height="967" border="0"><BR>
|
246 |
|
|
<BR>
|
247 |
|
|
<BR>
|
248 |
|
|
What is driving "NReg" ?<BR>
|
249 |
|
|
<IMG src="yacc_trace22.png" width="1210" height="967" border="0"><BR>
|
250 |
|
|
<BR>
|
251 |
|
|
"AReg" is driving.<BR>
|
252 |
|
|
<IMG src="yacc_trace23.png" width="1210" height="967" border="0"><BR>
|
253 |
|
|
<BR>
|
254 |
|
|
<BR>
|
255 |
|
|
What is driving "AReg"?<BR>
|
256 |
|
|
<IMG src="yacc_trace24.png" width="1210" height="967" border="0"><BR>
|
257 |
|
|
<BR>
|
258 |
|
|
<BR>
|
259 |
|
|
"alu_out " is driving.<BR>
|
260 |
|
|
<IMG src="yacc_trace25.png" width="1210" height="967" border="0"><BR>
|
261 |
|
|
<BR>
|
262 |
|
|
Then investigate the driver of "alu_out".<BR>
|
263 |
|
|
<IMG src="yacc_trace27.png" width="1210" height="967" border="0"><BR>
|
264 |
|
|
<BR>
|
265 |
|
|
<BR>
|
266 |
|
|
We realize alu_out=a | b;<BR>
|
267 |
|
|
<IMG src="yacc_trace28.png" width="1210" height="967" border="0"><BR>
|
268 |
|
|
<BR>
|
269 |
|
|
<BR>
|
270 |
|
|
<BR>
|
271 |
|
|
Finally We understand "the value 0x928 is derived from the instruction
|
272 |
|
|
" ori $a0, $a0,#$928" through pipelined registers ,not from register
|
273 |
|
|
file's output<B></B>. You can see <A href="pipelined_reg.gif">block diagram</A> I wrote ,which is a sketch in early design stage.<BR>
|
274 |
|
|
<IMG src="yacc_trace29.png" width="1210" height="967" border="0"><BR>
|
275 |
|
|
<BR>
|
276 |
|
|
<BR>
|
277 |
|
|
<BR>
|
278 |
|
|
</P>
|
279 |
|
|
</BODY>
|
280 |
|
|
</HTML>
|