# miniMIPS Superscalar

FIR filter example code 0 with dummy instruction. NOT works!

addi \$a1, \$t0, 35; 《 pipe 1 runs

addi \$a3, \$t0, 35; 《 pipe 2 runs, \$a3 = n

for1:

lw \$t2, 216(\$a0); 《 pipe 1 runs, \$t2 = x(k), \$a0 = k

lw \$t4, 180(\$a1); 《 pipe 2 runs, \$t4 = h(n-k), \$a1 = n-k

addi \$a0, \$a0, 1; 《 pipe 1 runs

addi \$a1, \$a1, -1; 《 pipe 2 runs

addi \$a2, \$a2, 1; 《 pipe 1 runs (dummy instruction)

mult \$t2, \$t4; 《 pipe 2 runs waiting data from load instruction.

mflo \$t6; 《 pipe 1 runs, not works because pipe 2 still waiting data, nothing or wrong data will be moved!

add \$t7, \$t7, \$t6; 《 pipe 2 runs

bne \$a3, \$a2, for1; 《 pipe 1 runs

sw \$t7, 258(\$t8); 《pipe 2 runs, \$t7 = y(n)

FIR filter example code 1 with dummy instruction. Works!

addi \$a1, \$t0, 35; 《 pipe 1 runs

addi \$a3, \$t0, 35; 《 pipe 2 runs

for1:

lw \$t2, 216(\$a0); 《 pipe 1 runs

lw \$t4, 180(\$a1); 《 pipe 2 runs

addi \$a0, \$a0, 1; 《 pipe 1 runs

addi \$a1, \$a1, -1; 《 pipe 2 runs

mult \$t2, \$t4; 《 pipe 1 runs waiting data from load instruction.

mflo \$t6; 《 pipe 2 runs, works because pipe 1 has already executed the mult instruction.

addi \$a2, \$a2, 1; 《 pipe 1 runs (dummy instruction)

add \$t7, \$t7, \$t6; 《 pipe 2 runs

bne \$a3, \$a2, for1; 《 pipe 1 runs

sw \$t7, 258(\$t8); 《 pipe 2 runs

FIR filter example code 2 without dummy instruction. Faster and works!

addi \$a1, \$t0, 35; 《 pipe 1 runs

addi \$a3, \$t0, 35; 《 pipe 2 runs

for1:

lw \$t2, 216(\$a0); 《 pipe 1 and 2 runs alternately at each iteration.

lw \$t4, 180(\$a1); 《 pipe 1 and 2 runs alternately at each iteration.

addi \$a0, \$a0, 1; 《 pipe 1 and 2 runs alternately at each iteration.

addi \$a1, \$a1, -1; 《 pipe 1 and 2 runs alternately at each iteration.

mult \$t2, \$t4; 《 pipe 1 and 2 runs alternately at each iteration. Waiting data from load instruction.

mflo \$t6; 《 pipe 1 and 2 runs alternately at each iteration, works because pipe 1 or 2 has already executed the mult instruction.

add \$t7, \$t7, \$t6; 《 pipe 1 and 2 runs alternately at each iteration.

bne \$a3, \$a0, for1; 《 pipe 1 and 2 runs alternately at each iteration.

sw \$t7, 258(\$t8); 《 pipe 1 or 2 runs, depend on number of coefficients (iterations)