OpenCores
URL https://opencores.org/ocsvn/wb_vga/wb_vga/trunk

Subversion Repositories wb_vga

[/] [wb_vga/] [web_uploads/] [accel.shtml] - Blame information for rev 10

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 10 root
<!--# set var="title" value="Title" -->
2
<!--# include virtual="/ssi/ssi_start.shtml" -->
3
<link REL="stylesheet" TYPE="text/css" HREF="/people/tantos/styles.css">
4
 
5
<h1>Wishbone Monitor Controller Accelerator</h1>
6
<h2>Description</h2>
7
<strong>Wishbone Monitor Controller Acclereator</strong> is a small address-generator
8
which can (when used properly) fasten up various common video memory operations.
9
The core is 100% Wishbone compatible with the
10
<a href="/cores/wb_tk/wb_extensions.shtml">WishboneTK extensions</a>. It incorporates 2
11
Wishbone interfaces. A salve interface for accessing it's internal registers and to
12
generating access cycles to the pixel memory and one master interface for the pixel memory.
13
The data width on the two interfaces should be the same but address might not.
14
 
15
<h3>Accelerator functions</h3>
16
<p>
17
Many 8-bit systems has very limited address space. It's common that they can't access more
18
than 64k of memory but often the available address spaces for video buffer is even much
19
smaller. There are two ways to overcome this:
20
<ul>
21
        <li>Paging</li>
22
        <li>Indirect addressing</li>
23
</ul>
24
Both implementations are quite common in various places. For this core I'm planning to use
25
the second approach with a little twist. You can imagine the indirect addressing of a
26
frame buffer as having a cursor on it. Writes will overwrite values under the cursor and reads
27
will return the value under the cursor. Writes to the index register will alter the cursor
28
position, reads will return cursor location. That fits seamlessly to the way many algorithm
29
thinks about the display. However there is one drawback: you need two accesses in average to reach
30
a pixel. First you have to set the cursor position by writing the index register and then you can
31
access the requested memory position. The first approach (Paging) excels in this. It can provide
32
the same information at the same time.
33
<p>
34
The important thing is that while paging provides two type of information in one cycle (what
35
and where) indirect addressing provides only one.
36
<p>
37
Many algorithms uses relative addressing which is quite time-consuming to implement in SW. they
38
would say things like, write to the pixel to the left, or up; Go two rows down, etc. The
39
accelerator part of the design will do such post-modification of the cursor automatically by HW.
40
It will have a set of modify registers (a small bunch of RAM). It will also has a separate address
41
space of (lets say) 256 locations, used for indirect frame-buffer access. A write operation on this
42
address space will identify a post-modify register and provide the information to be written to the
43
frame-buffer. After the write completes, the cursor location will be incremented/decremented by the
44
value in the addressed post-modify register. Read operation can work the very same way only
45
returning data instead of writing it to the frame-buffer.
46
 
47
<p>
48
Let's have an example! Let's say we would like to implement an algorithm of scrolling the screen
49
up by one line. With direct access the code for that would be something like this:
50
<pre>
51
   mov  R1, frame_buffer_addr
52
   mov  R2, frame_buffer_addr+xres/8
53
   mov  R3, frame_buffer_size-xres/8
54
loop:
55
   mov  R4,[R2]
56
   mov  [R1],R4
57
   inc  R1
58
   inc  R2
59
   dec  R3
60
   jnz  loop
61
</pre>
62
<p>
63
The same thing with indirect addressing:
64
<pre>
65
   mov  R1, 0
66
   mov  R2, xres/8
67
   mov  R3, frame_buffer_size-xres/8
68
loop:
69
   mov  [cursor_reg],R2
70
   mov  R4,[pixel_reg]
71
   mov  [cursor_reg],R3
72
   mov  [pixel_reg],R4
73
   inc  R1
74
   inc  R2
75
   dec  R3
76
   jnz  loop
77
</pre>
78
<p>
79
And finally the same thing with the accelerator functions:
80
<pre>
81
   mov  R1, -xres/8
82
   mov  [cursor_post_inc_reg1],R1
83
   mov  R1, 1+xres/8
84
   mov  [cursor_post_inc_reg2],R1
85
   mov  R1, frame_buffer_addr
86
   mov  [cursor_reg],R1
87
   mov  R2, frame_buffer_size-xres/8
88
loop:
89
   mov  R1,[cursor_post_inc_reg1] ; will modify cursor position to the upper byte
90
   mov  [cursor_post_inc_reg2],R2 ; will modify cursor position to the lower-right pixel
91
   dec  R2
92
   jnz  loop
93
</pre>
94
<p>
95
As you can see for this particular algorithm this approach is faster than even the direct access
96
method to the frame-buffer. Hence the pompous word, accelerator.
97
<p>
98
If you think about the most common functions you will find that this approach fastens at least the
99
following operations
100
<ul>
101
        <li>copy one part of the screen to another location, especially</li>
102
        <ul>
103
                <li>scroll</li>
104
                <li>mouse cursor</li>
105
                <li>character rendering</li>
106
        </ul>
107
        <li>Copy from local memory to frame buffer or back</li>
108
        <li>Line-drawing. Both vertical, horizontal, and Brezenhelm algorithms can benefit from it</li>
109
        <li>Drawing continuos curves of nearly any kind</li>
110
</ul>
111
Of course there is a downside:
112
<ul>
113
        <li>It takes time to set-up the engine and to program the various post-increment values</li>
114
        <li>Generaly, when arbitrary pixels are needed to be accessed it's not any faster than indirect addressing</li>
115
        <li>Required address range comparable to paged access</li>
116
</ul>
117
 
118
<h3>Programming information</h3>
119
<p>
120
As discussed above the accelerator maintains and updates a cursor register. That register is large enough to address all
121
possible locations in the pixel memory, that is it has <code>video_addr_width</code> number of bits. This register
122
can be read or written by asserting the cur_stb_i pin in a valid Wishbone cycle.
123
<p>
124
Each location in the accelerator memory functions as an increment value to the cursor. Thus each location has the same
125
size as the cursor register itself. If <code>data_width</code> is less than <code>video_addr_width</code> such a
126
location cannot be updated in one access. In that case the lower <code>data_width</code> bits of the location written
127
or read directly form the data bus of the master Wishbone bus when the <code>ACC_STB_I</code> signal is asserted.
128
The upper part of the location is written to/from the extension register. That register can be accessed by
129
asserting the <code>EXT_STB_I</code> signal. Pixel memory accesses are performed upon the assertion of the
130
<code>MEM_STB_I</code> signal. The accelerator location is selected by the address bits, the pixel memory location
131
will be the current value of the cursor register, and the cursor register is updated with the value read from the
132
accelerator location.
133
<p>
134
The accelerator has 2**accel_size locations.
135
 
136
<h3>Wishbone datasheet</h3>
137
<table border>
138
<tr><th>Description</th><th>Specification</th></tr>
139
<tr><td>General Description             </td><td>Monitor controller accelerator.</td></tr>
140
<tr><td>Supported cycles                </td><td>Slave read/write<br>Slave block read/write<br>Slave rmw<br>
141
                                                                                                 Master read/write<br>Master block read/write<br>Master rmw<br></td></tr>
142
<tr><td>Data port size                  </td><td>Configurable on slave side, same on the master side</td></tr>
143
<tr><td>Data port granularity           </td><td>Bus size</td></tr>
144
<tr><td>Data port maximum operand size  </td><td>Bus size</td></tr>
145
<tr><td>Data transfer ordering          </td><td>Little endien</td></tr>
146
<tr><td>Data transfer sequencing        </td><td>n/a</td></tr>
147
<tr><td>Supported signal list and cross reference to equivalent Wishbone signals</td><td>
148
        <table>
149
        <tr><th>Signal name</th><th>Wishbone equiv.</th></tr>
150
        <tr><th colspan="2">Common signals for all ports</th></tr>
151
        <tr><td>CLK_I       </td><td>CLK_I</td></tr>
152
        <tr><td>RST_I       </td><td>RST_I</td></tr>
153
        <tr><th colspan="2">Signals to connect to master</th></tr>
154
        <tr><td>CYC_I       </td><td>CYC_I</td></tr>
155
        <tr><td>CUR_STB_I    </td><td>STB_I</td></tr>
156
        <tr><td>EXT_STB_I    </td><td>STB_I</td></tr>
157
        <tr><td>ACC_STB_I    </td><td>STB_I</td></tr>
158
        <tr><td>MEM_STB_I    </td><td>STB_I</td></tr>
159
        <tr><td>WE_I        </td><td>WE_I </td></tr>
160
        <tr><td>ACK_O       </td><td>ACK_O</td></tr>
161
        <tr><td>SEL_I(..)   </td><td>SEL_I()</td></tr>
162
        <tr><td>ADR_I(..)   </td><td>ADR_I()</td></tr>
163
        <tr><td>DAT_I(..)   </td><td>DAT_I()</td></tr>
164
        <tr><td>DAT_O(..)   </td><td>DAT_O()</td></tr>
165
        <tr><th colspan="2">Signals to connect to the pixel memory</th></tr>
166
        <tr><td>V_CYC_O       </td><td>CYC_O</td></tr>
167
        <tr><td>V_STB_O       </td><td>STB_O</td></tr>
168
        <tr><td>V_WE_O        </td><td>WE_O </td></tr>
169
        <tr><td>V_ACK_I       </td><td>ACK_I</td></tr>
170
        <tr><td>V_ADR_O(..)   </td><td>ADR_O()</td></tr>
171
        <tr><td>v_SEL_o(..)   </td><td>SEL_O()</td></tr>
172
        <tr><td>V_DAT_I(..)   </td><td>DAT_I()</td></tr>
173
        <tr><td>V_DAT_O(..)   </td><td>DAT_O()</td></tr>
174
        </table>
175
</table>
176
<h3>Parameter description</h3>
177
<table border>
178
<tr><td>Parameter name</th><th>Description</th></tr>
179
<tr><td>accel_size       </td><td>Address size of the accelerator memory</td></tr>
180
<tr><td>video_addr_width </td><td>Address size of the pixel memory</td></tr>
181
<tr><td>data_width       </td><td>Data size of the interfaces</td></tr>
182
</table>
183
<h3>Signal description</h3>
184
<table border>
185
<tr><th>Signal name</th><th>Description</th></tr>
186
<tr><th colspan="2">Signals to connect to master</th></tr>
187
<tr><td>CYC_I       </td><td>Wishbone cycle signal. High value frames blocks of access</td></tr>
188
<tr><td>CUR_STB_I   </td><td>Wishbone strobe signal. High value indicates cycle to the cursor register</td></tr>
189
<tr><td>EXT_STB_I   </td><td>Wishbone strobe signal. High value indicates cycle to the extension register</td></tr>
190
<tr><td>ACC_STB_I   </td><td>Wishbone strobe signal. High value indicates cycle to the accelerator memory</td></tr>
191
<tr><td>MEM_STB_I   </td><td>Wishbone strobe signal. High value indicates cycle to the pixel memory</td></tr>
192
<tr><td>WE_I        </td><td>Wishbone write enable signal. High indicates data flowing from master to slave</td></tr>
193
<tr><td>ACK_O       </td><td>Wishbone acknowledge signal. High indicates that slave finished operation sucessfully</td></tr>
194
<tr><td>ACK_OI      </td><td>WhisboneTK acknowledge chain input signal</td></tr>
195
<tr><td>ADR_I(accel_size-1..0)  </td><td>Wishbone address bus signals</td></tr>
196
<tr><td>SEL_I(data_width/8-1..0) </td><td>Wishbone byte-selection signals</td></tr>
197
<tr><td>DAT_I(data_width-1..0)   </td><td>Wishbone data bus input (to slave direction) signals</td></tr>
198
<tr><td>DAT_O(data_width-1..0)   </td><td>Wishbone data bus output (to master direction) signals</td></tr>
199
<tr><td>DAT_OI(cpu_dat_width-1..0)  </td><td>WhisboneTK data bus chain input signal</td></tr>
200
<tr><th colspan="2">Signals to connect to the pixel memory</th></tr>
201
<tr><td>V_CYC_O       </td><td>Wishbone cycle signal. High value frames blocks of access</td></tr>
202
<tr><td>V_STB_O       </td><td>Wishbone strobe signal. High value indicates cycle to this particular device</td></tr>
203
<tr><td>V_WE_O        </td><td>Wishbone write enable signal. High indicates data flowing from master to slave</td></tr>
204
<tr><td>V_ACK_I       </td><td>Wishbone acknowledge signal. High indicates that slave finished operation sucessfully</td></tr>
205
<tr><td>V_ADR_O(video_addr_width-2..0)  </td><td>Wishbone address bus signals</td></tr>
206
<tr><td>V_SEL_O(data_width/8-1..0) </td><td>Wishbone byte-selection signals</td></tr>
207
<tr><td>V_DAT_I(data_width-1..0)   </td><td>Wishbone data bus input (to slave direction) signals</td></tr>
208
<tr><td>V_DAT_O(data_width-1..0)   </td><td>Wishbone data bus output (to master direction) signals</td></tr>
209
</table>
210
 
211
 
212
<h2>Author & Maintainer</h2>
213
<p>
214
<a href="/people/tantos">Andras Tantos</a>
215
<!--# include virtual="/ssi/ssi_end.shtml" -->

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.