1 |
645 |
julius |
OpenRISC 1200 IP Core Specification (Preliminary Draft)
|
2 |
|
|
=======================================================
|
3 |
|
|
:doctype: book
|
4 |
|
|
|
5 |
|
|
////
|
6 |
|
|
Revision history
|
7 |
|
|
Note: When adding new entries, strictly follow the format of the existing ones.
|
8 |
|
|
|
9 |
|
|
Rev. | Date | Author | Description
|
10 |
|
|
__vstart__
|
11 |
|
|
v0.1 | 28/3/01 | Damjan Lampret | First Draft
|
12 |
|
|
|
13 |
|
|
v0.2 | 16/4/01 | Damjan Lampret | First time published
|
14 |
|
|
|
15 |
|
|
v0.3 | 29/4/01 | Damjan Lampret | All chapters almost
|
16 |
|
|
finished. Some bugs hidden waiting for an update. Awaiting feedback.
|
17 |
|
|
|
18 |
|
|
v0.4 | 16/5/01 | Damjan Lampret | Synchronization with
|
19 |
|
|
OR1K Arch Manual
|
20 |
|
|
|
21 |
|
|
v0.5 | 24/5/01 | Damjan Lampret | Fixed bugs
|
22 |
|
|
|
23 |
|
|
v0.6 | 28/5/01 | Damjan Lampret | Changed some SPR addresses.
|
24 |
|
|
|
25 |
|
|
v0.7 | 06/9/01 | Damjan Lampret | Simplified debug unit.
|
26 |
|
|
|
27 |
|
|
v0.8 | 30/08/10 | Julius Baxter | Adding information about FPU
|
28 |
|
|
implementation, data cache write-back capability. PIC behavior update.
|
29 |
|
|
Instruction list update. Update of bits in config registers, bringing into
|
30 |
|
|
line with latest OR1200 - not entirely complete.
|
31 |
|
|
|
32 |
|
|
v0.9 | 12/9/10 | Julius Baxter | Clarified supported parts of
|
33 |
|
|
OR1K instruction set. Updated core clock input information.
|
34 |
|
|
Fixed up reference to instruction execute stage cycle table.
|
35 |
|
|
Added divide cycles to execute stage cycle table.
|
36 |
|
|
|
37 |
|
|
0.10 | 1/11/10 | Julius Baxter | Added FF1/FL1 instructions to
|
38 |
|
|
supported instructions table.
|
39 |
|
|
|
40 |
|
|
v0.11 | 19/1/11 | Julius Baxter | Cache information update.
|
41 |
|
|
Wishbone behavior clarification. Serial integer multiply/divide update.
|
42 |
|
|
Reset address clarification
|
43 |
647 |
julius |
|
44 |
|
|
v0.12 | 13/9/11 | Julius Baxter | Addition of extension instructions
|
45 |
|
|
l.extbs, l.extbz, l.exths, l.exthz, l.extws and l.extwz. Range exception
|
46 |
|
|
support, overflow bit in supervision register.
|
47 |
809 |
julius |
|
48 |
808 |
julius |
v0.13 | 27/5/12 | Julius Baxter | Addition of support for delay-slot
|
49 |
|
|
exception indicator bit in supervision register
|
50 |
645 |
julius |
__vend__
|
51 |
|
|
////
|
52 |
|
|
|
53 |
|
|
Introduction
|
54 |
|
|
------------
|
55 |
|
|
Purpose of this document is to define specifications of the OpenRISC 1200
|
56 |
|
|
implementation. This specification defines all implementation specific
|
57 |
|
|
variables that are not part of the general architecture specification. This
|
58 |
|
|
includes type and size of data and instruction caches, type and size of data
|
59 |
|
|
and instruction MMUs, details of all execution pipelines, implementation
|
60 |
|
|
of exception unit, interrupt controller and other supplemental units.
|
61 |
|
|
This document does not cover general architecture topics like instruction set,
|
62 |
|
|
memory addressing modes and other architectural definitions. See
|
63 |
|
|
<> for more information about architecture.
|
64 |
|
|
|
65 |
|
|
OpenRISC Family
|
66 |
|
|
~~~~~~~~~~~~~~~
|
67 |
|
|
(((OpenRISC,Family)))
|
68 |
|
|
OpenRISC 1000 is architecture for a family of free, open source RISC processor
|
69 |
|
|
cores. As architecture, OpenRISC 1000 allows for a spectrum of chip and
|
70 |
|
|
system implementations at a variety of price/performance points for a range of
|
71 |
|
|
applications. It is a 32/64-bit load and store RISC architecture designed with
|
72 |
|
|
emphasis on performance, simplicity, low power requirements, scalability and
|
73 |
|
|
versatility. OpenRISC 1000 architecture targets medium and high performance
|
74 |
|
|
networking, embedded, automotive and portable computer environments.
|
75 |
|
|
|
76 |
|
|
image::img/or_family.gif[scaledwidth="50%",align="center"]
|
77 |
|
|
|
78 |
|
|
All OpenRISC implementations, whose first digit in identification number
|
79 |
|
|
is 1 , belong to OpenRISC 1000 family. Second digit defines which features
|
80 |
|
|
of OpenRISC 1000 architecture are implemented and in which way they are
|
81 |
|
|
implemented. Last two digits define how an implementation is configured
|
82 |
|
|
before it is used in a real application.
|
83 |
|
|
|
84 |
|
|
However, at present the OR1200 is the only major RTL implementation of the
|
85 |
|
|
OR1K architecture spec, and the OR1200 name has stuck, despite the high level
|
86 |
|
|
of reconfigurability possible that would, strictly speaking, mean the core
|
87 |
|
|
is either a OR1000, OR1300, etc. So, despite the various features that may
|
88 |
|
|
or may not be implemented, the core is still only referred to as the OR1200.
|
89 |
|
|
|
90 |
|
|
OpenRISC 1200
|
91 |
|
|
~~~~~~~~~~~~~
|
92 |
|
|
(((OpenRISC,1200)))
|
93 |
|
|
The OR1200 is a 32-bit scalar RISC with Harvard microarchitecture, 5 stage
|
94 |
|
|
integer pipeline, virtual memory support (MMU) and basic DSP capabilities.
|
95 |
|
|
Default caches are 1-way direct-mapped 8KB data cache and 1-way direct-mapped
|
96 |
|
|
8KB instruction cache, each with 16-byte line size. Both caches are
|
97 |
|
|
physically tagged. By default MMUs are implemented and they are constructed of
|
98 |
|
|
64-entry hash based 1-way direct-mpped data TLB and 64-entry hash based 1-way
|
99 |
|
|
direct-mapped instruction TLB.
|
100 |
|
|
|
101 |
|
|
Supplemental facilities include debug unit for real-time debugging, high
|
102 |
|
|
resolution tick timer, programmable interrupt controller and power management
|
103 |
|
|
support. When implemented in a typical 0.18u 6LM process it should provide
|
104 |
|
|
over 300 dhrystone 2.1 MIPS at 300MHz and 300 DSP MAC 32x32 operations, at
|
105 |
|
|
least 20% more than any other competitor in this class. OR1200 in default
|
106 |
|
|
configuration has about 1M transistors.
|
107 |
|
|
|
108 |
|
|
OR1200 is intended for embedded, portable and networking applications. It can
|
109 |
|
|
successfully compete with latest scalar 32-bit RISC processors in his class
|
110 |
|
|
and can efficiently run any modern operating system. Competitors include
|
111 |
|
|
ARM10, ARC and Tensilica RISC processors.
|
112 |
|
|
|
113 |
|
|
Features
|
114 |
|
|
^^^^^^^^
|
115 |
|
|
The following lists the main features of OR1200 IP core:
|
116 |
|
|
|
117 |
|
|
- All major characteristics of the core can be set by the user
|
118 |
|
|
- High performance of 300 Dhrystone 2.1 MIPS at 300 MHz using 0.18u process
|
119 |
|
|
- High performance cache and MMU subsystems
|
120 |
|
|
- WISHBONE SoC Interconnection Rev. B3 compliant interface
|
121 |
|
|
|
122 |
|
|
Architecture
|
123 |
|
|
------------
|
124 |
|
|
<> below shows general architecture of OR1200 IP core. It
|
125 |
|
|
consists of several building blocks:
|
126 |
|
|
|
127 |
|
|
- CPU/FPU/DSP central block
|
128 |
|
|
- Direct-mapped data cache
|
129 |
|
|
- Direct-mapped instruction cache
|
130 |
|
|
- Data MMU based on hash based DTLB
|
131 |
|
|
- Instruction MMU based on hash based ITLB
|
132 |
|
|
- Power management unit and power management interface
|
133 |
|
|
- Tick timer
|
134 |
|
|
- Debug unit and development interface
|
135 |
|
|
- Interrupt controller and interrupt interface
|
136 |
|
|
- Instruction and Data WISHBONE host interfaces
|
137 |
|
|
|
138 |
|
|
[[core_arch_fig]]
|
139 |
|
|
.Core's Architecture
|
140 |
|
|
image::img/core_arch.gif[scaledwidth="50%",align="center"]
|
141 |
|
|
|
142 |
|
|
CPU/FPU/DSP
|
143 |
|
|
~~~~~~~~~~~
|
144 |
|
|
((CPU))/((FPU))/((DSP)) is a central part of the OR1200 RISC processor.
|
145 |
|
|
<> shows basic block diagram of the CPU/DSP. Not pictured
|
146 |
|
|
are the FPU components. OR1200 CPU/FPU/DSP ony implements sections of
|
147 |
|
|
the ORBIS32 and ORFPX32 instruction set. No ((ORBIS64)), ((ORFBX64)) or
|
148 |
|
|
((ORVDX64)) instructions are implemented in OR1200.
|
149 |
|
|
|
150 |
|
|
[[cpu_fpu_dsp_fig]]
|
151 |
|
|
.CPU/FPU/DSP Block Diagram
|
152 |
|
|
image::img/cpu_fpu_dsp.gif[scaledwidth="50%",align="center"]
|
153 |
|
|
|
154 |
|
|
Instruction unit
|
155 |
|
|
^^^^^^^^^^^^^^^^
|
156 |
|
|
The instruction unit implements the basic instruction pipeline, fetches
|
157 |
|
|
instructions from the memory subsystem, dispatches them to available execution
|
158 |
|
|
units, and maintains a state history to ensure a precise exception model
|
159 |
|
|
and that operations finish in order. It also executes conditional branch
|
160 |
|
|
and unconditional jump instructions.
|
161 |
|
|
|
162 |
|
|
The sequencer can dispatch a sequential instruction on each clock if the
|
163 |
|
|
appropriate execution unit is available. The execution unit must discern
|
164 |
|
|
whether source data is available and to ensure that no other instruction is
|
165 |
|
|
targeting the same destination register.
|
166 |
|
|
|
167 |
|
|
Instruction unit handles only ((ORBIS32)) and, optionally, a subset of the
|
168 |
|
|
((ORFPX32)) instruction class. Some ((ORFPX32)) and all ((ORFPX3264)) and
|
169 |
|
|
((ORVDX64)) instruction classes are not supported by the OR1200 at present.
|
170 |
|
|
|
171 |
|
|
General-Purpose Registers
|
172 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
173 |
|
|
OpenRISC 1200 implements 32 general-purpose 32-bit ((registers)). OpenRISC 1000
|
174 |
|
|
architecture also support shadow copies of register file to implement fast
|
175 |
|
|
switching between working contexts, however this feature is not implemented
|
176 |
|
|
in current OR1200 implementation.
|
177 |
|
|
|
178 |
|
|
OR1200 implements general-purpose register file as two synchronous dual-port
|
179 |
|
|
memories with capacity of 32 words by 32 bits per word.
|
180 |
|
|
|
181 |
|
|
Load/Store Unit
|
182 |
|
|
^^^^^^^^^^^^^^^
|
183 |
|
|
The ((load/store unit (LSU))) transfers all data between the GPRs and the CPU's
|
184 |
|
|
internal bus. It is implemented as an independent execution unit so that stalls
|
185 |
|
|
in memory subsystem only affect master pipeline if there is a data dependency.
|
186 |
|
|
|
187 |
|
|
The following are LSU's main features:
|
188 |
|
|
|
189 |
|
|
- all load/store instruction implemented in hardware (atomic instructions
|
190 |
|
|
included)
|
191 |
|
|
- address entry buffer
|
192 |
|
|
- pipelined operation
|
193 |
|
|
- aligned accesses for fast memory access
|
194 |
|
|
|
195 |
|
|
When load and store instructions are issued, the LSU determines if all
|
196 |
|
|
operands are available. These operands include the following:
|
197 |
|
|
|
198 |
|
|
- address register operand
|
199 |
|
|
- source data register operand (for store instructions)
|
200 |
|
|
- destination data register operand (for load instructions)
|
201 |
|
|
|
202 |
|
|
Integer Execution Pipeline
|
203 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
204 |
|
|
(((Pipeline, Integer Execution)))
|
205 |
|
|
The core implements the following types of 32-bit integer instructions:
|
206 |
|
|
|
207 |
|
|
- Arithmetic instructions
|
208 |
|
|
- Compare instructions
|
209 |
|
|
- Logical instructions
|
210 |
|
|
- Rotate and shift instructions
|
211 |
|
|
|
212 |
|
|
Most integer instructions can execute in one cycle. For details about timing
|
213 |
|
|
see <>.
|
214 |
|
|
|
215 |
|
|
MAC Unit
|
216 |
|
|
^^^^^^^^
|
217 |
|
|
The ((MAC)) unit executes DSP MAC operations. MAC operations are 32x32 with
|
218 |
|
|
48-bit accumulator. MAC unit is fully pipelined and can accept new MAC
|
219 |
|
|
operation in each new clock cycle.
|
220 |
|
|
|
221 |
|
|
Floating Point Unit
|
222 |
|
|
^^^^^^^^^^^^^^^^^^^
|
223 |
|
|
(((Floating Point Unit)))
|
224 |
|
|
The ((FPU)) implementation is based on two other FPUs available from
|
225 |
|
|
OpenCores.org. For the comparison and conversion functions, parts were taken
|
226 |
|
|
from the FPU project by Rudolf Usselmann, and for the arithmetic operations,
|
227 |
|
|
the fpu100 project by Jidan Al-Eryani was converted to Verilog HDL.
|
228 |
|
|
|
229 |
|
|
All ((ORFPX32)) instructions except for ((lf.madd.s)) and ((lf.rem.s)) are
|
230 |
|
|
supported when the FPU is enabled in the OR1200 configuration.
|
231 |
|
|
|
232 |
|
|
System Unit
|
233 |
|
|
^^^^^^^^^^^
|
234 |
|
|
The ((system unit)) connects all other signals of the CPU/FPU/DSP that are not
|
235 |
|
|
connected through instruction and data interfaces. It also implements all
|
236 |
|
|
system special-purpose registers (e.g. supervisor register).
|
237 |
|
|
|
238 |
|
|
Exceptions
|
239 |
|
|
^^^^^^^^^^
|
240 |
|
|
Core exceptions can be generated when an exception condition occurs.
|
241 |
|
|
((Exception sources)) in OR1200 include the following:
|
242 |
|
|
|
243 |
|
|
- External interrupt request
|
244 |
|
|
- Certain memory access condition
|
245 |
|
|
- Internal errors, such as an attempt to execute unimplemented opcode
|
246 |
|
|
- System call
|
247 |
|
|
- Internal exception, such as breakpoint exceptions
|
248 |
647 |
julius |
- Arithmetic overflow
|
249 |
645 |
julius |
|
250 |
|
|
((Exception handling)) is transparent to user software and uses the same
|
251 |
|
|
mechanism to handle all types of exceptions. When an exception is taken,
|
252 |
|
|
control is transferred to an exception handler at an offset defined by for
|
253 |
|
|
the type of exception encountered. Exceptions are handled in supervisor mode.
|
254 |
|
|
|
255 |
808 |
julius |
Exceptions caused by instructions in a delay slot will set the supervision
|
256 |
|
|
register's DSX bit.
|
257 |
|
|
|
258 |
645 |
julius |
Data Cache
|
259 |
|
|
~~~~~~~~~~
|
260 |
|
|
The default configuration of OR1200 data ((cache)) is 8-Kbyte, 1-way
|
261 |
|
|
direct-mapped data cache, which allows rapid core access to data. However
|
262 |
|
|
data cache can be configured according to <>.
|
263 |
|
|
|
264 |
|
|
[[data_confs_or1200_table]]
|
265 |
|
|
.Possible Data Cache Configurations of OR1200
|
266 |
|
|
[width="60%",options="header"]
|
267 |
|
|
|======================================================
|
268 |
|
|
| | Direct mapped
|
269 |
|
|
| 16B/line, 256 lines, 1 way | 4KB
|
270 |
|
|
| 16B/line, 512 lines, 1 way | *8KB (default)*
|
271 |
|
|
| 16B/line, 1024 lines, 1 way | 16KB
|
272 |
|
|
| 32B/line, 1024 lines, 1 way | 32KB
|
273 |
|
|
|======================================================
|
274 |
|
|
|
275 |
|
|
It is possible to operate the data cache with write-through or write-back
|
276 |
|
|
strategies, however write-back is currently experimental.
|
277 |
|
|
|
278 |
|
|
Features:
|
279 |
|
|
|
280 |
|
|
- data cache is separate from instruction cache (Harvard architecture)
|
281 |
|
|
- data cache implements a least-recently used (LRU) replacement algorithm
|
282 |
|
|
within each set
|
283 |
|
|
- the cache directory is physically addressed. The physical address tag is
|
284 |
|
|
stored in the cache directory
|
285 |
|
|
- write-through or write-back operation
|
286 |
|
|
- entire cache can be disabled, lines invalidated, flushed or forced to be
|
287 |
|
|
written back, by writing to cache special purpose registers
|
288 |
|
|
|
289 |
|
|
On a miss, and appropriate conditions, the cache line is filled or emptied
|
290 |
|
|
(written back) with 16-byte bursts. The burst fill is performed as a
|
291 |
|
|
critical-word-first operation; the critical word is simultaneously written
|
292 |
|
|
to the cache and forwarded to the requesting unit, thus minimizing stalls
|
293 |
|
|
due to cache fill latency. Data cache provides storage for cache tags and
|
294 |
|
|
performs cache line replacement function.
|
295 |
|
|
|
296 |
|
|
Data cache is tightly coupled to external interface to allow efficient
|
297 |
|
|
access to the system memory controller.
|
298 |
|
|
|
299 |
|
|
The data cache supplies data to the GPRs by means of a 32-bit interface
|
300 |
|
|
to the load/store unit. The LSU provides all logic required to calculate
|
301 |
|
|
effective addresses, handles data alignment to and from the data cache,
|
302 |
|
|
and provides sequencing for load and store operations. Write operations to
|
303 |
|
|
the data cache can be performed on a byte, half-word or word basis.
|
304 |
|
|
|
305 |
|
|
image::img/data_cache_diag.gif[scaledwidth="50%",align="center"]
|
306 |
|
|
|
307 |
|
|
Each line contains four contiguous words from memory that are loaded from
|
308 |
|
|
a cache line aligned boundary. As a result, cache lines are aligned with
|
309 |
|
|
page boundaries.
|
310 |
|
|
|
311 |
|
|
Instruction Cache
|
312 |
|
|
~~~~~~~~~~~~~~~~~
|
313 |
|
|
The default configuration of OR1200 instruction ((cache)) is 8-Kbyte, 1-way
|
314 |
|
|
direct mapped instruction cache, which allows rapid core access to
|
315 |
|
|
instructions. However instruction cache can be configured according to
|
316 |
|
|
<>.
|
317 |
|
|
|
318 |
|
|
[[inst_confs_or1200_table]]
|
319 |
|
|
.Possible Instruction Cache Configurations of OR1200
|
320 |
|
|
[width="60%",options="header"]
|
321 |
|
|
|==============================================
|
322 |
|
|
| | Direct mapped
|
323 |
|
|
| 16B/line, 32 lines, 1 way | 512B
|
324 |
|
|
| 16B/line, 256 lines, 1 way | 4KB
|
325 |
|
|
| 16B/line, 512 lines, 1 way | *8KB (Default)*
|
326 |
|
|
| 16B/line, 1024 lines, 1 way | 16KB
|
327 |
|
|
| 32B/line, 1024 lines, 1 way | 32KB
|
328 |
|
|
|==============================================
|
329 |
|
|
|
330 |
|
|
Features:
|
331 |
|
|
|
332 |
|
|
- instruction cache is separate from data cache (Harvard architecture)
|
333 |
|
|
(((Architecture,Harvard)))
|
334 |
|
|
- instruction cache implements a least-recently used (LRU) replacement
|
335 |
|
|
algorithm within each set
|
336 |
|
|
((LRU))
|
337 |
|
|
- the ((cache directory)) is physically addressed. The physical address tag is
|
338 |
|
|
stored in the cache directory
|
339 |
|
|
- it can be disabled or invalidated by writing to cache special purpose
|
340 |
|
|
registers
|
341 |
|
|
|
342 |
|
|
On a miss, the cache is filled in with 16-byte bursts. The burst fill
|
343 |
|
|
is performed as a critical-word-first operation; the critical word is
|
344 |
|
|
simultaneously written to the cache and forwarded to the requesting unit,
|
345 |
|
|
thus minimizing stalls due to cache fill latency. Instruction cache provides
|
346 |
|
|
storage for cache tags and performs cache line replacement function.
|
347 |
|
|
|
348 |
|
|
Instruction cache is tightly coupled to external interface to allow efficient
|
349 |
|
|
access to the system memory controller.
|
350 |
|
|
|
351 |
|
|
The instruction cache supplies instructions to the instruction sequencer by
|
352 |
|
|
means of a 32-bit interface to the instruction fetch subunit. The instruction
|
353 |
|
|
fetch subunit provides all logic required to calculate effective addresses.
|
354 |
|
|
|
355 |
|
|
image::img/inst_cache_diag.gif[scaledwidth="50%",align="center"]
|
356 |
|
|
|
357 |
|
|
Each line contains four contiguous words from memory that are loaded from
|
358 |
|
|
a line-size aligned boundary. As a result, cache lines are aligned with
|
359 |
|
|
page boundaries.
|
360 |
|
|
|
361 |
|
|
Data MMU
|
362 |
|
|
~~~~~~~~
|
363 |
|
|
(((MMU, Data)))
|
364 |
|
|
The OR1200 implements a ((virtual memory management)) scheme that
|
365 |
|
|
provides memory access protection and effective-to-physical address
|
366 |
|
|
translation. ((Protection)) granularity is as defined by OpenRISC 1000
|
367 |
|
|
architecture - 8-Kbyte and 16-Mbyte pages.
|
368 |
|
|
|
369 |
|
|
[[data_tlb_confs_or1200_table]]
|
370 |
|
|
.Possible Data TLB Configurations of OR1200
|
371 |
|
|
[width="60%",options="header"]
|
372 |
|
|
|======================================
|
373 |
|
|
| | Direct mapped
|
374 |
|
|
| 16 entries per way | 16 DTLB entries
|
375 |
|
|
| 32 entries per way | 32 DTLB entries
|
376 |
|
|
| 64 entries per way | *64 DTLB entries (default)*
|
377 |
|
|
| 128 entries per way | 128 DTLB entries
|
378 |
|
|
|======================================
|
379 |
|
|
|
380 |
|
|
Features:
|
381 |
|
|
|
382 |
|
|
* data MMU is separate from instruction MMU
|
383 |
|
|
* page size 8-Kbyte
|
384 |
|
|
* comprehensive page protection scheme
|
385 |
|
|
* direct mapped hash based translation lookaside buffer (DTLB) with the
|
386 |
|
|
default of 1 way and the following features:
|
387 |
|
|
** miss and fault exceptions
|
388 |
|
|
** software tablewalk
|
389 |
|
|
** high performance because of hashed based design
|
390 |
|
|
** variable number DTLB entries with default of 64 per each way
|
391 |
|
|
|
392 |
|
|
image::img/tlb_diag.gif[scaledwidth="50%",align="center"]
|
393 |
|
|
|
394 |
|
|
The MMU hardware supports two-level software tablewalk.
|
395 |
|
|
|
396 |
|
|
Instruction MMU
|
397 |
|
|
~~~~~~~~~~~~~~~
|
398 |
|
|
(((MMU, Instruction)))
|
399 |
|
|
The OR1200 implements a virtual memory management scheme that provides memory
|
400 |
|
|
access protection and effective-to-physical address translation. Protection
|
401 |
|
|
granularity is as defined by OpenRISC 1000 architecture - 8-Kbyte and
|
402 |
|
|
16-Mbyte pages.
|
403 |
|
|
|
404 |
|
|
[[inst_tlb_confs_or1200_table]]
|
405 |
|
|
.Possible Instruction TLB Configurations of OR1200
|
406 |
|
|
[width="60%",options="header"]
|
407 |
|
|
|======================================
|
408 |
|
|
| | Direct mapped
|
409 |
|
|
| 16 entries per way | 16 DTLB entries
|
410 |
|
|
| 32 entries per way | 32 DTLB entries
|
411 |
|
|
| 64 entries per way | *64 DTLB entries (default)*
|
412 |
|
|
| 128 entries per way | 128 DTLB entries
|
413 |
|
|
|======================================
|
414 |
|
|
|
415 |
|
|
Features:
|
416 |
|
|
|
417 |
|
|
* instruction MMU is separate from data MMU
|
418 |
|
|
* pages size 8-Kbyte
|
419 |
|
|
* comprehensive page protection scheme
|
420 |
|
|
* 1 way direct-mapped hash based translation lookaside buffer (ITLB) with the
|
421 |
|
|
following features:
|
422 |
|
|
** miss and fault exceptions
|
423 |
|
|
** software tablewalk
|
424 |
|
|
** high performance because of hashed based design
|
425 |
|
|
** Variable number of ITLB entries with default of 64 entries per way
|
426 |
|
|
|
427 |
|
|
image::img/inst_mmu_diag.gif[scaledwidth="50%",align="center"]
|
428 |
|
|
|
429 |
|
|
The MMU hardware supports two-level software tablewalk.
|
430 |
|
|
|
431 |
|
|
Programmable Interrupt Controller
|
432 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
433 |
|
|
The ((interrupt)) controller receives interrupts from external sources and
|
434 |
|
|
forwards them as low or high priority interrupt exception to the CPU core.
|
435 |
|
|
|
436 |
|
|
[[interrupt_controller_fig]]
|
437 |
|
|
.Block Diagram of the Interrupt Controller
|
438 |
|
|
image::img/interrupt_controller.gif[scaledwidth="50%",align="center"]
|
439 |
|
|
|
440 |
|
|
Programmable interrupt controller has three special-purpose registers and 32
|
441 |
|
|
interrupt inputs. Interrupt input 0 and 1 are always enabled and connected
|
442 |
|
|
to high and low priority interrupt input, respectively.
|
443 |
|
|
|
444 |
|
|
30 other interrupt inputs can be masked and assigned low or high priority
|
445 |
|
|
through programming special-purpose registers.
|
446 |
|
|
|
447 |
|
|
Tick Timer
|
448 |
|
|
~~~~~~~~~~
|
449 |
|
|
OR1200 implements tick ((timer)) facility. Basically this is a timer that is
|
450 |
|
|
clocked by RISC clock and is used by the operating system to precisely
|
451 |
|
|
measure time and schedule system tasks.
|
452 |
|
|
|
453 |
|
|
OR1200 precisely follow architectural definition of the tick timer facility:
|
454 |
|
|
|
455 |
|
|
* Maximum timer count of 2^32 clock cycles
|
456 |
|
|
* Maximum time period of 2^28 clock cycles between interrupts
|
457 |
|
|
* Maskable tick timer interrupt
|
458 |
|
|
* Single run, restartable or continues timer
|
459 |
|
|
|
460 |
|
|
Tick timer operates from independent clock source so that doze power management
|
461 |
|
|
mode can be implemented.
|
462 |
|
|
|
463 |
|
|
Power Management Support
|
464 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
465 |
|
|
To optimize ((power consumption)), the OR1200 provides ((low-power)) modes that
|
466 |
|
|
can be used to dynamically activate and deactivate certain internal modules.
|
467 |
|
|
|
468 |
|
|
OR1200 has three major features to minimize power consumption:
|
469 |
|
|
|
470 |
|
|
* Slow and Idle Modes (SW controlled clock freq reduction)
|
471 |
|
|
* Doze and Sleep Modes (interrupt wake-up)
|
472 |
|
|
|
473 |
|
|
[[power_consumption_table]]
|
474 |
|
|
.Power Consumption
|
475 |
|
|
[width="60%",options="header"]
|
476 |
|
|
|===================================================================
|
477 |
|
|
| Power Minimization Feature | Approx Power Consumption Reduction
|
478 |
|
|
| Slow and Idle mode | 2x - 10x
|
479 |
|
|
| Doze mode | 100x
|
480 |
|
|
| Sleep mode | 200x
|
481 |
|
|
| Dynamic clock gating | N/A
|
482 |
|
|
|===================================================================
|
483 |
|
|
|
484 |
|
|
Slow down mode takes advantage of the low-power dividers in external clock
|
485 |
|
|
generation circuitry to enable full functionality, but at a lower frequency
|
486 |
|
|
so that a power consumption is reduced. PMR[SDF] 4 bits are broadcasted on
|
487 |
|
|
pm_clksd and external clock generation for the RISC should adapt RISC clock
|
488 |
|
|
frequency according to the value on pm_clksd.
|
489 |
|
|
|
490 |
|
|
When software initiates the doze mode, software processing on the core
|
491 |
|
|
suspends. The clocks to the RISC internal modules are disabled except to
|
492 |
|
|
the tick timer. However any other on-chip blocks can continue to function
|
493 |
|
|
as normal. The OR1200 will leave doze mode and enter normal mode when a
|
494 |
|
|
pending interrupt occurs.
|
495 |
|
|
|
496 |
|
|
In sleep mode, all OR1200 internal units are disabled and clocks
|
497 |
|
|
gated. Optionally implementation may choose to lower the operating voltage
|
498 |
|
|
of the OR1200 core. The OR1200 should leave sleep mode and enter normal
|
499 |
|
|
mode when a pending interrupt occurs.
|
500 |
|
|
|
501 |
|
|
Dynamic ((Clock gating)) (unit clock gating on clock by clock basis) is not
|
502 |
|
|
supported by OR1200.
|
503 |
|
|
|
504 |
|
|
Debug unit
|
505 |
|
|
~~~~~~~~~~
|
506 |
|
|
((Debug unit)) assists software developers to debug their systems. It provides
|
507 |
|
|
support only for basic debugging and does not have support for more advanced
|
508 |
|
|
debug features of OpenRISC 1000 architecture such as watchpoints, breakpoints
|
509 |
|
|
and program-flow control registers.
|
510 |
|
|
|
511 |
|
|
[[debug_unit_fig]]
|
512 |
|
|
.Block Diagram of Debug Unit
|
513 |
|
|
image::img/debug_unit_diag.gif[scaledwidth="50%",align="center"]
|
514 |
|
|
|
515 |
|
|
Watchpoints and breakpoints are events triggered by program- or data-flow
|
516 |
|
|
matching the conditions programmed in the debug registers. Breakpoints
|
517 |
|
|
unlike watchpoints also suspend execution of the current program-flow and
|
518 |
|
|
start breakpoint exception.
|
519 |
|
|
|
520 |
|
|
Clocks & Reset
|
521 |
|
|
~~~~~~~~~~~~~~
|
522 |
|
|
The OR1200 core has a ((clock)) input each for the instruction and data Wishbone
|
523 |
|
|
interface logic, and for the CPU core. Clock input clk_cpu clocks everything
|
524 |
|
|
inside the Wishbone interfaces. Data Wishbone interface is clocked by
|
525 |
|
|
dwb_clk_i, instruction Wishbone interface is clocked by iwb_clk_i.
|
526 |
|
|
|
527 |
|
|
OR1200 has asynchronous ((reset)) signal. Reset signal rst, when asserted high,
|
528 |
|
|
immediately resets all flip-flops inside OR1200. When deasserted, OR1200
|
529 |
|
|
will start reset exception.
|
530 |
|
|
|
531 |
|
|
WISHBONE Interfaces
|
532 |
|
|
~~~~~~~~~~~~~~~~~~~
|
533 |
|
|
Two ((WISHBONE)) interfaces connect OR1200 core to external peripherals and
|
534 |
|
|
external memory subsystem. They are WISHBONE SoC Interconnection specification
|
535 |
|
|
Rev. B3 compliant. The implementation implements a 32-bit bus width and does
|
536 |
|
|
not support other bus widths.
|
537 |
|
|
|
538 |
|
|
Wishbone registered-feedback incrementing burst accesses occur when not
|
539 |
|
|
disabled, and cache lines are filled. The burst size (beats) is determined
|
540 |
|
|
by the cache line size.
|
541 |
|
|
|
542 |
|
|
image::img/wb_compatible.png[scaledwidth="30%",align="center"]
|
543 |
|
|
|
544 |
|
|
Operation
|
545 |
|
|
---------
|
546 |
|
|
This section describes the operation of the OR1200 core. For operations
|
547 |
|
|
that pertain to the architectural definitions, see <>.
|
548 |
|
|
|
549 |
|
|
Reset
|
550 |
|
|
~~~~~
|
551 |
|
|
OR1200 has one asynchronous ((reset)) signal that can be used by a soft and hard
|
552 |
|
|
reset on a higher system hierarchy levels.
|
553 |
|
|
|
554 |
|
|
[[powerup_sequence_fig]]
|
555 |
|
|
.Power-Up and Reset Sequence
|
556 |
|
|
image::img/powerup_seq.gif[scaledwidth="70%",align="center"]
|
557 |
|
|
|
558 |
|
|
<> shows how asynchronous reset is applied after
|
559 |
|
|
powering up the OR1200 core. Reset is connected to asynchronous reset of
|
560 |
|
|
almost all flip-flops inside RISC core. Special care must be taken to ensure
|
561 |
|
|
hold and setup times of all flip-flops compared to main RISC clock.
|
562 |
|
|
|
563 |
|
|
If system implements gated clocks, then clock gating can be used to ensure
|
564 |
|
|
proper reset timing.
|
565 |
|
|
|
566 |
|
|
[[powerup_sequence_gatedclk_fig]]
|
567 |
|
|
.Power-Up and Reset Sequence w/ Gated Clock
|
568 |
|
|
image::img/powerup_seq_gatedclk.gif[scaledwidth="70%",align="center"]
|
569 |
|
|
|
570 |
|
|
The address the PC assumes at hard reset (assertion of external reset signal)
|
571 |
|
|
is definable at synthesis time, via the OR1200_BOOT_ADR define. This is not
|
572 |
|
|
to be confused with the ability to set the exception prefix address with
|
573 |
|
|
the EPH bit.
|
574 |
|
|
|
575 |
|
|
CPU/FPU/DSP
|
576 |
|
|
~~~~~~~~~~~
|
577 |
|
|
((CPU))/((FPU))/((DSP)) is implementation of the 32-bit part of the OpenRISC
|
578 |
|
|
1000 architecture and only a subset of all features is implemented.
|
579 |
|
|
|
580 |
|
|
Instructions
|
581 |
|
|
^^^^^^^^^^^^
|
582 |
|
|
(((OpenRISC 1200, Instruction List)))
|
583 |
|
|
The following table lists the instructions implemented in the OR1200. Those
|
584 |
|
|
optionally implemented are indicated as such.
|
585 |
|
|
|
586 |
|
|
// The table below is split into several columns for readability by the
|
587 |
|
|
// preprocessing script. It is better to have this automated because
|
588 |
|
|
// given the pseudo-lexicographical ordering, adding a new instruction
|
589 |
|
|
// would require manual changes in all subsequent columns, which is
|
590 |
|
|
// tedious and error-prone.
|
591 |
|
|
//
|
592 |
|
|
// When changing the column headers, remember to change the script accordingly.
|
593 |
|
|
|
594 |
|
|
[[instructions_table]]
|
595 |
|
|
.Instructions implemented in OR1200
|
596 |
|
|
[width="95%",options="header"]
|
597 |
|
|
|=================================
|
598 |
|
|
| Instruction mnemonic | Optional
|
599 |
|
|
| ((l.add)) |
|
600 |
|
|
| ((l.addc)) | Yes
|
601 |
|
|
| ((l.addi)) |
|
602 |
|
|
| ((l.and)) |
|
603 |
|
|
| ((l.andi)) |
|
604 |
|
|
| ((l.bf)) |
|
605 |
|
|
| ((l.bnf)) |
|
606 |
|
|
| ((l.div)) | Yes
|
607 |
647 |
julius |
| ((l.extbs)) | Yes
|
608 |
|
|
| ((l.extbz)) | Yes
|
609 |
|
|
| ((l.exths)) | Yes
|
610 |
|
|
| ((l.exthz)) | Yes
|
611 |
|
|
| ((l.extws)) | Yes
|
612 |
|
|
| ((l.extwz)) | Yes
|
613 |
645 |
julius |
| ((l.ff1)) | Yes
|
614 |
|
|
| ((l.fl1)) | Yes
|
615 |
|
|
| ((l.j)) |
|
616 |
|
|
| ((l.jal)) |
|
617 |
|
|
| ((l.jalr)) |
|
618 |
|
|
| ((l.jr)) |
|
619 |
|
|
| ((l.lbs)) |
|
620 |
|
|
| ((l.lbz)) |
|
621 |
|
|
| ((l.lhs)) |
|
622 |
|
|
| ((l.lhz)) |
|
623 |
|
|
| ((l.lws)) |
|
624 |
|
|
| ((l.lwz)) |
|
625 |
|
|
| ((l.mac)) | Yes
|
626 |
|
|
| ((l.maci)) | Yes
|
627 |
|
|
| ((l.macrc)) | Yes
|
628 |
|
|
| ((l.mfspr)) |
|
629 |
|
|
| ((l.movhi)) |
|
630 |
|
|
| ((l.msb)) | Yes
|
631 |
|
|
| ((l.mtspr)) |
|
632 |
|
|
| ((l.mul)) | Yes
|
633 |
|
|
| ((l.muli)) | Yes
|
634 |
|
|
| ((l.nop)) |
|
635 |
|
|
| ((l.or)) |
|
636 |
|
|
| ((l.ori)) |
|
637 |
|
|
| ((l.rfe)) |
|
638 |
|
|
| ((l.rori)) |
|
639 |
|
|
| ((l.sb)) |
|
640 |
|
|
| ((l.sfeq)) |
|
641 |
|
|
| ((l.sfges)) |
|
642 |
|
|
| ((l.sfgeu)) |
|
643 |
|
|
| ((l.sfgts)) |
|
644 |
|
|
| ((l.sfgtu)) |
|
645 |
|
|
| ((l.sfleu)) |
|
646 |
|
|
| ((l.sflts)) |
|
647 |
|
|
| ((l.sfltu)) |
|
648 |
|
|
| ((l.sfne)) |
|
649 |
|
|
| ((l.sh)) |
|
650 |
|
|
| ((l.sll)) |
|
651 |
|
|
| ((l.slli)) |
|
652 |
|
|
| ((l.sra)) |
|
653 |
|
|
| ((l.srai)) |
|
654 |
|
|
| ((l.srl)) |
|
655 |
|
|
| ((l.srli)) |
|
656 |
|
|
| ((l.sub)) | Yes
|
657 |
|
|
| ((l.sw)) |
|
658 |
|
|
| ((l.sys)) |
|
659 |
|
|
| ((l.trap)) |
|
660 |
|
|
| ((l.xor)) |
|
661 |
|
|
| ((l.xori)) |
|
662 |
|
|
| ((lf.add.s)) | Yes
|
663 |
|
|
| ((lf.div.s)) | Yes
|
664 |
|
|
| ((lf.ftoi.s)) | Yes
|
665 |
|
|
| ((lf.itof.s)) | Yes
|
666 |
|
|
| ((lf.mul.s)) | Yes
|
667 |
|
|
| ((lf.sfeq.s)) | Yes
|
668 |
|
|
| ((lf.sfge.s)) | Yes
|
669 |
|
|
| ((lf.sfgt.s)) | Yes
|
670 |
|
|
| ((lf.sfle.s)) | Yes
|
671 |
|
|
| ((lf.sflt.s)) | Yes
|
672 |
|
|
| ((lf.sfne.s)) | Yes
|
673 |
|
|
| ((lf.sub.s)) | Yes
|
674 |
|
|
|=================================
|
675 |
|
|
|
676 |
|
|
For a complete description of each instruction's format refer to
|
677 |
|
|
<>.
|
678 |
|
|
|
679 |
|
|
Instruction Unit
|
680 |
|
|
^^^^^^^^^^^^^^^^
|
681 |
|
|
((Instruction unit)) generates instruction fetch effective address and fetches
|
682 |
|
|
instructions from instruction cache. Each clock cycle one instruction can
|
683 |
|
|
be fetched. Instruction fetch EA is further translated into physical address
|
684 |
|
|
by IMMU.
|
685 |
|
|
|
686 |
|
|
General-Purpose Registers
|
687 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
688 |
|
|
((General-purpose register)) file can supply two read operands each clock cycle
|
689 |
|
|
and store one result in a destination register.
|
690 |
|
|
|
691 |
|
|
GPRs can be also read and written through development interface.
|
692 |
|
|
|
693 |
|
|
Load/Store Unit
|
694 |
|
|
^^^^^^^^^^^^^^^
|
695 |
|
|
((LSU)) can execute one load instruction every two clock cycles assuming load
|
696 |
|
|
instruction have a hit in the data cache. Execution of store instructions
|
697 |
|
|
takes one clock cycle assuming they have a hit in the data cache.
|
698 |
|
|
|
699 |
|
|
LSU performs calculation of the load/store effective address. EA is further
|
700 |
|
|
translated into physical address by DMMU.
|
701 |
|
|
|
702 |
|
|
Load/store effective address and load and store data can be also accessed
|
703 |
|
|
through development interface.
|
704 |
|
|
|
705 |
|
|
Integer Execution Pipeline
|
706 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
707 |
|
|
(((Pipeline, Integer Execution)))
|
708 |
|
|
The core implements the following types of 32-bit integer instructions:
|
709 |
|
|
|
710 |
|
|
* Arithmetic instructions
|
711 |
|
|
* Compare instructions
|
712 |
|
|
* Logical instructions
|
713 |
|
|
* Rotate and shift instructions
|
714 |
|
|
|
715 |
|
|
[[exec_time_int_table]]
|
716 |
|
|
.Execution Time of Integer Instructions
|
717 |
|
|
[width="70%",options="header"]
|
718 |
|
|
|================================================
|
719 |
|
|
| Instruction Group | Clock Cycles to Execute
|
720 |
|
|
| Arithmetic except Multiply/Divide | 1
|
721 |
|
|
| Multiply | 3
|
722 |
|
|
| Divide | 32
|
723 |
|
|
| Compare | 1
|
724 |
|
|
| Logical | 1
|
725 |
|
|
| Rotate and Shift | 1
|
726 |
|
|
| Others | 1
|
727 |
|
|
|================================================
|
728 |
|
|
|
729 |
|
|
<> lists execution times for instructions executed by
|
730 |
|
|
integer execution pipeline. Most instructions are executed in one clock cycle.
|
731 |
|
|
|
732 |
|
|
Integer multiply can be either serial or parallel implementations. Serial
|
733 |
|
|
operations require one clock cycle per bit of operand, which is 32-cycles
|
734 |
|
|
on the OR1200. At present no synthesis tools support division operators,
|
735 |
|
|
and so the serial option must be used.
|
736 |
|
|
|
737 |
|
|
MAC Unit
|
738 |
|
|
^^^^^^^^
|
739 |
|
|
((MAC)) unit executes l.mac instructions. MAC unit implements 32x32 fully
|
740 |
|
|
pipelined multiplier and 48-bit accumulator. MAC unit can accept one new
|
741 |
|
|
l.mac instruction each clock cycle.
|
742 |
|
|
|
743 |
|
|
Care should be taken when executing l.macrc (MAC read and clear) too soon
|
744 |
|
|
after the final l.mac instruction as the operation may still be underway
|
745 |
|
|
and the result will not be valid in time. It is recommended at least 3 other
|
746 |
|
|
instructions (or just l.nops) are inserted between the final l.mac and l.macrc.
|
747 |
|
|
|
748 |
|
|
Floating Point Unit
|
749 |
|
|
^^^^^^^^^^^^^^^^^^^
|
750 |
|
|
The ((floating point unit)) has a mechanism to stall the processor pipeline
|
751 |
|
|
until processing has completed.
|
752 |
|
|
|
753 |
|
|
The following table indicates the number of cycles per operation
|
754 |
|
|
|
755 |
|
|
[[exec_time_fp_table]]
|
756 |
|
|
.Execution time of floating point instructions
|
757 |
|
|
[width="60%",options="header"]
|
758 |
|
|
|=======================
|
759 |
|
|
| Operation | Cycles
|
760 |
|
|
| Add/subtract | 10
|
761 |
|
|
| Multiply | 38
|
762 |
|
|
| Divide | 37
|
763 |
|
|
| Compare | 2
|
764 |
|
|
| Convert | 7
|
765 |
|
|
|=======================
|
766 |
|
|
|
767 |
|
|
System Unit
|
768 |
|
|
^^^^^^^^^^^
|
769 |
|
|
((System unit)) implements system control and status special-purpose registers
|
770 |
|
|
and executes all l.mtspr/l.mfspr instructions.
|
771 |
|
|
|
772 |
|
|
Exceptions
|
773 |
|
|
^^^^^^^^^^
|
774 |
|
|
The core implements a precise ((exception model)). This means that when an
|
775 |
|
|
exception is taken, the following conditions are met:
|
776 |
|
|
|
777 |
|
|
* Subsequent instructions in program flow are discarded
|
778 |
|
|
* Previous instructions finish and write back their results
|
779 |
|
|
* The address of faulting instruction is saved in EPCR registers and the
|
780 |
|
|
machine state is saved to ESR registers
|
781 |
808 |
julius |
* If the exception occurred in a delay slot, the DSX bit of the SR is set
|
782 |
645 |
julius |
|
783 |
|
|
[[exceptions_table]]
|
784 |
|
|
.List of Implemented ((Exceptions))
|
785 |
|
|
[width="95%",options="header"]
|
786 |
|
|
|===========================================================
|
787 |
|
|
| Exception Type | Vector Offset | Causing Conditions
|
788 |
|
|
| Reset | 0x100 | Caused by reset.
|
789 |
|
|
| Bus Error | 0x200 | Caused by an attempt to access invalid
|
790 |
|
|
physical address.
|
791 |
|
|
| Data Page Fault | 0x300 | Generated artificially by DTLB miss exception
|
792 |
|
|
handler when no matching PTE found in page tables or page protection
|
793 |
|
|
violation for load/store operations.
|
794 |
|
|
| Instruction Page Fault| 0x400 | Generated artificially by ITLB miss exception
|
795 |
|
|
handler when no matching PTE found in page tables or page protection violation
|
796 |
|
|
for instruction fetch.
|
797 |
|
|
| Low Priority External Interrupt | 0x500 | Low priority external
|
798 |
|
|
interrupt asserted.
|
799 |
|
|
| Alignment | 0x600 | Load/store access to naturally not aligned location.
|
800 |
|
|
| Illegal Instruction | 0x700 | Illegal instruction in the instruction stream.
|
801 |
|
|
| High Priority External Interrupt | 0x800 | High priority external
|
802 |
|
|
interrupt asserted.
|
803 |
|
|
| D-TLB Miss | 0x900 | No matching entry in DTLB (DTLB miss).
|
804 |
|
|
| I-TLB Miss | 0xA00 | No matching entry in ITLB (ITLB miss).
|
805 |
647 |
julius |
| Range | 0xB00 | If programmed in the SR, the setting of SR[OV],
|
806 |
|
|
usually by an arithmetic instruction, causes a range exception.
|
807 |
645 |
julius |
| System Call | 0xC00 | System call initiated by software.
|
808 |
|
|
| Floating point exception | 0xD00 | FP operation caused flags in FPCSR to
|
809 |
|
|
become set.
|
810 |
|
|
| Trap | 0xE00 | Trap instruction was decoded
|
811 |
|
|
|===========================================================
|
812 |
|
|
|
813 |
808 |
julius |
The OR1200 exception support does not include support for fast context
|
814 |
|
|
switching.
|
815 |
645 |
julius |
|
816 |
|
|
Data Cache Operation
|
817 |
|
|
~~~~~~~~~~~~~~~~~~~~
|
818 |
|
|
Data Cache Load/Store Access
|
819 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
820 |
|
|
Load/store unit requests data from the data ((cache)) and stores them into
|
821 |
|
|
the general-purpose register file and forwards them to integer execution
|
822 |
|
|
units. Therefore LSU is tightly coupled with the data cache.
|
823 |
|
|
|
824 |
|
|
If there is no data cache line miss nor ((DTLB)) miss, load operations take
|
825 |
|
|
two clock cycles to execute and store operations take one clock cycle to
|
826 |
|
|
execute. LSU does all the data alignment work.
|
827 |
|
|
|
828 |
|
|
Data can be written to the data cache on a word, half-word or byte basis. Since
|
829 |
|
|
data cache only operates in write-through mode, all writes are immediately
|
830 |
|
|
written back to main memory or to the next level of caches.
|
831 |
|
|
|
832 |
|
|
[[wb_write_fig]]
|
833 |
|
|
.WISHBONE Write Cycle
|
834 |
|
|
image::img/wb_write.gif[scaledwidth="70%",align="center"]
|
835 |
|
|
|
836 |
|
|
<> shows how a ((write-through)) cycle on data WISHBONE interface
|
837 |
|
|
is performed when a store instruction hits in the data cache. If +dwb_ERR_I+
|
838 |
|
|
or +dwb_RTY_I+ is asserted instead of usual +dwb_ACK_I+, bus error exception
|
839 |
|
|
is invoked.
|
840 |
|
|
|
841 |
|
|
Data Cache Line Fill Operation
|
842 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
843 |
|
|
When executing load instruction and a cache miss occurs, depending on whether
|
844 |
|
|
the cache uses ((write-through)) or ((write-back)) strategy and the line
|
845 |
|
|
is clean or invalid, a 4 beat sequential read burst with critical word
|
846 |
|
|
first is performed. If the strategy is write-back and the line is dirty,
|
847 |
|
|
the line is first written back to memory. The critical word is forwarded to
|
848 |
|
|
the load/store unit to minimize performance loss because of the cache miss.
|
849 |
|
|
|
850 |
|
|
[[wb_read_fig]]
|
851 |
|
|
.WISHBONE Block Read Cycle
|
852 |
|
|
image::img/wb_read.gif[scaledwidth="70%",align="center"]
|
853 |
|
|
|
854 |
|
|
<> shows how a cache line is read in WISHBONE read block cycle
|
855 |
|
|
composed out of four read transfers. If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted
|
856 |
|
|
instead of usual +dwb_ACK_I+, bus error exception is invoked.
|
857 |
|
|
|
858 |
|
|
When executing a store instruction with the cache in write-through strategy,
|
859 |
|
|
and a cache miss occurs, the write is simply put on the bus and no caching
|
860 |
|
|
occurs. If it is a miss and the cache is in write back strategy and the line
|
861 |
|
|
is valid and clean or invalid, a 4 beat sequential read burst to fill the
|
862 |
|
|
line is performed, and the the write to cache occurs. If storing and a cache
|
863 |
|
|
miss occurs, and the desired line is valid and dirty, it is first written
|
864 |
|
|
back to memory before the desired line is read.
|
865 |
|
|
|
866 |
|
|
[[wb_rw_fig]]
|
867 |
|
|
.WISHBONE Block Read/Write Cycle
|
868 |
|
|
image::img/wb_rw.gif[scaledwidth="70%",align="center"]
|
869 |
|
|
|
870 |
|
|
<> shows how a cache line is read in WISHBONE read block cycle
|
871 |
|
|
followed by a write transfer. If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted instead
|
872 |
|
|
of usual +dwb_ACK_I+, bus error exception is invoked.
|
873 |
|
|
|
874 |
|
|
Cache/Memory Coherency
|
875 |
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
876 |
|
|
Data cache in OR1200 operates in either write-through or write-back mode,
|
877 |
|
|
definable at synthesis time, for default use, and runtime when DMMU is
|
878 |
|
|
used. There is currently no ((coherency)) support between local data cache and
|
879 |
|
|
caches of other processors.
|
880 |
|
|
|
881 |
|
|
Data Cache Enabling/Disabling
|
882 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
883 |
|
|
Data cache is disabled at power up. Entire data cache can be enabled by setting
|
884 |
|
|
bit SR[DCE] to one. Before data cache is enabled, it must be invalidated.
|
885 |
|
|
|
886 |
|
|
Data Cache Invalidation
|
887 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
888 |
|
|
Data cache in OR1200 does not support ((invalidation)) of entire data
|
889 |
|
|
cache. Normal procedure to invalidate entire data cache is to cycle through
|
890 |
|
|
all data cache lines and invalidate each line separately.
|
891 |
|
|
|
892 |
|
|
Data Cache Locking
|
893 |
|
|
^^^^^^^^^^^^^^^^^^
|
894 |
|
|
Data cache implements way ((locking)) bits in data cache control register
|
895 |
|
|
DCCR. Bits LWx lock individual ways when they are set to one.
|
896 |
|
|
|
897 |
|
|
Data Cache Line Prefetch
|
898 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
899 |
|
|
Data cache line ((prefetch)) is optional in the OpenRISC 1000 architecture and
|
900 |
|
|
is not implemented in OR1200.
|
901 |
|
|
|
902 |
|
|
Data Cache Line ((Flush))
|
903 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
904 |
|
|
Operation is performed by writing effective address to the DCBFR register.
|
905 |
|
|
|
906 |
|
|
When a cache line is valid and clean, or the cache is in write-through
|
907 |
|
|
strategy, the line is invalidated and no write-back occurs.
|
908 |
|
|
|
909 |
|
|
Data Cache Line Invalidate
|
910 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
911 |
|
|
Data cache line ((invalidate)) invalidates a single data cache line. Operation
|
912 |
|
|
is performed by writing effective address to the DCBIR register. If cache
|
913 |
|
|
is in write-back strategy, it is best to use the line flush function.
|
914 |
|
|
|
915 |
|
|
Data Cache Line ((Write-back))
|
916 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
917 |
|
|
Operation is performed by writing effective address to the DCBWR register.
|
918 |
|
|
|
919 |
|
|
If cache is in ((write-through)) strategy, this operation is ignored as no
|
920 |
|
|
lines will be cached and dirty, capable of being written back.
|
921 |
|
|
|
922 |
|
|
Data Cache Line ((Lock))
|
923 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
924 |
|
|
Locking of individual data cache lines is not implemented in OR1200.
|
925 |
|
|
|
926 |
|
|
Data Cache ((inhibit)) with address bit 31 set
|
927 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
928 |
|
|
If DMMU is disabled, by default all addresses with bit 31 of the address
|
929 |
|
|
asserted high will cause the data cache to be inhibited, meaning no reads
|
930 |
|
|
or writes are cached.
|
931 |
|
|
|
932 |
|
|
If the ((DMMU)) is enabled, it is possible for any address to be inhibited
|
933 |
|
|
or not, and in these modes the cache behaves accordingly.
|
934 |
|
|
|
935 |
|
|
Instruction ((Cache)) Operation
|
936 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
937 |
|
|
Instruction Cache Instruction ((Fetch)) Access
|
938 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
939 |
|
|
Instruction unit requests instruction from the instruction cache and forwards
|
940 |
|
|
them to the instruction queue inside instruction unit. Therefore instruction
|
941 |
|
|
unit is tightly coupled with the instruction cache.
|
942 |
|
|
|
943 |
|
|
If there is no instruction cache line ((miss)) nor ITLB miss, instruction fetch
|
944 |
|
|
operation takes one clock cycle to execute.
|
945 |
|
|
|
946 |
|
|
Instruction cache cannot be explicitly modified like data cache can be with
|
947 |
|
|
store instructions.
|
948 |
|
|
|
949 |
|
|
Instruction Cache Line Fill Operation
|
950 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
951 |
|
|
On a cache miss, a 4 beat sequential read burst with critical word first is
|
952 |
|
|
performed. Critical word is forwarded to the instruction unit to minimize
|
953 |
|
|
performance loss because of the cache miss.
|
954 |
|
|
|
955 |
|
|
[[wb_block_read_fig]]
|
956 |
|
|
.WISHBONE Block Read Cycle
|
957 |
|
|
image::img/wb_block_read.gif[scaledwidth="70%",align="center"]
|
958 |
|
|
|
959 |
|
|
<> shows how a cache line is read in WISHBONE read block
|
960 |
|
|
cycle composed out of four read transfers. If +iwb_ERR_I+ or +iwb_RTY_I+ is
|
961 |
|
|
asserted instead of usual +dwb_ACK_I+, bus error exception is invoked.
|
962 |
|
|
|
963 |
|
|
Cache/Memory ((Coherency))
|
964 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
965 |
|
|
OR1200 is not intended for use in multiprocessor environments. Therefore no
|
966 |
|
|
support for coherency between local instruction cache and caches of other
|
967 |
|
|
processors or main memory is implemented.
|
968 |
|
|
|
969 |
|
|
Instruction Cache Enabling/Disabling
|
970 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
971 |
|
|
Instruction cache is disabled at power up. Entire instruction cache can be
|
972 |
|
|
enabled by setting bit SR[ICE] to one. Before instruction cache is enabled,
|
973 |
|
|
it must be invalidated.
|
974 |
|
|
|
975 |
|
|
Instruction Cache ((Invalidation))
|
976 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
977 |
|
|
Instruction cache in OR1200 does not support invalidation of entire instruction
|
978 |
|
|
cache. Normal procedure to invalidate entire instruction cache is to cycle
|
979 |
|
|
through all instruction cache lines and invalidate each line separately.
|
980 |
|
|
|
981 |
|
|
Instruction Cache Locking
|
982 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
983 |
|
|
Instruction cache implements way locking bits in instruction cache control
|
984 |
|
|
register ICCR. Bits LWx lock individual ways when they are set to one.
|
985 |
|
|
|
986 |
|
|
Instruction Cache Line ((Prefetch))
|
987 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
988 |
|
|
Instruction cache line prefetch is optional in the OpenRISC 1000 architecture
|
989 |
|
|
and is not implemented in OR1200.
|
990 |
|
|
|
991 |
|
|
Instruction Cache Line ((Invalidate))
|
992 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
993 |
|
|
Instruction cache line invalidate invalidates a single instruction cache
|
994 |
|
|
line. Operation is performed by writing effective address to the ICBIR
|
995 |
|
|
register.
|
996 |
|
|
|
997 |
|
|
Instruction ((Cache Line Lock))
|
998 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
999 |
|
|
Locking of individual instruction cache lines is not implemented in OR1200.
|
1000 |
|
|
|
1001 |
|
|
Data MMU
|
1002 |
|
|
~~~~~~~~
|
1003 |
|
|
Translation Disabled
|
1004 |
|
|
^^^^^^^^^^^^^^^^^^^^
|
1005 |
|
|
Load/store address translation can be disabled by clearing bit SR[DME]. If
|
1006 |
|
|
translation is disabled, then physical address used to access data cache
|
1007 |
|
|
and optionally provided on +dwb_ADDR_O+, is the same as load/store effective
|
1008 |
|
|
address.
|
1009 |
|
|
(((Address Translation,Data)))
|
1010 |
|
|
|
1011 |
|
|
Translation Enabled
|
1012 |
|
|
^^^^^^^^^^^^^^^^^^^
|
1013 |
|
|
Load/store address translation can be enabled by setting bit SR[DME]. If
|
1014 |
|
|
translation is enabled, it provides load/store effective address to physical
|
1015 |
|
|
address translation and page protection for memory accesses.
|
1016 |
|
|
(((Address Translation,Data)))
|
1017 |
|
|
|
1018 |
|
|
[[addr_translation_fig]]
|
1019 |
|
|
.32-bit Address Translation Mechanism using Two-Level Page Table
|
1020 |
|
|
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
|
1021 |
|
|
|
1022 |
|
|
In OR1200 case, ((page tables)) must be managed by operating system's virtual
|
1023 |
|
|
memory management subsystem. <> shows address translation
|
1024 |
|
|
using two-level page table. Refer to <> for one-level page
|
1025 |
|
|
table address translation as well as for details about address translation
|
1026 |
|
|
and page table content.
|
1027 |
|
|
|
1028 |
|
|
((DMMUCR)) and Flush of Entire ((DTLB))
|
1029 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1030 |
|
|
DMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
|
1031 |
|
|
must be stored in software variable. Flush of entire DTLB must be performed
|
1032 |
|
|
by software flush of every DTLB entry separately. Software flush is performed
|
1033 |
|
|
by manually writing bits from the TLB entries back to PTEs.
|
1034 |
|
|
|
1035 |
|
|
Page Protection
|
1036 |
|
|
^^^^^^^^^^^^^^^
|
1037 |
|
|
After a virtual address is determined to be within a page covered by the
|
1038 |
|
|
valid PTE, the access is validated by the memory protection mechanism. If
|
1039 |
|
|
this protection mechanism prohibits the access, a data page fault exception
|
1040 |
|
|
is generated.
|
1041 |
|
|
(((Page Protection,Data)))
|
1042 |
|
|
|
1043 |
|
|
The memory protection mechanism allows selectively granting read access
|
1044 |
|
|
and write access for both supervisor and user modes. The page protection
|
1045 |
|
|
mechanism provides protection at all page level granularities.
|
1046 |
|
|
|
1047 |
|
|
[[protection_attrs_ldst_table]]
|
1048 |
|
|
.Protection Attributes for Load/Store Accesses
|
1049 |
|
|
[width="70%",options="header"]
|
1050 |
|
|
|================================
|
1051 |
|
|
| Protection attribute | Meaning
|
1052 |
|
|
| DTLBWyTR[SREx] | Enable load operations in supervisor mode to the
|
1053 |
|
|
page.
|
1054 |
|
|
| DTLBWyTR[SWEx] | Enable store operations in supervisor mode to the
|
1055 |
|
|
page.
|
1056 |
|
|
| DTLBWyTR[UREx] | Enable load operations in user mode to the page.
|
1057 |
|
|
| DTLBWyTR[UWEx] | Enable store operations in user mode to the page.
|
1058 |
|
|
|================================
|
1059 |
|
|
|
1060 |
|
|
<> lists page protection attributes defined in
|
1061 |
|
|
DTLBWyTR pregister. For the individual page appropriate strategy out of
|
1062 |
|
|
seven possible strategies programmed with the PPI field of the PTE. Because
|
1063 |
|
|
OR1200 does not implement DMMUPR, translation of PTE[PPI] into suitable set
|
1064 |
|
|
of protection bits must be performed by software and written into DTLBWyTR.
|
1065 |
|
|
|
1066 |
|
|
((DTLB)) Entry Reload
|
1067 |
|
|
^^^^^^^^^^^^^^^^^^^^^
|
1068 |
|
|
OR1200 does not implement DTLB entry reloads in hardware. Instead software
|
1069 |
|
|
routine must be used to search page table for correct page table entry (PTE)
|
1070 |
|
|
and copy it into the DTLB. Software is responsible for maintaining accessed
|
1071 |
|
|
and dirty bits in the page tables.
|
1072 |
|
|
|
1073 |
|
|
When LSU computes load/store effective address whose physical address is
|
1074 |
|
|
not already cached by DTLB, a DTLB miss exception is invoked.
|
1075 |
|
|
|
1076 |
|
|
DTLB reload routine must load the correct ((PTE)) to correct ((DTLBWyMR))
|
1077 |
|
|
and ((DTLBWyTR)) register from one of possible DTLB ways.
|
1078 |
|
|
|
1079 |
|
|
DTLB Entry Invalidation
|
1080 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
1081 |
|
|
Special-purpose register DTLBEIR must be written with the effective address
|
1082 |
|
|
and corresponding DTLB entry will be invalidated in the local DTLB.
|
1083 |
|
|
|
1084 |
|
|
Locking DTLB Entries
|
1085 |
|
|
^^^^^^^^^^^^^^^^^^^^
|
1086 |
|
|
Since all DTLB entry reloads are performed in software, there is no hardware
|
1087 |
|
|
locking of DTLB entries. Instead it is up to the software reload routine to
|
1088 |
|
|
avoid replacing some of the entries if so desired.
|
1089 |
|
|
|
1090 |
|
|
Page Attribute - Dirty (D)
|
1091 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1092 |
|
|
Dirty (D) attribute is not implemented in OR1200 DTLB. It is up to the
|
1093 |
|
|
operating system to generate dirty attribute bit with page protection
|
1094 |
|
|
mechanism.
|
1095 |
|
|
(((Page Attributes,Data)))
|
1096 |
|
|
|
1097 |
|
|
Page Attribute - Accessed (A)
|
1098 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1099 |
|
|
Accessed (A) attribute is not implemented in OR1200 DTLB. It is up to the
|
1100 |
|
|
operating system to generate accessed attribute bit with page protection
|
1101 |
|
|
mechanism.
|
1102 |
|
|
(((Page Attributes,Data)))
|
1103 |
|
|
|
1104 |
|
|
Page Attribute - Weakly Ordered Memory (WOM)
|
1105 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1106 |
|
|
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
|
1107 |
|
|
memory accesses are serialized and therefore this attribute is not implemented.
|
1108 |
|
|
(((Page Attributes,Data)))
|
1109 |
|
|
|
1110 |
|
|
Page Attribute - Write-Back Cache (WBC)
|
1111 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1112 |
|
|
Write-back cache (WBC) attribute is not implemented as the data cache cannot
|
1113 |
|
|
be configured at run time to be write-back enabled if write-through strategy
|
1114 |
|
|
was selected at synthesis-time.
|
1115 |
|
|
(((Page Attributes,Data)))
|
1116 |
|
|
|
1117 |
|
|
Page Attribute - Caching-Inhibited (CI)
|
1118 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1119 |
|
|
Caching-inhibited (CI) attribute is not implemented in OR1200 DTLB. Cached
|
1120 |
|
|
and uncached regions are divided by bit 30 of data effective address.
|
1121 |
|
|
(((Page Attributes,Data)))
|
1122 |
|
|
|
1123 |
|
|
[[data_cached_regions_table]]
|
1124 |
|
|
.Cached and uncached regions
|
1125 |
|
|
[width="70%",options="header"]
|
1126 |
|
|
|===============================
|
1127 |
|
|
| Effective Address | Region
|
1128 |
|
|
| 0x00000000 - 0x3FFFFFFF | Cached
|
1129 |
|
|
| 0x40000000 - 0x7FFFFFFF | Uncached
|
1130 |
|
|
| 0x80000000 - 0xBFFFFFFF | Cached
|
1131 |
|
|
| 0xC0000000 - 0xFFFFFFFF | Uncached
|
1132 |
|
|
|===============================
|
1133 |
|
|
|
1134 |
|
|
Uncached accesses must be performed when I/O registers are memory mapped
|
1135 |
|
|
and all reads and writes must be always performed directly to the external
|
1136 |
|
|
interface and not to the data cache.
|
1137 |
|
|
|
1138 |
|
|
Page Attribute - Cache Coherency (CC)
|
1139 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1140 |
|
|
Cache coherency (CC) attribute is not needed in OR1200 because it does not
|
1141 |
|
|
implement support for multiprocessor environments and because data cache
|
1142 |
|
|
operates only in write-through mode and therefore this attribute is not
|
1143 |
|
|
implemented.
|
1144 |
|
|
(((Page Attributes,Data)))
|
1145 |
|
|
|
1146 |
|
|
((Instruction MMU))
|
1147 |
|
|
~~~~~~~~~~~~~~~~~~~
|
1148 |
|
|
Translation Disabled
|
1149 |
|
|
^^^^^^^^^^^^^^^^^^^^
|
1150 |
|
|
Instruction fetch address translation can be disabled by clearing bit
|
1151 |
|
|
SR[IME]. If translation is disabled, then physical address used to access
|
1152 |
|
|
instruction cache and optionally provided on iwb_ADDR_O, is the same as
|
1153 |
|
|
instruction fetch effective address.
|
1154 |
|
|
(((Address Translation,Instruction)))
|
1155 |
|
|
|
1156 |
|
|
Translation Enabled
|
1157 |
|
|
^^^^^^^^^^^^^^^^^^^
|
1158 |
|
|
Instruction fetch address translation can be enabled by setting bit
|
1159 |
|
|
SR[IME]. If translation is enabled, it provides instruction fetch effective
|
1160 |
|
|
address to physical address translation and page protection for instruction
|
1161 |
|
|
fetch accesses.
|
1162 |
|
|
(((Address Translation,Instruction)))
|
1163 |
|
|
|
1164 |
|
|
[[addr_translation_rep_fig]]
|
1165 |
|
|
.32-bit Address Translation Mechanism using Two-Level Page Table
|
1166 |
|
|
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
|
1167 |
|
|
|
1168 |
|
|
In OR1200 case, page tables must be managed by operating system s virtual
|
1169 |
|
|
memory management subsystem. <> shows address
|
1170 |
|
|
translation using two-level page table. Refer to <> for
|
1171 |
|
|
one-level page table address translation as well as for details about address
|
1172 |
|
|
translation and page table content.
|
1173 |
|
|
|
1174 |
|
|
((IMMUCR)) and ((Flush)) of Entire ITLB
|
1175 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1176 |
|
|
IMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
|
1177 |
|
|
must be stored in software variable. Flush of entire ITLB must be performed
|
1178 |
|
|
by software flush of every ITLB entry separately. Software flush is performed
|
1179 |
|
|
by manually writing bits from the TLB entries back to PTEs.
|
1180 |
|
|
|
1181 |
|
|
Page Protection
|
1182 |
|
|
^^^^^^^^^^^^^^^
|
1183 |
|
|
After a virtual address is determined to be within a page covered by the
|
1184 |
|
|
valid PTE, the access is validated by the memory protection mechanism. If
|
1185 |
|
|
this protection mechanism prohibits the access, an instruction page fault
|
1186 |
|
|
exception is generated.
|
1187 |
|
|
(((Page Protection,Instruction)))
|
1188 |
|
|
|
1189 |
|
|
The memory protection mechanism allows selectively granting execute access
|
1190 |
|
|
for both supervisor and user modes. The page protection mechanism provides
|
1191 |
|
|
protection at all page level granularities.
|
1192 |
|
|
|
1193 |
|
|
[[protection_attrs_inst_table]]
|
1194 |
|
|
.Protection Attributes for Instruction Fetch Accesses
|
1195 |
|
|
[width="70%",options="header"]
|
1196 |
|
|
|================================
|
1197 |
|
|
| Protection attribute | Meaning
|
1198 |
|
|
| ITLBWyTR[SXEx] | Enable execute operations in supervisor mode of the
|
1199 |
|
|
page.
|
1200 |
|
|
| ITLBWyTR[UXEx] | Enable execute operations in user mode of the page.
|
1201 |
|
|
|================================
|
1202 |
|
|
|
1203 |
|
|
<> lists page protection attributes defined
|
1204 |
|
|
in ITLBWyTR pregister. For the individual page appropriate strategy out
|
1205 |
|
|
of seven possible strategies programmed with PPI field of the PTE. Because
|
1206 |
|
|
OR1200 does not implement IMMUPR, translation of PTE[PPI] into suitable set
|
1207 |
|
|
of protection bits must be performed by software and written into ITLBWyTR.
|
1208 |
|
|
|
1209 |
|
|
((ITLB)) Entry Reload
|
1210 |
|
|
^^^^^^^^^^^^^^^^^^^^^
|
1211 |
|
|
OR1200 does not implement ITLB entry reloads in hardware. Instead software
|
1212 |
|
|
routine must be used to search page table for correct page table entry (PTE)
|
1213 |
|
|
and copy it into the ITLB. Software is responsible for maintaining accessed
|
1214 |
|
|
bit in the page tables.
|
1215 |
|
|
|
1216 |
|
|
When LSU computes instruction fetch effective address whose physical address
|
1217 |
|
|
is not already cached by ITLB, an ITLB miss exception is invoked.
|
1218 |
|
|
|
1219 |
|
|
ITLB reload routine must load the correct PTE to correct ITLBWyMR and ITLBWyTR
|
1220 |
|
|
register from one of possible ITLB ways.
|
1221 |
|
|
|
1222 |
|
|
ITLB Entry Invalidation
|
1223 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
1224 |
|
|
Special-purpose register ITLBEIR must be written with the effective address
|
1225 |
|
|
and corresponding ITLB entry will be invalidated in the local ITLB.
|
1226 |
|
|
|
1227 |
|
|
Locking ITLB Entries
|
1228 |
|
|
^^^^^^^^^^^^^^^^^^^^
|
1229 |
|
|
Since all ITLB entry reloads are performed in software, there is no hardware
|
1230 |
|
|
locking of ITLB entries. Instead it is up to the software reload routine to
|
1231 |
|
|
avoid replacing some of the entries if so desired.
|
1232 |
|
|
|
1233 |
|
|
Page Attribute - Dirty (D)
|
1234 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1235 |
|
|
Dirty (D) attribute resides in the PTE but it is not used by the IMMU.
|
1236 |
|
|
(((Page Attributes,Instruction)))
|
1237 |
|
|
|
1238 |
|
|
Page Attribute - Accessed (A)
|
1239 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1240 |
|
|
Accessed (A) attribute is not implemented in OR1200 ITLB. It is up to the
|
1241 |
|
|
operating system to generate accessed attribute bit with page protection
|
1242 |
|
|
mechanism.
|
1243 |
|
|
(((Page Attributes,Instruction)))
|
1244 |
|
|
|
1245 |
|
|
Page Attribute - Weakly Ordered Memory (WOM)
|
1246 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1247 |
|
|
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
|
1248 |
|
|
instruction fetch accesses are serialized and therefore this attribute is
|
1249 |
|
|
not implemented.
|
1250 |
|
|
(((Page Attributes,Instruction)))
|
1251 |
|
|
|
1252 |
|
|
Page Attribute - Write-Back Cache (WBC)
|
1253 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1254 |
|
|
Write-back cache (WBC) attribute resides in the PTE but it is not used by
|
1255 |
|
|
the IMMU.
|
1256 |
|
|
(((Page Attributes,Instruction)))
|
1257 |
|
|
|
1258 |
|
|
Page Attribute - Caching-Inhibited (CI)
|
1259 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1260 |
|
|
Caching-inhibited (CI) attribute is not implemented in OR1200 ITLB. Cached
|
1261 |
|
|
and uncached regions are divided by bit 30 of instruction effective address.
|
1262 |
|
|
(((Page Attributes,Instruction)))
|
1263 |
|
|
|
1264 |
|
|
[[inst_cached_regions_table]]
|
1265 |
|
|
.Cached and uncached regions
|
1266 |
|
|
[width="70%",options="header"]
|
1267 |
|
|
|===============================
|
1268 |
|
|
| Effective Address | Region
|
1269 |
|
|
| 0x00000000 - 0x3FFFFFFF | Cached
|
1270 |
|
|
| 0x40000000 - 0x7FFFFFFF | Uncached
|
1271 |
|
|
| 0x80000000 - 0xBFFFFFFF | Cached
|
1272 |
|
|
| 0xC0000000 - 0xFFFFFFFF | Uncached
|
1273 |
|
|
|===============================
|
1274 |
|
|
|
1275 |
|
|
Page Attribute - Cache Coherency (CC)
|
1276 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1277 |
|
|
Cache coherency (CC) attribute resides in the PTE but it is not used by
|
1278 |
|
|
the IMMU.
|
1279 |
|
|
(((Page Attributes,Instruction)))
|
1280 |
|
|
|
1281 |
|
|
((Programmable Interrupt Controller))
|
1282 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1283 |
|
|
PICMR special-purpose register is used to mask or unmask up to 30 programmable
|
1284 |
|
|
interrupt sources. PICPR special-purpose register is used to assign low or
|
1285 |
|
|
high priority to maximum of 30 interrupt sources.
|
1286 |
|
|
|
1287 |
|
|
PICSR special-purpose register is used to determine status of each interrupt
|
1288 |
|
|
input. Bits in PICSR represent status of the interrupt inputs and the
|
1289 |
|
|
actual interrupt must be cleared in the device that is the source of a
|
1290 |
|
|
pending interrupt.
|
1291 |
|
|
|
1292 |
|
|
The ((PIC)) implementation in the OR1200 differs from the architecture
|
1293 |
|
|
specification. The PIC instead offers a latched level-sensitive interrupt.
|
1294 |
|
|
|
1295 |
|
|
Once an interrupt line is latched (i.e. its value appears in PICSR), no
|
1296 |
|
|
new interrupts can be triggered for that line until its bit in PICSR is
|
1297 |
|
|
cleared. The usual sequence for an interrupt handler is then as follows.
|
1298 |
|
|
|
1299 |
|
|
. Peripheral asserts interrupt, which is latched and triggers handler.
|
1300 |
|
|
. Handler processes interrupt.
|
1301 |
|
|
. Handler notifies peripheral that the interrupt has been processed (typically
|
1302 |
|
|
via a memory mapped register).
|
1303 |
|
|
. Peripheral deasserts interrupt.
|
1304 |
|
|
. Handler clears corresponding bit in PICSR and returns.
|
1305 |
|
|
|
1306 |
|
|
It is assumed that the peripheral will de-assert its interrupt promptly
|
1307 |
|
|
(within 1-2 cycles). Otherwise on exiting the interrupt handler, having
|
1308 |
|
|
cleared PICSR, the level sensitive interrupt will immediately retrigger.
|
1309 |
|
|
|
1310 |
|
|
((Tick Timer))
|
1311 |
|
|
~~~~~~~~~~~~~~
|
1312 |
|
|
Tick timer facility is enabled with TTMR[M]. TTCR is incremented with each
|
1313 |
|
|
clock cycle and a high priority interrupt can be asserted whenever lower 28
|
1314 |
|
|
bits of TTCR match TTMR[TP] and TTMR[IE] is set.
|
1315 |
|
|
|
1316 |
|
|
TTCR restarts counting from zero when match event happens and TTMR[M] is
|
1317 |
|
|
0x1. If TTMR[M] is 0x2, TTCR is stoped when match event happens and TTCR
|
1318 |
|
|
must be changed to start counting again. When TTMR[M] is 0x3, TTCR keeps
|
1319 |
|
|
counting even when match event happens.
|
1320 |
|
|
|
1321 |
|
|
((Power Management))
|
1322 |
|
|
~~~~~~~~~~~~~~~~~~~~
|
1323 |
|
|
((Clock Gating)) and Frequency Changing Versus CPU Stalling
|
1324 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1325 |
|
|
If system doesn t support clock gating and if changing clock frequency in
|
1326 |
|
|
slow down mode is not possible, CPU can be stalled for certain number of
|
1327 |
|
|
clock cycles. This is much lower benefit on power consumption however it
|
1328 |
|
|
still reduces power consumption.
|
1329 |
|
|
|
1330 |
|
|
Slow Down Mode
|
1331 |
|
|
^^^^^^^^^^^^^^
|
1332 |
|
|
Slow down mode is software controlled with the 4-bit value in PMR[SDF]. Lower
|
1333 |
|
|
value specifies higher expected performance from the processor core. Usually
|
1334 |
|
|
PMR[SDF] is dynamically set by the operating system s idle routine, that
|
1335 |
|
|
monitors the usage of the processor core.
|
1336 |
|
|
(((Mode,Slow Down)))
|
1337 |
|
|
|
1338 |
|
|
PMR[SDF] is broadcast on +pm_clksd+. External clock generator should adjust
|
1339 |
|
|
clock frequency according to the value of +pm_clksd+. Exact slow down factors
|
1340 |
|
|
are not defined but 0xF should go all the way down to 32.768 KHz.
|
1341 |
|
|
|
1342 |
|
|
With +pm_clksd+ equal to 0xF, +pm_lvolt+ is asserted. This is an indication for
|
1343 |
|
|
the external power supply to lower the voltage.
|
1344 |
|
|
|
1345 |
|
|
Doze Mode
|
1346 |
|
|
^^^^^^^^^
|
1347 |
|
|
To switch to doze mode, software should set the PMR[DME]. Once an interrupt
|
1348 |
|
|
is received by the programmable interrupt controller (PIC), +pm_wakeup+
|
1349 |
|
|
is asserted and external clock generation circuitry should enable all
|
1350 |
|
|
clocks. Once clocks are running RISC is switched back again to the normal
|
1351 |
|
|
mode and PMR[DME] is cleared.
|
1352 |
|
|
(((Mode,Doze)))
|
1353 |
|
|
|
1354 |
|
|
When doze mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
|
1355 |
|
|
+pm_immu_gate+ and +pm_cpugate+ are asserted. As a result all clocks except
|
1356 |
|
|
+clk_tt+ should be gated by external clock generation circuitry.
|
1357 |
|
|
|
1358 |
|
|
Sleep Mode
|
1359 |
|
|
^^^^^^^^^^
|
1360 |
|
|
To switch to sleep mode, software should set the PMR[SME]. Once an interrupt
|
1361 |
|
|
is received by the programmable interrupt controller (PIC), +pm_wakeup+ is
|
1362 |
|
|
asserted and external clock generation should enable all clocks. Once clocks
|
1363 |
|
|
are running, RISC is switched back again to the normal mode and PMR[SME]
|
1364 |
|
|
is cleared.
|
1365 |
|
|
(((Mode,Sleep)))
|
1366 |
|
|
|
1367 |
|
|
When sleep mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
|
1368 |
|
|
+pm_immu_gate+, +pm_cpu_gate+ and +pm_tt_gate+ are asserted. As a result
|
1369 |
|
|
all clocks including +clk_tt+ should be gated by external clock generation
|
1370 |
|
|
circuitry.
|
1371 |
|
|
|
1372 |
|
|
In sleep mode, +pm_lvolt+ is asserted. This is an indication for the external
|
1373 |
|
|
power supply to lower the voltage.
|
1374 |
|
|
|
1375 |
|
|
Clock Gating
|
1376 |
|
|
^^^^^^^^^^^^
|
1377 |
|
|
((Clock gating)) feature is not implemented in OR1200 power management.
|
1378 |
|
|
|
1379 |
|
|
Disabled Units Force Clock Gating
|
1380 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1381 |
|
|
Units that are disabled in special-purpose register SR, have their clock
|
1382 |
|
|
gate signals asserted. Cleared bits SR[DCE], SR[ICE], SR[DME] and SR[IME]
|
1383 |
|
|
directly force assertion of +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+
|
1384 |
|
|
and +pm_immu_gate+.
|
1385 |
|
|
|
1386 |
|
|
((Debug Unit))
|
1387 |
|
|
~~~~~~~~~~~~~~
|
1388 |
|
|
Debug unit can be controlled through development interface or it can operate
|
1389 |
|
|
independently programmed and handled by the RISC s resident debug software.
|
1390 |
|
|
|
1391 |
|
|
((Watchpoints))
|
1392 |
|
|
^^^^^^^^^^^^^^^
|
1393 |
|
|
OR1200 debug unit does not implement OR12000 architecture watchpoints.
|
1394 |
|
|
|
1395 |
|
|
((Breakpoint)) Exception
|
1396 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
1397 |
|
|
Which breakpointDMR2[WGB] bits specify which watchpoints invoke breakpoint
|
1398 |
|
|
exception. By invoking breakpoint exception, target resident debugger can
|
1399 |
|
|
be built.
|
1400 |
|
|
|
1401 |
|
|
Breakpoint is broadcast on development interface on +dbg_bp_o+.
|
1402 |
|
|
|
1403 |
|
|
((Development Interface))
|
1404 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
1405 |
|
|
NOTE: The information in this section is to be reviewed. It is the author's
|
1406 |
|
|
opinion that the debug interface is now largely provided by the SPR mappings,
|
1407 |
|
|
and no special sideband functions exist aside from stalling and resetting
|
1408 |
|
|
the core.
|
1409 |
|
|
|
1410 |
|
|
An additional _development and debug interface IP_ core may be used to connect
|
1411 |
|
|
OpenRISC 1200 to standard debuggers using IEEE.1149.1 (JTAG) protocol.
|
1412 |
|
|
|
1413 |
|
|
((Debugging)) Through ((Development Interface))
|
1414 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1415 |
|
|
The DSR special-purpose register specifies which exceptions cause the core
|
1416 |
|
|
to stop the execution of the exception handler and turn over control to
|
1417 |
|
|
development interface. It can be programmed by the resident debug software
|
1418 |
|
|
or by the development interface.
|
1419 |
|
|
|
1420 |
|
|
The DRR special-purpose register is specifies which event caused the core to
|
1421 |
|
|
stop the execution of program flow and turned over control to the development
|
1422 |
|
|
interface. It should be cleared by the resident debug software or by the
|
1423 |
|
|
development interface.
|
1424 |
|
|
|
1425 |
|
|
The DIR special-purpose register is not implemented.
|
1426 |
|
|
|
1427 |
|
|
Reading PC, Load/Store EA, Load Data, Store Data, Instruction
|
1428 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1429 |
|
|
Crucial information like ((program counter)) (PC), load/store effective
|
1430 |
|
|
address (LSEA), load data, store data and current instruction in execution
|
1431 |
|
|
pipeline can be asynchronously read through the development interface.
|
1432 |
|
|
|
1433 |
|
|
[[dev_commands_table]]
|
1434 |
|
|
.Development Interface Operation Commands
|
1435 |
|
|
[width="70%",options="header"]
|
1436 |
|
|
|========================
|
1437 |
|
|
| dbg_op_i[2:0] | Meaning
|
1438 |
|
|
| 0x0 | Reading Program Counter (PC)
|
1439 |
|
|
| 0x1 | Reading Load/Store Effective Address
|
1440 |
|
|
| 0x2 | Reading Load Data
|
1441 |
|
|
| 0x3 | Reading Store Data
|
1442 |
|
|
| 0x4 | Reading SPR
|
1443 |
|
|
| 0x5 | Writing SPR
|
1444 |
|
|
| 0x6 | Reading Instruction in Execution Pipeline
|
1445 |
|
|
| 0x7 | Reserved
|
1446 |
|
|
|========================
|
1447 |
|
|
|
1448 |
|
|
<> lists operation commands that control what is read
|
1449 |
|
|
or written through development interface. All reads except reads and writes
|
1450 |
|
|
of SPRs are asynchronous.
|
1451 |
|
|
|
1452 |
|
|
Reading and Writing SPRs Through Development Interface
|
1453 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1454 |
|
|
For reads and write to SPRs +dbg_op_i+ must be set to 0x4 and 0x5,
|
1455 |
|
|
respectively.
|
1456 |
|
|
|
1457 |
|
|
[[dev_interface_cycles_fig]]
|
1458 |
|
|
.Development Interface Cycles
|
1459 |
|
|
image::img/dev_interface_cycles.gif[scaledwidth="70%",align="center"]
|
1460 |
|
|
|
1461 |
|
|
<> shows development interface cycles. Writes must
|
1462 |
|
|
be synchronous to the main RISC clock positive edge and should take one clock
|
1463 |
|
|
cycle. Reads must take two clock cycles because access to synchronous cache
|
1464 |
|
|
lines or to TLB entries introduces one clock cycle of delay.
|
1465 |
|
|
|
1466 |
|
|
If required, external debugger can stop the CPU core by asserting
|
1467 |
|
|
+dbg_stall_i+. This way it can have enough time to read all interesting
|
1468 |
|
|
registers from the RISC or guarantee that writes into SPRs are performed
|
1469 |
|
|
without RISC writing to the same registers.
|
1470 |
|
|
|
1471 |
|
|
Tracking ((Data Flow))
|
1472 |
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
1473 |
|
|
An external debugger can monitor and record data flow inside the RISC for
|
1474 |
|
|
debugging purposes and profiling analysis. This is accomplished by monitoring
|
1475 |
|
|
status of the load/store unit, load/store effective address and load/store
|
1476 |
|
|
data, all available at the development interface.
|
1477 |
|
|
|
1478 |
|
|
[[status_ldst_unit_table]]
|
1479 |
|
|
.Status of the Load/Store Unit
|
1480 |
|
|
[width="70%",options="header"]
|
1481 |
|
|
|============================================================
|
1482 |
|
|
| dbg_lss_o[3:0] | Load/Store Instruction in Execution
|
1483 |
|
|
| 0x0 | No load/store instruction in execution
|
1484 |
|
|
| 0x1 | Reserved for load doubleword
|
1485 |
|
|
| 0x2 | Load byte and zero extend
|
1486 |
|
|
| 0x3 | Load byte and sign extend
|
1487 |
|
|
| 0x4 | Load halfword and zero extend
|
1488 |
|
|
| 0x5 | Load halfword and sign extend
|
1489 |
|
|
| 0x6 | Load singleword and zero extend
|
1490 |
|
|
| 0x7 | Load singleword and sign extend
|
1491 |
|
|
| 0x8 | Reserved for store doubleword
|
1492 |
|
|
| 0x9 | Reserved
|
1493 |
|
|
| 0xA | Store byte
|
1494 |
|
|
| 0xB | Reserved
|
1495 |
|
|
| 0xC | Store halfword
|
1496 |
|
|
| 0xD | Reserved
|
1497 |
|
|
| 0xE | Store singleword
|
1498 |
|
|
| 0xF | Reserved
|
1499 |
|
|
|============================================================
|
1500 |
|
|
|
1501 |
|
|
External trace buffer can capture all interesting data flow
|
1502 |
|
|
events by analyzing status of the load/store unit available on
|
1503 |
|
|
+dbg_lss_o+. <> lists different status encoding for
|
1504 |
|
|
the load/store unit.
|
1505 |
|
|
|
1506 |
|
|
Tracking ((Program Flow))
|
1507 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
1508 |
|
|
An external debugger can monitor and record program flow inside the RISC
|
1509 |
|
|
for debugging purposes and profiling analysis. This is accomplished by
|
1510 |
|
|
monitoring status of the instruction unit, PC and fetched instruction word,
|
1511 |
|
|
all available at the development interface.
|
1512 |
|
|
|
1513 |
|
|
[[status_inst_unit_table]]
|
1514 |
|
|
.Status of the Instruction Unit
|
1515 |
|
|
[width="70%",options="header"]
|
1516 |
|
|
|=========================================
|
1517 |
|
|
| dbg_is_o[1:0] | Instruction Fetch Status
|
1518 |
|
|
| 0x0 | No instruction fetch in progress
|
1519 |
|
|
| 0x1 | Normal instruction fetch
|
1520 |
|
|
| 0x2 | Executing branch instruction
|
1521 |
|
|
| 0x3 | Fetching instruction in delay slot
|
1522 |
|
|
|=========================================
|
1523 |
|
|
|
1524 |
|
|
External trace buffer can capture all interesting program flow
|
1525 |
|
|
events by analyzing status of the instruction unit available on
|
1526 |
|
|
+dbg_is_o+. <> lists different status encoding for
|
1527 |
|
|
the instruction unit.
|
1528 |
|
|
|
1529 |
|
|
Triggering ((External Watchpoint Event))
|
1530 |
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
1531 |
|
|
<> shows how development interface can assert
|
1532 |
|
|
+dbg_ewt_I+ and cause watchpoint event. If programmed, external watchpoint
|
1533 |
|
|
event will cause a breakpoint exception.
|
1534 |
|
|
|
1535 |
|
|
[[watchpoint_trigger_fig]]
|
1536 |
|
|
.Assertion of External Watchpoint Trigger
|
1537 |
|
|
image::img/watchpoint_trigger.gif[scaledwidth="70%",align="center"]
|
1538 |
|
|
|
1539 |
|
|
((Registers))
|
1540 |
|
|
-------------
|
1541 |
|
|
This section describes all registers inside the OR1200 core. Shifting _GRP_
|
1542 |
|
|
number 11 bits left and adding _REG_ number computes the address of each
|
1543 |
|
|
special-purpose register. All registers are 32 bits wide from software
|
1544 |
|
|
perspective. _USER MODE_ and _SUPV MODE_ specify the valid access types for
|
1545 |
|
|
each register in user mode and supervisor mode of operation. R/W stands for
|
1546 |
|
|
read and write access and R stands for read only access.
|
1547 |
|
|
|
1548 |
|
|
((Registers list))
|
1549 |
|
|
~~~~~~~~~~~~~~~~~~
|
1550 |
|
|
[[regs_table]]
|
1551 |
|
|
.List of All Registers
|
1552 |
|
|
[width="95%",options="header"]
|
1553 |
|
|
|============================================================================
|
1554 |
|
|
| Grp # | Reg # | Reg Name | USER MODE | SUPV MODE | Description
|
1555 |
|
|
| 0 | 0 | ((VR)) | - | R | Version Register
|
1556 |
|
|
| 0 | 1 | ((UPR)) | - | R | Unit Present Register
|
1557 |
|
|
| 0 | 2 | ((CPUCFGR)) | - | R | CPU Configuration Register
|
1558 |
|
|
| 0 | 3 | ((DMMUCFGR)) | - | R | Data MMU Configuration Register
|
1559 |
|
|
| 0 | 4 | ((IMMUCFGR)) | - | R | Instruction MMU Configuration Register
|
1560 |
|
|
| 0 | 5 | ((DCCFGR)) | - | R | Data Cache Configuration Register
|
1561 |
|
|
| 0 | 6 | ((ICCFGR)) | - | R | Instruction Cache Configuration Register
|
1562 |
|
|
| 0 | 7 | ((DCFGR)) | - | R | Debug Configuration Register
|
1563 |
|
|
| 0 | 16 | ((PC)) | - | R/W | PC mapped to SPR space
|
1564 |
|
|
| 0 | 17 | ((SR)) | - | R/W | Supervision Register
|
1565 |
|
|
| 0 | 20 | ((FPCSR)) | - | R/W | FP Control Status Register
|
1566 |
|
|
| 0 | 32 | ((EPCR0)) | - | R/W | Exception PC Register
|
1567 |
|
|
| 0 | 48 | ((EEAR0)) | - | R/W | Exception EA Register
|
1568 |
|
|
| 0 | 64 | ((ESR0)) | - | R/W | Exception SR Register
|
1569 |
|
|
| 0 | 1024-1055 | ((GPR0-GPR31)) | - | R/W | GPRs mapped to SPR space
|
1570 |
|
|
| 1 | 2 | ((DTLBEIR)) | - | W | Data TLB Entry Invalidate Register
|
1571 |
|
|
| 1 | 1024-1151 | ((DTLBW0MR0-DTLBW0MR127)) | - | R/W | Data TLB Match Registers Way 0
|
1572 |
|
|
| 1 | 1536-1663 | ((DTLBW0TR0-DTLBW0TR127)) | - | R/W | Data TLB Translate Registers Way 0
|
1573 |
|
|
| 2 | 2 | ((ITLBEIR)) | - | W | Instruction TLB Entry Invalidate Register
|
1574 |
|
|
| 2 | 1024-1151 | ((ITLBW0MR0-ITLBW0MR127)) | - | R/W | Instruction TLB Match Registers Way 0
|
1575 |
|
|
| 2 | 1536-1663 | ((ITLBW0TR0-ITLBW0TR127)) | - | R/W | Instruction TLB Translate Registers Way 0
|
1576 |
|
|
| 3 | 0 | ((DCCR)) | - | R/W | DC Control Register
|
1577 |
|
|
| 3 | 2 | ((DCBFR)) | W | W | DC Block Flush Register
|
1578 |
|
|
| 3 | 3 | ((DCBIR)) | W | W | DC Block Invalidate Register
|
1579 |
|
|
| 3 | 4 | ((DCBWR)) | W | W | DC Block Write-back register
|
1580 |
|
|
| 4 | 0 | ((ICCR)) | - | R/W | IC Control Register
|
1581 |
|
|
| 4 | 256 | ((ICBIR)) | W | W | IC Block Invalidate Register
|
1582 |
|
|
| 5 | 256 | ((MACLO)) | R/W | R/W | MAC Low
|
1583 |
|
|
| 5 | 257 | ((MACHI)) | R/W | R/W | MAC High
|
1584 |
|
|
| 6 | 16 | ((DMR1)) | - | R/W | Debug Mode Register 1
|
1585 |
|
|
| 6 | 17 | ((DMR2)) | - | R/W | Debug Mode Register 2
|
1586 |
|
|
| 6 | 20 | ((DSR)) | - | R/W | Debug Stop Register
|
1587 |
|
|
| 6 | 21 | ((DRR)) | - | R/W | Debug Reason Register
|
1588 |
|
|
| 8 | 0 | ((PMR)) | - | R/W | Power Management Register
|
1589 |
|
|
| 9 | 0 | ((PICMR)) | - | R/W | PIC Mask Register
|
1590 |
|
|
| 9 | 2 | ((PICSR)) | - | R/W | PIC Status Register
|
1591 |
|
|
| 10 | 0 | ((TTMR)) | - | R/W | Tick Timer Mode Register
|
1592 |
|
|
| 10 | 1 | ((TTCR)) | R* | R/W | Tick Timer Count Register
|
1593 |
|
|
|============================================================================
|
1594 |
|
|
|
1595 |
|
|
<> lists all OpenRISC 1000 special-purpose registers implemented
|
1596 |
|
|
in OR1200. Registers VR and UPR are described below. For description of
|
1597 |
|
|
other registers refer to <>.
|
1598 |
|
|
|
1599 |
|
|
Register VR description
|
1600 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
1601 |
|
|
Special-purpose register VR identifies the version (model) and revision
|
1602 |
|
|
level of the OpenRISC 1000 processor. It also specifies possible standard
|
1603 |
|
|
template on which this implementation is based.
|
1604 |
|
|
(((Register,VR)))
|
1605 |
|
|
|
1606 |
|
|
[[vr_reg_table]]
|
1607 |
|
|
.VR Register
|
1608 |
|
|
[width="95%",options="header"]
|
1609 |
|
|
|============================================================
|
1610 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1611 |
|
|
| 5:0 | R | Revision | REV | Revision number of this document.
|
1612 |
|
|
| 15:6 | R | 0x0 | - | Reserved
|
1613 |
|
|
| 23:16 | R | 0x00 | CFG | Configuration should be read from UPR and configuration registers
|
1614 |
|
|
| 31:24 | R | 0x12 | VER | Version number for OR1200 is fixed at 0x1200.
|
1615 |
|
|
|============================================================
|
1616 |
|
|
|
1617 |
|
|
Register UPR description
|
1618 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
1619 |
|
|
Special-purpose register UPR identifies the units present in the processor. It
|
1620 |
|
|
has a bit for each implemented unit or functionality. Lower sixteen bits
|
1621 |
|
|
identify present units defined in the OpenRISC 1000 architecture. Upper
|
1622 |
|
|
sixteen bits define present custom units.
|
1623 |
|
|
(((Register,UPR)))
|
1624 |
|
|
|
1625 |
|
|
[[upr_reg_table]]
|
1626 |
|
|
.UPR Register
|
1627 |
|
|
[width="95%",options="header"]
|
1628 |
|
|
|============================================================
|
1629 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1630 |
|
|
| 0 | R | 1 | UP | UPR present
|
1631 |
|
|
| 1 | R | 1 | DCP | Data cache present[†]
|
1632 |
|
|
| 2 | R | 1 | ICP | Instruction cache present[†]
|
1633 |
|
|
| 3 | R | 1 | DMP | Data MMU present[†]
|
1634 |
|
|
| 4 | R | 1 | IMP | Instruction MMU present[†]
|
1635 |
|
|
| 5 | R | 1 | MP | MAC present[†]
|
1636 |
|
|
| 6 | R | 1 | DUP | Debug unit present[†]
|
1637 |
|
|
| 7 | R | 0 | PCUP | Performance counters unit not present[†]
|
1638 |
|
|
| 8 | R | 1 | PMP | Power Management Present[†]
|
1639 |
|
|
| 9 | R | 1 | PICP | Programmable interrupt controller present
|
1640 |
|
|
| 10 | R | 1 | TTP | Tick timer present
|
1641 |
|
|
| 11 | R | 1 | FPP | Floating point present[†]
|
1642 |
|
|
| 23:12 | R | X | - | Reserved
|
1643 |
|
|
| 31:24 | R | 0xXXXX| CUP | The user of the OR1200 core adds custom units.
|
1644 |
|
|
|============================================================
|
1645 |
|
|
[†]: if enabled at synthesis time
|
1646 |
|
|
|
1647 |
|
|
Register CPUCFGR description
|
1648 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1649 |
|
|
Special-purpose register CPUCFGR identifies the capabilities and configuration
|
1650 |
|
|
of the CPU.
|
1651 |
|
|
(((Register,CPUCFGR)))
|
1652 |
|
|
|
1653 |
|
|
[[cpucfgr_reg_table]]
|
1654 |
|
|
.CPUCFGR Register
|
1655 |
|
|
[width="95%",options="header"]
|
1656 |
|
|
|============================================================
|
1657 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1658 |
|
|
| 3:0 | R | 0x0 | NSGF | Zero number of shadow GPR files
|
1659 |
|
|
| 4 | R | 0 | HGF | No half GPR files[†]
|
1660 |
|
|
| 5 | R | 1 | OB32S | ORBIS32 supported
|
1661 |
|
|
| 6 | R | 0 | OB64S | ORBIS64 not supported
|
1662 |
|
|
| 7 | R | 1 | OF32S | ORFPX32 supported[‡]
|
1663 |
|
|
| 8 | R | 0 | OF64S | ORFPX64 not supported
|
1664 |
|
|
| 9 | R | 0 | OV64S | ORVDX64 not supported
|
1665 |
|
|
|============================================================
|
1666 |
|
|
[†]: If disabled at synthesis time
|
1667 |
|
|
|
1668 |
|
|
[‡]: If FPU enabled at synthesis time
|
1669 |
|
|
|
1670 |
|
|
Register DMMUCFGR description
|
1671 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1672 |
|
|
Special-purpose register DMMUCFGR identifies the capabilities and configuration
|
1673 |
|
|
of the DMMU.
|
1674 |
|
|
(((Register,DMMUCFGR)))
|
1675 |
|
|
|
1676 |
|
|
[[dmmucfgr_reg_table]]
|
1677 |
|
|
.DMMUCFGR Register
|
1678 |
|
|
[width="95%",options="header"]
|
1679 |
|
|
|============================================================
|
1680 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1681 |
|
|
| 1:0 | R | 0x0 | NTW | One DTLB way
|
1682 |
|
|
| 4:2 | R | 0x4 - 0x7 | NTS | 16, 32, 64 or 128 DTLB sets
|
1683 |
|
|
| 7:5 | R | 0x0 | NAE | No ATB Entries
|
1684 |
|
|
| 8 | R | 0 | CRI | No DMMU control register implemented
|
1685 |
|
|
| 9 | R | 0 | PRI | No protection register implemented
|
1686 |
|
|
| 10 | R | 1 | TEIRI | DTLB entry invalidate register implemented
|
1687 |
|
|
| 11 | R | 0 | HTR | No hardware DTLB reload
|
1688 |
|
|
|============================================================
|
1689 |
|
|
|
1690 |
|
|
Register IMMUCFGR description
|
1691 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1692 |
|
|
Special-purpose register IMMUCFGR identifies the capabilities and configuration
|
1693 |
|
|
of the IMMU.
|
1694 |
|
|
(((Register,IMMUCFGR)))
|
1695 |
|
|
|
1696 |
|
|
[[immucfgr_reg_table]]
|
1697 |
|
|
.IMMUCFGR Register
|
1698 |
|
|
[width="95%",options="header"]
|
1699 |
|
|
|============================================================
|
1700 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1701 |
|
|
| 1:0 | R | 0x0 | NTW | One ITLB way
|
1702 |
|
|
| 4:2 | R | 0x4 - 0x7 | NTS | 16, 32, 64 or 128 ITLB sets
|
1703 |
|
|
| 7:5 | R | 0x0 | NAE | No ATB Entries
|
1704 |
|
|
| 8 | R | 0 | CRI | No IMMU control register implemented
|
1705 |
|
|
| 9 | R | 0 | PRI | No protection register implemented
|
1706 |
|
|
| 10 | R | 1 | TEIRI | ITLB entry invalidate register implemented
|
1707 |
|
|
| 11 | R | 0 | HTR | No hardware ITLB reload
|
1708 |
|
|
|============================================================
|
1709 |
|
|
|
1710 |
|
|
Register DCCFGR description
|
1711 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1712 |
|
|
Special-purpose register DCCFGR identifies the capabilities and configuration
|
1713 |
|
|
of the data cache.
|
1714 |
|
|
(((Register,DCCFGR)))
|
1715 |
|
|
|
1716 |
|
|
[[dccfgr_reg_table]]
|
1717 |
|
|
.DCCFGR Register
|
1718 |
|
|
[width="95%",options="header"]
|
1719 |
|
|
|============================================================
|
1720 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1721 |
|
|
| 2:0 | R | 0x0 | NCW | One DC way
|
1722 |
|
|
| 6:3 | R | 0x4 - 0x7 | NCS | 16, 32, 64 or 128 DC sets
|
1723 |
|
|
| 7 | R | 0x0 | CBS | 16-byte cache block size
|
1724 |
|
|
| 8 | R | 0 | CWS | Cache write-through strategy[†]
|
1725 |
|
|
| 9 | R | 1 | CCRI | DC control register implemented
|
1726 |
|
|
| 10 | R | 1 | CBIRI | DC block invalidate register implemented
|
1727 |
|
|
| 11 | R | 0 | CBPRI | DC block prefetch register not implemented
|
1728 |
|
|
| 12 | R | 0 | CBLRI | DC block lock register not implemented
|
1729 |
|
|
| 13 | R | 1 | CBFRI | DC block flush register implemented
|
1730 |
|
|
| 14 | R | 1 | CBWBRI | DC block write-back register implemented[‡]
|
1731 |
|
|
|============================================================
|
1732 |
|
|
[†]: If disabled at synthesis time
|
1733 |
|
|
|
1734 |
|
|
[‡]: If FPU enabled at synthesis time
|
1735 |
|
|
|
1736 |
|
|
Register ICCFGR description
|
1737 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1738 |
|
|
Special-purpose register ICCFGR identifies the capabilities and configuration
|
1739 |
|
|
of the instruction cache.
|
1740 |
|
|
(((Register,ICCFGR)))
|
1741 |
|
|
|
1742 |
|
|
[[iccfgr_reg_table]]
|
1743 |
|
|
.ICCFGR Register
|
1744 |
|
|
[width="95%",options="header"]
|
1745 |
|
|
|============================================================
|
1746 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1747 |
|
|
| 2:0 | R | 0x0 | NCW | One IC way
|
1748 |
|
|
| 6:3 | R | 0x4 - 0x7 | NCS | 16, 32, 64 or 128 IC sets
|
1749 |
|
|
| 7 | R | 0x0 | CBS | 16-byte cache block size
|
1750 |
|
|
| 8 | R | 0 | CWS | Cache write-through strategy
|
1751 |
|
|
| 9 | R | 1 | CCRI | IC control register implemented
|
1752 |
|
|
| 10 | R | 1 | CBIRI | IC block invalidate register implemented
|
1753 |
|
|
| 11 | R | 0 | CBPRI | IC block prefetch register not implemented
|
1754 |
|
|
| 12 | R | 0 | CBLRI | IC block lock register not implemented
|
1755 |
|
|
| 13 | R | 1 | CBFRI | IC block flush register implemented
|
1756 |
|
|
| 14 | R | 0 | CBWBRI | IC block write-back register not implemented
|
1757 |
|
|
|============================================================
|
1758 |
|
|
|
1759 |
|
|
Register DCFGR description
|
1760 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1761 |
|
|
Special-purpose register DCFGR identifies the capabilities and configuration
|
1762 |
|
|
of the debut unit.
|
1763 |
|
|
(((Register,DCFGR)))
|
1764 |
|
|
|
1765 |
|
|
[[dcfgr_reg_table]]
|
1766 |
|
|
.DCFGR Register
|
1767 |
|
|
[width="95%",options="header"]
|
1768 |
|
|
|============================================================
|
1769 |
|
|
| Bit # | Access | Reset | Short Name | Description
|
1770 |
|
|
| 3:0 | R | 0x0 | NDP | Zero DVR/DCR pairs[†]
|
1771 |
|
|
| 4 | R | 0 | WPCI | Watchpoint counters not implemented
|
1772 |
|
|
|============================================================
|
1773 |
|
|
[†]: If hardware breakpoints disabled at synthesis time
|
1774 |
|
|
|
1775 |
|
|
((IO ports))
|
1776 |
|
|
------------
|
1777 |
|
|
OR1200 IP core has several interfaces. <> below shows
|
1778 |
|
|
all interfaces:
|
1779 |
|
|
|
1780 |
|
|
* Instruction and data WISHBONE host interfaces
|
1781 |
|
|
* Power management interface
|
1782 |
|
|
* Development interface
|
1783 |
|
|
* Interrupts interface
|
1784 |
|
|
|
1785 |
|
|
[[core_interfaces_fig]]
|
1786 |
|
|
.Core's Interfaces
|
1787 |
|
|
image::img/core_interfaces.gif[scaledwidth="50%",align="center"]
|
1788 |
|
|
|
1789 |
|
|
Instruction WISHBONE Master Interface
|
1790 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1791 |
|
|
OR1200 has two master WISHBONE Rev B compliant interfaces. Instruction
|
1792 |
|
|
interface is used to connect OR1200 core to memory subsystem for purpose of
|
1793 |
|
|
fetching instructions or instruction cache lines.
|
1794 |
|
|
|
1795 |
|
|
[[inst_wb_master_table]]
|
1796 |
|
|
.Instruction WISHBONE Master Interface' Signals
|
1797 |
|
|
[width="95%",options="header"]
|
1798 |
|
|
|====================================================
|
1799 |
|
|
| Port | Width | Direction | Description
|
1800 |
|
|
| ((iwb_CLK_I)) | 1 | Input | Clock input
|
1801 |
|
|
| ((iwb_RST_I)) | 1 | Input | Reset input
|
1802 |
|
|
| ((iwb_CYC_O)) | 1 | Output | Indicates valid bus cycle (core select)
|
1803 |
|
|
| ((iwb_ADR_O)) | 32 | Outputs | Address outputs
|
1804 |
|
|
| ((iwb_DAT_I)) | 32 | Inputs | Data inputs
|
1805 |
|
|
| ((iwb_DAT_O)) | 32 | Outputs | Data outputs
|
1806 |
|
|
| ((iwb_SEL_O)) | 4 | Outputs | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
|
1807 |
|
|
| ((iwb_ACK_I)) | 1 | Input | Acknowledgment input (indicates normal transaction termination)
|
1808 |
|
|
| ((iwb_ERR_I)) | 1 | Input | Error acknowledgment input (indicates an abnormal transaction termination)
|
1809 |
|
|
| ((iwb_RTY_I)) | 1 | Input | In OR1200 treated same way as iwb_ERR_I.
|
1810 |
|
|
| ((iwb_WE_O)) | 1 | Output | Write transaction when asserted high
|
1811 |
|
|
| ((iwb_STB_O)) | 1 | Outputs | Indicates valid data transfer cycle
|
1812 |
|
|
|====================================================
|
1813 |
|
|
|
1814 |
|
|
Data WISHBONE Master Interface
|
1815 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1816 |
|
|
OR1200 has two master WISHBONE Rev B compliant interfaces. Data interface
|
1817 |
|
|
is used to connect OR1200 core to external peripherals and memory subsystem
|
1818 |
|
|
for purpose of reading and writing data or data cache lines.
|
1819 |
|
|
|
1820 |
|
|
[[data_wb_master_table]]
|
1821 |
|
|
.Data WISHBONE Master Interface' Signals
|
1822 |
|
|
[width="95%",options="header"]
|
1823 |
|
|
|====================================================
|
1824 |
|
|
| Port | Width | Direction | Description
|
1825 |
|
|
| ((dwb_CLK_I)) | 1 | Input | Clock input
|
1826 |
|
|
| ((dwb_RST_I)) | 1 | Input | Reset input
|
1827 |
|
|
| ((dwb_CYC_O)) | 1 | Output | Indicates valid bus cycle (core select)
|
1828 |
|
|
| ((dwb_ADR_O)) | 32 | Outputs | Address outputs
|
1829 |
|
|
| ((dwb_DAT_I)) | 32 | Inputs | Data inputs
|
1830 |
|
|
| ((dwb_DAT_O)) | 32 | Outputs | Data outputs
|
1831 |
|
|
| ((dwb_SEL_O)) | 4 | Outputs | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
|
1832 |
|
|
| ((dwb_ACK_I)) | 1 | Input | Acknowledgment input (indicates normal transaction termination)
|
1833 |
|
|
| ((dwb_ERR_I)) | 1 | Input | Error acknowledgment input (indicates an abnormal transaction termination)
|
1834 |
|
|
| ((dwb_RTY_I)) | 1 | Input | In OR1200 treated same way as dwb_ERR_I.
|
1835 |
|
|
| ((dwb_WE_O)) | 1 | Output | Write transaction when asserted high
|
1836 |
|
|
| ((dwb_STB_O)) | 1 | Outputs | Indicates valid data transfer cycle
|
1837 |
|
|
|====================================================
|
1838 |
|
|
|
1839 |
|
|
System Interface
|
1840 |
|
|
~~~~~~~~~~~~~~~~
|
1841 |
|
|
System interface connects reset, clock and other system signals to the
|
1842 |
|
|
OR1200 core.
|
1843 |
|
|
|
1844 |
|
|
[[sys_interface_table]]
|
1845 |
|
|
.System Interface Signals
|
1846 |
|
|
[width="95%",options="header"]
|
1847 |
|
|
|====================================================
|
1848 |
|
|
| Port | Width | Direction | Description
|
1849 |
|
|
| ((Rst)) | 1 | Input | Asynchronous reset
|
1850 |
|
|
| ((clk_cpu)) | 1 | Input | Main clock input to the RISC
|
1851 |
|
|
| ((clk_dc)) | 1 | Input | Data cache clock
|
1852 |
|
|
| ((clk_ic)) | 1 | Input | Instruction cache clock
|
1853 |
|
|
| ((clk_dmmu)) | 1 | Input | Data MMU clock
|
1854 |
|
|
| ((clk_immu)) | 1 | Input | Instruction MMU clock
|
1855 |
|
|
| ((clk_tt)) | 1 | Input | Tick timer clock
|
1856 |
|
|
|====================================================
|
1857 |
|
|
|
1858 |
|
|
Development Interface
|
1859 |
|
|
~~~~~~~~~~~~~~~~~~~~~
|
1860 |
|
|
Development interface connects external development port to the RISC s internal
|
1861 |
|
|
debug facility. Debug facility allows control over program execution inside
|
1862 |
|
|
RISC, setting of breakpoints and watchpoints, and tracing of instruction
|
1863 |
|
|
and data flows.
|
1864 |
|
|
|
1865 |
|
|
[[dev_interface_table]]
|
1866 |
|
|
.Development Interface
|
1867 |
|
|
[width="95%",options="header"]
|
1868 |
|
|
|====================================================
|
1869 |
|
|
| Port | Width | Direction | Description
|
1870 |
|
|
| ((dbg_dat_o)) | 32 | Output | Transfer of data from RISC to external development interface
|
1871 |
|
|
| ((dbg_dat_i)) | 32 | Input | Transfer of data from external development interface to RISC
|
1872 |
|
|
| ((dbg_adr_i)) | 32 | Input | Address of special-purpose register to be read or written
|
1873 |
|
|
| ((dbg_op_I)) | 3 | Input | Operation select for development interface
|
1874 |
|
|
| ((dbg_lss_o)) | 4 | Output | Status of load/store unit
|
1875 |
|
|
| ((dbg_is_o)) | 2 | Output | Status of instruction fetch unit
|
1876 |
|
|
| ((dbg_wp_o)) | 11 | Output | Status of watchpoints
|
1877 |
|
|
| ((dbg_bp_o)) | 1 | Output | Status of the breakpoint
|
1878 |
|
|
| ((dbg_stall_i)) | 1 | Input | Stalls RISC CPU core
|
1879 |
|
|
| ((dbg_ewt_i)) | 1 | Input | External watchpoint trigger
|
1880 |
|
|
|====================================================
|
1881 |
|
|
|
1882 |
|
|
Power Management Interface
|
1883 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1884 |
|
|
Power management interface provides signals for interfacing RISC core with
|
1885 |
|
|
external power management circuitry. External power management circuitry is
|
1886 |
|
|
required to implement functions that are technology specific and cannot be
|
1887 |
|
|
implemented inside OR1200 core.
|
1888 |
|
|
|
1889 |
|
|
[[pow_mgmt_interface_table]]
|
1890 |
|
|
.Power Management Interface
|
1891 |
|
|
[width="95%",options="header"]
|
1892 |
|
|
|============================================================================
|
1893 |
|
|
| Port | Width | Direction | Generation | Description
|
1894 |
|
|
| ((pm_clksd)) | 4 | Output | Static (in SW) | Slow down outputs that control reduction of RISC clock frequency
|
1895 |
|
|
| ((pm_cpustall)) | 1 | Input | - | Synchronous stall of the RISC’s CPU core
|
1896 |
|
|
| ((pm_dc_gate)) | 1 | Output | Dynamic (in HW) | Gating of data cache clock
|
1897 |
|
|
| ((pm_ic_gate)) | 1 | Output | Dynamic (in HW) | Gating of instruction cache clock
|
1898 |
|
|
| ((pm_dmmu_gate)) | 1 | Output | Dynamic (in HW) | Gating of data MMU clock
|
1899 |
|
|
| ((pm_immu_gate)) | 1 | Output | Dynamic (in HW) | Gating of instruction MMU clock
|
1900 |
|
|
| ((pm_tt_gate)) | 1 | Output | Dynamic (in HW) | Gating of tick timer clock
|
1901 |
|
|
| ((pm_cpu_gate)) | 1 | Output | Static (in SW) | Gating of main CPU clock
|
1902 |
|
|
| ((pm_wakeup)) | 1 | Output | Dynamic (in HW) | Activate all clocks
|
1903 |
|
|
| ((pm_lvolt)) | 1 | Output | Static (in SW) | Lower voltage
|
1904 |
|
|
|============================================================================
|
1905 |
|
|
|
1906 |
|
|
Interrupt Interface
|
1907 |
|
|
~~~~~~~~~~~~~~~~~~~
|
1908 |
|
|
Interrupt interface has interrupt inputs for interfacing external peripheral
|
1909 |
|
|
s interrupt outputs to the RISC core. All interrupt inputs are evaluated on
|
1910 |
|
|
positive edge of main RISC clock.
|
1911 |
|
|
|
1912 |
|
|
[[interrupt_interface_table]]
|
1913 |
|
|
.Interrupt Interface
|
1914 |
|
|
[width="95%",options="header"]
|
1915 |
|
|
|============================================================
|
1916 |
|
|
| Port | Width | Direction | Description
|
1917 |
|
|
| ((pic_ints)) | PIC_INTS | Input | External interrupts
|
1918 |
|
|
|============================================================
|
1919 |
|
|
|
1920 |
|
|
|
1921 |
|
|
|
1922 |
|
|
[appendix]
|
1923 |
|
|
Core HW Configuration
|
1924 |
|
|
=====================
|
1925 |
|
|
(((Hardware,Configuration)))
|
1926 |
|
|
This section describes parameters that are set by the user of the core and
|
1927 |
|
|
define configuration of the core. Parameters must be set by the user before
|
1928 |
|
|
actual use of the core in simulation or synthesis.
|
1929 |
|
|
|
1930 |
|
|
[[core_hw_conf_table]]
|
1931 |
|
|
.Core HW configuration table
|
1932 |
|
|
[width="95%",options="header"]
|
1933 |
|
|
|============================================================
|
1934 |
|
|
| Variable Name | Range | Default | Description
|
1935 |
|
|
| ((EADDR_WIDTH)) | 32 | 32 | Effective address width
|
1936 |
|
|
| ((VADDR_WIDTH)) | 32 | 32 | Virtual address width
|
1937 |
|
|
| ((PADDR_WIDTH)) | 24 - 36| 32 | Physical address width
|
1938 |
|
|
| ((DATA_WIDTH)) | 32 | 32 | Data width / Operation width
|
1939 |
|
|
| ((DC_IMPL)) | 0 - 1 | 1 | Data cache implementation
|
1940 |
|
|
| ((DC_SETS)) | 256-1024 | 512 | Data cache number of sets
|
1941 |
|
|
| ((DC_WAYS)) | 1 | 1 | Data cache number of ways
|
1942 |
|
|
| ((DC_LINE)) | 16 - 32 | 16 | Data cache line size
|
1943 |
|
|
| ((IC_IMPL)) | 0 - 1 | 1 | Instruction cache implementation
|
1944 |
|
|
| ((IC_SETS)) | 32-1024 | 512 | Instruction cache number of sets
|
1945 |
|
|
| ((IC_WAYS)) | 1 | 1 | Instruction cache number of ways
|
1946 |
|
|
| ((IC_LINE)) | 16-32 | 16 | Instruction cache line size in bytes
|
1947 |
|
|
| ((DMMU_IMPL)) | 0 - 1 | 1 | Data MMU implementation
|
1948 |
|
|
| ((DTLB_SETS)) | 64 | 64 | Data TLB number of sets
|
1949 |
|
|
| ((DTLB_WAYS)) | 1 | 1 | Data TLB number of ways
|
1950 |
|
|
| ((IMMU_IMPL)) | 0 - 1 | 1 | Instruction MMU implementation
|
1951 |
|
|
| ((ITLB_SETS)) | 64 | 64 | Instruction TLB number of sets
|
1952 |
|
|
| ((ITLB_WAYS)) | 1 | 1 | Instruction TLB number of ways
|
1953 |
|
|
| ((PIC_INTS)) | 2 - 32 | 20 | Number of interrupt inputs
|
1954 |
|
|
|============================================================
|
1955 |
|
|
|
1956 |
|
|
:numbered!:
|
1957 |
|
|
|
1958 |
|
|
[bibliography]
|
1959 |
|
|
((Bibliography))
|
1960 |
|
|
================
|
1961 |
|
|
[bibliography]
|
1962 |
|
|
- [[[or1000_manual]]] Damjan Lampret et al. 'OpenRISC 1000 System Architecture
|
1963 |
|
|
Manual'. 2004.
|
1964 |
|
|
|
1965 |
|
|
[index]
|
1966 |
|
|
Index
|
1967 |
|
|
=====
|
1968 |
|
|
// The index is generated automatically by the DocBook toolchain.
|