OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [or1200/] [doc/] [openrisc1200_spec.txt] - Blame information for rev 808

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 645 julius
OpenRISC 1200 IP Core Specification (Preliminary Draft)
2
=======================================================
3
:doctype: book
4
 
5
////
6
Revision history
7
Note: When adding new entries, strictly follow the format of the existing ones.
8
 
9
Rev.    | Date          | Author        | Description
10
__vstart__
11
v0.1    | 28/3/01       | Damjan Lampret        | First Draft
12
 
13
v0.2    | 16/4/01       | Damjan Lampret        | First time published
14
 
15
v0.3    | 29/4/01       | Damjan Lampret        | All chapters almost
16
finished. Some bugs hidden waiting for an update. Awaiting feedback.
17
 
18
v0.4    | 16/5/01       | Damjan Lampret        | Synchronization with
19
OR1K Arch Manual
20
 
21
v0.5    | 24/5/01       | Damjan Lampret        | Fixed bugs
22
 
23
v0.6    | 28/5/01       | Damjan Lampret        | Changed some SPR addresses.
24
 
25
v0.7    | 06/9/01       | Damjan Lampret        | Simplified debug unit.
26
 
27
v0.8    | 30/08/10      | Julius Baxter         | Adding information about FPU
28
implementation, data cache write-back capability. PIC behavior update.
29
Instruction list update. Update of bits in config registers, bringing into
30
line with latest OR1200 - not entirely complete.
31
 
32
v0.9    | 12/9/10       | Julius Baxter         | Clarified supported parts of
33
OR1K instruction set. Updated core clock input information.
34
Fixed up reference to instruction execute stage cycle table.
35
Added divide cycles to execute stage cycle table.
36
 
37
0.10    | 1/11/10       | Julius Baxter         | Added FF1/FL1 instructions to
38
supported instructions table.
39
 
40
v0.11   | 19/1/11       | Julius Baxter | Cache information update.
41
Wishbone behavior clarification. Serial integer multiply/divide update.
42
Reset address clarification
43 647 julius
 
44
v0.12   | 13/9/11       | Julius Baxter | Addition of extension instructions
45
l.extbs, l.extbz, l.exths, l.exthz, l.extws and l.extwz. Range exception
46
support, overflow bit in supervision register.
47 808 julius
v0.13   | 27/5/12       | Julius Baxter | Addition of support for delay-slot
48
exception indicator bit in supervision register
49 645 julius
__vend__
50
////
51
 
52
Introduction
53
------------
54
Purpose of this document is to define specifications of the OpenRISC 1200
55
implementation. This specification defines all implementation specific
56
variables that are not part of the general architecture specification. This
57
includes type and size of data and instruction caches, type and size of data
58
and instruction MMUs, details of all execution pipelines, implementation
59
of exception unit, interrupt controller and other supplemental units.
60
This document does not cover general architecture topics like instruction set,
61
memory addressing modes and other architectural definitions. See
62
<> for more information about architecture.
63
 
64
OpenRISC Family
65
~~~~~~~~~~~~~~~
66
(((OpenRISC,Family)))
67
OpenRISC 1000 is architecture for a family of free, open source RISC processor
68
cores. As architecture, OpenRISC 1000 allows for a spectrum of chip and
69
system implementations at a variety of price/performance points for a range of
70
applications. It is a 32/64-bit load and store RISC architecture designed with
71
emphasis on performance, simplicity, low power requirements, scalability and
72
versatility. OpenRISC 1000 architecture targets medium and high performance
73
networking, embedded, automotive and portable computer environments.
74
 
75
image::img/or_family.gif[scaledwidth="50%",align="center"]
76
 
77
All OpenRISC implementations, whose first digit in identification number
78
is  1 , belong to OpenRISC 1000 family. Second digit defines which features
79
of OpenRISC 1000 architecture are implemented and in which way they are
80
implemented. Last two digits define how an implementation is configured
81
before it is used in a real application.
82
 
83
However, at present the OR1200 is the only major RTL implementation of the
84
OR1K architecture spec, and the OR1200 name has stuck, despite the high level
85
of reconfigurability possible that would, strictly speaking, mean the core
86
is either a OR1000, OR1300, etc. So, despite the various features that may
87
or may not be implemented, the core is still only referred to as the OR1200.
88
 
89
OpenRISC 1200
90
~~~~~~~~~~~~~
91
(((OpenRISC,1200)))
92
The OR1200 is a 32-bit scalar RISC with Harvard microarchitecture, 5 stage
93
integer pipeline, virtual memory support (MMU) and basic DSP capabilities.
94
Default caches are 1-way direct-mapped 8KB data cache and 1-way direct-mapped
95
8KB instruction cache, each with 16-byte line size. Both caches are
96
physically tagged.  By default MMUs are implemented and they are constructed of
97
64-entry hash based 1-way direct-mpped data TLB and 64-entry hash based 1-way
98
direct-mapped instruction TLB.
99
 
100
Supplemental facilities include debug unit for real-time debugging, high
101
resolution tick timer, programmable interrupt controller and power management
102
support.  When implemented in a typical 0.18u 6LM process it should provide
103
over 300 dhrystone 2.1 MIPS at 300MHz and 300 DSP MAC 32x32 operations, at
104
least 20% more than any other competitor in this class. OR1200 in default
105
configuration has about 1M transistors.
106
 
107
OR1200 is intended for embedded, portable and networking applications. It can
108
successfully compete with latest scalar 32-bit RISC processors in his class
109
and can efficiently run any modern operating system.  Competitors include
110
ARM10, ARC and Tensilica RISC processors.
111
 
112
Features
113
^^^^^^^^
114
The following lists the main features of OR1200 IP core:
115
 
116
- All major characteristics of the core can be set by the user
117
- High performance of 300 Dhrystone 2.1 MIPS at 300 MHz using 0.18u process
118
- High performance cache and MMU subsystems
119
- WISHBONE SoC Interconnection Rev. B3 compliant interface
120
 
121
Architecture
122
------------
123
<> below shows general architecture of OR1200 IP core. It
124
consists of several building blocks:
125
 
126
- CPU/FPU/DSP central block
127
- Direct-mapped data cache
128
- Direct-mapped instruction cache
129
- Data MMU based on hash based DTLB
130
- Instruction MMU based on hash based ITLB
131
- Power management unit and power management interface
132
- Tick timer
133
- Debug unit and development interface
134
- Interrupt controller and interrupt interface
135
- Instruction and Data WISHBONE host interfaces
136
 
137
[[core_arch_fig]]
138
.Core's Architecture
139
image::img/core_arch.gif[scaledwidth="50%",align="center"]
140
 
141
CPU/FPU/DSP
142
~~~~~~~~~~~
143
((CPU))/((FPU))/((DSP)) is a central part of the OR1200 RISC processor.
144
<> shows basic block diagram of the CPU/DSP. Not pictured
145
are the FPU components.  OR1200 CPU/FPU/DSP ony implements sections of
146
the ORBIS32 and ORFPX32 instruction set. No ((ORBIS64)), ((ORFBX64)) or
147
((ORVDX64)) instructions are implemented in OR1200.
148
 
149
[[cpu_fpu_dsp_fig]]
150
.CPU/FPU/DSP Block Diagram
151
image::img/cpu_fpu_dsp.gif[scaledwidth="50%",align="center"]
152
 
153
Instruction unit
154
^^^^^^^^^^^^^^^^
155
The instruction unit implements the basic instruction pipeline, fetches
156
instructions from the memory subsystem, dispatches them to available execution
157
units, and maintains a state history to ensure a precise exception model
158
and that operations finish in order. It also executes conditional branch
159
and unconditional jump instructions.
160
 
161
The sequencer can dispatch a sequential instruction on each clock if the
162
appropriate execution unit is available. The execution unit must discern
163
whether source data is available and to ensure that no other instruction is
164
targeting the same destination register.
165
 
166
Instruction unit handles only ((ORBIS32)) and, optionally, a subset of the
167
((ORFPX32)) instruction class. Some ((ORFPX32)) and all ((ORFPX3264)) and
168
((ORVDX64)) instruction classes are not supported by the OR1200 at present.
169
 
170
General-Purpose Registers
171
^^^^^^^^^^^^^^^^^^^^^^^^^
172
OpenRISC 1200 implements 32 general-purpose 32-bit ((registers)). OpenRISC 1000
173
architecture also support shadow copies of register file to implement fast
174
switching between working contexts, however this feature is not implemented
175
in current OR1200 implementation.
176
 
177
OR1200 implements general-purpose register file as two synchronous dual-port
178
memories with capacity of 32 words by 32 bits per word.
179
 
180
Load/Store Unit
181
^^^^^^^^^^^^^^^
182
The ((load/store unit (LSU))) transfers all data between the GPRs and the CPU's
183
internal bus. It is implemented as an independent execution unit so that stalls
184
in memory subsystem only affect master pipeline if there is a data dependency.
185
 
186
The following are LSU's main features:
187
 
188
- all load/store instruction implemented in hardware (atomic instructions
189
  included)
190
- address entry buffer
191
- pipelined operation
192
- aligned accesses for fast memory access
193
 
194
When load and store instructions are issued, the LSU determines if all
195
operands are available. These operands include the following:
196
 
197
- address register operand
198
- source data register operand (for store instructions)
199
- destination data register operand (for load instructions)
200
 
201
Integer Execution Pipeline
202
^^^^^^^^^^^^^^^^^^^^^^^^^^
203
(((Pipeline, Integer Execution)))
204
The core implements the following types of 32-bit integer instructions:
205
 
206
- Arithmetic instructions
207
- Compare instructions
208
- Logical instructions
209
- Rotate and shift instructions
210
 
211
Most integer instructions can execute in one cycle. For details about timing
212
see <>.
213
 
214
MAC Unit
215
^^^^^^^^
216
The ((MAC)) unit executes DSP MAC operations. MAC operations are 32x32 with
217
48-bit accumulator. MAC unit is fully pipelined and can accept new MAC
218
operation in each new clock cycle.
219
 
220
Floating Point Unit
221
^^^^^^^^^^^^^^^^^^^
222
(((Floating Point Unit)))
223
The ((FPU)) implementation is based on two other FPUs available from
224
OpenCores.org. For the comparison and conversion functions, parts were taken
225
from the FPU project by Rudolf Usselmann, and for the arithmetic operations,
226
the fpu100 project by Jidan Al-Eryani was converted to Verilog HDL.
227
 
228
All ((ORFPX32)) instructions except for ((lf.madd.s)) and ((lf.rem.s)) are
229
supported when the FPU is enabled in the OR1200 configuration.
230
 
231
System Unit
232
^^^^^^^^^^^
233
The ((system unit)) connects all other signals of the CPU/FPU/DSP that are not
234
connected through instruction and data interfaces. It also implements all
235
system special-purpose registers (e.g. supervisor register).
236
 
237
Exceptions
238
^^^^^^^^^^
239
Core exceptions can be generated when an exception condition occurs.
240
((Exception sources)) in OR1200 include the following:
241
 
242
- External interrupt request
243
- Certain memory access condition
244
- Internal errors, such as an attempt to execute unimplemented opcode
245
- System call
246
- Internal exception, such as breakpoint exceptions
247 647 julius
- Arithmetic overflow
248 645 julius
 
249
((Exception handling)) is transparent to user software and uses the same
250
mechanism to handle all types of exceptions. When an exception is taken,
251
control is transferred to an exception handler at an offset defined by for
252
the type of exception encountered. Exceptions are handled in supervisor mode.
253
 
254 808 julius
Exceptions caused by instructions in a delay slot will set the supervision
255
register's DSX bit.
256
 
257 645 julius
Data Cache
258
~~~~~~~~~~
259
The default configuration of OR1200 data ((cache)) is 8-Kbyte, 1-way
260
direct-mapped data cache, which allows rapid core access to data. However
261
data cache can be configured according to <>.
262
 
263
[[data_confs_or1200_table]]
264
.Possible Data Cache Configurations of OR1200
265
[width="60%",options="header"]
266
|======================================================
267
|                                       | Direct mapped
268
| 16B/line, 256 lines, 1 way            | 4KB
269
| 16B/line, 512 lines, 1 way            | *8KB (default)*
270
| 16B/line, 1024 lines, 1 way           | 16KB
271
| 32B/line, 1024 lines, 1 way           | 32KB
272
|======================================================
273
 
274
It is possible to operate the data cache with write-through or write-back
275
strategies, however write-back is currently experimental.
276
 
277
Features:
278
 
279
- data cache is separate from instruction cache (Harvard architecture)
280
- data cache implements a least-recently used (LRU) replacement algorithm
281
  within each set
282
- the cache directory is physically addressed. The physical address tag is
283
  stored in the cache directory
284
- write-through or write-back operation
285
- entire cache can be disabled, lines invalidated, flushed or forced to be
286
  written back, by writing to cache special purpose registers
287
 
288
On a miss, and appropriate conditions, the cache line is filled or emptied
289
(written back) with 16-byte bursts. The burst fill is performed as a
290
critical-word-first operation; the critical word is simultaneously written
291
to the cache and forwarded to the requesting unit, thus minimizing stalls
292
due to cache fill latency. Data cache provides storage for cache tags and
293
performs cache line replacement function.
294
 
295
Data cache is tightly coupled to external interface to allow efficient
296
access to the system memory controller.
297
 
298
The data cache supplies data to the GPRs by means of a 32-bit interface
299
to the load/store unit. The LSU provides all logic required to calculate
300
effective addresses, handles data alignment to and from the data cache,
301
and provides sequencing for load and store operations. Write operations to
302
the data cache can be performed on a byte, half-word or word basis.
303
 
304
image::img/data_cache_diag.gif[scaledwidth="50%",align="center"]
305
 
306
Each line contains four contiguous words from memory that are loaded from
307
a cache line aligned boundary. As a result, cache lines are aligned with
308
page boundaries.
309
 
310
Instruction Cache
311
~~~~~~~~~~~~~~~~~
312
The default configuration of OR1200 instruction ((cache)) is 8-Kbyte, 1-way
313
direct mapped instruction cache, which allows rapid core access to
314
instructions. However instruction cache can be configured according to
315
<>.
316
 
317
[[inst_confs_or1200_table]]
318
.Possible Instruction Cache Configurations of OR1200
319
[width="60%",options="header"]
320
|==============================================
321
|                               | Direct mapped
322
| 16B/line, 32 lines, 1 way     | 512B
323
| 16B/line, 256 lines, 1 way    | 4KB
324
| 16B/line, 512 lines, 1 way    | *8KB (Default)*
325
| 16B/line, 1024 lines, 1 way   | 16KB
326
| 32B/line, 1024 lines, 1 way   | 32KB
327
|==============================================
328
 
329
Features:
330
 
331
- instruction cache is separate from data cache (Harvard architecture)
332
  (((Architecture,Harvard)))
333
- instruction cache implements a least-recently used (LRU) replacement
334
  algorithm within each set
335
  ((LRU))
336
- the ((cache directory)) is physically addressed. The physical address tag is
337
  stored in the cache directory
338
- it can be disabled or invalidated by writing to cache special purpose
339
  registers
340
 
341
On a miss, the cache is filled in with 16-byte bursts. The burst fill
342
is performed as a critical-word-first operation; the critical word is
343
simultaneously written to the cache and forwarded to the requesting unit,
344
thus minimizing stalls due to cache fill latency. Instruction cache provides
345
storage for cache tags and performs cache line replacement function.
346
 
347
Instruction cache is tightly coupled to external interface to allow efficient
348
access to the system memory controller.
349
 
350
The instruction cache supplies instructions to the instruction sequencer by
351
means of a 32-bit interface to the instruction fetch subunit. The instruction
352
fetch subunit provides all logic required to calculate effective addresses.
353
 
354
image::img/inst_cache_diag.gif[scaledwidth="50%",align="center"]
355
 
356
Each line contains four contiguous words from memory that are loaded from
357
a line-size  aligned boundary. As a result, cache lines are aligned with
358
page boundaries.
359
 
360
Data MMU
361
~~~~~~~~
362
(((MMU, Data)))
363
The OR1200 implements a ((virtual memory management)) scheme that
364
provides memory access protection and effective-to-physical address
365
translation. ((Protection)) granularity is as defined by OpenRISC 1000
366
architecture - 8-Kbyte and 16-Mbyte pages.
367
 
368
[[data_tlb_confs_or1200_table]]
369
.Possible Data TLB Configurations of OR1200
370
[width="60%",options="header"]
371
|======================================
372
|                       | Direct mapped
373
| 16 entries per way    | 16 DTLB entries
374
| 32 entries per way    | 32 DTLB entries
375
| 64 entries per way    | *64 DTLB entries (default)*
376
| 128 entries per way   | 128 DTLB entries
377
|======================================
378
 
379
Features:
380
 
381
* data MMU is separate from instruction MMU
382
* page size 8-Kbyte
383
* comprehensive page protection scheme
384
* direct mapped hash based translation lookaside buffer (DTLB) with the
385
  default of 1 way and the following features:
386
** miss and fault exceptions
387
** software tablewalk
388
** high performance because of hashed based design
389
** variable number DTLB entries with default of 64 per each way
390
 
391
image::img/tlb_diag.gif[scaledwidth="50%",align="center"]
392
 
393
The MMU hardware supports two-level software tablewalk.
394
 
395
Instruction MMU
396
~~~~~~~~~~~~~~~
397
(((MMU, Instruction)))
398
The OR1200 implements a virtual memory management scheme that provides memory
399
access protection and effective-to-physical address translation. Protection
400
granularity is as defined by OpenRISC 1000 architecture - 8-Kbyte and
401
16-Mbyte pages.
402
 
403
[[inst_tlb_confs_or1200_table]]
404
.Possible Instruction TLB Configurations of OR1200
405
[width="60%",options="header"]
406
|======================================
407
|                       | Direct mapped
408
| 16 entries per way    | 16 DTLB entries
409
| 32 entries per way    | 32 DTLB entries
410
| 64 entries per way    | *64 DTLB entries (default)*
411
| 128 entries per way   | 128 DTLB entries
412
|======================================
413
 
414
Features:
415
 
416
* instruction MMU is separate from data MMU
417
* pages size 8-Kbyte
418
* comprehensive page protection scheme
419
* 1 way direct-mapped hash based translation lookaside buffer (ITLB) with the
420
  following features:
421
** miss and fault exceptions
422
** software tablewalk
423
** high performance because of hashed based design
424
** Variable number of ITLB entries with default of 64 entries per way
425
 
426
image::img/inst_mmu_diag.gif[scaledwidth="50%",align="center"]
427
 
428
The MMU hardware supports two-level software tablewalk.
429
 
430
Programmable Interrupt Controller
431
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
432
The ((interrupt)) controller receives interrupts from external sources and
433
forwards them as low or high priority interrupt exception to the CPU core.
434
 
435
[[interrupt_controller_fig]]
436
.Block Diagram of the Interrupt Controller
437
image::img/interrupt_controller.gif[scaledwidth="50%",align="center"]
438
 
439
Programmable interrupt controller has three special-purpose registers and 32
440
interrupt inputs. Interrupt input 0 and 1 are always enabled and connected
441
to high and low priority interrupt input, respectively.
442
 
443
30 other interrupt inputs can be masked and assigned low or high priority
444
through programming special-purpose registers.
445
 
446
Tick Timer
447
~~~~~~~~~~
448
OR1200 implements tick ((timer)) facility. Basically this is a timer that is
449
clocked by RISC clock and is used by the operating system to precisely
450
measure time and schedule system tasks.
451
 
452
OR1200 precisely follow architectural definition of the tick timer facility:
453
 
454
* Maximum timer count of 2^32 clock cycles
455
* Maximum time period of 2^28 clock cycles between interrupts
456
* Maskable tick timer interrupt
457
* Single run, restartable or continues timer
458
 
459
Tick timer operates from independent clock source so that doze power management
460
mode can be implemented.
461
 
462
Power Management Support
463
~~~~~~~~~~~~~~~~~~~~~~~~
464
To optimize ((power consumption)), the OR1200 provides ((low-power)) modes that
465
can be used to dynamically activate and deactivate certain internal modules.
466
 
467
OR1200 has three major features to minimize power consumption:
468
 
469
* Slow and Idle Modes (SW controlled clock freq reduction)
470
* Doze and Sleep Modes (interrupt wake-up)
471
 
472
[[power_consumption_table]]
473
.Power Consumption
474
[width="60%",options="header"]
475
|===================================================================
476
| Power Minimization Feature    | Approx Power Consumption Reduction
477
| Slow and Idle mode            | 2x - 10x
478
| Doze mode                     | 100x
479
| Sleep mode                    | 200x
480
| Dynamic clock gating          | N/A
481
|===================================================================
482
 
483
Slow down mode takes advantage of the low-power dividers in external clock
484
generation circuitry to enable full functionality, but at a lower frequency
485
so that a power consumption is reduced.  PMR[SDF] 4 bits are broadcasted on
486
pm_clksd and external clock generation for the RISC should adapt RISC clock
487
frequency according to the value on pm_clksd.
488
 
489
When software initiates the doze mode, software processing on the core
490
suspends. The clocks to the RISC internal modules are disabled except to
491
the tick timer. However any other on-chip blocks can continue to function
492
as normal.  The OR1200 will leave doze mode and enter normal mode when a
493
pending interrupt occurs.
494
 
495
In sleep mode, all OR1200 internal units are disabled and clocks
496
gated. Optionally implementation may choose to lower the operating voltage
497
of the OR1200 core.  The OR1200 should leave sleep mode and enter normal
498
mode when a pending interrupt occurs.
499
 
500
Dynamic ((Clock gating)) (unit clock gating on clock by clock basis) is not
501
supported by OR1200.
502
 
503
Debug unit
504
~~~~~~~~~~
505
((Debug unit)) assists software developers to debug their systems. It provides
506
support only for basic debugging and does not have support for more advanced
507
debug features of OpenRISC 1000 architecture such as watchpoints, breakpoints
508
and program-flow control registers.
509
 
510
[[debug_unit_fig]]
511
.Block Diagram of Debug Unit
512
image::img/debug_unit_diag.gif[scaledwidth="50%",align="center"]
513
 
514
Watchpoints and breakpoints are events triggered by program- or data-flow
515
matching the conditions programmed in the debug registers. Breakpoints
516
unlike watchpoints also suspend execution of the current program-flow and
517
start breakpoint exception.
518
 
519
Clocks & Reset
520
~~~~~~~~~~~~~~
521
The OR1200 core has a ((clock)) input each for the instruction and data Wishbone
522
interface logic, and for the CPU core. Clock input clk_cpu clocks everything
523
inside the Wishbone interfaces. Data Wishbone interface is clocked by
524
dwb_clk_i, instruction Wishbone interface is clocked by iwb_clk_i.
525
 
526
OR1200 has asynchronous ((reset)) signal. Reset signal rst, when asserted high,
527
immediately resets all flip-flops inside OR1200. When deasserted, OR1200
528
will start reset exception.
529
 
530
WISHBONE Interfaces
531
~~~~~~~~~~~~~~~~~~~
532
Two ((WISHBONE)) interfaces connect OR1200 core to external peripherals and
533
external memory subsystem. They are WISHBONE SoC Interconnection specification
534
Rev. B3 compliant. The implementation implements a 32-bit bus width and does
535
not support other bus widths.
536
 
537
Wishbone registered-feedback incrementing burst accesses occur when not
538
disabled, and cache lines are filled. The burst size (beats) is determined
539
by the cache line size.
540
 
541
image::img/wb_compatible.png[scaledwidth="30%",align="center"]
542
 
543
Operation
544
---------
545
This section describes the operation of the OR1200 core. For operations
546
that pertain to the architectural definitions, see <>.
547
 
548
Reset
549
~~~~~
550
OR1200 has one asynchronous ((reset)) signal that can be used by a soft and hard
551
reset on a higher system hierarchy levels.
552
 
553
[[powerup_sequence_fig]]
554
.Power-Up and Reset Sequence
555
image::img/powerup_seq.gif[scaledwidth="70%",align="center"]
556
 
557
<> shows how asynchronous reset is applied after
558
powering up the OR1200 core. Reset is connected to asynchronous reset of
559
almost all flip-flops inside RISC core. Special care must be taken to ensure
560
hold and setup times of all flip-flops compared to main RISC clock.
561
 
562
If system implements gated clocks, then clock gating can be used to ensure
563
proper reset timing.
564
 
565
[[powerup_sequence_gatedclk_fig]]
566
.Power-Up and Reset Sequence w/ Gated Clock
567
image::img/powerup_seq_gatedclk.gif[scaledwidth="70%",align="center"]
568
 
569
The address the PC assumes at hard reset (assertion of external reset signal)
570
is definable at synthesis time, via the OR1200_BOOT_ADR define. This is not
571
to be confused with the ability to set the exception prefix address with
572
the EPH bit.
573
 
574
CPU/FPU/DSP
575
~~~~~~~~~~~
576
((CPU))/((FPU))/((DSP)) is implementation of the 32-bit part of the OpenRISC
577
1000 architecture and only a subset of all features is implemented.
578
 
579
Instructions
580
^^^^^^^^^^^^
581
(((OpenRISC 1200, Instruction List)))
582
The following table lists the instructions implemented in the OR1200. Those
583
optionally implemented are indicated as such.
584
 
585
// The table below is split into several columns for readability by the
586
// preprocessing script. It is better to have this automated because
587
// given the pseudo-lexicographical ordering, adding a new instruction
588
// would require manual changes in all subsequent columns, which is
589
// tedious and error-prone.
590
//
591
// When changing the column headers, remember to change the script accordingly.
592
 
593
[[instructions_table]]
594
.Instructions implemented in OR1200
595
[width="95%",options="header"]
596
|=================================
597
| Instruction mnemonic  | Optional
598
| ((l.add))             |
599
| ((l.addc))            | Yes
600
| ((l.addi))            |
601
| ((l.and))             |
602
| ((l.andi))            |
603
| ((l.bf))              |
604
| ((l.bnf))             |
605
| ((l.div))             | Yes
606 647 julius
| ((l.extbs))           | Yes
607
| ((l.extbz))           | Yes
608
| ((l.exths))           | Yes
609
| ((l.exthz))           | Yes
610
| ((l.extws))           | Yes
611
| ((l.extwz))           | Yes
612 645 julius
| ((l.ff1))             | Yes
613
| ((l.fl1))             | Yes
614
| ((l.j))               |
615
| ((l.jal))             |
616
| ((l.jalr))            |
617
| ((l.jr))              |
618
| ((l.lbs))             |
619
| ((l.lbz))             |
620
| ((l.lhs))             |
621
| ((l.lhz))             |
622
| ((l.lws))             |
623
| ((l.lwz))             |
624
| ((l.mac))             | Yes
625
| ((l.maci))            | Yes
626
| ((l.macrc))           | Yes
627
| ((l.mfspr))           |
628
| ((l.movhi))           |
629
| ((l.msb))             | Yes
630
| ((l.mtspr))           |
631
| ((l.mul))             | Yes
632
| ((l.muli))            | Yes
633
| ((l.nop))             |
634
| ((l.or))              |
635
| ((l.ori))             |
636
| ((l.rfe))             |
637
| ((l.rori))            |
638
| ((l.sb))              |
639
| ((l.sfeq))            |
640
| ((l.sfges))           |
641
| ((l.sfgeu))           |
642
| ((l.sfgts))           |
643
| ((l.sfgtu))           |
644
| ((l.sfleu))           |
645
| ((l.sflts))           |
646
| ((l.sfltu))           |
647
| ((l.sfne))            |
648
| ((l.sh))              |
649
| ((l.sll))             |
650
| ((l.slli))            |
651
| ((l.sra))             |
652
| ((l.srai))            |
653
| ((l.srl))             |
654
| ((l.srli))            |
655
| ((l.sub))             | Yes
656
| ((l.sw))              |
657
| ((l.sys))             |
658
| ((l.trap))            |
659
| ((l.xor))             |
660
| ((l.xori))            |
661
| ((lf.add.s))          | Yes
662
| ((lf.div.s))          | Yes
663
| ((lf.ftoi.s))         | Yes
664
| ((lf.itof.s))         | Yes
665
| ((lf.mul.s))          | Yes
666
| ((lf.sfeq.s))         | Yes
667
| ((lf.sfge.s))         | Yes
668
| ((lf.sfgt.s))         | Yes
669
| ((lf.sfle.s))         | Yes
670
| ((lf.sflt.s))         | Yes
671
| ((lf.sfne.s))         | Yes
672
| ((lf.sub.s))          | Yes
673
|=================================
674
 
675
For a complete description of each instruction's format refer to
676
<>.
677
 
678
Instruction Unit
679
^^^^^^^^^^^^^^^^
680
((Instruction unit)) generates instruction fetch effective address and fetches
681
instructions from instruction cache. Each clock cycle one instruction can
682
be fetched. Instruction fetch EA is further translated into physical address
683
by IMMU.
684
 
685
General-Purpose Registers
686
^^^^^^^^^^^^^^^^^^^^^^^^^
687
((General-purpose register)) file can supply two read operands each clock cycle
688
and store one result in a destination register.
689
 
690
GPRs can be also read and written through development interface.
691
 
692
Load/Store Unit
693
^^^^^^^^^^^^^^^
694
((LSU)) can execute one load instruction every two clock cycles assuming load
695
instruction have a hit in the data cache. Execution of store instructions
696
takes one clock cycle assuming they have a hit in the data cache.
697
 
698
LSU performs calculation of the load/store effective address. EA is further
699
translated into physical address by DMMU.
700
 
701
Load/store effective address and load and store data can be also accessed
702
through development interface.
703
 
704
Integer Execution Pipeline
705
^^^^^^^^^^^^^^^^^^^^^^^^^^
706
(((Pipeline, Integer Execution)))
707
The core implements the following types of 32-bit integer instructions:
708
 
709
* Arithmetic instructions
710
* Compare instructions
711
* Logical instructions
712
* Rotate and shift instructions
713
 
714
[[exec_time_int_table]]
715
.Execution Time of Integer Instructions
716
[width="70%",options="header"]
717
|================================================
718
| Instruction Group     | Clock Cycles to Execute
719
| Arithmetic except Multiply/Divide     | 1
720
| Multiply                              | 3
721
| Divide                                | 32
722
| Compare                               | 1
723
| Logical                               | 1
724
| Rotate and Shift                      | 1
725
| Others                                | 1
726
|================================================
727
 
728
<> lists execution times for instructions executed by
729
integer execution pipeline. Most instructions are executed in one clock cycle.
730
 
731
Integer multiply can be either serial or parallel implementations. Serial
732
operations require one clock cycle per bit of operand, which is 32-cycles
733
on the OR1200. At present no synthesis tools support division operators,
734
and so the serial option must be used.
735
 
736
MAC Unit
737
^^^^^^^^
738
((MAC)) unit executes l.mac instructions. MAC unit implements 32x32 fully
739
pipelined multiplier and 48-bit accumulator. MAC unit can accept one new
740
l.mac instruction each clock cycle.
741
 
742
Care should be taken when executing l.macrc (MAC read and clear) too soon
743
after the final l.mac instruction as the operation may still be underway
744
and the result will not be valid in time. It is recommended at least 3 other
745
instructions (or just l.nops) are inserted between the final l.mac and l.macrc.
746
 
747
Floating Point Unit
748
^^^^^^^^^^^^^^^^^^^
749
The ((floating point unit)) has a mechanism to stall the processor pipeline
750
until processing has completed.
751
 
752
The following table indicates the number of cycles per operation
753
 
754
[[exec_time_fp_table]]
755
.Execution time of floating point instructions
756
[width="60%",options="header"]
757
|=======================
758
| Operation     | Cycles
759
| Add/subtract  | 10
760
| Multiply      | 38
761
| Divide        | 37
762
| Compare       | 2
763
| Convert       | 7
764
|=======================
765
 
766
System Unit
767
^^^^^^^^^^^
768
((System unit)) implements system control and status special-purpose registers
769
and executes all l.mtspr/l.mfspr instructions.
770
 
771
Exceptions
772
^^^^^^^^^^
773
The core implements a precise ((exception model)). This means that when an
774
exception is taken, the following conditions are met:
775
 
776
* Subsequent instructions in program flow are discarded
777
* Previous instructions finish and write back their results
778
* The address of faulting instruction is saved in EPCR registers and the
779
  machine state is saved to ESR registers
780 808 julius
* If the exception occurred in a delay slot, the DSX bit of the SR is set
781 645 julius
 
782
[[exceptions_table]]
783
.List of Implemented ((Exceptions))
784
[width="95%",options="header"]
785
|===========================================================
786
| Exception Type        | Vector Offset | Causing Conditions
787
| Reset                 | 0x100 | Caused by reset.
788
| Bus Error             | 0x200 | Caused by an attempt to access invalid
789
  physical address.
790
| Data Page Fault       | 0x300 | Generated artificially by DTLB miss exception
791
  handler when no matching PTE found in page tables or page protection
792
  violation for load/store operations.
793
| Instruction Page Fault| 0x400 | Generated artificially by ITLB miss exception
794
  handler when no matching PTE found in page tables or page protection violation
795
  for instruction fetch.
796
| Low Priority External Interrupt       | 0x500 | Low priority external
797
  interrupt asserted.
798
| Alignment     | 0x600 | Load/store access to naturally not aligned location.
799
| Illegal Instruction   | 0x700 | Illegal instruction in the instruction stream.
800
| High Priority External Interrupt      | 0x800 | High priority external
801
  interrupt asserted.
802
| D-TLB Miss    | 0x900 | No matching entry in DTLB (DTLB miss).
803
| I-TLB Miss    | 0xA00 | No matching entry in ITLB (ITLB miss).
804 647 julius
| Range         | 0xB00 | If programmed in the SR, the setting of  SR[OV],
805
  usually by an arithmetic instruction, causes a range exception.
806 645 julius
| System Call   | 0xC00 | System call initiated by software.
807
| Floating point exception      | 0xD00 | FP operation caused flags in FPCSR to
808
  become set.
809
| Trap  | 0xE00 | Trap instruction was decoded
810
|===========================================================
811
 
812 808 julius
The OR1200 exception support does not include support for fast context
813
switching.
814 645 julius
 
815
Data Cache Operation
816
~~~~~~~~~~~~~~~~~~~~
817
Data Cache Load/Store Access
818
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
819
Load/store unit requests data from the data ((cache)) and stores them into
820
the general-purpose register file and forwards them to integer execution
821
units. Therefore LSU is tightly coupled with the data cache.
822
 
823
If there is no data cache line miss nor ((DTLB)) miss, load operations take
824
two clock cycles to execute and store operations take one clock cycle to
825
execute. LSU does all the data alignment work.
826
 
827
Data can be written to the data cache on a word, half-word or byte basis. Since
828
data cache only operates in write-through mode, all writes are immediately
829
written back to main memory or to the next level of caches.
830
 
831
[[wb_write_fig]]
832
.WISHBONE Write Cycle
833
image::img/wb_write.gif[scaledwidth="70%",align="center"]
834
 
835
<> shows how a ((write-through)) cycle on data WISHBONE interface
836
is performed when a store instruction hits in the data cache.  If +dwb_ERR_I+
837
or +dwb_RTY_I+ is asserted instead of usual +dwb_ACK_I+, bus error exception
838
is invoked.
839
 
840
Data Cache Line Fill Operation
841
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
842
When executing load instruction and a cache miss occurs, depending on whether
843
the cache uses ((write-through)) or ((write-back)) strategy and the line
844
is clean or invalid, a 4 beat sequential read burst with critical word
845
first is performed. If the strategy is write-back and the line is dirty,
846
the line is first written back to memory. The critical word is forwarded to
847
the load/store unit to minimize performance loss because of the cache miss.
848
 
849
[[wb_read_fig]]
850
.WISHBONE Block Read Cycle
851
image::img/wb_read.gif[scaledwidth="70%",align="center"]
852
 
853
<> shows how a cache line is read in WISHBONE read block cycle
854
composed out of four read transfers.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted
855
instead of usual +dwb_ACK_I+, bus error exception is invoked.
856
 
857
When executing a store instruction with the cache in write-through strategy,
858
and a cache miss occurs, the write is simply put on the bus and no caching
859
occurs. If it is a miss and the cache is in write back strategy and the line
860
is valid and clean or invalid,  a 4 beat sequential read burst to fill the
861
line is performed, and the the write to cache occurs. If storing and a cache
862
miss occurs, and the desired line is valid and dirty, it is first written
863
back to memory before the desired line is read.
864
 
865
[[wb_rw_fig]]
866
.WISHBONE Block Read/Write Cycle
867
image::img/wb_rw.gif[scaledwidth="70%",align="center"]
868
 
869
<> shows how a cache line is read in WISHBONE read block cycle
870
followed by a write transfer.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted instead
871
of usual +dwb_ACK_I+, bus error exception is invoked.
872
 
873
Cache/Memory Coherency
874
^^^^^^^^^^^^^^^^^^^^^^
875
Data cache in OR1200 operates in either write-through or write-back mode,
876
definable at synthesis time, for default use, and runtime when DMMU is
877
used. There is currently no ((coherency)) support between local data cache and
878
caches of other processors.
879
 
880
Data Cache Enabling/Disabling
881
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
882
Data cache is disabled at power up. Entire data cache can be enabled by setting
883
bit SR[DCE] to one. Before data cache is enabled, it must be invalidated.
884
 
885
Data Cache Invalidation
886
^^^^^^^^^^^^^^^^^^^^^^^
887
Data cache in OR1200 does not support ((invalidation)) of entire data
888
cache. Normal procedure to invalidate entire data cache is to cycle through
889
all data cache lines and invalidate each line separately.
890
 
891
Data Cache Locking
892
^^^^^^^^^^^^^^^^^^
893
Data cache implements way ((locking)) bits in data cache control register
894
DCCR. Bits LWx lock individual ways when they are set to one.
895
 
896
Data Cache Line Prefetch
897
^^^^^^^^^^^^^^^^^^^^^^^^
898
Data cache line ((prefetch)) is optional in the OpenRISC 1000 architecture and
899
is not implemented in OR1200.
900
 
901
Data Cache Line ((Flush))
902
^^^^^^^^^^^^^^^^^^^^^^^^^
903
Operation is performed by writing effective address to the DCBFR register.
904
 
905
When a cache line is valid and clean, or the cache is in write-through
906
strategy, the line is invalidated and no write-back occurs.
907
 
908
Data Cache Line Invalidate
909
^^^^^^^^^^^^^^^^^^^^^^^^^^
910
Data cache line ((invalidate)) invalidates a single data cache line. Operation
911
is performed by writing effective address to the DCBIR register.  If cache
912
is in write-back strategy, it is best to use the line flush function.
913
 
914
Data Cache Line ((Write-back))
915
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
916
Operation is performed by writing effective address to the DCBWR register.
917
 
918
If cache is in ((write-through)) strategy, this operation is ignored as no
919
lines will be cached and dirty, capable of being written back.
920
 
921
Data Cache Line ((Lock))
922
^^^^^^^^^^^^^^^^^^^^^^^^
923
Locking of individual data cache lines is not implemented in OR1200.
924
 
925
Data Cache ((inhibit)) with address bit 31 set
926
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
927
If DMMU is disabled, by default all addresses with bit 31 of the address
928
asserted high will cause the data cache to be inhibited, meaning no reads
929
or writes are cached.
930
 
931
If the ((DMMU)) is enabled, it is possible for any address to be inhibited
932
or not, and in these modes the cache behaves accordingly.
933
 
934
Instruction ((Cache)) Operation
935
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
936
Instruction Cache Instruction ((Fetch)) Access
937
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
938
Instruction unit requests instruction from the instruction cache and forwards
939
them to the instruction queue inside instruction unit. Therefore instruction
940
unit is tightly coupled with the instruction cache.
941
 
942
If there is no instruction cache line ((miss)) nor ITLB miss, instruction fetch
943
operation takes one clock cycle to execute.
944
 
945
Instruction cache cannot be explicitly modified like data cache can be with
946
store instructions.
947
 
948
Instruction Cache Line Fill Operation
949
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
950
On a cache miss, a 4 beat sequential read burst with critical word first is
951
performed. Critical word is forwarded to the instruction unit to minimize
952
performance loss because of the cache miss.
953
 
954
[[wb_block_read_fig]]
955
.WISHBONE Block Read Cycle
956
image::img/wb_block_read.gif[scaledwidth="70%",align="center"]
957
 
958
<> shows how a cache line is read in WISHBONE read block
959
cycle composed out of four read transfers.  If +iwb_ERR_I+ or +iwb_RTY_I+ is
960
asserted instead of usual +dwb_ACK_I+, bus error exception is invoked.
961
 
962
Cache/Memory ((Coherency))
963
^^^^^^^^^^^^^^^^^^^^^^^^^^
964
OR1200 is not intended for use in multiprocessor environments. Therefore no
965
support for coherency between local instruction cache and caches of other
966
processors or main memory is implemented.
967
 
968
Instruction Cache Enabling/Disabling
969
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
970
Instruction cache is disabled at power up. Entire instruction cache can be
971
enabled by setting bit SR[ICE] to one. Before instruction cache is enabled,
972
it must be invalidated.
973
 
974
Instruction Cache ((Invalidation))
975
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
976
Instruction cache in OR1200 does not support invalidation of entire instruction
977
cache. Normal procedure to invalidate entire instruction cache is to cycle
978
through all instruction cache lines and invalidate each line separately.
979
 
980
Instruction Cache Locking
981
^^^^^^^^^^^^^^^^^^^^^^^^^
982
Instruction cache implements way locking bits in instruction cache control
983
register ICCR. Bits LWx lock individual ways when they are set to one.
984
 
985
Instruction Cache Line ((Prefetch))
986
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
987
Instruction cache line prefetch is optional in the OpenRISC 1000 architecture
988
and is not implemented in OR1200.
989
 
990
Instruction Cache Line ((Invalidate))
991
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
992
Instruction cache line invalidate invalidates a single instruction cache
993
line. Operation is performed by writing effective address to the ICBIR
994
register.
995
 
996
Instruction ((Cache Line Lock))
997
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
998
Locking of individual instruction cache lines is not implemented in OR1200.
999
 
1000
Data MMU
1001
~~~~~~~~
1002
Translation Disabled
1003
^^^^^^^^^^^^^^^^^^^^
1004
Load/store address translation can be disabled by clearing bit SR[DME]. If
1005
translation is disabled, then physical address used to access data cache
1006
and optionally provided on +dwb_ADDR_O+, is the same as load/store effective
1007
address.
1008
(((Address Translation,Data)))
1009
 
1010
Translation Enabled
1011
^^^^^^^^^^^^^^^^^^^
1012
Load/store address translation can be enabled by setting bit SR[DME]. If
1013
translation is enabled, it provides load/store effective address to physical
1014
address translation and page protection for memory accesses.
1015
(((Address Translation,Data)))
1016
 
1017
[[addr_translation_fig]]
1018
.32-bit Address Translation Mechanism using Two-Level Page Table
1019
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1020
 
1021
In OR1200 case, ((page tables)) must be managed by operating system's virtual
1022
memory management subsystem. <> shows address translation
1023
using two-level page table. Refer to <> for one-level page
1024
table address translation as well as for details about address translation
1025
and page table content.
1026
 
1027
((DMMUCR)) and Flush of Entire ((DTLB))
1028
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1029
DMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1030
must be stored in software variable. Flush of entire DTLB must be performed
1031
by software flush of every DTLB entry separately. Software flush is performed
1032
by manually writing  bits from the TLB entries back to PTEs.
1033
 
1034
Page Protection
1035
^^^^^^^^^^^^^^^
1036
After a virtual address is determined to be within a page covered by the
1037
valid PTE, the access is validated by the memory protection mechanism. If
1038
this protection mechanism prohibits the access, a data page fault exception
1039
is generated.
1040
(((Page Protection,Data)))
1041
 
1042
The memory protection mechanism allows selectively granting read access
1043
and write access for both supervisor and user modes. The page protection
1044
mechanism provides protection at all page level granularities.
1045
 
1046
[[protection_attrs_ldst_table]]
1047
.Protection Attributes for Load/Store Accesses
1048
[width="70%",options="header"]
1049
|================================
1050
| Protection attribute  | Meaning
1051
| DTLBWyTR[SREx]        | Enable load operations in supervisor mode to the
1052
  page.
1053
| DTLBWyTR[SWEx]        | Enable store operations in supervisor mode to the
1054
  page.
1055
| DTLBWyTR[UREx]        | Enable load operations in user mode to the page.
1056
| DTLBWyTR[UWEx]        | Enable store operations in user mode to the page.
1057
|================================
1058
 
1059
<> lists page protection attributes defined in
1060
DTLBWyTR pregister. For the individual page appropriate strategy out of
1061
seven possible strategies programmed with the PPI field of the PTE. Because
1062
OR1200 does not implement DMMUPR, translation of PTE[PPI] into suitable set
1063
of protection bits must be performed by software and written into DTLBWyTR.
1064
 
1065
((DTLB)) Entry Reload
1066
^^^^^^^^^^^^^^^^^^^^^
1067
OR1200 does not implement DTLB entry reloads in hardware. Instead software
1068
routine must be used to search page table for correct page table entry (PTE)
1069
and copy it into the DTLB. Software is responsible for maintaining accessed
1070
and dirty bits in the page tables.
1071
 
1072
When LSU computes load/store effective address whose physical address is
1073
not already cached by DTLB, a DTLB miss exception is invoked.
1074
 
1075
DTLB reload routine must load the correct ((PTE)) to correct ((DTLBWyMR))
1076
and ((DTLBWyTR)) register from one of possible DTLB ways.
1077
 
1078
DTLB Entry Invalidation
1079
^^^^^^^^^^^^^^^^^^^^^^^
1080
Special-purpose register DTLBEIR must be written with the effective address
1081
and corresponding DTLB entry will be invalidated in the local DTLB.
1082
 
1083
Locking DTLB Entries
1084
^^^^^^^^^^^^^^^^^^^^
1085
Since all DTLB entry reloads are performed in software, there is no hardware
1086
locking of DTLB entries. Instead it is up to the software reload routine to
1087
avoid replacing some of the entries if so desired.
1088
 
1089
Page Attribute - Dirty (D)
1090
^^^^^^^^^^^^^^^^^^^^^^^^^^
1091
Dirty (D) attribute is not implemented in OR1200 DTLB. It is up to the
1092
operating system to generate dirty attribute bit with page protection
1093
mechanism.
1094
(((Page Attributes,Data)))
1095
 
1096
Page Attribute - Accessed (A)
1097
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1098
Accessed (A) attribute is not implemented in OR1200 DTLB. It is up to the
1099
operating system to generate accessed attribute bit with page protection
1100
mechanism.
1101
(((Page Attributes,Data)))
1102
 
1103
Page Attribute - Weakly Ordered Memory (WOM)
1104
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1105
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1106
memory accesses are serialized and therefore this attribute is not implemented.
1107
(((Page Attributes,Data)))
1108
 
1109
Page Attribute - Write-Back Cache (WBC)
1110
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1111
Write-back cache (WBC) attribute is not implemented as the data cache cannot
1112
be configured at run time to be write-back enabled if write-through strategy
1113
was selected at synthesis-time.
1114
(((Page Attributes,Data)))
1115
 
1116
Page Attribute - Caching-Inhibited (CI)
1117
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1118
Caching-inhibited (CI) attribute is not implemented in OR1200 DTLB. Cached
1119
and uncached regions are divided by bit 30 of data effective address.
1120
(((Page Attributes,Data)))
1121
 
1122
[[data_cached_regions_table]]
1123
.Cached and uncached regions
1124
[width="70%",options="header"]
1125
|===============================
1126
| Effective Address     | Region
1127
| 0x00000000 - 0x3FFFFFFF       | Cached
1128
| 0x40000000 - 0x7FFFFFFF       | Uncached
1129
| 0x80000000 - 0xBFFFFFFF       | Cached
1130
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1131
|===============================
1132
 
1133
Uncached accesses must be performed when I/O registers are memory mapped
1134
and all reads and writes must be always performed directly to the external
1135
interface and not to the data cache.
1136
 
1137
Page Attribute - Cache Coherency (CC)
1138
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1139
Cache coherency (CC) attribute is not needed in OR1200 because it does not
1140
implement support for multiprocessor environments and because data cache
1141
operates only in write-through mode and therefore this attribute is not
1142
implemented.
1143
(((Page Attributes,Data)))
1144
 
1145
((Instruction MMU))
1146
~~~~~~~~~~~~~~~~~~~
1147
Translation Disabled
1148
^^^^^^^^^^^^^^^^^^^^
1149
Instruction fetch address translation can be disabled by clearing bit
1150
SR[IME]. If translation is disabled, then physical address used to access
1151
instruction cache and optionally provided on iwb_ADDR_O, is the same as
1152
instruction fetch effective address.
1153
(((Address Translation,Instruction)))
1154
 
1155
Translation Enabled
1156
^^^^^^^^^^^^^^^^^^^
1157
Instruction fetch address translation can be enabled by setting bit
1158
SR[IME]. If translation is enabled, it provides instruction fetch effective
1159
address to physical address translation and page protection for instruction
1160
fetch accesses.
1161
(((Address Translation,Instruction)))
1162
 
1163
[[addr_translation_rep_fig]]
1164
.32-bit Address Translation Mechanism using Two-Level Page Table
1165
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1166
 
1167
In OR1200 case, page tables must be managed by operating system s virtual
1168
memory management subsystem. <> shows address
1169
translation using two-level page table. Refer to <> for
1170
one-level page table address translation as well as for details about address
1171
translation and page table content.
1172
 
1173
((IMMUCR)) and ((Flush)) of Entire ITLB
1174
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1175
IMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1176
must be stored in software variable. Flush of entire ITLB must be performed
1177
by software flush of every ITLB entry separately. Software flush is performed
1178
by manually writing bits from the TLB entries back to PTEs.
1179
 
1180
Page Protection
1181
^^^^^^^^^^^^^^^
1182
After a virtual address is determined to be within a page covered by the
1183
valid PTE, the access is validated by the memory protection mechanism. If
1184
this protection mechanism prohibits the access, an instruction page fault
1185
exception is generated.
1186
(((Page Protection,Instruction)))
1187
 
1188
The memory protection mechanism allows selectively granting execute access
1189
for both supervisor and user modes. The page protection mechanism provides
1190
protection at all page level granularities.
1191
 
1192
[[protection_attrs_inst_table]]
1193
.Protection Attributes for Instruction Fetch Accesses
1194
[width="70%",options="header"]
1195
|================================
1196
| Protection attribute  | Meaning
1197
| ITLBWyTR[SXEx]        | Enable execute operations in supervisor mode of the
1198
  page.
1199
| ITLBWyTR[UXEx]        | Enable execute operations in user mode of the page.
1200
|================================
1201
 
1202
<> lists page protection attributes defined
1203
in ITLBWyTR pregister. For the individual page appropriate strategy out
1204
of seven possible strategies programmed with PPI field of the PTE. Because
1205
OR1200 does not implement IMMUPR, translation of PTE[PPI] into suitable set
1206
of protection bits must be performed by software and written into ITLBWyTR.
1207
 
1208
((ITLB)) Entry Reload
1209
^^^^^^^^^^^^^^^^^^^^^
1210
OR1200 does not implement ITLB entry reloads in hardware. Instead software
1211
routine must be used to search page table for correct page table entry (PTE)
1212
and copy it into the ITLB. Software is responsible for maintaining accessed
1213
bit in the page tables.
1214
 
1215
When LSU computes instruction fetch effective address whose physical address
1216
is not already cached by ITLB, an ITLB miss exception is invoked.
1217
 
1218
ITLB reload routine must load the correct PTE to correct ITLBWyMR and ITLBWyTR
1219
register from one of possible ITLB ways.
1220
 
1221
ITLB Entry Invalidation
1222
^^^^^^^^^^^^^^^^^^^^^^^
1223
Special-purpose register ITLBEIR must be written with the effective address
1224
and corresponding ITLB entry will be invalidated in the local ITLB.
1225
 
1226
Locking ITLB Entries
1227
^^^^^^^^^^^^^^^^^^^^
1228
Since all ITLB entry reloads are performed in software, there is no hardware
1229
locking of ITLB entries. Instead it is up to the software reload routine to
1230
avoid replacing some of the entries if so desired.
1231
 
1232
Page Attribute - Dirty (D)
1233
^^^^^^^^^^^^^^^^^^^^^^^^^^
1234
Dirty (D) attribute resides in the PTE but it is not used by the IMMU.
1235
(((Page Attributes,Instruction)))
1236
 
1237
Page Attribute - Accessed (A)
1238
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1239
Accessed (A) attribute is not implemented in OR1200 ITLB. It is up to the
1240
operating system to generate accessed attribute bit with page protection
1241
mechanism.
1242
(((Page Attributes,Instruction)))
1243
 
1244
Page Attribute - Weakly Ordered Memory (WOM)
1245
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1246
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1247
instruction fetch accesses are serialized and therefore this attribute is
1248
not implemented.
1249
(((Page Attributes,Instruction)))
1250
 
1251
Page Attribute - Write-Back Cache (WBC)
1252
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1253
Write-back cache (WBC) attribute resides in the PTE but it is not used by
1254
the IMMU.
1255
(((Page Attributes,Instruction)))
1256
 
1257
Page Attribute - Caching-Inhibited (CI)
1258
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1259
Caching-inhibited (CI) attribute is not implemented in OR1200 ITLB. Cached
1260
and uncached regions are divided by bit 30 of instruction effective address.
1261
(((Page Attributes,Instruction)))
1262
 
1263
[[inst_cached_regions_table]]
1264
.Cached and uncached regions
1265
[width="70%",options="header"]
1266
|===============================
1267
| Effective Address     | Region
1268
| 0x00000000 - 0x3FFFFFFF       | Cached
1269
| 0x40000000 - 0x7FFFFFFF       | Uncached
1270
| 0x80000000 - 0xBFFFFFFF       | Cached
1271
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1272
|===============================
1273
 
1274
Page Attribute - Cache Coherency (CC)
1275
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1276
Cache coherency (CC) attribute resides in the PTE but it is not used by
1277
the IMMU.
1278
(((Page Attributes,Instruction)))
1279
 
1280
((Programmable Interrupt Controller))
1281
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1282
PICMR special-purpose register is used to mask or unmask up to 30 programmable
1283
interrupt sources. PICPR special-purpose register is used to assign low or
1284
high priority to maximum of 30 interrupt sources.
1285
 
1286
PICSR special-purpose register is used to determine status of each interrupt
1287
input. Bits in PICSR represent status of the interrupt inputs and the
1288
actual interrupt must be cleared in the device that is the source of a
1289
pending interrupt.
1290
 
1291
The ((PIC)) implementation in the OR1200  differs from the architecture
1292
specification. The PIC instead offers a latched level-sensitive interrupt.
1293
 
1294
Once an interrupt line is latched (i.e. its value appears in PICSR), no
1295
new interrupts can be triggered for that line until its bit in PICSR is
1296
cleared. The usual sequence for an interrupt handler is then as follows.
1297
 
1298
. Peripheral asserts interrupt, which is latched and triggers handler.
1299
. Handler processes interrupt.
1300
. Handler notifies peripheral that the interrupt has been processed (typically
1301
  via a memory mapped register).
1302
. Peripheral deasserts interrupt.
1303
. Handler clears corresponding bit in PICSR and returns.
1304
 
1305
It is assumed that the peripheral will de-assert its interrupt promptly
1306
(within 1-2 cycles). Otherwise on exiting the interrupt handler, having
1307
cleared PICSR, the level sensitive interrupt will immediately retrigger.
1308
 
1309
((Tick Timer))
1310
~~~~~~~~~~~~~~
1311
Tick timer facility is enabled with TTMR[M]. TTCR is incremented with each
1312
clock cycle and a high priority interrupt can be asserted whenever lower 28
1313
bits of TTCR match TTMR[TP] and TTMR[IE] is set.
1314
 
1315
TTCR restarts counting from zero when match event happens and TTMR[M] is
1316
0x1. If TTMR[M] is 0x2, TTCR is stoped when match event happens and TTCR
1317
must be changed to start counting again. When TTMR[M] is 0x3, TTCR keeps
1318
counting even when match event happens.
1319
 
1320
((Power Management))
1321
~~~~~~~~~~~~~~~~~~~~
1322
((Clock Gating)) and Frequency Changing Versus CPU Stalling
1323
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1324
If system doesn t support clock gating and if changing clock frequency in
1325
slow down mode is not possible, CPU can be stalled for certain number of
1326
clock cycles. This is much lower benefit on power consumption however it
1327
still reduces power consumption.
1328
 
1329
Slow Down Mode
1330
^^^^^^^^^^^^^^
1331
Slow down mode is software controlled with the 4-bit value in PMR[SDF]. Lower
1332
value specifies higher expected performance from the processor core. Usually
1333
PMR[SDF] is dynamically set by the operating system s idle routine, that
1334
monitors the usage of the processor core.
1335
(((Mode,Slow Down)))
1336
 
1337
PMR[SDF] is broadcast on +pm_clksd+. External clock generator should adjust
1338
clock frequency according to the value of +pm_clksd+. Exact slow down factors
1339
are not defined but 0xF should go all the way down to 32.768 KHz.
1340
 
1341
With +pm_clksd+ equal to 0xF, +pm_lvolt+ is asserted. This is an indication for
1342
the external power supply to lower the voltage.
1343
 
1344
Doze Mode
1345
^^^^^^^^^
1346
To switch to doze mode, software should set the PMR[DME]. Once an interrupt
1347
is received by the programmable interrupt controller (PIC), +pm_wakeup+
1348
is asserted and external clock generation circuitry should enable all
1349
clocks. Once clocks are running RISC is switched back again to the normal
1350
mode and PMR[DME] is cleared.
1351
(((Mode,Doze)))
1352
 
1353
When doze mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1354
+pm_immu_gate+ and +pm_cpugate+ are asserted. As a result all clocks except
1355
+clk_tt+ should be gated by external clock generation circuitry.
1356
 
1357
Sleep Mode
1358
^^^^^^^^^^
1359
To switch to sleep mode, software should set the PMR[SME]. Once an interrupt
1360
is received by the programmable interrupt controller (PIC), +pm_wakeup+ is
1361
asserted and external clock generation should enable all clocks. Once clocks
1362
are running, RISC is switched back again to the normal mode and PMR[SME]
1363
is cleared.
1364
(((Mode,Sleep)))
1365
 
1366
When sleep mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1367
+pm_immu_gate+, +pm_cpu_gate+ and +pm_tt_gate+ are asserted. As a result
1368
all clocks including +clk_tt+ should be gated by external clock generation
1369
circuitry.
1370
 
1371
In sleep mode, +pm_lvolt+ is asserted. This is an indication for the external
1372
power supply to lower the voltage.
1373
 
1374
Clock Gating
1375
^^^^^^^^^^^^
1376
((Clock gating)) feature is not implemented in OR1200 power management.
1377
 
1378
Disabled Units Force Clock Gating
1379
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1380
Units that are disabled in special-purpose register SR, have their clock
1381
gate signals asserted. Cleared bits SR[DCE], SR[ICE], SR[DME] and SR[IME]
1382
directly force assertion of +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+
1383
and +pm_immu_gate+.
1384
 
1385
((Debug Unit))
1386
~~~~~~~~~~~~~~
1387
Debug unit can be controlled through development interface or it can operate
1388
independently programmed and handled by the RISC s resident debug software.
1389
 
1390
((Watchpoints))
1391
^^^^^^^^^^^^^^^
1392
OR1200 debug unit does not implement OR12000 architecture watchpoints.
1393
 
1394
((Breakpoint)) Exception
1395
^^^^^^^^^^^^^^^^^^^^^^^^
1396
Which breakpointDMR2[WGB] bits specify which watchpoints invoke breakpoint
1397
exception. By invoking breakpoint exception, target resident debugger can
1398
be built.
1399
 
1400
Breakpoint is broadcast on development interface on +dbg_bp_o+.
1401
 
1402
((Development Interface))
1403
~~~~~~~~~~~~~~~~~~~~~~~~~
1404
NOTE: The information in this section is to be reviewed. It is the author's
1405
opinion that the debug interface is now largely provided by the SPR mappings,
1406
and no special sideband functions exist aside from stalling and resetting
1407
the core.
1408
 
1409
An additional _development and debug interface IP_ core may be used to connect
1410
OpenRISC 1200 to standard debuggers using IEEE.1149.1 (JTAG) protocol.
1411
 
1412
((Debugging)) Through ((Development Interface))
1413
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1414
The DSR special-purpose register specifies which exceptions cause the core
1415
to stop the execution of the exception handler and turn over control to
1416
development interface. It can be programmed by the resident debug software
1417
or by the development interface.
1418
 
1419
The DRR special-purpose register is specifies which event caused the core to
1420
stop the execution of program flow and turned over control to the development
1421
interface. It should be cleared by the resident debug software or by the
1422
development interface.
1423
 
1424
The DIR special-purpose register is not implemented.
1425
 
1426
Reading PC, Load/Store EA, Load Data, Store Data, Instruction
1427
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1428
Crucial information like ((program counter)) (PC), load/store effective
1429
address (LSEA), load data, store data and current instruction in execution
1430
pipeline can be asynchronously read through the development interface.
1431
 
1432
[[dev_commands_table]]
1433
.Development Interface Operation Commands
1434
[width="70%",options="header"]
1435
|========================
1436
| dbg_op_i[2:0] | Meaning
1437
| 0x0           | Reading Program Counter (PC)
1438
| 0x1           | Reading Load/Store Effective Address
1439
| 0x2           | Reading Load Data
1440
| 0x3           | Reading Store Data
1441
| 0x4           | Reading SPR
1442
| 0x5           | Writing SPR
1443
| 0x6           | Reading Instruction in Execution Pipeline
1444
| 0x7           | Reserved
1445
|========================
1446
 
1447
<> lists operation commands that control what is read
1448
or written through development interface. All reads except reads and writes
1449
of SPRs are asynchronous.
1450
 
1451
Reading and Writing SPRs Through Development Interface
1452
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1453
For reads and write to SPRs +dbg_op_i+ must be set to 0x4 and 0x5,
1454
respectively.
1455
 
1456
[[dev_interface_cycles_fig]]
1457
.Development Interface Cycles
1458
image::img/dev_interface_cycles.gif[scaledwidth="70%",align="center"]
1459
 
1460
<> shows development interface cycles. Writes must
1461
be synchronous to the main RISC clock positive edge and should take one clock
1462
cycle. Reads must take two clock cycles because access to synchronous cache
1463
lines or to TLB entries introduces one clock cycle of delay.
1464
 
1465
If required, external debugger can stop the CPU core by asserting
1466
+dbg_stall_i+. This way it can have enough time to read all interesting
1467
registers from the RISC or guarantee that writes into SPRs are performed
1468
without RISC writing to the same registers.
1469
 
1470
Tracking ((Data Flow))
1471
^^^^^^^^^^^^^^^^^^^^^^
1472
An external debugger can monitor and record data flow inside the RISC for
1473
debugging purposes and profiling analysis. This is accomplished by monitoring
1474
status of the load/store unit, load/store effective address and load/store
1475
data, all available at the development interface.
1476
 
1477
[[status_ldst_unit_table]]
1478
.Status of the Load/Store Unit
1479
[width="70%",options="header"]
1480
|============================================================
1481
| dbg_lss_o[3:0]        | Load/Store Instruction in Execution
1482
| 0x0   | No load/store instruction in execution
1483
| 0x1   | Reserved for load doubleword
1484
| 0x2   | Load byte and zero extend
1485
| 0x3   | Load byte and sign extend
1486
| 0x4   | Load halfword and zero extend
1487
| 0x5   | Load halfword and sign extend
1488
| 0x6   | Load singleword and zero extend
1489
| 0x7   | Load singleword and sign extend
1490
| 0x8   | Reserved for store doubleword
1491
| 0x9   | Reserved
1492
| 0xA   | Store byte
1493
| 0xB   | Reserved
1494
| 0xC   | Store halfword
1495
| 0xD   | Reserved
1496
| 0xE   | Store singleword
1497
| 0xF   | Reserved
1498
|============================================================
1499
 
1500
External trace buffer can capture all interesting data flow
1501
events by analyzing status of the load/store unit available on
1502
+dbg_lss_o+. <> lists different status encoding for
1503
the load/store unit.
1504
 
1505
Tracking ((Program Flow))
1506
^^^^^^^^^^^^^^^^^^^^^^^^^
1507
An external debugger can monitor and record program flow inside the RISC
1508
for debugging purposes and profiling analysis. This is accomplished by
1509
monitoring status of the instruction unit, PC and fetched instruction word,
1510
all available at the development interface.
1511
 
1512
[[status_inst_unit_table]]
1513
.Status of the Instruction Unit
1514
[width="70%",options="header"]
1515
|=========================================
1516
| dbg_is_o[1:0] | Instruction Fetch Status
1517
| 0x0   | No instruction fetch in progress
1518
| 0x1   | Normal instruction fetch
1519
| 0x2   | Executing branch instruction
1520
| 0x3   | Fetching instruction in delay slot
1521
|=========================================
1522
 
1523
External trace buffer can capture all interesting program flow
1524
events by analyzing status of the instruction unit available on
1525
+dbg_is_o+. <> lists different status encoding for
1526
the instruction unit.
1527
 
1528
Triggering ((External Watchpoint Event))
1529
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1530
<> shows how development interface can assert
1531
+dbg_ewt_I+ and cause watchpoint event. If programmed, external watchpoint
1532
event will cause a breakpoint exception.
1533
 
1534
[[watchpoint_trigger_fig]]
1535
.Assertion of External Watchpoint Trigger
1536
image::img/watchpoint_trigger.gif[scaledwidth="70%",align="center"]
1537
 
1538
((Registers))
1539
-------------
1540
This section describes all registers inside the OR1200 core. Shifting _GRP_
1541
number 11 bits left and adding _REG_ number computes the address of each
1542
special-purpose register. All registers are 32 bits wide from software
1543
perspective. _USER MODE_ and _SUPV MODE_ specify the valid access types for
1544
each register in user mode and supervisor mode of operation. R/W stands for
1545
read and write access and R stands for read only access.
1546
 
1547
((Registers list))
1548
~~~~~~~~~~~~~~~~~~
1549
[[regs_table]]
1550
.List of All Registers
1551
[width="95%",options="header"]
1552
|============================================================================
1553
| Grp # | Reg # | Reg Name      | USER MODE     | SUPV MODE     | Description
1554
| 0     | 0     | ((VR))        | -             | R     | Version Register
1555
| 0     | 1     | ((UPR))       | -             | R     | Unit Present Register
1556
| 0     | 2     | ((CPUCFGR))   | -             | R     | CPU Configuration Register
1557
| 0     | 3     | ((DMMUCFGR))  | -             | R     | Data MMU Configuration Register
1558
| 0     | 4     | ((IMMUCFGR))  | -             | R     | Instruction MMU Configuration Register
1559
| 0     | 5     | ((DCCFGR))    | -             | R     | Data Cache Configuration Register
1560
| 0     | 6     | ((ICCFGR))    | -             | R     | Instruction Cache Configuration Register
1561
| 0     | 7     | ((DCFGR))     | -             | R     | Debug Configuration Register
1562
| 0     | 16    | ((PC))        | -             | R/W   | PC mapped to SPR space
1563
| 0     | 17    | ((SR))        | -             | R/W   | Supervision Register
1564
| 0     | 20    | ((FPCSR))     | -             | R/W   | FP Control Status Register
1565
| 0     | 32    | ((EPCR0))     | -             | R/W   | Exception PC Register
1566
| 0     | 48    | ((EEAR0))     | -             | R/W   | Exception EA Register
1567
| 0     | 64    | ((ESR0))      | -             | R/W   | Exception SR Register
1568
| 0     | 1024-1055     | ((GPR0-GPR31))        | -     | R/W   | GPRs mapped to SPR space
1569
| 1     | 2             | ((DTLBEIR))   | -     | W     | Data TLB Entry Invalidate Register
1570
| 1     | 1024-1151     | ((DTLBW0MR0-DTLBW0MR127))     | -     | R/W   | Data TLB Match Registers Way 0
1571
| 1     | 1536-1663     | ((DTLBW0TR0-DTLBW0TR127))     | -     | R/W   | Data TLB Translate Registers Way 0
1572
| 2     | 2             | ((ITLBEIR))   | -     | W     | Instruction TLB Entry Invalidate Register
1573
| 2     | 1024-1151     | ((ITLBW0MR0-ITLBW0MR127))     | -     | R/W   | Instruction TLB Match Registers Way 0
1574
| 2     | 1536-1663     | ((ITLBW0TR0-ITLBW0TR127))     | -     | R/W   | Instruction TLB Translate Registers Way 0
1575
| 3     | 0     | ((DCCR))      | -             | R/W   | DC Control Register
1576
| 3     | 2     | ((DCBFR))     | W             | W     | DC Block Flush Register
1577
| 3     | 3     | ((DCBIR))     | W             | W     | DC Block Invalidate Register
1578
| 3     | 4     | ((DCBWR))     | W             | W     | DC Block Write-back register
1579
| 4     | 0     | ((ICCR))      | -             | R/W   | IC Control Register
1580
| 4     | 256   | ((ICBIR))     | W             | W     | IC Block Invalidate Register
1581
| 5     | 256   | ((MACLO))     | R/W           | R/W   | MAC Low
1582
| 5     | 257   | ((MACHI))     | R/W           | R/W   | MAC High
1583
| 6     | 16    | ((DMR1))      | -             | R/W   | Debug Mode Register 1
1584
| 6     | 17    | ((DMR2))      | -             | R/W   | Debug Mode Register 2
1585
| 6     | 20    | ((DSR))       | -             | R/W   | Debug Stop Register
1586
| 6     | 21    | ((DRR))       | -             | R/W   | Debug Reason Register
1587
| 8     | 0     | ((PMR))       | -             | R/W   | Power Management Register
1588
| 9     | 0     | ((PICMR))     | -             | R/W   | PIC Mask Register
1589
| 9     | 2     | ((PICSR))     | -             | R/W   | PIC Status Register
1590
| 10    | 0     | ((TTMR))      | -             | R/W   | Tick Timer Mode Register
1591
| 10    | 1     | ((TTCR))      | R*            | R/W   | Tick Timer Count Register
1592
|============================================================================
1593
 
1594
<> lists all OpenRISC 1000 special-purpose registers implemented
1595
in OR1200. Registers VR and UPR are described below. For description of
1596
other registers refer to <>.
1597
 
1598
Register VR description
1599
~~~~~~~~~~~~~~~~~~~~~~~
1600
Special-purpose register VR identifies the version (model) and revision
1601
level of the OpenRISC 1000 processor. It also specifies possible standard
1602
template on which this implementation is based.
1603
(((Register,VR)))
1604
 
1605
[[vr_reg_table]]
1606
.VR Register
1607
[width="95%",options="header"]
1608
|============================================================
1609
| Bit # | Access        | Reset | Short Name    | Description
1610
| 5:0   | R     | Revision      | REV           | Revision number of this document.
1611
| 15:6  | R     | 0x0           | -             | Reserved
1612
| 23:16 | R     | 0x00          | CFG           | Configuration should be read from UPR and configuration registers
1613
| 31:24 | R     | 0x12          | VER           | Version number for OR1200 is fixed at 0x1200.
1614
|============================================================
1615
 
1616
Register UPR description
1617
~~~~~~~~~~~~~~~~~~~~~~~~
1618
Special-purpose register UPR identifies the units present in the processor. It
1619
has a bit for each implemented unit or functionality. Lower sixteen bits
1620
identify present units defined in the OpenRISC 1000 architecture. Upper
1621
sixteen bits define present custom units.
1622
(((Register,UPR)))
1623
 
1624
[[upr_reg_table]]
1625
.UPR Register
1626
[width="95%",options="header"]
1627
|============================================================
1628
| Bit # | Access        | Reset | Short Name    | Description
1629
| 0     | R             | 1     | UP            | UPR present
1630
| 1     | R             | 1     | DCP           | Data cache present[†]
1631
| 2     | R             | 1     | ICP           | Instruction cache present[†]
1632
| 3     | R             | 1     | DMP           | Data MMU present[†]
1633
| 4     | R             | 1     | IMP           | Instruction MMU present[†]
1634
| 5     | R             | 1     | MP            | MAC present[†]
1635
| 6     | R             | 1     | DUP           | Debug unit present[†]
1636
| 7     | R             | 0     | PCUP          | Performance counters unit not present[†]
1637
| 8     | R             | 1     | PMP           | Power Management Present[†]
1638
| 9     | R             | 1     | PICP          | Programmable interrupt controller present
1639
| 10    | R             | 1     | TTP           | Tick timer present
1640
| 11    | R             | 1     | FPP           | Floating point present[†]
1641
| 23:12 | R             | X     | -             | Reserved
1642
| 31:24 | R             | 0xXXXX| CUP           | The user of the OR1200 core adds custom units.
1643
|============================================================
1644
[†]: if enabled at synthesis time
1645
 
1646
Register CPUCFGR description
1647
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1648
Special-purpose register CPUCFGR identifies the capabilities and configuration
1649
of the CPU.
1650
(((Register,CPUCFGR)))
1651
 
1652
[[cpucfgr_reg_table]]
1653
.CPUCFGR Register
1654
[width="95%",options="header"]
1655
|============================================================
1656
| Bit # | Access        | Reset | Short Name    | Description
1657
| 3:0   | R             | 0x0   | NSGF          | Zero number of shadow GPR files
1658
| 4     | R             | 0     | HGF           | No half GPR files[†]
1659
| 5     | R             | 1     | OB32S         | ORBIS32 supported
1660
| 6     | R             | 0     | OB64S         | ORBIS64 not supported
1661
| 7     | R             | 1     | OF32S         | ORFPX32 supported[‡]
1662
| 8     | R             | 0     | OF64S         | ORFPX64 not supported
1663
| 9     | R             | 0     | OV64S         | ORVDX64 not supported
1664
|============================================================
1665
[†]: If disabled at synthesis time
1666
 
1667
[‡]: If FPU enabled at synthesis time
1668
 
1669
Register DMMUCFGR description
1670
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1671
Special-purpose register DMMUCFGR identifies the capabilities and configuration
1672
of the DMMU.
1673
(((Register,DMMUCFGR)))
1674
 
1675
[[dmmucfgr_reg_table]]
1676
.DMMUCFGR Register
1677
[width="95%",options="header"]
1678
|============================================================
1679
| Bit # | Access        | Reset | Short Name    | Description
1680
| 1:0   | R             | 0x0   | NTW           | One DTLB way
1681
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 DTLB sets
1682
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1683
| 8     | R             | 0     | CRI           | No DMMU control register implemented
1684
| 9     | R             | 0     | PRI           | No protection register implemented
1685
| 10    | R             | 1     | TEIRI         | DTLB entry invalidate register implemented
1686
| 11    | R             | 0     | HTR           | No hardware DTLB reload
1687
|============================================================
1688
 
1689
Register IMMUCFGR description
1690
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1691
Special-purpose register IMMUCFGR identifies the capabilities and configuration
1692
of the IMMU.
1693
(((Register,IMMUCFGR)))
1694
 
1695
[[immucfgr_reg_table]]
1696
.IMMUCFGR Register
1697
[width="95%",options="header"]
1698
|============================================================
1699
| Bit # | Access        | Reset | Short Name    | Description
1700
| 1:0   | R             | 0x0   | NTW           | One ITLB way
1701
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 ITLB sets
1702
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1703
| 8     | R             | 0     | CRI           | No IMMU control register implemented
1704
| 9     | R             | 0     | PRI           | No protection register implemented
1705
| 10    | R             | 1     | TEIRI         | ITLB entry invalidate register implemented
1706
| 11    | R             | 0     | HTR           | No hardware ITLB reload
1707
|============================================================
1708
 
1709
Register DCCFGR description
1710
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1711
Special-purpose register DCCFGR identifies the capabilities and configuration
1712
of the data cache.
1713
(((Register,DCCFGR)))
1714
 
1715
[[dccfgr_reg_table]]
1716
.DCCFGR Register
1717
[width="95%",options="header"]
1718
|============================================================
1719
| Bit # | Access        | Reset | Short Name    | Description
1720
| 2:0   | R             | 0x0   | NCW           | One DC way
1721
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 DC sets
1722
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1723
| 8     | R             | 0     | CWS           | Cache write-through strategy[†]
1724
| 9     | R             | 1     | CCRI          | DC control register implemented
1725
| 10    | R             | 1     | CBIRI         | DC block invalidate register implemented
1726
| 11    | R             | 0     | CBPRI         | DC block prefetch register not implemented
1727
| 12    | R             | 0     | CBLRI         | DC block lock register not implemented
1728
| 13    | R             | 1     | CBFRI         | DC block flush register implemented
1729
| 14    | R             | 1     | CBWBRI        | DC block write-back register  implemented[‡]
1730
|============================================================
1731
[†]: If disabled at synthesis time
1732
 
1733
[‡]: If FPU enabled at synthesis time
1734
 
1735
Register ICCFGR description
1736
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1737
Special-purpose register ICCFGR identifies the capabilities and configuration
1738
of the instruction cache.
1739
(((Register,ICCFGR)))
1740
 
1741
[[iccfgr_reg_table]]
1742
.ICCFGR Register
1743
[width="95%",options="header"]
1744
|============================================================
1745
| Bit # | Access        | Reset | Short Name    | Description
1746
| 2:0   | R             | 0x0   | NCW           | One IC way
1747
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 IC sets
1748
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1749
| 8     | R             | 0     | CWS           | Cache write-through strategy
1750
| 9     | R             | 1     | CCRI          | IC control register implemented
1751
| 10    | R             | 1     | CBIRI         | IC block invalidate register implemented
1752
| 11    | R             | 0     | CBPRI         | IC block prefetch register not implemented
1753
| 12    | R             | 0     | CBLRI         | IC block lock register not implemented
1754
| 13    | R             | 1     | CBFRI         | IC block flush register implemented
1755
| 14    | R             | 0     | CBWBRI        | IC block write-back register not implemented
1756
|============================================================
1757
 
1758
Register DCFGR description
1759
~~~~~~~~~~~~~~~~~~~~~~~~~~
1760
Special-purpose register DCFGR identifies the capabilities and configuration
1761
of the debut unit.
1762
(((Register,DCFGR)))
1763
 
1764
[[dcfgr_reg_table]]
1765
.DCFGR Register
1766
[width="95%",options="header"]
1767
|============================================================
1768
| Bit # | Access        | Reset | Short Name    | Description
1769
| 3:0   | R             | 0x0   | NDP           | Zero DVR/DCR pairs[†]
1770
| 4     | R             | 0     | WPCI          | Watchpoint counters not implemented
1771
|============================================================
1772
[†]: If hardware breakpoints disabled at synthesis time
1773
 
1774
((IO ports))
1775
------------
1776
OR1200 IP core has several interfaces. <> below shows
1777
all interfaces:
1778
 
1779
* Instruction and data WISHBONE host interfaces
1780
* Power management interface
1781
* Development interface
1782
* Interrupts interface
1783
 
1784
[[core_interfaces_fig]]
1785
.Core's Interfaces
1786
image::img/core_interfaces.gif[scaledwidth="50%",align="center"]
1787
 
1788
Instruction WISHBONE Master Interface
1789
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1790
OR1200 has two master WISHBONE Rev B compliant interfaces. Instruction
1791
interface is used to connect OR1200 core to memory subsystem for purpose of
1792
fetching instructions or instruction cache lines.
1793
 
1794
[[inst_wb_master_table]]
1795
.Instruction WISHBONE Master Interface' Signals
1796
[width="95%",options="header"]
1797
|====================================================
1798
| Port          | Width | Direction     | Description
1799
| ((iwb_CLK_I)) | 1     | Input         | Clock input
1800
| ((iwb_RST_I)) | 1     | Input         | Reset input
1801
| ((iwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1802
| ((iwb_ADR_O)) | 32    | Outputs       | Address outputs
1803
| ((iwb_DAT_I)) | 32    | Inputs        | Data inputs
1804
| ((iwb_DAT_O)) | 32    | Outputs       | Data outputs
1805
| ((iwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1806
| ((iwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1807
| ((iwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1808
| ((iwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as iwb_ERR_I.
1809
| ((iwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1810
| ((iwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1811
|====================================================
1812
 
1813
Data WISHBONE Master Interface
1814
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1815
OR1200 has two master WISHBONE Rev B compliant interfaces. Data interface
1816
is used to connect OR1200 core to external peripherals and memory subsystem
1817
for purpose of reading and writing data or data cache lines.
1818
 
1819
[[data_wb_master_table]]
1820
.Data WISHBONE Master Interface' Signals
1821
[width="95%",options="header"]
1822
|====================================================
1823
| Port          | Width | Direction     | Description
1824
| ((dwb_CLK_I)) | 1     | Input         | Clock input
1825
| ((dwb_RST_I)) | 1     | Input         | Reset input
1826
| ((dwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1827
| ((dwb_ADR_O)) | 32    | Outputs       | Address outputs
1828
| ((dwb_DAT_I)) | 32    | Inputs        | Data inputs
1829
| ((dwb_DAT_O)) | 32    | Outputs       | Data outputs
1830
| ((dwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1831
| ((dwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1832
| ((dwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1833
| ((dwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as dwb_ERR_I.
1834
| ((dwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1835
| ((dwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1836
|====================================================
1837
 
1838
System Interface
1839
~~~~~~~~~~~~~~~~
1840
System interface connects reset, clock and other system signals to the
1841
OR1200 core.
1842
 
1843
[[sys_interface_table]]
1844
.System Interface Signals
1845
[width="95%",options="header"]
1846
|====================================================
1847
| Port          | Width | Direction     | Description
1848
| ((Rst))       | 1     | Input         | Asynchronous reset
1849
| ((clk_cpu))   | 1     | Input         | Main clock input to the RISC
1850
| ((clk_dc))    | 1     | Input         | Data cache clock
1851
| ((clk_ic))    | 1     | Input         | Instruction cache clock
1852
| ((clk_dmmu))  | 1     | Input         | Data MMU clock
1853
| ((clk_immu))  | 1     | Input         | Instruction MMU clock
1854
| ((clk_tt))    | 1     | Input         | Tick timer clock
1855
|====================================================
1856
 
1857
Development Interface
1858
~~~~~~~~~~~~~~~~~~~~~
1859
Development interface connects external development port to the RISC s internal
1860
debug facility. Debug facility allows control over program execution inside
1861
RISC, setting of breakpoints and watchpoints, and tracing of instruction
1862
and data flows.
1863
 
1864
[[dev_interface_table]]
1865
.Development Interface
1866
[width="95%",options="header"]
1867
|====================================================
1868
| Port          | Width | Direction     | Description
1869
| ((dbg_dat_o)) | 32    | Output        | Transfer of data from RISC to external development interface
1870
| ((dbg_dat_i)) | 32    | Input         | Transfer of data from external development interface to RISC
1871
| ((dbg_adr_i)) | 32    | Input         | Address of special-purpose register to be read or written
1872
| ((dbg_op_I))  | 3     | Input         | Operation select for development interface
1873
| ((dbg_lss_o)) | 4     | Output        | Status of load/store unit
1874
| ((dbg_is_o))  | 2     | Output        | Status of instruction fetch unit
1875
| ((dbg_wp_o))  | 11    | Output        | Status of watchpoints
1876
| ((dbg_bp_o))  | 1     | Output        | Status of the breakpoint
1877
| ((dbg_stall_i))       | 1     | Input | Stalls RISC CPU core
1878
| ((dbg_ewt_i)) | 1     | Input         | External watchpoint trigger
1879
|====================================================
1880
 
1881
Power Management Interface
1882
~~~~~~~~~~~~~~~~~~~~~~~~~~
1883
Power management interface provides signals for interfacing RISC core with
1884
external power management circuitry. External power management circuitry is
1885
required to implement functions that are technology specific and cannot be
1886
implemented inside OR1200 core.
1887
 
1888
[[pow_mgmt_interface_table]]
1889
.Power Management Interface
1890
[width="95%",options="header"]
1891
|============================================================================
1892
| Port                  | Width | Direction     | Generation            | Description
1893
| ((pm_clksd))          | 4     | Output        | Static (in SW)        | Slow down outputs that control reduction of RISC clock frequency
1894
| ((pm_cpustall))       | 1     | Input         | -                     | Synchronous stall of the RISC’s CPU core
1895
| ((pm_dc_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of data cache clock
1896
| ((pm_ic_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of instruction cache clock
1897
| ((pm_dmmu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of data MMU clock
1898
| ((pm_immu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of instruction MMU clock
1899
| ((pm_tt_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of tick timer clock
1900
| ((pm_cpu_gate))       | 1     | Output        | Static (in SW)        | Gating of main CPU clock
1901
| ((pm_wakeup))         | 1     | Output        | Dynamic (in HW)       | Activate all clocks
1902
| ((pm_lvolt))          | 1     | Output        | Static (in SW)        | Lower voltage
1903
|============================================================================
1904
 
1905
Interrupt Interface
1906
~~~~~~~~~~~~~~~~~~~
1907
Interrupt interface has interrupt inputs for interfacing external peripheral
1908
s interrupt outputs to the RISC core. All interrupt inputs are evaluated on
1909
positive edge of main RISC clock.
1910
 
1911
[[interrupt_interface_table]]
1912
.Interrupt Interface
1913
[width="95%",options="header"]
1914
|============================================================
1915
| Port          | Width         | Direction     | Description
1916
| ((pic_ints))  | PIC_INTS      | Input         | External interrupts
1917
|============================================================
1918
 
1919
 
1920
 
1921
[appendix]
1922
Core HW Configuration
1923
=====================
1924
(((Hardware,Configuration)))
1925
This section describes parameters that are set by the user of the core and
1926
define configuration of the core. Parameters must be set by the user before
1927
actual use of the core in simulation or synthesis.
1928
 
1929
[[core_hw_conf_table]]
1930
.Core HW configuration table
1931
[width="95%",options="header"]
1932
|============================================================
1933
| Variable Name | Range         | Default       | Description
1934
| ((EADDR_WIDTH))       | 32    | 32    | Effective address width
1935
| ((VADDR_WIDTH))       | 32    | 32    | Virtual address width
1936
| ((PADDR_WIDTH))       | 24 - 36| 32   | Physical address width
1937
| ((DATA_WIDTH))        | 32    | 32    | Data width / Operation width
1938
| ((DC_IMPL))   | 0 - 1         | 1     | Data cache implementation
1939
| ((DC_SETS))   | 256-1024      | 512   | Data cache number of sets
1940
| ((DC_WAYS))   | 1             | 1     | Data cache number of ways
1941
| ((DC_LINE))   | 16 - 32       | 16    | Data cache line size
1942
| ((IC_IMPL))   | 0 - 1         | 1     | Instruction cache implementation
1943
| ((IC_SETS))   | 32-1024       | 512   | Instruction cache number of sets
1944
| ((IC_WAYS))   | 1             | 1     | Instruction cache number of ways
1945
| ((IC_LINE))   | 16-32         | 16    | Instruction cache line size in bytes
1946
| ((DMMU_IMPL)) | 0 - 1         | 1     | Data MMU implementation
1947
| ((DTLB_SETS)) | 64            | 64    | Data TLB number of sets
1948
| ((DTLB_WAYS)) | 1             | 1     | Data TLB number of ways
1949
| ((IMMU_IMPL)) | 0 - 1         | 1     | Instruction MMU implementation
1950
| ((ITLB_SETS)) | 64            | 64    | Instruction TLB number of sets
1951
| ((ITLB_WAYS)) | 1             | 1     | Instruction TLB number of ways
1952
| ((PIC_INTS))  | 2 - 32        | 20    | Number of interrupt inputs
1953
|============================================================
1954
 
1955
:numbered!:
1956
 
1957
[bibliography]
1958
((Bibliography))
1959
================
1960
[bibliography]
1961
- [[[or1000_manual]]] Damjan Lampret et al. 'OpenRISC 1000 System Architecture
1962
  Manual'. 2004.
1963
 
1964
[index]
1965
Index
1966
=====
1967
// The index is generated automatically by the DocBook toolchain.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.