OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [branches/] [or1200_rel3/] [doc/] [openrisc1200_spec.txt] - Blame information for rev 856

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 645 julius
OpenRISC 1200 IP Core Specification (Preliminary Draft)
2
=======================================================
3
:doctype: book
4
 
5
////
6
Revision history
7
Note: When adding new entries, strictly follow the format of the existing ones.
8
 
9
Rev.    | Date          | Author        | Description
10
__vstart__
11
v0.1    | 28/3/01       | Damjan Lampret        | First Draft
12
 
13
v0.2    | 16/4/01       | Damjan Lampret        | First time published
14
 
15
v0.3    | 29/4/01       | Damjan Lampret        | All chapters almost
16
finished. Some bugs hidden waiting for an update. Awaiting feedback.
17
 
18
v0.4    | 16/5/01       | Damjan Lampret        | Synchronization with
19
OR1K Arch Manual
20
 
21
v0.5    | 24/5/01       | Damjan Lampret        | Fixed bugs
22
 
23
v0.6    | 28/5/01       | Damjan Lampret        | Changed some SPR addresses.
24
 
25
v0.7    | 06/9/01       | Damjan Lampret        | Simplified debug unit.
26
 
27
v0.8    | 30/08/10      | Julius Baxter         | Adding information about FPU
28
implementation, data cache write-back capability. PIC behavior update.
29
Instruction list update. Update of bits in config registers, bringing into
30
line with latest OR1200 - not entirely complete.
31
 
32
v0.9    | 12/9/10       | Julius Baxter         | Clarified supported parts of
33
OR1K instruction set. Updated core clock input information.
34
Fixed up reference to instruction execute stage cycle table.
35
Added divide cycles to execute stage cycle table.
36
 
37
0.10    | 1/11/10       | Julius Baxter         | Added FF1/FL1 instructions to
38
supported instructions table.
39
 
40
v0.11   | 19/1/11       | Julius Baxter | Cache information update.
41
Wishbone behavior clarification. Serial integer multiply/divide update.
42
Reset address clarification
43 647 julius
 
44
v0.12   | 13/9/11       | Julius Baxter | Addition of extension instructions
45
l.extbs, l.extbz, l.exths, l.exthz, l.extws and l.extwz. Range exception
46
support, overflow bit in supervision register.
47 645 julius
__vend__
48
////
49
 
50
Introduction
51
------------
52
Purpose of this document is to define specifications of the OpenRISC 1200
53
implementation. This specification defines all implementation specific
54
variables that are not part of the general architecture specification. This
55
includes type and size of data and instruction caches, type and size of data
56
and instruction MMUs, details of all execution pipelines, implementation
57
of exception unit, interrupt controller and other supplemental units.
58
This document does not cover general architecture topics like instruction set,
59
memory addressing modes and other architectural definitions. See
60
<> for more information about architecture.
61
 
62
OpenRISC Family
63
~~~~~~~~~~~~~~~
64
(((OpenRISC,Family)))
65
OpenRISC 1000 is architecture for a family of free, open source RISC processor
66
cores. As architecture, OpenRISC 1000 allows for a spectrum of chip and
67
system implementations at a variety of price/performance points for a range of
68
applications. It is a 32/64-bit load and store RISC architecture designed with
69
emphasis on performance, simplicity, low power requirements, scalability and
70
versatility. OpenRISC 1000 architecture targets medium and high performance
71
networking, embedded, automotive and portable computer environments.
72
 
73
image::img/or_family.gif[scaledwidth="50%",align="center"]
74
 
75
All OpenRISC implementations, whose first digit in identification number
76
is  1 , belong to OpenRISC 1000 family. Second digit defines which features
77
of OpenRISC 1000 architecture are implemented and in which way they are
78
implemented. Last two digits define how an implementation is configured
79
before it is used in a real application.
80
 
81
However, at present the OR1200 is the only major RTL implementation of the
82
OR1K architecture spec, and the OR1200 name has stuck, despite the high level
83
of reconfigurability possible that would, strictly speaking, mean the core
84
is either a OR1000, OR1300, etc. So, despite the various features that may
85
or may not be implemented, the core is still only referred to as the OR1200.
86
 
87
OpenRISC 1200
88
~~~~~~~~~~~~~
89
(((OpenRISC,1200)))
90
The OR1200 is a 32-bit scalar RISC with Harvard microarchitecture, 5 stage
91
integer pipeline, virtual memory support (MMU) and basic DSP capabilities.
92
Default caches are 1-way direct-mapped 8KB data cache and 1-way direct-mapped
93
8KB instruction cache, each with 16-byte line size. Both caches are
94
physically tagged.  By default MMUs are implemented and they are constructed of
95
64-entry hash based 1-way direct-mpped data TLB and 64-entry hash based 1-way
96
direct-mapped instruction TLB.
97
 
98
Supplemental facilities include debug unit for real-time debugging, high
99
resolution tick timer, programmable interrupt controller and power management
100
support.  When implemented in a typical 0.18u 6LM process it should provide
101
over 300 dhrystone 2.1 MIPS at 300MHz and 300 DSP MAC 32x32 operations, at
102
least 20% more than any other competitor in this class. OR1200 in default
103
configuration has about 1M transistors.
104
 
105
OR1200 is intended for embedded, portable and networking applications. It can
106
successfully compete with latest scalar 32-bit RISC processors in his class
107
and can efficiently run any modern operating system.  Competitors include
108
ARM10, ARC and Tensilica RISC processors.
109
 
110
Features
111
^^^^^^^^
112
The following lists the main features of OR1200 IP core:
113
 
114
- All major characteristics of the core can be set by the user
115
- High performance of 300 Dhrystone 2.1 MIPS at 300 MHz using 0.18u process
116
- High performance cache and MMU subsystems
117
- WISHBONE SoC Interconnection Rev. B3 compliant interface
118
 
119
Architecture
120
------------
121
<> below shows general architecture of OR1200 IP core. It
122
consists of several building blocks:
123
 
124
- CPU/FPU/DSP central block
125
- Direct-mapped data cache
126
- Direct-mapped instruction cache
127
- Data MMU based on hash based DTLB
128
- Instruction MMU based on hash based ITLB
129
- Power management unit and power management interface
130
- Tick timer
131
- Debug unit and development interface
132
- Interrupt controller and interrupt interface
133
- Instruction and Data WISHBONE host interfaces
134
 
135
[[core_arch_fig]]
136
.Core's Architecture
137
image::img/core_arch.gif[scaledwidth="50%",align="center"]
138
 
139
CPU/FPU/DSP
140
~~~~~~~~~~~
141
((CPU))/((FPU))/((DSP)) is a central part of the OR1200 RISC processor.
142
<> shows basic block diagram of the CPU/DSP. Not pictured
143
are the FPU components.  OR1200 CPU/FPU/DSP ony implements sections of
144
the ORBIS32 and ORFPX32 instruction set. No ((ORBIS64)), ((ORFBX64)) or
145
((ORVDX64)) instructions are implemented in OR1200.
146
 
147
[[cpu_fpu_dsp_fig]]
148
.CPU/FPU/DSP Block Diagram
149
image::img/cpu_fpu_dsp.gif[scaledwidth="50%",align="center"]
150
 
151
Instruction unit
152
^^^^^^^^^^^^^^^^
153
The instruction unit implements the basic instruction pipeline, fetches
154
instructions from the memory subsystem, dispatches them to available execution
155
units, and maintains a state history to ensure a precise exception model
156
and that operations finish in order. It also executes conditional branch
157
and unconditional jump instructions.
158
 
159
The sequencer can dispatch a sequential instruction on each clock if the
160
appropriate execution unit is available. The execution unit must discern
161
whether source data is available and to ensure that no other instruction is
162
targeting the same destination register.
163
 
164
Instruction unit handles only ((ORBIS32)) and, optionally, a subset of the
165
((ORFPX32)) instruction class. Some ((ORFPX32)) and all ((ORFPX3264)) and
166
((ORVDX64)) instruction classes are not supported by the OR1200 at present.
167
 
168
General-Purpose Registers
169
^^^^^^^^^^^^^^^^^^^^^^^^^
170
OpenRISC 1200 implements 32 general-purpose 32-bit ((registers)). OpenRISC 1000
171
architecture also support shadow copies of register file to implement fast
172
switching between working contexts, however this feature is not implemented
173
in current OR1200 implementation.
174
 
175
OR1200 implements general-purpose register file as two synchronous dual-port
176
memories with capacity of 32 words by 32 bits per word.
177
 
178
Load/Store Unit
179
^^^^^^^^^^^^^^^
180
The ((load/store unit (LSU))) transfers all data between the GPRs and the CPU's
181
internal bus. It is implemented as an independent execution unit so that stalls
182
in memory subsystem only affect master pipeline if there is a data dependency.
183
 
184
The following are LSU's main features:
185
 
186
- all load/store instruction implemented in hardware (atomic instructions
187
  included)
188
- address entry buffer
189
- pipelined operation
190
- aligned accesses for fast memory access
191
 
192
When load and store instructions are issued, the LSU determines if all
193
operands are available. These operands include the following:
194
 
195
- address register operand
196
- source data register operand (for store instructions)
197
- destination data register operand (for load instructions)
198
 
199
Integer Execution Pipeline
200
^^^^^^^^^^^^^^^^^^^^^^^^^^
201
(((Pipeline, Integer Execution)))
202
The core implements the following types of 32-bit integer instructions:
203
 
204
- Arithmetic instructions
205
- Compare instructions
206
- Logical instructions
207
- Rotate and shift instructions
208
 
209
Most integer instructions can execute in one cycle. For details about timing
210
see <>.
211
 
212
MAC Unit
213
^^^^^^^^
214
The ((MAC)) unit executes DSP MAC operations. MAC operations are 32x32 with
215
48-bit accumulator. MAC unit is fully pipelined and can accept new MAC
216
operation in each new clock cycle.
217
 
218
Floating Point Unit
219
^^^^^^^^^^^^^^^^^^^
220
(((Floating Point Unit)))
221
The ((FPU)) implementation is based on two other FPUs available from
222
OpenCores.org. For the comparison and conversion functions, parts were taken
223
from the FPU project by Rudolf Usselmann, and for the arithmetic operations,
224
the fpu100 project by Jidan Al-Eryani was converted to Verilog HDL.
225
 
226
All ((ORFPX32)) instructions except for ((lf.madd.s)) and ((lf.rem.s)) are
227
supported when the FPU is enabled in the OR1200 configuration.
228
 
229
System Unit
230
^^^^^^^^^^^
231
The ((system unit)) connects all other signals of the CPU/FPU/DSP that are not
232
connected through instruction and data interfaces. It also implements all
233
system special-purpose registers (e.g. supervisor register).
234
 
235
Exceptions
236
^^^^^^^^^^
237
Core exceptions can be generated when an exception condition occurs.
238
((Exception sources)) in OR1200 include the following:
239
 
240
- External interrupt request
241
- Certain memory access condition
242
- Internal errors, such as an attempt to execute unimplemented opcode
243
- System call
244
- Internal exception, such as breakpoint exceptions
245 647 julius
- Arithmetic overflow
246 645 julius
 
247
((Exception handling)) is transparent to user software and uses the same
248
mechanism to handle all types of exceptions. When an exception is taken,
249
control is transferred to an exception handler at an offset defined by for
250
the type of exception encountered. Exceptions are handled in supervisor mode.
251
 
252
Data Cache
253
~~~~~~~~~~
254
The default configuration of OR1200 data ((cache)) is 8-Kbyte, 1-way
255
direct-mapped data cache, which allows rapid core access to data. However
256
data cache can be configured according to <>.
257
 
258
[[data_confs_or1200_table]]
259
.Possible Data Cache Configurations of OR1200
260
[width="60%",options="header"]
261
|======================================================
262
|                                       | Direct mapped
263
| 16B/line, 256 lines, 1 way            | 4KB
264
| 16B/line, 512 lines, 1 way            | *8KB (default)*
265
| 16B/line, 1024 lines, 1 way           | 16KB
266
| 32B/line, 1024 lines, 1 way           | 32KB
267
|======================================================
268
 
269
It is possible to operate the data cache with write-through or write-back
270
strategies, however write-back is currently experimental.
271
 
272
Features:
273
 
274
- data cache is separate from instruction cache (Harvard architecture)
275
- data cache implements a least-recently used (LRU) replacement algorithm
276
  within each set
277
- the cache directory is physically addressed. The physical address tag is
278
  stored in the cache directory
279
- write-through or write-back operation
280
- entire cache can be disabled, lines invalidated, flushed or forced to be
281
  written back, by writing to cache special purpose registers
282
 
283
On a miss, and appropriate conditions, the cache line is filled or emptied
284
(written back) with 16-byte bursts. The burst fill is performed as a
285
critical-word-first operation; the critical word is simultaneously written
286
to the cache and forwarded to the requesting unit, thus minimizing stalls
287
due to cache fill latency. Data cache provides storage for cache tags and
288
performs cache line replacement function.
289
 
290
Data cache is tightly coupled to external interface to allow efficient
291
access to the system memory controller.
292
 
293
The data cache supplies data to the GPRs by means of a 32-bit interface
294
to the load/store unit. The LSU provides all logic required to calculate
295
effective addresses, handles data alignment to and from the data cache,
296
and provides sequencing for load and store operations. Write operations to
297
the data cache can be performed on a byte, half-word or word basis.
298
 
299
image::img/data_cache_diag.gif[scaledwidth="50%",align="center"]
300
 
301
Each line contains four contiguous words from memory that are loaded from
302
a cache line aligned boundary. As a result, cache lines are aligned with
303
page boundaries.
304
 
305
Instruction Cache
306
~~~~~~~~~~~~~~~~~
307
The default configuration of OR1200 instruction ((cache)) is 8-Kbyte, 1-way
308
direct mapped instruction cache, which allows rapid core access to
309
instructions. However instruction cache can be configured according to
310
<>.
311
 
312
[[inst_confs_or1200_table]]
313
.Possible Instruction Cache Configurations of OR1200
314
[width="60%",options="header"]
315
|==============================================
316
|                               | Direct mapped
317
| 16B/line, 32 lines, 1 way     | 512B
318
| 16B/line, 256 lines, 1 way    | 4KB
319
| 16B/line, 512 lines, 1 way    | *8KB (Default)*
320
| 16B/line, 1024 lines, 1 way   | 16KB
321
| 32B/line, 1024 lines, 1 way   | 32KB
322
|==============================================
323
 
324
Features:
325
 
326
- instruction cache is separate from data cache (Harvard architecture)
327
  (((Architecture,Harvard)))
328
- instruction cache implements a least-recently used (LRU) replacement
329
  algorithm within each set
330
  ((LRU))
331
- the ((cache directory)) is physically addressed. The physical address tag is
332
  stored in the cache directory
333
- it can be disabled or invalidated by writing to cache special purpose
334
  registers
335
 
336
On a miss, the cache is filled in with 16-byte bursts. The burst fill
337
is performed as a critical-word-first operation; the critical word is
338
simultaneously written to the cache and forwarded to the requesting unit,
339
thus minimizing stalls due to cache fill latency. Instruction cache provides
340
storage for cache tags and performs cache line replacement function.
341
 
342
Instruction cache is tightly coupled to external interface to allow efficient
343
access to the system memory controller.
344
 
345
The instruction cache supplies instructions to the instruction sequencer by
346
means of a 32-bit interface to the instruction fetch subunit. The instruction
347
fetch subunit provides all logic required to calculate effective addresses.
348
 
349
image::img/inst_cache_diag.gif[scaledwidth="50%",align="center"]
350
 
351
Each line contains four contiguous words from memory that are loaded from
352
a line-size  aligned boundary. As a result, cache lines are aligned with
353
page boundaries.
354
 
355
Data MMU
356
~~~~~~~~
357
(((MMU, Data)))
358
The OR1200 implements a ((virtual memory management)) scheme that
359
provides memory access protection and effective-to-physical address
360
translation. ((Protection)) granularity is as defined by OpenRISC 1000
361
architecture - 8-Kbyte and 16-Mbyte pages.
362
 
363
[[data_tlb_confs_or1200_table]]
364
.Possible Data TLB Configurations of OR1200
365
[width="60%",options="header"]
366
|======================================
367
|                       | Direct mapped
368
| 16 entries per way    | 16 DTLB entries
369
| 32 entries per way    | 32 DTLB entries
370
| 64 entries per way    | *64 DTLB entries (default)*
371
| 128 entries per way   | 128 DTLB entries
372
|======================================
373
 
374
Features:
375
 
376
* data MMU is separate from instruction MMU
377
* page size 8-Kbyte
378
* comprehensive page protection scheme
379
* direct mapped hash based translation lookaside buffer (DTLB) with the
380
  default of 1 way and the following features:
381
** miss and fault exceptions
382
** software tablewalk
383
** high performance because of hashed based design
384
** variable number DTLB entries with default of 64 per each way
385
 
386
image::img/tlb_diag.gif[scaledwidth="50%",align="center"]
387
 
388
The MMU hardware supports two-level software tablewalk.
389
 
390
Instruction MMU
391
~~~~~~~~~~~~~~~
392
(((MMU, Instruction)))
393
The OR1200 implements a virtual memory management scheme that provides memory
394
access protection and effective-to-physical address translation. Protection
395
granularity is as defined by OpenRISC 1000 architecture - 8-Kbyte and
396
16-Mbyte pages.
397
 
398
[[inst_tlb_confs_or1200_table]]
399
.Possible Instruction TLB Configurations of OR1200
400
[width="60%",options="header"]
401
|======================================
402
|                       | Direct mapped
403
| 16 entries per way    | 16 DTLB entries
404
| 32 entries per way    | 32 DTLB entries
405
| 64 entries per way    | *64 DTLB entries (default)*
406
| 128 entries per way   | 128 DTLB entries
407
|======================================
408
 
409
Features:
410
 
411
* instruction MMU is separate from data MMU
412
* pages size 8-Kbyte
413
* comprehensive page protection scheme
414
* 1 way direct-mapped hash based translation lookaside buffer (ITLB) with the
415
  following features:
416
** miss and fault exceptions
417
** software tablewalk
418
** high performance because of hashed based design
419
** Variable number of ITLB entries with default of 64 entries per way
420
 
421
image::img/inst_mmu_diag.gif[scaledwidth="50%",align="center"]
422
 
423
The MMU hardware supports two-level software tablewalk.
424
 
425
Programmable Interrupt Controller
426
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
427
The ((interrupt)) controller receives interrupts from external sources and
428
forwards them as low or high priority interrupt exception to the CPU core.
429
 
430
[[interrupt_controller_fig]]
431
.Block Diagram of the Interrupt Controller
432
image::img/interrupt_controller.gif[scaledwidth="50%",align="center"]
433
 
434
Programmable interrupt controller has three special-purpose registers and 32
435
interrupt inputs. Interrupt input 0 and 1 are always enabled and connected
436
to high and low priority interrupt input, respectively.
437
 
438
30 other interrupt inputs can be masked and assigned low or high priority
439
through programming special-purpose registers.
440
 
441
Tick Timer
442
~~~~~~~~~~
443
OR1200 implements tick ((timer)) facility. Basically this is a timer that is
444
clocked by RISC clock and is used by the operating system to precisely
445
measure time and schedule system tasks.
446
 
447
OR1200 precisely follow architectural definition of the tick timer facility:
448
 
449
* Maximum timer count of 2^32 clock cycles
450
* Maximum time period of 2^28 clock cycles between interrupts
451
* Maskable tick timer interrupt
452
* Single run, restartable or continues timer
453
 
454
Tick timer operates from independent clock source so that doze power management
455
mode can be implemented.
456
 
457
Power Management Support
458
~~~~~~~~~~~~~~~~~~~~~~~~
459
To optimize ((power consumption)), the OR1200 provides ((low-power)) modes that
460
can be used to dynamically activate and deactivate certain internal modules.
461
 
462
OR1200 has three major features to minimize power consumption:
463
 
464
* Slow and Idle Modes (SW controlled clock freq reduction)
465
* Doze and Sleep Modes (interrupt wake-up)
466
 
467
[[power_consumption_table]]
468
.Power Consumption
469
[width="60%",options="header"]
470
|===================================================================
471
| Power Minimization Feature    | Approx Power Consumption Reduction
472
| Slow and Idle mode            | 2x - 10x
473
| Doze mode                     | 100x
474
| Sleep mode                    | 200x
475
| Dynamic clock gating          | N/A
476
|===================================================================
477
 
478
Slow down mode takes advantage of the low-power dividers in external clock
479
generation circuitry to enable full functionality, but at a lower frequency
480
so that a power consumption is reduced.  PMR[SDF] 4 bits are broadcasted on
481
pm_clksd and external clock generation for the RISC should adapt RISC clock
482
frequency according to the value on pm_clksd.
483
 
484
When software initiates the doze mode, software processing on the core
485
suspends. The clocks to the RISC internal modules are disabled except to
486
the tick timer. However any other on-chip blocks can continue to function
487
as normal.  The OR1200 will leave doze mode and enter normal mode when a
488
pending interrupt occurs.
489
 
490
In sleep mode, all OR1200 internal units are disabled and clocks
491
gated. Optionally implementation may choose to lower the operating voltage
492
of the OR1200 core.  The OR1200 should leave sleep mode and enter normal
493
mode when a pending interrupt occurs.
494
 
495
Dynamic ((Clock gating)) (unit clock gating on clock by clock basis) is not
496
supported by OR1200.
497
 
498
Debug unit
499
~~~~~~~~~~
500
((Debug unit)) assists software developers to debug their systems. It provides
501
support only for basic debugging and does not have support for more advanced
502
debug features of OpenRISC 1000 architecture such as watchpoints, breakpoints
503
and program-flow control registers.
504
 
505
[[debug_unit_fig]]
506
.Block Diagram of Debug Unit
507
image::img/debug_unit_diag.gif[scaledwidth="50%",align="center"]
508
 
509
Watchpoints and breakpoints are events triggered by program- or data-flow
510
matching the conditions programmed in the debug registers. Breakpoints
511
unlike watchpoints also suspend execution of the current program-flow and
512
start breakpoint exception.
513
 
514
Clocks & Reset
515
~~~~~~~~~~~~~~
516
The OR1200 core has a ((clock)) input each for the instruction and data Wishbone
517
interface logic, and for the CPU core. Clock input clk_cpu clocks everything
518
inside the Wishbone interfaces. Data Wishbone interface is clocked by
519
dwb_clk_i, instruction Wishbone interface is clocked by iwb_clk_i.
520
 
521
OR1200 has asynchronous ((reset)) signal. Reset signal rst, when asserted high,
522
immediately resets all flip-flops inside OR1200. When deasserted, OR1200
523
will start reset exception.
524
 
525
WISHBONE Interfaces
526
~~~~~~~~~~~~~~~~~~~
527
Two ((WISHBONE)) interfaces connect OR1200 core to external peripherals and
528
external memory subsystem. They are WISHBONE SoC Interconnection specification
529
Rev. B3 compliant. The implementation implements a 32-bit bus width and does
530
not support other bus widths.
531
 
532
Wishbone registered-feedback incrementing burst accesses occur when not
533
disabled, and cache lines are filled. The burst size (beats) is determined
534
by the cache line size.
535
 
536
image::img/wb_compatible.png[scaledwidth="30%",align="center"]
537
 
538
Operation
539
---------
540
This section describes the operation of the OR1200 core. For operations
541
that pertain to the architectural definitions, see <>.
542
 
543
Reset
544
~~~~~
545
OR1200 has one asynchronous ((reset)) signal that can be used by a soft and hard
546
reset on a higher system hierarchy levels.
547
 
548
[[powerup_sequence_fig]]
549
.Power-Up and Reset Sequence
550
image::img/powerup_seq.gif[scaledwidth="70%",align="center"]
551
 
552
<> shows how asynchronous reset is applied after
553
powering up the OR1200 core. Reset is connected to asynchronous reset of
554
almost all flip-flops inside RISC core. Special care must be taken to ensure
555
hold and setup times of all flip-flops compared to main RISC clock.
556
 
557
If system implements gated clocks, then clock gating can be used to ensure
558
proper reset timing.
559
 
560
[[powerup_sequence_gatedclk_fig]]
561
.Power-Up and Reset Sequence w/ Gated Clock
562
image::img/powerup_seq_gatedclk.gif[scaledwidth="70%",align="center"]
563
 
564
The address the PC assumes at hard reset (assertion of external reset signal)
565
is definable at synthesis time, via the OR1200_BOOT_ADR define. This is not
566
to be confused with the ability to set the exception prefix address with
567
the EPH bit.
568
 
569
CPU/FPU/DSP
570
~~~~~~~~~~~
571
((CPU))/((FPU))/((DSP)) is implementation of the 32-bit part of the OpenRISC
572
1000 architecture and only a subset of all features is implemented.
573
 
574
Instructions
575
^^^^^^^^^^^^
576
(((OpenRISC 1200, Instruction List)))
577
The following table lists the instructions implemented in the OR1200. Those
578
optionally implemented are indicated as such.
579
 
580
// The table below is split into several columns for readability by the
581
// preprocessing script. It is better to have this automated because
582
// given the pseudo-lexicographical ordering, adding a new instruction
583
// would require manual changes in all subsequent columns, which is
584
// tedious and error-prone.
585
//
586
// When changing the column headers, remember to change the script accordingly.
587
 
588
[[instructions_table]]
589
.Instructions implemented in OR1200
590
[width="95%",options="header"]
591
|=================================
592
| Instruction mnemonic  | Optional
593
| ((l.add))             |
594
| ((l.addc))            | Yes
595
| ((l.addi))            |
596
| ((l.and))             |
597
| ((l.andi))            |
598
| ((l.bf))              |
599
| ((l.bnf))             |
600
| ((l.div))             | Yes
601 647 julius
| ((l.extbs))           | Yes
602
| ((l.extbz))           | Yes
603
| ((l.exths))           | Yes
604
| ((l.exthz))           | Yes
605
| ((l.extws))           | Yes
606
| ((l.extwz))           | Yes
607 645 julius
| ((l.ff1))             | Yes
608
| ((l.fl1))             | Yes
609
| ((l.j))               |
610
| ((l.jal))             |
611
| ((l.jalr))            |
612
| ((l.jr))              |
613
| ((l.lbs))             |
614
| ((l.lbz))             |
615
| ((l.lhs))             |
616
| ((l.lhz))             |
617
| ((l.lws))             |
618
| ((l.lwz))             |
619
| ((l.mac))             | Yes
620
| ((l.maci))            | Yes
621
| ((l.macrc))           | Yes
622
| ((l.mfspr))           |
623
| ((l.movhi))           |
624
| ((l.msb))             | Yes
625
| ((l.mtspr))           |
626
| ((l.mul))             | Yes
627
| ((l.muli))            | Yes
628
| ((l.nop))             |
629
| ((l.or))              |
630
| ((l.ori))             |
631
| ((l.rfe))             |
632
| ((l.rori))            |
633
| ((l.sb))              |
634
| ((l.sfeq))            |
635
| ((l.sfges))           |
636
| ((l.sfgeu))           |
637
| ((l.sfgts))           |
638
| ((l.sfgtu))           |
639
| ((l.sfleu))           |
640
| ((l.sflts))           |
641
| ((l.sfltu))           |
642
| ((l.sfne))            |
643
| ((l.sh))              |
644
| ((l.sll))             |
645
| ((l.slli))            |
646
| ((l.sra))             |
647
| ((l.srai))            |
648
| ((l.srl))             |
649
| ((l.srli))            |
650
| ((l.sub))             | Yes
651
| ((l.sw))              |
652
| ((l.sys))             |
653
| ((l.trap))            |
654
| ((l.xor))             |
655
| ((l.xori))            |
656
| ((lf.add.s))          | Yes
657
| ((lf.div.s))          | Yes
658
| ((lf.ftoi.s))         | Yes
659
| ((lf.itof.s))         | Yes
660
| ((lf.mul.s))          | Yes
661
| ((lf.sfeq.s))         | Yes
662
| ((lf.sfge.s))         | Yes
663
| ((lf.sfgt.s))         | Yes
664
| ((lf.sfle.s))         | Yes
665
| ((lf.sflt.s))         | Yes
666
| ((lf.sfne.s))         | Yes
667
| ((lf.sub.s))          | Yes
668
|=================================
669
 
670
For a complete description of each instruction's format refer to
671
<>.
672
 
673
Instruction Unit
674
^^^^^^^^^^^^^^^^
675
((Instruction unit)) generates instruction fetch effective address and fetches
676
instructions from instruction cache. Each clock cycle one instruction can
677
be fetched. Instruction fetch EA is further translated into physical address
678
by IMMU.
679
 
680
General-Purpose Registers
681
^^^^^^^^^^^^^^^^^^^^^^^^^
682
((General-purpose register)) file can supply two read operands each clock cycle
683
and store one result in a destination register.
684
 
685
GPRs can be also read and written through development interface.
686
 
687
Load/Store Unit
688
^^^^^^^^^^^^^^^
689
((LSU)) can execute one load instruction every two clock cycles assuming load
690
instruction have a hit in the data cache. Execution of store instructions
691
takes one clock cycle assuming they have a hit in the data cache.
692
 
693
LSU performs calculation of the load/store effective address. EA is further
694
translated into physical address by DMMU.
695
 
696
Load/store effective address and load and store data can be also accessed
697
through development interface.
698
 
699
Integer Execution Pipeline
700
^^^^^^^^^^^^^^^^^^^^^^^^^^
701
(((Pipeline, Integer Execution)))
702
The core implements the following types of 32-bit integer instructions:
703
 
704
* Arithmetic instructions
705
* Compare instructions
706
* Logical instructions
707
* Rotate and shift instructions
708
 
709
[[exec_time_int_table]]
710
.Execution Time of Integer Instructions
711
[width="70%",options="header"]
712
|================================================
713
| Instruction Group     | Clock Cycles to Execute
714
| Arithmetic except Multiply/Divide     | 1
715
| Multiply                              | 3
716
| Divide                                | 32
717
| Compare                               | 1
718
| Logical                               | 1
719
| Rotate and Shift                      | 1
720
| Others                                | 1
721
|================================================
722
 
723
<> lists execution times for instructions executed by
724
integer execution pipeline. Most instructions are executed in one clock cycle.
725
 
726
Integer multiply can be either serial or parallel implementations. Serial
727
operations require one clock cycle per bit of operand, which is 32-cycles
728
on the OR1200. At present no synthesis tools support division operators,
729
and so the serial option must be used.
730
 
731
MAC Unit
732
^^^^^^^^
733
((MAC)) unit executes l.mac instructions. MAC unit implements 32x32 fully
734
pipelined multiplier and 48-bit accumulator. MAC unit can accept one new
735
l.mac instruction each clock cycle.
736
 
737
Care should be taken when executing l.macrc (MAC read and clear) too soon
738
after the final l.mac instruction as the operation may still be underway
739
and the result will not be valid in time. It is recommended at least 3 other
740
instructions (or just l.nops) are inserted between the final l.mac and l.macrc.
741
 
742
Floating Point Unit
743
^^^^^^^^^^^^^^^^^^^
744
The ((floating point unit)) has a mechanism to stall the processor pipeline
745
until processing has completed.
746
 
747
The following table indicates the number of cycles per operation
748
 
749
[[exec_time_fp_table]]
750
.Execution time of floating point instructions
751
[width="60%",options="header"]
752
|=======================
753
| Operation     | Cycles
754
| Add/subtract  | 10
755
| Multiply      | 38
756
| Divide        | 37
757
| Compare       | 2
758
| Convert       | 7
759
|=======================
760
 
761
System Unit
762
^^^^^^^^^^^
763
((System unit)) implements system control and status special-purpose registers
764
and executes all l.mtspr/l.mfspr instructions.
765
 
766
Exceptions
767
^^^^^^^^^^
768
The core implements a precise ((exception model)). This means that when an
769
exception is taken, the following conditions are met:
770
 
771
* Subsequent instructions in program flow are discarded
772
* Previous instructions finish and write back their results
773
* The address of faulting instruction is saved in EPCR registers and the
774
  machine state is saved to ESR registers
775
 
776
[[exceptions_table]]
777
.List of Implemented ((Exceptions))
778
[width="95%",options="header"]
779
|===========================================================
780
| Exception Type        | Vector Offset | Causing Conditions
781
| Reset                 | 0x100 | Caused by reset.
782
| Bus Error             | 0x200 | Caused by an attempt to access invalid
783
  physical address.
784
| Data Page Fault       | 0x300 | Generated artificially by DTLB miss exception
785
  handler when no matching PTE found in page tables or page protection
786
  violation for load/store operations.
787
| Instruction Page Fault| 0x400 | Generated artificially by ITLB miss exception
788
  handler when no matching PTE found in page tables or page protection violation
789
  for instruction fetch.
790
| Low Priority External Interrupt       | 0x500 | Low priority external
791
  interrupt asserted.
792
| Alignment     | 0x600 | Load/store access to naturally not aligned location.
793
| Illegal Instruction   | 0x700 | Illegal instruction in the instruction stream.
794
| High Priority External Interrupt      | 0x800 | High priority external
795
  interrupt asserted.
796
| D-TLB Miss    | 0x900 | No matching entry in DTLB (DTLB miss).
797
| I-TLB Miss    | 0xA00 | No matching entry in ITLB (ITLB miss).
798 647 julius
| Range         | 0xB00 | If programmed in the SR, the setting of  SR[OV],
799
  usually by an arithmetic instruction, causes a range exception.
800 645 julius
| System Call   | 0xC00 | System call initiated by software.
801
| Floating point exception      | 0xD00 | FP operation caused flags in FPCSR to
802
  become set.
803
| Trap  | 0xE00 | Trap instruction was decoded
804
|===========================================================
805
 
806
The OR1200 exception support does not include support for range exceptions
807
or fast context switching.
808
 
809
Data Cache Operation
810
~~~~~~~~~~~~~~~~~~~~
811
Data Cache Load/Store Access
812
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
813
Load/store unit requests data from the data ((cache)) and stores them into
814
the general-purpose register file and forwards them to integer execution
815
units. Therefore LSU is tightly coupled with the data cache.
816
 
817
If there is no data cache line miss nor ((DTLB)) miss, load operations take
818
two clock cycles to execute and store operations take one clock cycle to
819
execute. LSU does all the data alignment work.
820
 
821
Data can be written to the data cache on a word, half-word or byte basis. Since
822
data cache only operates in write-through mode, all writes are immediately
823
written back to main memory or to the next level of caches.
824
 
825
[[wb_write_fig]]
826
.WISHBONE Write Cycle
827
image::img/wb_write.gif[scaledwidth="70%",align="center"]
828
 
829
<> shows how a ((write-through)) cycle on data WISHBONE interface
830
is performed when a store instruction hits in the data cache.  If +dwb_ERR_I+
831
or +dwb_RTY_I+ is asserted instead of usual +dwb_ACK_I+, bus error exception
832
is invoked.
833
 
834
Data Cache Line Fill Operation
835
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
836
When executing load instruction and a cache miss occurs, depending on whether
837
the cache uses ((write-through)) or ((write-back)) strategy and the line
838
is clean or invalid, a 4 beat sequential read burst with critical word
839
first is performed. If the strategy is write-back and the line is dirty,
840
the line is first written back to memory. The critical word is forwarded to
841
the load/store unit to minimize performance loss because of the cache miss.
842
 
843
[[wb_read_fig]]
844
.WISHBONE Block Read Cycle
845
image::img/wb_read.gif[scaledwidth="70%",align="center"]
846
 
847
<> shows how a cache line is read in WISHBONE read block cycle
848
composed out of four read transfers.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted
849
instead of usual +dwb_ACK_I+, bus error exception is invoked.
850
 
851
When executing a store instruction with the cache in write-through strategy,
852
and a cache miss occurs, the write is simply put on the bus and no caching
853
occurs. If it is a miss and the cache is in write back strategy and the line
854
is valid and clean or invalid,  a 4 beat sequential read burst to fill the
855
line is performed, and the the write to cache occurs. If storing and a cache
856
miss occurs, and the desired line is valid and dirty, it is first written
857
back to memory before the desired line is read.
858
 
859
[[wb_rw_fig]]
860
.WISHBONE Block Read/Write Cycle
861
image::img/wb_rw.gif[scaledwidth="70%",align="center"]
862
 
863
<> shows how a cache line is read in WISHBONE read block cycle
864
followed by a write transfer.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted instead
865
of usual +dwb_ACK_I+, bus error exception is invoked.
866
 
867
Cache/Memory Coherency
868
^^^^^^^^^^^^^^^^^^^^^^
869
Data cache in OR1200 operates in either write-through or write-back mode,
870
definable at synthesis time, for default use, and runtime when DMMU is
871
used. There is currently no ((coherency)) support between local data cache and
872
caches of other processors.
873
 
874
Data Cache Enabling/Disabling
875
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
876
Data cache is disabled at power up. Entire data cache can be enabled by setting
877
bit SR[DCE] to one. Before data cache is enabled, it must be invalidated.
878
 
879
Data Cache Invalidation
880
^^^^^^^^^^^^^^^^^^^^^^^
881
Data cache in OR1200 does not support ((invalidation)) of entire data
882
cache. Normal procedure to invalidate entire data cache is to cycle through
883
all data cache lines and invalidate each line separately.
884
 
885
Data Cache Locking
886
^^^^^^^^^^^^^^^^^^
887
Data cache implements way ((locking)) bits in data cache control register
888
DCCR. Bits LWx lock individual ways when they are set to one.
889
 
890
Data Cache Line Prefetch
891
^^^^^^^^^^^^^^^^^^^^^^^^
892
Data cache line ((prefetch)) is optional in the OpenRISC 1000 architecture and
893
is not implemented in OR1200.
894
 
895
Data Cache Line ((Flush))
896
^^^^^^^^^^^^^^^^^^^^^^^^^
897
Operation is performed by writing effective address to the DCBFR register.
898
 
899
When a cache line is valid and clean, or the cache is in write-through
900
strategy, the line is invalidated and no write-back occurs.
901
 
902
Data Cache Line Invalidate
903
^^^^^^^^^^^^^^^^^^^^^^^^^^
904
Data cache line ((invalidate)) invalidates a single data cache line. Operation
905
is performed by writing effective address to the DCBIR register.  If cache
906
is in write-back strategy, it is best to use the line flush function.
907
 
908
Data Cache Line ((Write-back))
909
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
910
Operation is performed by writing effective address to the DCBWR register.
911
 
912
If cache is in ((write-through)) strategy, this operation is ignored as no
913
lines will be cached and dirty, capable of being written back.
914
 
915
Data Cache Line ((Lock))
916
^^^^^^^^^^^^^^^^^^^^^^^^
917
Locking of individual data cache lines is not implemented in OR1200.
918
 
919
Data Cache ((inhibit)) with address bit 31 set
920
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
921
If DMMU is disabled, by default all addresses with bit 31 of the address
922
asserted high will cause the data cache to be inhibited, meaning no reads
923
or writes are cached.
924
 
925
If the ((DMMU)) is enabled, it is possible for any address to be inhibited
926
or not, and in these modes the cache behaves accordingly.
927
 
928
Instruction ((Cache)) Operation
929
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
930
Instruction Cache Instruction ((Fetch)) Access
931
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
932
Instruction unit requests instruction from the instruction cache and forwards
933
them to the instruction queue inside instruction unit. Therefore instruction
934
unit is tightly coupled with the instruction cache.
935
 
936
If there is no instruction cache line ((miss)) nor ITLB miss, instruction fetch
937
operation takes one clock cycle to execute.
938
 
939
Instruction cache cannot be explicitly modified like data cache can be with
940
store instructions.
941
 
942
Instruction Cache Line Fill Operation
943
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
944
On a cache miss, a 4 beat sequential read burst with critical word first is
945
performed. Critical word is forwarded to the instruction unit to minimize
946
performance loss because of the cache miss.
947
 
948
[[wb_block_read_fig]]
949
.WISHBONE Block Read Cycle
950
image::img/wb_block_read.gif[scaledwidth="70%",align="center"]
951
 
952
<> shows how a cache line is read in WISHBONE read block
953
cycle composed out of four read transfers.  If +iwb_ERR_I+ or +iwb_RTY_I+ is
954
asserted instead of usual +dwb_ACK_I+, bus error exception is invoked.
955
 
956
Cache/Memory ((Coherency))
957
^^^^^^^^^^^^^^^^^^^^^^^^^^
958
OR1200 is not intended for use in multiprocessor environments. Therefore no
959
support for coherency between local instruction cache and caches of other
960
processors or main memory is implemented.
961
 
962
Instruction Cache Enabling/Disabling
963
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
964
Instruction cache is disabled at power up. Entire instruction cache can be
965
enabled by setting bit SR[ICE] to one. Before instruction cache is enabled,
966
it must be invalidated.
967
 
968
Instruction Cache ((Invalidation))
969
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
970
Instruction cache in OR1200 does not support invalidation of entire instruction
971
cache. Normal procedure to invalidate entire instruction cache is to cycle
972
through all instruction cache lines and invalidate each line separately.
973
 
974
Instruction Cache Locking
975
^^^^^^^^^^^^^^^^^^^^^^^^^
976
Instruction cache implements way locking bits in instruction cache control
977
register ICCR. Bits LWx lock individual ways when they are set to one.
978
 
979
Instruction Cache Line ((Prefetch))
980
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
981
Instruction cache line prefetch is optional in the OpenRISC 1000 architecture
982
and is not implemented in OR1200.
983
 
984
Instruction Cache Line ((Invalidate))
985
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
986
Instruction cache line invalidate invalidates a single instruction cache
987
line. Operation is performed by writing effective address to the ICBIR
988
register.
989
 
990
Instruction ((Cache Line Lock))
991
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
992
Locking of individual instruction cache lines is not implemented in OR1200.
993
 
994
Data MMU
995
~~~~~~~~
996
Translation Disabled
997
^^^^^^^^^^^^^^^^^^^^
998
Load/store address translation can be disabled by clearing bit SR[DME]. If
999
translation is disabled, then physical address used to access data cache
1000
and optionally provided on +dwb_ADDR_O+, is the same as load/store effective
1001
address.
1002
(((Address Translation,Data)))
1003
 
1004
Translation Enabled
1005
^^^^^^^^^^^^^^^^^^^
1006
Load/store address translation can be enabled by setting bit SR[DME]. If
1007
translation is enabled, it provides load/store effective address to physical
1008
address translation and page protection for memory accesses.
1009
(((Address Translation,Data)))
1010
 
1011
[[addr_translation_fig]]
1012
.32-bit Address Translation Mechanism using Two-Level Page Table
1013
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1014
 
1015
In OR1200 case, ((page tables)) must be managed by operating system's virtual
1016
memory management subsystem. <> shows address translation
1017
using two-level page table. Refer to <> for one-level page
1018
table address translation as well as for details about address translation
1019
and page table content.
1020
 
1021
((DMMUCR)) and Flush of Entire ((DTLB))
1022
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1023
DMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1024
must be stored in software variable. Flush of entire DTLB must be performed
1025
by software flush of every DTLB entry separately. Software flush is performed
1026
by manually writing  bits from the TLB entries back to PTEs.
1027
 
1028
Page Protection
1029
^^^^^^^^^^^^^^^
1030
After a virtual address is determined to be within a page covered by the
1031
valid PTE, the access is validated by the memory protection mechanism. If
1032
this protection mechanism prohibits the access, a data page fault exception
1033
is generated.
1034
(((Page Protection,Data)))
1035
 
1036
The memory protection mechanism allows selectively granting read access
1037
and write access for both supervisor and user modes. The page protection
1038
mechanism provides protection at all page level granularities.
1039
 
1040
[[protection_attrs_ldst_table]]
1041
.Protection Attributes for Load/Store Accesses
1042
[width="70%",options="header"]
1043
|================================
1044
| Protection attribute  | Meaning
1045
| DTLBWyTR[SREx]        | Enable load operations in supervisor mode to the
1046
  page.
1047
| DTLBWyTR[SWEx]        | Enable store operations in supervisor mode to the
1048
  page.
1049
| DTLBWyTR[UREx]        | Enable load operations in user mode to the page.
1050
| DTLBWyTR[UWEx]        | Enable store operations in user mode to the page.
1051
|================================
1052
 
1053
<> lists page protection attributes defined in
1054
DTLBWyTR pregister. For the individual page appropriate strategy out of
1055
seven possible strategies programmed with the PPI field of the PTE. Because
1056
OR1200 does not implement DMMUPR, translation of PTE[PPI] into suitable set
1057
of protection bits must be performed by software and written into DTLBWyTR.
1058
 
1059
((DTLB)) Entry Reload
1060
^^^^^^^^^^^^^^^^^^^^^
1061
OR1200 does not implement DTLB entry reloads in hardware. Instead software
1062
routine must be used to search page table for correct page table entry (PTE)
1063
and copy it into the DTLB. Software is responsible for maintaining accessed
1064
and dirty bits in the page tables.
1065
 
1066
When LSU computes load/store effective address whose physical address is
1067
not already cached by DTLB, a DTLB miss exception is invoked.
1068
 
1069
DTLB reload routine must load the correct ((PTE)) to correct ((DTLBWyMR))
1070
and ((DTLBWyTR)) register from one of possible DTLB ways.
1071
 
1072
DTLB Entry Invalidation
1073
^^^^^^^^^^^^^^^^^^^^^^^
1074
Special-purpose register DTLBEIR must be written with the effective address
1075
and corresponding DTLB entry will be invalidated in the local DTLB.
1076
 
1077
Locking DTLB Entries
1078
^^^^^^^^^^^^^^^^^^^^
1079
Since all DTLB entry reloads are performed in software, there is no hardware
1080
locking of DTLB entries. Instead it is up to the software reload routine to
1081
avoid replacing some of the entries if so desired.
1082
 
1083
Page Attribute - Dirty (D)
1084
^^^^^^^^^^^^^^^^^^^^^^^^^^
1085
Dirty (D) attribute is not implemented in OR1200 DTLB. It is up to the
1086
operating system to generate dirty attribute bit with page protection
1087
mechanism.
1088
(((Page Attributes,Data)))
1089
 
1090
Page Attribute - Accessed (A)
1091
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1092
Accessed (A) attribute is not implemented in OR1200 DTLB. It is up to the
1093
operating system to generate accessed attribute bit with page protection
1094
mechanism.
1095
(((Page Attributes,Data)))
1096
 
1097
Page Attribute - Weakly Ordered Memory (WOM)
1098
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1099
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1100
memory accesses are serialized and therefore this attribute is not implemented.
1101
(((Page Attributes,Data)))
1102
 
1103
Page Attribute - Write-Back Cache (WBC)
1104
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1105
Write-back cache (WBC) attribute is not implemented as the data cache cannot
1106
be configured at run time to be write-back enabled if write-through strategy
1107
was selected at synthesis-time.
1108
(((Page Attributes,Data)))
1109
 
1110
Page Attribute - Caching-Inhibited (CI)
1111
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1112
Caching-inhibited (CI) attribute is not implemented in OR1200 DTLB. Cached
1113
and uncached regions are divided by bit 30 of data effective address.
1114
(((Page Attributes,Data)))
1115
 
1116
[[data_cached_regions_table]]
1117
.Cached and uncached regions
1118
[width="70%",options="header"]
1119
|===============================
1120
| Effective Address     | Region
1121
| 0x00000000 - 0x3FFFFFFF       | Cached
1122
| 0x40000000 - 0x7FFFFFFF       | Uncached
1123
| 0x80000000 - 0xBFFFFFFF       | Cached
1124
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1125
|===============================
1126
 
1127
Uncached accesses must be performed when I/O registers are memory mapped
1128
and all reads and writes must be always performed directly to the external
1129
interface and not to the data cache.
1130
 
1131
Page Attribute - Cache Coherency (CC)
1132
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1133
Cache coherency (CC) attribute is not needed in OR1200 because it does not
1134
implement support for multiprocessor environments and because data cache
1135
operates only in write-through mode and therefore this attribute is not
1136
implemented.
1137
(((Page Attributes,Data)))
1138
 
1139
((Instruction MMU))
1140
~~~~~~~~~~~~~~~~~~~
1141
Translation Disabled
1142
^^^^^^^^^^^^^^^^^^^^
1143
Instruction fetch address translation can be disabled by clearing bit
1144
SR[IME]. If translation is disabled, then physical address used to access
1145
instruction cache and optionally provided on iwb_ADDR_O, is the same as
1146
instruction fetch effective address.
1147
(((Address Translation,Instruction)))
1148
 
1149
Translation Enabled
1150
^^^^^^^^^^^^^^^^^^^
1151
Instruction fetch address translation can be enabled by setting bit
1152
SR[IME]. If translation is enabled, it provides instruction fetch effective
1153
address to physical address translation and page protection for instruction
1154
fetch accesses.
1155
(((Address Translation,Instruction)))
1156
 
1157
[[addr_translation_rep_fig]]
1158
.32-bit Address Translation Mechanism using Two-Level Page Table
1159
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1160
 
1161
In OR1200 case, page tables must be managed by operating system s virtual
1162
memory management subsystem. <> shows address
1163
translation using two-level page table. Refer to <> for
1164
one-level page table address translation as well as for details about address
1165
translation and page table content.
1166
 
1167
((IMMUCR)) and ((Flush)) of Entire ITLB
1168
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1169
IMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1170
must be stored in software variable. Flush of entire ITLB must be performed
1171
by software flush of every ITLB entry separately. Software flush is performed
1172
by manually writing bits from the TLB entries back to PTEs.
1173
 
1174
Page Protection
1175
^^^^^^^^^^^^^^^
1176
After a virtual address is determined to be within a page covered by the
1177
valid PTE, the access is validated by the memory protection mechanism. If
1178
this protection mechanism prohibits the access, an instruction page fault
1179
exception is generated.
1180
(((Page Protection,Instruction)))
1181
 
1182
The memory protection mechanism allows selectively granting execute access
1183
for both supervisor and user modes. The page protection mechanism provides
1184
protection at all page level granularities.
1185
 
1186
[[protection_attrs_inst_table]]
1187
.Protection Attributes for Instruction Fetch Accesses
1188
[width="70%",options="header"]
1189
|================================
1190
| Protection attribute  | Meaning
1191
| ITLBWyTR[SXEx]        | Enable execute operations in supervisor mode of the
1192
  page.
1193
| ITLBWyTR[UXEx]        | Enable execute operations in user mode of the page.
1194
|================================
1195
 
1196
<> lists page protection attributes defined
1197
in ITLBWyTR pregister. For the individual page appropriate strategy out
1198
of seven possible strategies programmed with PPI field of the PTE. Because
1199
OR1200 does not implement IMMUPR, translation of PTE[PPI] into suitable set
1200
of protection bits must be performed by software and written into ITLBWyTR.
1201
 
1202
((ITLB)) Entry Reload
1203
^^^^^^^^^^^^^^^^^^^^^
1204
OR1200 does not implement ITLB entry reloads in hardware. Instead software
1205
routine must be used to search page table for correct page table entry (PTE)
1206
and copy it into the ITLB. Software is responsible for maintaining accessed
1207
bit in the page tables.
1208
 
1209
When LSU computes instruction fetch effective address whose physical address
1210
is not already cached by ITLB, an ITLB miss exception is invoked.
1211
 
1212
ITLB reload routine must load the correct PTE to correct ITLBWyMR and ITLBWyTR
1213
register from one of possible ITLB ways.
1214
 
1215
ITLB Entry Invalidation
1216
^^^^^^^^^^^^^^^^^^^^^^^
1217
Special-purpose register ITLBEIR must be written with the effective address
1218
and corresponding ITLB entry will be invalidated in the local ITLB.
1219
 
1220
Locking ITLB Entries
1221
^^^^^^^^^^^^^^^^^^^^
1222
Since all ITLB entry reloads are performed in software, there is no hardware
1223
locking of ITLB entries. Instead it is up to the software reload routine to
1224
avoid replacing some of the entries if so desired.
1225
 
1226
Page Attribute - Dirty (D)
1227
^^^^^^^^^^^^^^^^^^^^^^^^^^
1228
Dirty (D) attribute resides in the PTE but it is not used by the IMMU.
1229
(((Page Attributes,Instruction)))
1230
 
1231
Page Attribute - Accessed (A)
1232
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1233
Accessed (A) attribute is not implemented in OR1200 ITLB. It is up to the
1234
operating system to generate accessed attribute bit with page protection
1235
mechanism.
1236
(((Page Attributes,Instruction)))
1237
 
1238
Page Attribute - Weakly Ordered Memory (WOM)
1239
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1240
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1241
instruction fetch accesses are serialized and therefore this attribute is
1242
not implemented.
1243
(((Page Attributes,Instruction)))
1244
 
1245
Page Attribute - Write-Back Cache (WBC)
1246
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1247
Write-back cache (WBC) attribute resides in the PTE but it is not used by
1248
the IMMU.
1249
(((Page Attributes,Instruction)))
1250
 
1251
Page Attribute - Caching-Inhibited (CI)
1252
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1253
Caching-inhibited (CI) attribute is not implemented in OR1200 ITLB. Cached
1254
and uncached regions are divided by bit 30 of instruction effective address.
1255
(((Page Attributes,Instruction)))
1256
 
1257
[[inst_cached_regions_table]]
1258
.Cached and uncached regions
1259
[width="70%",options="header"]
1260
|===============================
1261
| Effective Address     | Region
1262
| 0x00000000 - 0x3FFFFFFF       | Cached
1263
| 0x40000000 - 0x7FFFFFFF       | Uncached
1264
| 0x80000000 - 0xBFFFFFFF       | Cached
1265
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1266
|===============================
1267
 
1268
Page Attribute - Cache Coherency (CC)
1269
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1270
Cache coherency (CC) attribute resides in the PTE but it is not used by
1271
the IMMU.
1272
(((Page Attributes,Instruction)))
1273
 
1274
((Programmable Interrupt Controller))
1275
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1276
PICMR special-purpose register is used to mask or unmask up to 30 programmable
1277
interrupt sources. PICPR special-purpose register is used to assign low or
1278
high priority to maximum of 30 interrupt sources.
1279
 
1280
PICSR special-purpose register is used to determine status of each interrupt
1281
input. Bits in PICSR represent status of the interrupt inputs and the
1282
actual interrupt must be cleared in the device that is the source of a
1283
pending interrupt.
1284
 
1285
The ((PIC)) implementation in the OR1200  differs from the architecture
1286
specification. The PIC instead offers a latched level-sensitive interrupt.
1287
 
1288
Once an interrupt line is latched (i.e. its value appears in PICSR), no
1289
new interrupts can be triggered for that line until its bit in PICSR is
1290
cleared. The usual sequence for an interrupt handler is then as follows.
1291
 
1292
. Peripheral asserts interrupt, which is latched and triggers handler.
1293
. Handler processes interrupt.
1294
. Handler notifies peripheral that the interrupt has been processed (typically
1295
  via a memory mapped register).
1296
. Peripheral deasserts interrupt.
1297
. Handler clears corresponding bit in PICSR and returns.
1298
 
1299
It is assumed that the peripheral will de-assert its interrupt promptly
1300
(within 1-2 cycles). Otherwise on exiting the interrupt handler, having
1301
cleared PICSR, the level sensitive interrupt will immediately retrigger.
1302
 
1303
((Tick Timer))
1304
~~~~~~~~~~~~~~
1305
Tick timer facility is enabled with TTMR[M]. TTCR is incremented with each
1306
clock cycle and a high priority interrupt can be asserted whenever lower 28
1307
bits of TTCR match TTMR[TP] and TTMR[IE] is set.
1308
 
1309
TTCR restarts counting from zero when match event happens and TTMR[M] is
1310
0x1. If TTMR[M] is 0x2, TTCR is stoped when match event happens and TTCR
1311
must be changed to start counting again. When TTMR[M] is 0x3, TTCR keeps
1312
counting even when match event happens.
1313
 
1314
((Power Management))
1315
~~~~~~~~~~~~~~~~~~~~
1316
((Clock Gating)) and Frequency Changing Versus CPU Stalling
1317
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1318
If system doesn t support clock gating and if changing clock frequency in
1319
slow down mode is not possible, CPU can be stalled for certain number of
1320
clock cycles. This is much lower benefit on power consumption however it
1321
still reduces power consumption.
1322
 
1323
Slow Down Mode
1324
^^^^^^^^^^^^^^
1325
Slow down mode is software controlled with the 4-bit value in PMR[SDF]. Lower
1326
value specifies higher expected performance from the processor core. Usually
1327
PMR[SDF] is dynamically set by the operating system s idle routine, that
1328
monitors the usage of the processor core.
1329
(((Mode,Slow Down)))
1330
 
1331
PMR[SDF] is broadcast on +pm_clksd+. External clock generator should adjust
1332
clock frequency according to the value of +pm_clksd+. Exact slow down factors
1333
are not defined but 0xF should go all the way down to 32.768 KHz.
1334
 
1335
With +pm_clksd+ equal to 0xF, +pm_lvolt+ is asserted. This is an indication for
1336
the external power supply to lower the voltage.
1337
 
1338
Doze Mode
1339
^^^^^^^^^
1340
To switch to doze mode, software should set the PMR[DME]. Once an interrupt
1341
is received by the programmable interrupt controller (PIC), +pm_wakeup+
1342
is asserted and external clock generation circuitry should enable all
1343
clocks. Once clocks are running RISC is switched back again to the normal
1344
mode and PMR[DME] is cleared.
1345
(((Mode,Doze)))
1346
 
1347
When doze mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1348
+pm_immu_gate+ and +pm_cpugate+ are asserted. As a result all clocks except
1349
+clk_tt+ should be gated by external clock generation circuitry.
1350
 
1351
Sleep Mode
1352
^^^^^^^^^^
1353
To switch to sleep mode, software should set the PMR[SME]. Once an interrupt
1354
is received by the programmable interrupt controller (PIC), +pm_wakeup+ is
1355
asserted and external clock generation should enable all clocks. Once clocks
1356
are running, RISC is switched back again to the normal mode and PMR[SME]
1357
is cleared.
1358
(((Mode,Sleep)))
1359
 
1360
When sleep mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1361
+pm_immu_gate+, +pm_cpu_gate+ and +pm_tt_gate+ are asserted. As a result
1362
all clocks including +clk_tt+ should be gated by external clock generation
1363
circuitry.
1364
 
1365
In sleep mode, +pm_lvolt+ is asserted. This is an indication for the external
1366
power supply to lower the voltage.
1367
 
1368
Clock Gating
1369
^^^^^^^^^^^^
1370
((Clock gating)) feature is not implemented in OR1200 power management.
1371
 
1372
Disabled Units Force Clock Gating
1373
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1374
Units that are disabled in special-purpose register SR, have their clock
1375
gate signals asserted. Cleared bits SR[DCE], SR[ICE], SR[DME] and SR[IME]
1376
directly force assertion of +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+
1377
and +pm_immu_gate+.
1378
 
1379
((Debug Unit))
1380
~~~~~~~~~~~~~~
1381
Debug unit can be controlled through development interface or it can operate
1382
independently programmed and handled by the RISC s resident debug software.
1383
 
1384
((Watchpoints))
1385
^^^^^^^^^^^^^^^
1386
OR1200 debug unit does not implement OR12000 architecture watchpoints.
1387
 
1388
((Breakpoint)) Exception
1389
^^^^^^^^^^^^^^^^^^^^^^^^
1390
Which breakpointDMR2[WGB] bits specify which watchpoints invoke breakpoint
1391
exception. By invoking breakpoint exception, target resident debugger can
1392
be built.
1393
 
1394
Breakpoint is broadcast on development interface on +dbg_bp_o+.
1395
 
1396
((Development Interface))
1397
~~~~~~~~~~~~~~~~~~~~~~~~~
1398
NOTE: The information in this section is to be reviewed. It is the author's
1399
opinion that the debug interface is now largely provided by the SPR mappings,
1400
and no special sideband functions exist aside from stalling and resetting
1401
the core.
1402
 
1403
An additional _development and debug interface IP_ core may be used to connect
1404
OpenRISC 1200 to standard debuggers using IEEE.1149.1 (JTAG) protocol.
1405
 
1406
((Debugging)) Through ((Development Interface))
1407
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1408
The DSR special-purpose register specifies which exceptions cause the core
1409
to stop the execution of the exception handler and turn over control to
1410
development interface. It can be programmed by the resident debug software
1411
or by the development interface.
1412
 
1413
The DRR special-purpose register is specifies which event caused the core to
1414
stop the execution of program flow and turned over control to the development
1415
interface. It should be cleared by the resident debug software or by the
1416
development interface.
1417
 
1418
The DIR special-purpose register is not implemented.
1419
 
1420
Reading PC, Load/Store EA, Load Data, Store Data, Instruction
1421
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1422
Crucial information like ((program counter)) (PC), load/store effective
1423
address (LSEA), load data, store data and current instruction in execution
1424
pipeline can be asynchronously read through the development interface.
1425
 
1426
[[dev_commands_table]]
1427
.Development Interface Operation Commands
1428
[width="70%",options="header"]
1429
|========================
1430
| dbg_op_i[2:0] | Meaning
1431
| 0x0           | Reading Program Counter (PC)
1432
| 0x1           | Reading Load/Store Effective Address
1433
| 0x2           | Reading Load Data
1434
| 0x3           | Reading Store Data
1435
| 0x4           | Reading SPR
1436
| 0x5           | Writing SPR
1437
| 0x6           | Reading Instruction in Execution Pipeline
1438
| 0x7           | Reserved
1439
|========================
1440
 
1441
<> lists operation commands that control what is read
1442
or written through development interface. All reads except reads and writes
1443
of SPRs are asynchronous.
1444
 
1445
Reading and Writing SPRs Through Development Interface
1446
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1447
For reads and write to SPRs +dbg_op_i+ must be set to 0x4 and 0x5,
1448
respectively.
1449
 
1450
[[dev_interface_cycles_fig]]
1451
.Development Interface Cycles
1452
image::img/dev_interface_cycles.gif[scaledwidth="70%",align="center"]
1453
 
1454
<> shows development interface cycles. Writes must
1455
be synchronous to the main RISC clock positive edge and should take one clock
1456
cycle. Reads must take two clock cycles because access to synchronous cache
1457
lines or to TLB entries introduces one clock cycle of delay.
1458
 
1459
If required, external debugger can stop the CPU core by asserting
1460
+dbg_stall_i+. This way it can have enough time to read all interesting
1461
registers from the RISC or guarantee that writes into SPRs are performed
1462
without RISC writing to the same registers.
1463
 
1464
Tracking ((Data Flow))
1465
^^^^^^^^^^^^^^^^^^^^^^
1466
An external debugger can monitor and record data flow inside the RISC for
1467
debugging purposes and profiling analysis. This is accomplished by monitoring
1468
status of the load/store unit, load/store effective address and load/store
1469
data, all available at the development interface.
1470
 
1471
[[status_ldst_unit_table]]
1472
.Status of the Load/Store Unit
1473
[width="70%",options="header"]
1474
|============================================================
1475
| dbg_lss_o[3:0]        | Load/Store Instruction in Execution
1476
| 0x0   | No load/store instruction in execution
1477
| 0x1   | Reserved for load doubleword
1478
| 0x2   | Load byte and zero extend
1479
| 0x3   | Load byte and sign extend
1480
| 0x4   | Load halfword and zero extend
1481
| 0x5   | Load halfword and sign extend
1482
| 0x6   | Load singleword and zero extend
1483
| 0x7   | Load singleword and sign extend
1484
| 0x8   | Reserved for store doubleword
1485
| 0x9   | Reserved
1486
| 0xA   | Store byte
1487
| 0xB   | Reserved
1488
| 0xC   | Store halfword
1489
| 0xD   | Reserved
1490
| 0xE   | Store singleword
1491
| 0xF   | Reserved
1492
|============================================================
1493
 
1494
External trace buffer can capture all interesting data flow
1495
events by analyzing status of the load/store unit available on
1496
+dbg_lss_o+. <> lists different status encoding for
1497
the load/store unit.
1498
 
1499
Tracking ((Program Flow))
1500
^^^^^^^^^^^^^^^^^^^^^^^^^
1501
An external debugger can monitor and record program flow inside the RISC
1502
for debugging purposes and profiling analysis. This is accomplished by
1503
monitoring status of the instruction unit, PC and fetched instruction word,
1504
all available at the development interface.
1505
 
1506
[[status_inst_unit_table]]
1507
.Status of the Instruction Unit
1508
[width="70%",options="header"]
1509
|=========================================
1510
| dbg_is_o[1:0] | Instruction Fetch Status
1511
| 0x0   | No instruction fetch in progress
1512
| 0x1   | Normal instruction fetch
1513
| 0x2   | Executing branch instruction
1514
| 0x3   | Fetching instruction in delay slot
1515
|=========================================
1516
 
1517
External trace buffer can capture all interesting program flow
1518
events by analyzing status of the instruction unit available on
1519
+dbg_is_o+. <> lists different status encoding for
1520
the instruction unit.
1521
 
1522
Triggering ((External Watchpoint Event))
1523
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1524
<> shows how development interface can assert
1525
+dbg_ewt_I+ and cause watchpoint event. If programmed, external watchpoint
1526
event will cause a breakpoint exception.
1527
 
1528
[[watchpoint_trigger_fig]]
1529
.Assertion of External Watchpoint Trigger
1530
image::img/watchpoint_trigger.gif[scaledwidth="70%",align="center"]
1531
 
1532
((Registers))
1533
-------------
1534
This section describes all registers inside the OR1200 core. Shifting _GRP_
1535
number 11 bits left and adding _REG_ number computes the address of each
1536
special-purpose register. All registers are 32 bits wide from software
1537
perspective. _USER MODE_ and _SUPV MODE_ specify the valid access types for
1538
each register in user mode and supervisor mode of operation. R/W stands for
1539
read and write access and R stands for read only access.
1540
 
1541
((Registers list))
1542
~~~~~~~~~~~~~~~~~~
1543
[[regs_table]]
1544
.List of All Registers
1545
[width="95%",options="header"]
1546
|============================================================================
1547
| Grp # | Reg # | Reg Name      | USER MODE     | SUPV MODE     | Description
1548
| 0     | 0     | ((VR))        | -             | R     | Version Register
1549
| 0     | 1     | ((UPR))       | -             | R     | Unit Present Register
1550
| 0     | 2     | ((CPUCFGR))   | -             | R     | CPU Configuration Register
1551
| 0     | 3     | ((DMMUCFGR))  | -             | R     | Data MMU Configuration Register
1552
| 0     | 4     | ((IMMUCFGR))  | -             | R     | Instruction MMU Configuration Register
1553
| 0     | 5     | ((DCCFGR))    | -             | R     | Data Cache Configuration Register
1554
| 0     | 6     | ((ICCFGR))    | -             | R     | Instruction Cache Configuration Register
1555
| 0     | 7     | ((DCFGR))     | -             | R     | Debug Configuration Register
1556
| 0     | 16    | ((PC))        | -             | R/W   | PC mapped to SPR space
1557
| 0     | 17    | ((SR))        | -             | R/W   | Supervision Register
1558
| 0     | 20    | ((FPCSR))     | -             | R/W   | FP Control Status Register
1559
| 0     | 32    | ((EPCR0))     | -             | R/W   | Exception PC Register
1560
| 0     | 48    | ((EEAR0))     | -             | R/W   | Exception EA Register
1561
| 0     | 64    | ((ESR0))      | -             | R/W   | Exception SR Register
1562
| 0     | 1024-1055     | ((GPR0-GPR31))        | -     | R/W   | GPRs mapped to SPR space
1563
| 1     | 2             | ((DTLBEIR))   | -     | W     | Data TLB Entry Invalidate Register
1564
| 1     | 1024-1151     | ((DTLBW0MR0-DTLBW0MR127))     | -     | R/W   | Data TLB Match Registers Way 0
1565
| 1     | 1536-1663     | ((DTLBW0TR0-DTLBW0TR127))     | -     | R/W   | Data TLB Translate Registers Way 0
1566
| 2     | 2             | ((ITLBEIR))   | -     | W     | Instruction TLB Entry Invalidate Register
1567
| 2     | 1024-1151     | ((ITLBW0MR0-ITLBW0MR127))     | -     | R/W   | Instruction TLB Match Registers Way 0
1568
| 2     | 1536-1663     | ((ITLBW0TR0-ITLBW0TR127))     | -     | R/W   | Instruction TLB Translate Registers Way 0
1569
| 3     | 0     | ((DCCR))      | -             | R/W   | DC Control Register
1570
| 3     | 2     | ((DCBFR))     | W             | W     | DC Block Flush Register
1571
| 3     | 3     | ((DCBIR))     | W             | W     | DC Block Invalidate Register
1572
| 3     | 4     | ((DCBWR))     | W             | W     | DC Block Write-back register
1573
| 4     | 0     | ((ICCR))      | -             | R/W   | IC Control Register
1574
| 4     | 256   | ((ICBIR))     | W             | W     | IC Block Invalidate Register
1575
| 5     | 256   | ((MACLO))     | R/W           | R/W   | MAC Low
1576
| 5     | 257   | ((MACHI))     | R/W           | R/W   | MAC High
1577
| 6     | 16    | ((DMR1))      | -             | R/W   | Debug Mode Register 1
1578
| 6     | 17    | ((DMR2))      | -             | R/W   | Debug Mode Register 2
1579
| 6     | 20    | ((DSR))       | -             | R/W   | Debug Stop Register
1580
| 6     | 21    | ((DRR))       | -             | R/W   | Debug Reason Register
1581
| 8     | 0     | ((PMR))       | -             | R/W   | Power Management Register
1582
| 9     | 0     | ((PICMR))     | -             | R/W   | PIC Mask Register
1583
| 9     | 2     | ((PICSR))     | -             | R/W   | PIC Status Register
1584
| 10    | 0     | ((TTMR))      | -             | R/W   | Tick Timer Mode Register
1585
| 10    | 1     | ((TTCR))      | R*            | R/W   | Tick Timer Count Register
1586
|============================================================================
1587
 
1588
<> lists all OpenRISC 1000 special-purpose registers implemented
1589
in OR1200. Registers VR and UPR are described below. For description of
1590
other registers refer to <>.
1591
 
1592
Register VR description
1593
~~~~~~~~~~~~~~~~~~~~~~~
1594
Special-purpose register VR identifies the version (model) and revision
1595
level of the OpenRISC 1000 processor. It also specifies possible standard
1596
template on which this implementation is based.
1597
(((Register,VR)))
1598
 
1599
[[vr_reg_table]]
1600
.VR Register
1601
[width="95%",options="header"]
1602
|============================================================
1603
| Bit # | Access        | Reset | Short Name    | Description
1604
| 5:0   | R     | Revision      | REV           | Revision number of this document.
1605
| 15:6  | R     | 0x0           | -             | Reserved
1606
| 23:16 | R     | 0x00          | CFG           | Configuration should be read from UPR and configuration registers
1607
| 31:24 | R     | 0x12          | VER           | Version number for OR1200 is fixed at 0x1200.
1608
|============================================================
1609
 
1610
Register UPR description
1611
~~~~~~~~~~~~~~~~~~~~~~~~
1612
Special-purpose register UPR identifies the units present in the processor. It
1613
has a bit for each implemented unit or functionality. Lower sixteen bits
1614
identify present units defined in the OpenRISC 1000 architecture. Upper
1615
sixteen bits define present custom units.
1616
(((Register,UPR)))
1617
 
1618
[[upr_reg_table]]
1619
.UPR Register
1620
[width="95%",options="header"]
1621
|============================================================
1622
| Bit # | Access        | Reset | Short Name    | Description
1623
| 0     | R             | 1     | UP            | UPR present
1624
| 1     | R             | 1     | DCP           | Data cache present[†]
1625
| 2     | R             | 1     | ICP           | Instruction cache present[†]
1626
| 3     | R             | 1     | DMP           | Data MMU present[†]
1627
| 4     | R             | 1     | IMP           | Instruction MMU present[†]
1628
| 5     | R             | 1     | MP            | MAC present[†]
1629
| 6     | R             | 1     | DUP           | Debug unit present[†]
1630
| 7     | R             | 0     | PCUP          | Performance counters unit not present[†]
1631
| 8     | R             | 1     | PMP           | Power Management Present[†]
1632
| 9     | R             | 1     | PICP          | Programmable interrupt controller present
1633
| 10    | R             | 1     | TTP           | Tick timer present
1634
| 11    | R             | 1     | FPP           | Floating point present[†]
1635
| 23:12 | R             | X     | -             | Reserved
1636
| 31:24 | R             | 0xXXXX| CUP           | The user of the OR1200 core adds custom units.
1637
|============================================================
1638
[†]: if enabled at synthesis time
1639
 
1640
Register CPUCFGR description
1641
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1642
Special-purpose register CPUCFGR identifies the capabilities and configuration
1643
of the CPU.
1644
(((Register,CPUCFGR)))
1645
 
1646
[[cpucfgr_reg_table]]
1647
.CPUCFGR Register
1648
[width="95%",options="header"]
1649
|============================================================
1650
| Bit # | Access        | Reset | Short Name    | Description
1651
| 3:0   | R             | 0x0   | NSGF          | Zero number of shadow GPR files
1652
| 4     | R             | 0     | HGF           | No half GPR files[†]
1653
| 5     | R             | 1     | OB32S         | ORBIS32 supported
1654
| 6     | R             | 0     | OB64S         | ORBIS64 not supported
1655
| 7     | R             | 1     | OF32S         | ORFPX32 supported[‡]
1656
| 8     | R             | 0     | OF64S         | ORFPX64 not supported
1657
| 9     | R             | 0     | OV64S         | ORVDX64 not supported
1658
|============================================================
1659
[†]: If disabled at synthesis time
1660
 
1661
[‡]: If FPU enabled at synthesis time
1662
 
1663
Register DMMUCFGR description
1664
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1665
Special-purpose register DMMUCFGR identifies the capabilities and configuration
1666
of the DMMU.
1667
(((Register,DMMUCFGR)))
1668
 
1669
[[dmmucfgr_reg_table]]
1670
.DMMUCFGR Register
1671
[width="95%",options="header"]
1672
|============================================================
1673
| Bit # | Access        | Reset | Short Name    | Description
1674
| 1:0   | R             | 0x0   | NTW           | One DTLB way
1675
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 DTLB sets
1676
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1677
| 8     | R             | 0     | CRI           | No DMMU control register implemented
1678
| 9     | R             | 0     | PRI           | No protection register implemented
1679
| 10    | R             | 1     | TEIRI         | DTLB entry invalidate register implemented
1680
| 11    | R             | 0     | HTR           | No hardware DTLB reload
1681
|============================================================
1682
 
1683
Register IMMUCFGR description
1684
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1685
Special-purpose register IMMUCFGR identifies the capabilities and configuration
1686
of the IMMU.
1687
(((Register,IMMUCFGR)))
1688
 
1689
[[immucfgr_reg_table]]
1690
.IMMUCFGR Register
1691
[width="95%",options="header"]
1692
|============================================================
1693
| Bit # | Access        | Reset | Short Name    | Description
1694
| 1:0   | R             | 0x0   | NTW           | One ITLB way
1695
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 ITLB sets
1696
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1697
| 8     | R             | 0     | CRI           | No IMMU control register implemented
1698
| 9     | R             | 0     | PRI           | No protection register implemented
1699
| 10    | R             | 1     | TEIRI         | ITLB entry invalidate register implemented
1700
| 11    | R             | 0     | HTR           | No hardware ITLB reload
1701
|============================================================
1702
 
1703
Register DCCFGR description
1704
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1705
Special-purpose register DCCFGR identifies the capabilities and configuration
1706
of the data cache.
1707
(((Register,DCCFGR)))
1708
 
1709
[[dccfgr_reg_table]]
1710
.DCCFGR Register
1711
[width="95%",options="header"]
1712
|============================================================
1713
| Bit # | Access        | Reset | Short Name    | Description
1714
| 2:0   | R             | 0x0   | NCW           | One DC way
1715
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 DC sets
1716
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1717
| 8     | R             | 0     | CWS           | Cache write-through strategy[†]
1718
| 9     | R             | 1     | CCRI          | DC control register implemented
1719
| 10    | R             | 1     | CBIRI         | DC block invalidate register implemented
1720
| 11    | R             | 0     | CBPRI         | DC block prefetch register not implemented
1721
| 12    | R             | 0     | CBLRI         | DC block lock register not implemented
1722
| 13    | R             | 1     | CBFRI         | DC block flush register implemented
1723
| 14    | R             | 1     | CBWBRI        | DC block write-back register  implemented[‡]
1724
|============================================================
1725
[†]: If disabled at synthesis time
1726
 
1727
[‡]: If FPU enabled at synthesis time
1728
 
1729
Register ICCFGR description
1730
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1731
Special-purpose register ICCFGR identifies the capabilities and configuration
1732
of the instruction cache.
1733
(((Register,ICCFGR)))
1734
 
1735
[[iccfgr_reg_table]]
1736
.ICCFGR Register
1737
[width="95%",options="header"]
1738
|============================================================
1739
| Bit # | Access        | Reset | Short Name    | Description
1740
| 2:0   | R             | 0x0   | NCW           | One IC way
1741
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 IC sets
1742
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1743
| 8     | R             | 0     | CWS           | Cache write-through strategy
1744
| 9     | R             | 1     | CCRI          | IC control register implemented
1745
| 10    | R             | 1     | CBIRI         | IC block invalidate register implemented
1746
| 11    | R             | 0     | CBPRI         | IC block prefetch register not implemented
1747
| 12    | R             | 0     | CBLRI         | IC block lock register not implemented
1748
| 13    | R             | 1     | CBFRI         | IC block flush register implemented
1749
| 14    | R             | 0     | CBWBRI        | IC block write-back register not implemented
1750
|============================================================
1751
 
1752
Register DCFGR description
1753
~~~~~~~~~~~~~~~~~~~~~~~~~~
1754
Special-purpose register DCFGR identifies the capabilities and configuration
1755
of the debut unit.
1756
(((Register,DCFGR)))
1757
 
1758
[[dcfgr_reg_table]]
1759
.DCFGR Register
1760
[width="95%",options="header"]
1761
|============================================================
1762
| Bit # | Access        | Reset | Short Name    | Description
1763
| 3:0   | R             | 0x0   | NDP           | Zero DVR/DCR pairs[†]
1764
| 4     | R             | 0     | WPCI          | Watchpoint counters not implemented
1765
|============================================================
1766
[†]: If hardware breakpoints disabled at synthesis time
1767
 
1768
((IO ports))
1769
------------
1770
OR1200 IP core has several interfaces. <> below shows
1771
all interfaces:
1772
 
1773
* Instruction and data WISHBONE host interfaces
1774
* Power management interface
1775
* Development interface
1776
* Interrupts interface
1777
 
1778
[[core_interfaces_fig]]
1779
.Core's Interfaces
1780
image::img/core_interfaces.gif[scaledwidth="50%",align="center"]
1781
 
1782
Instruction WISHBONE Master Interface
1783
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1784
OR1200 has two master WISHBONE Rev B compliant interfaces. Instruction
1785
interface is used to connect OR1200 core to memory subsystem for purpose of
1786
fetching instructions or instruction cache lines.
1787
 
1788
[[inst_wb_master_table]]
1789
.Instruction WISHBONE Master Interface' Signals
1790
[width="95%",options="header"]
1791
|====================================================
1792
| Port          | Width | Direction     | Description
1793
| ((iwb_CLK_I)) | 1     | Input         | Clock input
1794
| ((iwb_RST_I)) | 1     | Input         | Reset input
1795
| ((iwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1796
| ((iwb_ADR_O)) | 32    | Outputs       | Address outputs
1797
| ((iwb_DAT_I)) | 32    | Inputs        | Data inputs
1798
| ((iwb_DAT_O)) | 32    | Outputs       | Data outputs
1799
| ((iwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1800
| ((iwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1801
| ((iwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1802
| ((iwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as iwb_ERR_I.
1803
| ((iwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1804
| ((iwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1805
|====================================================
1806
 
1807
Data WISHBONE Master Interface
1808
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1809
OR1200 has two master WISHBONE Rev B compliant interfaces. Data interface
1810
is used to connect OR1200 core to external peripherals and memory subsystem
1811
for purpose of reading and writing data or data cache lines.
1812
 
1813
[[data_wb_master_table]]
1814
.Data WISHBONE Master Interface' Signals
1815
[width="95%",options="header"]
1816
|====================================================
1817
| Port          | Width | Direction     | Description
1818
| ((dwb_CLK_I)) | 1     | Input         | Clock input
1819
| ((dwb_RST_I)) | 1     | Input         | Reset input
1820
| ((dwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1821
| ((dwb_ADR_O)) | 32    | Outputs       | Address outputs
1822
| ((dwb_DAT_I)) | 32    | Inputs        | Data inputs
1823
| ((dwb_DAT_O)) | 32    | Outputs       | Data outputs
1824
| ((dwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1825
| ((dwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1826
| ((dwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1827
| ((dwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as dwb_ERR_I.
1828
| ((dwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1829
| ((dwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1830
|====================================================
1831
 
1832
System Interface
1833
~~~~~~~~~~~~~~~~
1834
System interface connects reset, clock and other system signals to the
1835
OR1200 core.
1836
 
1837
[[sys_interface_table]]
1838
.System Interface Signals
1839
[width="95%",options="header"]
1840
|====================================================
1841
| Port          | Width | Direction     | Description
1842
| ((Rst))       | 1     | Input         | Asynchronous reset
1843
| ((clk_cpu))   | 1     | Input         | Main clock input to the RISC
1844
| ((clk_dc))    | 1     | Input         | Data cache clock
1845
| ((clk_ic))    | 1     | Input         | Instruction cache clock
1846
| ((clk_dmmu))  | 1     | Input         | Data MMU clock
1847
| ((clk_immu))  | 1     | Input         | Instruction MMU clock
1848
| ((clk_tt))    | 1     | Input         | Tick timer clock
1849
|====================================================
1850
 
1851
Development Interface
1852
~~~~~~~~~~~~~~~~~~~~~
1853
Development interface connects external development port to the RISC s internal
1854
debug facility. Debug facility allows control over program execution inside
1855
RISC, setting of breakpoints and watchpoints, and tracing of instruction
1856
and data flows.
1857
 
1858
[[dev_interface_table]]
1859
.Development Interface
1860
[width="95%",options="header"]
1861
|====================================================
1862
| Port          | Width | Direction     | Description
1863
| ((dbg_dat_o)) | 32    | Output        | Transfer of data from RISC to external development interface
1864
| ((dbg_dat_i)) | 32    | Input         | Transfer of data from external development interface to RISC
1865
| ((dbg_adr_i)) | 32    | Input         | Address of special-purpose register to be read or written
1866
| ((dbg_op_I))  | 3     | Input         | Operation select for development interface
1867
| ((dbg_lss_o)) | 4     | Output        | Status of load/store unit
1868
| ((dbg_is_o))  | 2     | Output        | Status of instruction fetch unit
1869
| ((dbg_wp_o))  | 11    | Output        | Status of watchpoints
1870
| ((dbg_bp_o))  | 1     | Output        | Status of the breakpoint
1871
| ((dbg_stall_i))       | 1     | Input | Stalls RISC CPU core
1872
| ((dbg_ewt_i)) | 1     | Input         | External watchpoint trigger
1873
|====================================================
1874
 
1875
Power Management Interface
1876
~~~~~~~~~~~~~~~~~~~~~~~~~~
1877
Power management interface provides signals for interfacing RISC core with
1878
external power management circuitry. External power management circuitry is
1879
required to implement functions that are technology specific and cannot be
1880
implemented inside OR1200 core.
1881
 
1882
[[pow_mgmt_interface_table]]
1883
.Power Management Interface
1884
[width="95%",options="header"]
1885
|============================================================================
1886
| Port                  | Width | Direction     | Generation            | Description
1887
| ((pm_clksd))          | 4     | Output        | Static (in SW)        | Slow down outputs that control reduction of RISC clock frequency
1888
| ((pm_cpustall))       | 1     | Input         | -                     | Synchronous stall of the RISC’s CPU core
1889
| ((pm_dc_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of data cache clock
1890
| ((pm_ic_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of instruction cache clock
1891
| ((pm_dmmu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of data MMU clock
1892
| ((pm_immu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of instruction MMU clock
1893
| ((pm_tt_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of tick timer clock
1894
| ((pm_cpu_gate))       | 1     | Output        | Static (in SW)        | Gating of main CPU clock
1895
| ((pm_wakeup))         | 1     | Output        | Dynamic (in HW)       | Activate all clocks
1896
| ((pm_lvolt))          | 1     | Output        | Static (in SW)        | Lower voltage
1897
|============================================================================
1898
 
1899
Interrupt Interface
1900
~~~~~~~~~~~~~~~~~~~
1901
Interrupt interface has interrupt inputs for interfacing external peripheral
1902
s interrupt outputs to the RISC core. All interrupt inputs are evaluated on
1903
positive edge of main RISC clock.
1904
 
1905
[[interrupt_interface_table]]
1906
.Interrupt Interface
1907
[width="95%",options="header"]
1908
|============================================================
1909
| Port          | Width         | Direction     | Description
1910
| ((pic_ints))  | PIC_INTS      | Input         | External interrupts
1911
|============================================================
1912
 
1913
 
1914
 
1915
[appendix]
1916
Core HW Configuration
1917
=====================
1918
(((Hardware,Configuration)))
1919
This section describes parameters that are set by the user of the core and
1920
define configuration of the core. Parameters must be set by the user before
1921
actual use of the core in simulation or synthesis.
1922
 
1923
[[core_hw_conf_table]]
1924
.Core HW configuration table
1925
[width="95%",options="header"]
1926
|============================================================
1927
| Variable Name | Range         | Default       | Description
1928
| ((EADDR_WIDTH))       | 32    | 32    | Effective address width
1929
| ((VADDR_WIDTH))       | 32    | 32    | Virtual address width
1930
| ((PADDR_WIDTH))       | 24 - 36| 32   | Physical address width
1931
| ((DATA_WIDTH))        | 32    | 32    | Data width / Operation width
1932
| ((DC_IMPL))   | 0 - 1         | 1     | Data cache implementation
1933
| ((DC_SETS))   | 256-1024      | 512   | Data cache number of sets
1934
| ((DC_WAYS))   | 1             | 1     | Data cache number of ways
1935
| ((DC_LINE))   | 16 - 32       | 16    | Data cache line size
1936
| ((IC_IMPL))   | 0 - 1         | 1     | Instruction cache implementation
1937
| ((IC_SETS))   | 32-1024       | 512   | Instruction cache number of sets
1938
| ((IC_WAYS))   | 1             | 1     | Instruction cache number of ways
1939
| ((IC_LINE))   | 16-32         | 16    | Instruction cache line size in bytes
1940
| ((DMMU_IMPL)) | 0 - 1         | 1     | Data MMU implementation
1941
| ((DTLB_SETS)) | 64            | 64    | Data TLB number of sets
1942
| ((DTLB_WAYS)) | 1             | 1     | Data TLB number of ways
1943
| ((IMMU_IMPL)) | 0 - 1         | 1     | Instruction MMU implementation
1944
| ((ITLB_SETS)) | 64            | 64    | Instruction TLB number of sets
1945
| ((ITLB_WAYS)) | 1             | 1     | Instruction TLB number of ways
1946
| ((PIC_INTS))  | 2 - 32        | 20    | Number of interrupt inputs
1947
|============================================================
1948
 
1949
:numbered!:
1950
 
1951
[bibliography]
1952
((Bibliography))
1953
================
1954
[bibliography]
1955
- [[[or1000_manual]]] Damjan Lampret et al. 'OpenRISC 1000 System Architecture
1956
  Manual'. 2004.
1957
 
1958
[index]
1959
Index
1960
=====
1961
// The index is generated automatically by the DocBook toolchain.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.