OpenCores
URL https://opencores.org/ocsvn/openrisc_2011-10-31/openrisc_2011-10-31/trunk

Subversion Repositories openrisc_2011-10-31

[/] [openrisc/] [trunk/] [or1200/] [doc/] [openrisc1200_spec.txt] - Blame information for rev 645

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 645 julius
OpenRISC 1200 IP Core Specification (Preliminary Draft)
2
=======================================================
3
:doctype: book
4
 
5
////
6
Revision history
7
Note: When adding new entries, strictly follow the format of the existing ones.
8
 
9
Rev.    | Date          | Author        | Description
10
__vstart__
11
v0.1    | 28/3/01       | Damjan Lampret        | First Draft
12
 
13
v0.2    | 16/4/01       | Damjan Lampret        | First time published
14
 
15
v0.3    | 29/4/01       | Damjan Lampret        | All chapters almost
16
finished. Some bugs hidden waiting for an update. Awaiting feedback.
17
 
18
v0.4    | 16/5/01       | Damjan Lampret        | Synchronization with
19
OR1K Arch Manual
20
 
21
v0.5    | 24/5/01       | Damjan Lampret        | Fixed bugs
22
 
23
v0.6    | 28/5/01       | Damjan Lampret        | Changed some SPR addresses.
24
 
25
v0.7    | 06/9/01       | Damjan Lampret        | Simplified debug unit.
26
 
27
v0.8    | 30/08/10      | Julius Baxter         | Adding information about FPU
28
implementation, data cache write-back capability. PIC behavior update.
29
Instruction list update. Update of bits in config registers, bringing into
30
line with latest OR1200 - not entirely complete.
31
 
32
v0.9    | 12/9/10       | Julius Baxter         | Clarified supported parts of
33
OR1K instruction set. Updated core clock input information.
34
Fixed up reference to instruction execute stage cycle table.
35
Added divide cycles to execute stage cycle table.
36
 
37
0.10    | 1/11/10       | Julius Baxter         | Added FF1/FL1 instructions to
38
supported instructions table.
39
 
40
v0.11   | 19/1/11       | Julius Baxter | Cache information update.
41
Wishbone behavior clarification. Serial integer multiply/divide update.
42
Reset address clarification
43
__vend__
44
////
45
 
46
Introduction
47
------------
48
Purpose of this document is to define specifications of the OpenRISC 1200
49
implementation. This specification defines all implementation specific
50
variables that are not part of the general architecture specification. This
51
includes type and size of data and instruction caches, type and size of data
52
and instruction MMUs, details of all execution pipelines, implementation
53
of exception unit, interrupt controller and other supplemental units.
54
This document does not cover general architecture topics like instruction set,
55
memory addressing modes and other architectural definitions. See
56
<> for more information about architecture.
57
 
58
OpenRISC Family
59
~~~~~~~~~~~~~~~
60
(((OpenRISC,Family)))
61
OpenRISC 1000 is architecture for a family of free, open source RISC processor
62
cores. As architecture, OpenRISC 1000 allows for a spectrum of chip and
63
system implementations at a variety of price/performance points for a range of
64
applications. It is a 32/64-bit load and store RISC architecture designed with
65
emphasis on performance, simplicity, low power requirements, scalability and
66
versatility. OpenRISC 1000 architecture targets medium and high performance
67
networking, embedded, automotive and portable computer environments.
68
 
69
image::img/or_family.gif[scaledwidth="50%",align="center"]
70
 
71
All OpenRISC implementations, whose first digit in identification number
72
is  1 , belong to OpenRISC 1000 family. Second digit defines which features
73
of OpenRISC 1000 architecture are implemented and in which way they are
74
implemented. Last two digits define how an implementation is configured
75
before it is used in a real application.
76
 
77
However, at present the OR1200 is the only major RTL implementation of the
78
OR1K architecture spec, and the OR1200 name has stuck, despite the high level
79
of reconfigurability possible that would, strictly speaking, mean the core
80
is either a OR1000, OR1300, etc. So, despite the various features that may
81
or may not be implemented, the core is still only referred to as the OR1200.
82
 
83
OpenRISC 1200
84
~~~~~~~~~~~~~
85
(((OpenRISC,1200)))
86
The OR1200 is a 32-bit scalar RISC with Harvard microarchitecture, 5 stage
87
integer pipeline, virtual memory support (MMU) and basic DSP capabilities.
88
Default caches are 1-way direct-mapped 8KB data cache and 1-way direct-mapped
89
8KB instruction cache, each with 16-byte line size. Both caches are
90
physically tagged.  By default MMUs are implemented and they are constructed of
91
64-entry hash based 1-way direct-mpped data TLB and 64-entry hash based 1-way
92
direct-mapped instruction TLB.
93
 
94
Supplemental facilities include debug unit for real-time debugging, high
95
resolution tick timer, programmable interrupt controller and power management
96
support.  When implemented in a typical 0.18u 6LM process it should provide
97
over 300 dhrystone 2.1 MIPS at 300MHz and 300 DSP MAC 32x32 operations, at
98
least 20% more than any other competitor in this class. OR1200 in default
99
configuration has about 1M transistors.
100
 
101
OR1200 is intended for embedded, portable and networking applications. It can
102
successfully compete with latest scalar 32-bit RISC processors in his class
103
and can efficiently run any modern operating system.  Competitors include
104
ARM10, ARC and Tensilica RISC processors.
105
 
106
Features
107
^^^^^^^^
108
The following lists the main features of OR1200 IP core:
109
 
110
- All major characteristics of the core can be set by the user
111
- High performance of 300 Dhrystone 2.1 MIPS at 300 MHz using 0.18u process
112
- High performance cache and MMU subsystems
113
- WISHBONE SoC Interconnection Rev. B3 compliant interface
114
 
115
Architecture
116
------------
117
<> below shows general architecture of OR1200 IP core. It
118
consists of several building blocks:
119
 
120
- CPU/FPU/DSP central block
121
- Direct-mapped data cache
122
- Direct-mapped instruction cache
123
- Data MMU based on hash based DTLB
124
- Instruction MMU based on hash based ITLB
125
- Power management unit and power management interface
126
- Tick timer
127
- Debug unit and development interface
128
- Interrupt controller and interrupt interface
129
- Instruction and Data WISHBONE host interfaces
130
 
131
[[core_arch_fig]]
132
.Core's Architecture
133
image::img/core_arch.gif[scaledwidth="50%",align="center"]
134
 
135
CPU/FPU/DSP
136
~~~~~~~~~~~
137
((CPU))/((FPU))/((DSP)) is a central part of the OR1200 RISC processor.
138
<> shows basic block diagram of the CPU/DSP. Not pictured
139
are the FPU components.  OR1200 CPU/FPU/DSP ony implements sections of
140
the ORBIS32 and ORFPX32 instruction set. No ((ORBIS64)), ((ORFBX64)) or
141
((ORVDX64)) instructions are implemented in OR1200.
142
 
143
[[cpu_fpu_dsp_fig]]
144
.CPU/FPU/DSP Block Diagram
145
image::img/cpu_fpu_dsp.gif[scaledwidth="50%",align="center"]
146
 
147
Instruction unit
148
^^^^^^^^^^^^^^^^
149
The instruction unit implements the basic instruction pipeline, fetches
150
instructions from the memory subsystem, dispatches them to available execution
151
units, and maintains a state history to ensure a precise exception model
152
and that operations finish in order. It also executes conditional branch
153
and unconditional jump instructions.
154
 
155
The sequencer can dispatch a sequential instruction on each clock if the
156
appropriate execution unit is available. The execution unit must discern
157
whether source data is available and to ensure that no other instruction is
158
targeting the same destination register.
159
 
160
Instruction unit handles only ((ORBIS32)) and, optionally, a subset of the
161
((ORFPX32)) instruction class. Some ((ORFPX32)) and all ((ORFPX3264)) and
162
((ORVDX64)) instruction classes are not supported by the OR1200 at present.
163
 
164
General-Purpose Registers
165
^^^^^^^^^^^^^^^^^^^^^^^^^
166
OpenRISC 1200 implements 32 general-purpose 32-bit ((registers)). OpenRISC 1000
167
architecture also support shadow copies of register file to implement fast
168
switching between working contexts, however this feature is not implemented
169
in current OR1200 implementation.
170
 
171
OR1200 implements general-purpose register file as two synchronous dual-port
172
memories with capacity of 32 words by 32 bits per word.
173
 
174
Load/Store Unit
175
^^^^^^^^^^^^^^^
176
The ((load/store unit (LSU))) transfers all data between the GPRs and the CPU's
177
internal bus. It is implemented as an independent execution unit so that stalls
178
in memory subsystem only affect master pipeline if there is a data dependency.
179
 
180
The following are LSU's main features:
181
 
182
- all load/store instruction implemented in hardware (atomic instructions
183
  included)
184
- address entry buffer
185
- pipelined operation
186
- aligned accesses for fast memory access
187
 
188
When load and store instructions are issued, the LSU determines if all
189
operands are available. These operands include the following:
190
 
191
- address register operand
192
- source data register operand (for store instructions)
193
- destination data register operand (for load instructions)
194
 
195
Integer Execution Pipeline
196
^^^^^^^^^^^^^^^^^^^^^^^^^^
197
(((Pipeline, Integer Execution)))
198
The core implements the following types of 32-bit integer instructions:
199
 
200
- Arithmetic instructions
201
- Compare instructions
202
- Logical instructions
203
- Rotate and shift instructions
204
 
205
Most integer instructions can execute in one cycle. For details about timing
206
see <>.
207
 
208
MAC Unit
209
^^^^^^^^
210
The ((MAC)) unit executes DSP MAC operations. MAC operations are 32x32 with
211
48-bit accumulator. MAC unit is fully pipelined and can accept new MAC
212
operation in each new clock cycle.
213
 
214
Floating Point Unit
215
^^^^^^^^^^^^^^^^^^^
216
(((Floating Point Unit)))
217
The ((FPU)) implementation is based on two other FPUs available from
218
OpenCores.org. For the comparison and conversion functions, parts were taken
219
from the FPU project by Rudolf Usselmann, and for the arithmetic operations,
220
the fpu100 project by Jidan Al-Eryani was converted to Verilog HDL.
221
 
222
All ((ORFPX32)) instructions except for ((lf.madd.s)) and ((lf.rem.s)) are
223
supported when the FPU is enabled in the OR1200 configuration.
224
 
225
System Unit
226
^^^^^^^^^^^
227
The ((system unit)) connects all other signals of the CPU/FPU/DSP that are not
228
connected through instruction and data interfaces. It also implements all
229
system special-purpose registers (e.g. supervisor register).
230
 
231
Exceptions
232
^^^^^^^^^^
233
Core exceptions can be generated when an exception condition occurs.
234
((Exception sources)) in OR1200 include the following:
235
 
236
- External interrupt request
237
- Certain memory access condition
238
- Internal errors, such as an attempt to execute unimplemented opcode
239
- System call
240
- Internal exception, such as breakpoint exceptions
241
 
242
((Exception handling)) is transparent to user software and uses the same
243
mechanism to handle all types of exceptions. When an exception is taken,
244
control is transferred to an exception handler at an offset defined by for
245
the type of exception encountered. Exceptions are handled in supervisor mode.
246
 
247
Data Cache
248
~~~~~~~~~~
249
The default configuration of OR1200 data ((cache)) is 8-Kbyte, 1-way
250
direct-mapped data cache, which allows rapid core access to data. However
251
data cache can be configured according to <>.
252
 
253
[[data_confs_or1200_table]]
254
.Possible Data Cache Configurations of OR1200
255
[width="60%",options="header"]
256
|======================================================
257
|                                       | Direct mapped
258
| 16B/line, 256 lines, 1 way            | 4KB
259
| 16B/line, 512 lines, 1 way            | *8KB (default)*
260
| 16B/line, 1024 lines, 1 way           | 16KB
261
| 32B/line, 1024 lines, 1 way           | 32KB
262
|======================================================
263
 
264
It is possible to operate the data cache with write-through or write-back
265
strategies, however write-back is currently experimental.
266
 
267
Features:
268
 
269
- data cache is separate from instruction cache (Harvard architecture)
270
- data cache implements a least-recently used (LRU) replacement algorithm
271
  within each set
272
- the cache directory is physically addressed. The physical address tag is
273
  stored in the cache directory
274
- write-through or write-back operation
275
- entire cache can be disabled, lines invalidated, flushed or forced to be
276
  written back, by writing to cache special purpose registers
277
 
278
On a miss, and appropriate conditions, the cache line is filled or emptied
279
(written back) with 16-byte bursts. The burst fill is performed as a
280
critical-word-first operation; the critical word is simultaneously written
281
to the cache and forwarded to the requesting unit, thus minimizing stalls
282
due to cache fill latency. Data cache provides storage for cache tags and
283
performs cache line replacement function.
284
 
285
Data cache is tightly coupled to external interface to allow efficient
286
access to the system memory controller.
287
 
288
The data cache supplies data to the GPRs by means of a 32-bit interface
289
to the load/store unit. The LSU provides all logic required to calculate
290
effective addresses, handles data alignment to and from the data cache,
291
and provides sequencing for load and store operations. Write operations to
292
the data cache can be performed on a byte, half-word or word basis.
293
 
294
image::img/data_cache_diag.gif[scaledwidth="50%",align="center"]
295
 
296
Each line contains four contiguous words from memory that are loaded from
297
a cache line aligned boundary. As a result, cache lines are aligned with
298
page boundaries.
299
 
300
Instruction Cache
301
~~~~~~~~~~~~~~~~~
302
The default configuration of OR1200 instruction ((cache)) is 8-Kbyte, 1-way
303
direct mapped instruction cache, which allows rapid core access to
304
instructions. However instruction cache can be configured according to
305
<>.
306
 
307
[[inst_confs_or1200_table]]
308
.Possible Instruction Cache Configurations of OR1200
309
[width="60%",options="header"]
310
|==============================================
311
|                               | Direct mapped
312
| 16B/line, 32 lines, 1 way     | 512B
313
| 16B/line, 256 lines, 1 way    | 4KB
314
| 16B/line, 512 lines, 1 way    | *8KB (Default)*
315
| 16B/line, 1024 lines, 1 way   | 16KB
316
| 32B/line, 1024 lines, 1 way   | 32KB
317
|==============================================
318
 
319
Features:
320
 
321
- instruction cache is separate from data cache (Harvard architecture)
322
  (((Architecture,Harvard)))
323
- instruction cache implements a least-recently used (LRU) replacement
324
  algorithm within each set
325
  ((LRU))
326
- the ((cache directory)) is physically addressed. The physical address tag is
327
  stored in the cache directory
328
- it can be disabled or invalidated by writing to cache special purpose
329
  registers
330
 
331
On a miss, the cache is filled in with 16-byte bursts. The burst fill
332
is performed as a critical-word-first operation; the critical word is
333
simultaneously written to the cache and forwarded to the requesting unit,
334
thus minimizing stalls due to cache fill latency. Instruction cache provides
335
storage for cache tags and performs cache line replacement function.
336
 
337
Instruction cache is tightly coupled to external interface to allow efficient
338
access to the system memory controller.
339
 
340
The instruction cache supplies instructions to the instruction sequencer by
341
means of a 32-bit interface to the instruction fetch subunit. The instruction
342
fetch subunit provides all logic required to calculate effective addresses.
343
 
344
image::img/inst_cache_diag.gif[scaledwidth="50%",align="center"]
345
 
346
Each line contains four contiguous words from memory that are loaded from
347
a line-size  aligned boundary. As a result, cache lines are aligned with
348
page boundaries.
349
 
350
Data MMU
351
~~~~~~~~
352
(((MMU, Data)))
353
The OR1200 implements a ((virtual memory management)) scheme that
354
provides memory access protection and effective-to-physical address
355
translation. ((Protection)) granularity is as defined by OpenRISC 1000
356
architecture - 8-Kbyte and 16-Mbyte pages.
357
 
358
[[data_tlb_confs_or1200_table]]
359
.Possible Data TLB Configurations of OR1200
360
[width="60%",options="header"]
361
|======================================
362
|                       | Direct mapped
363
| 16 entries per way    | 16 DTLB entries
364
| 32 entries per way    | 32 DTLB entries
365
| 64 entries per way    | *64 DTLB entries (default)*
366
| 128 entries per way   | 128 DTLB entries
367
|======================================
368
 
369
Features:
370
 
371
* data MMU is separate from instruction MMU
372
* page size 8-Kbyte
373
* comprehensive page protection scheme
374
* direct mapped hash based translation lookaside buffer (DTLB) with the
375
  default of 1 way and the following features:
376
** miss and fault exceptions
377
** software tablewalk
378
** high performance because of hashed based design
379
** variable number DTLB entries with default of 64 per each way
380
 
381
image::img/tlb_diag.gif[scaledwidth="50%",align="center"]
382
 
383
The MMU hardware supports two-level software tablewalk.
384
 
385
Instruction MMU
386
~~~~~~~~~~~~~~~
387
(((MMU, Instruction)))
388
The OR1200 implements a virtual memory management scheme that provides memory
389
access protection and effective-to-physical address translation. Protection
390
granularity is as defined by OpenRISC 1000 architecture - 8-Kbyte and
391
16-Mbyte pages.
392
 
393
[[inst_tlb_confs_or1200_table]]
394
.Possible Instruction TLB Configurations of OR1200
395
[width="60%",options="header"]
396
|======================================
397
|                       | Direct mapped
398
| 16 entries per way    | 16 DTLB entries
399
| 32 entries per way    | 32 DTLB entries
400
| 64 entries per way    | *64 DTLB entries (default)*
401
| 128 entries per way   | 128 DTLB entries
402
|======================================
403
 
404
Features:
405
 
406
* instruction MMU is separate from data MMU
407
* pages size 8-Kbyte
408
* comprehensive page protection scheme
409
* 1 way direct-mapped hash based translation lookaside buffer (ITLB) with the
410
  following features:
411
** miss and fault exceptions
412
** software tablewalk
413
** high performance because of hashed based design
414
** Variable number of ITLB entries with default of 64 entries per way
415
 
416
image::img/inst_mmu_diag.gif[scaledwidth="50%",align="center"]
417
 
418
The MMU hardware supports two-level software tablewalk.
419
 
420
Programmable Interrupt Controller
421
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
422
The ((interrupt)) controller receives interrupts from external sources and
423
forwards them as low or high priority interrupt exception to the CPU core.
424
 
425
[[interrupt_controller_fig]]
426
.Block Diagram of the Interrupt Controller
427
image::img/interrupt_controller.gif[scaledwidth="50%",align="center"]
428
 
429
Programmable interrupt controller has three special-purpose registers and 32
430
interrupt inputs. Interrupt input 0 and 1 are always enabled and connected
431
to high and low priority interrupt input, respectively.
432
 
433
30 other interrupt inputs can be masked and assigned low or high priority
434
through programming special-purpose registers.
435
 
436
Tick Timer
437
~~~~~~~~~~
438
OR1200 implements tick ((timer)) facility. Basically this is a timer that is
439
clocked by RISC clock and is used by the operating system to precisely
440
measure time and schedule system tasks.
441
 
442
OR1200 precisely follow architectural definition of the tick timer facility:
443
 
444
* Maximum timer count of 2^32 clock cycles
445
* Maximum time period of 2^28 clock cycles between interrupts
446
* Maskable tick timer interrupt
447
* Single run, restartable or continues timer
448
 
449
Tick timer operates from independent clock source so that doze power management
450
mode can be implemented.
451
 
452
Power Management Support
453
~~~~~~~~~~~~~~~~~~~~~~~~
454
To optimize ((power consumption)), the OR1200 provides ((low-power)) modes that
455
can be used to dynamically activate and deactivate certain internal modules.
456
 
457
OR1200 has three major features to minimize power consumption:
458
 
459
* Slow and Idle Modes (SW controlled clock freq reduction)
460
* Doze and Sleep Modes (interrupt wake-up)
461
 
462
[[power_consumption_table]]
463
.Power Consumption
464
[width="60%",options="header"]
465
|===================================================================
466
| Power Minimization Feature    | Approx Power Consumption Reduction
467
| Slow and Idle mode            | 2x - 10x
468
| Doze mode                     | 100x
469
| Sleep mode                    | 200x
470
| Dynamic clock gating          | N/A
471
|===================================================================
472
 
473
Slow down mode takes advantage of the low-power dividers in external clock
474
generation circuitry to enable full functionality, but at a lower frequency
475
so that a power consumption is reduced.  PMR[SDF] 4 bits are broadcasted on
476
pm_clksd and external clock generation for the RISC should adapt RISC clock
477
frequency according to the value on pm_clksd.
478
 
479
When software initiates the doze mode, software processing on the core
480
suspends. The clocks to the RISC internal modules are disabled except to
481
the tick timer. However any other on-chip blocks can continue to function
482
as normal.  The OR1200 will leave doze mode and enter normal mode when a
483
pending interrupt occurs.
484
 
485
In sleep mode, all OR1200 internal units are disabled and clocks
486
gated. Optionally implementation may choose to lower the operating voltage
487
of the OR1200 core.  The OR1200 should leave sleep mode and enter normal
488
mode when a pending interrupt occurs.
489
 
490
Dynamic ((Clock gating)) (unit clock gating on clock by clock basis) is not
491
supported by OR1200.
492
 
493
Debug unit
494
~~~~~~~~~~
495
((Debug unit)) assists software developers to debug their systems. It provides
496
support only for basic debugging and does not have support for more advanced
497
debug features of OpenRISC 1000 architecture such as watchpoints, breakpoints
498
and program-flow control registers.
499
 
500
[[debug_unit_fig]]
501
.Block Diagram of Debug Unit
502
image::img/debug_unit_diag.gif[scaledwidth="50%",align="center"]
503
 
504
Watchpoints and breakpoints are events triggered by program- or data-flow
505
matching the conditions programmed in the debug registers. Breakpoints
506
unlike watchpoints also suspend execution of the current program-flow and
507
start breakpoint exception.
508
 
509
Clocks & Reset
510
~~~~~~~~~~~~~~
511
The OR1200 core has a ((clock)) input each for the instruction and data Wishbone
512
interface logic, and for the CPU core. Clock input clk_cpu clocks everything
513
inside the Wishbone interfaces. Data Wishbone interface is clocked by
514
dwb_clk_i, instruction Wishbone interface is clocked by iwb_clk_i.
515
 
516
OR1200 has asynchronous ((reset)) signal. Reset signal rst, when asserted high,
517
immediately resets all flip-flops inside OR1200. When deasserted, OR1200
518
will start reset exception.
519
 
520
WISHBONE Interfaces
521
~~~~~~~~~~~~~~~~~~~
522
Two ((WISHBONE)) interfaces connect OR1200 core to external peripherals and
523
external memory subsystem. They are WISHBONE SoC Interconnection specification
524
Rev. B3 compliant. The implementation implements a 32-bit bus width and does
525
not support other bus widths.
526
 
527
Wishbone registered-feedback incrementing burst accesses occur when not
528
disabled, and cache lines are filled. The burst size (beats) is determined
529
by the cache line size.
530
 
531
image::img/wb_compatible.png[scaledwidth="30%",align="center"]
532
 
533
Operation
534
---------
535
This section describes the operation of the OR1200 core. For operations
536
that pertain to the architectural definitions, see <>.
537
 
538
Reset
539
~~~~~
540
OR1200 has one asynchronous ((reset)) signal that can be used by a soft and hard
541
reset on a higher system hierarchy levels.
542
 
543
[[powerup_sequence_fig]]
544
.Power-Up and Reset Sequence
545
image::img/powerup_seq.gif[scaledwidth="70%",align="center"]
546
 
547
<> shows how asynchronous reset is applied after
548
powering up the OR1200 core. Reset is connected to asynchronous reset of
549
almost all flip-flops inside RISC core. Special care must be taken to ensure
550
hold and setup times of all flip-flops compared to main RISC clock.
551
 
552
If system implements gated clocks, then clock gating can be used to ensure
553
proper reset timing.
554
 
555
[[powerup_sequence_gatedclk_fig]]
556
.Power-Up and Reset Sequence w/ Gated Clock
557
image::img/powerup_seq_gatedclk.gif[scaledwidth="70%",align="center"]
558
 
559
The address the PC assumes at hard reset (assertion of external reset signal)
560
is definable at synthesis time, via the OR1200_BOOT_ADR define. This is not
561
to be confused with the ability to set the exception prefix address with
562
the EPH bit.
563
 
564
CPU/FPU/DSP
565
~~~~~~~~~~~
566
((CPU))/((FPU))/((DSP)) is implementation of the 32-bit part of the OpenRISC
567
1000 architecture and only a subset of all features is implemented.
568
 
569
Instructions
570
^^^^^^^^^^^^
571
(((OpenRISC 1200, Instruction List)))
572
The following table lists the instructions implemented in the OR1200. Those
573
optionally implemented are indicated as such.
574
 
575
// The table below is split into several columns for readability by the
576
// preprocessing script. It is better to have this automated because
577
// given the pseudo-lexicographical ordering, adding a new instruction
578
// would require manual changes in all subsequent columns, which is
579
// tedious and error-prone.
580
//
581
// When changing the column headers, remember to change the script accordingly.
582
 
583
[[instructions_table]]
584
.Instructions implemented in OR1200
585
[width="95%",options="header"]
586
|=================================
587
| Instruction mnemonic  | Optional
588
| ((l.add))             |
589
| ((l.addc))            | Yes
590
| ((l.addi))            |
591
| ((l.and))             |
592
| ((l.andi))            |
593
| ((l.bf))              |
594
| ((l.bnf))             |
595
| ((l.div))             | Yes
596
| ((l.ff1))             | Yes
597
| ((l.fl1))             | Yes
598
| ((l.j))               |
599
| ((l.jal))             |
600
| ((l.jalr))            |
601
| ((l.jr))              |
602
| ((l.lbs))             |
603
| ((l.lbz))             |
604
| ((l.lhs))             |
605
| ((l.lhz))             |
606
| ((l.lws))             |
607
| ((l.lwz))             |
608
| ((l.mac))             | Yes
609
| ((l.maci))            | Yes
610
| ((l.macrc))           | Yes
611
| ((l.mfspr))           |
612
| ((l.movhi))           |
613
| ((l.msb))             | Yes
614
| ((l.mtspr))           |
615
| ((l.mul))             | Yes
616
| ((l.muli))            | Yes
617
| ((l.nop))             |
618
| ((l.or))              |
619
| ((l.ori))             |
620
| ((l.rfe))             |
621
| ((l.rori))            |
622
| ((l.sb))              |
623
| ((l.sfeq))            |
624
| ((l.sfges))           |
625
| ((l.sfgeu))           |
626
| ((l.sfgts))           |
627
| ((l.sfgtu))           |
628
| ((l.sfleu))           |
629
| ((l.sflts))           |
630
| ((l.sfltu))           |
631
| ((l.sfne))            |
632
| ((l.sh))              |
633
| ((l.sll))             |
634
| ((l.slli))            |
635
| ((l.sra))             |
636
| ((l.srai))            |
637
| ((l.srl))             |
638
| ((l.srli))            |
639
| ((l.sub))             | Yes
640
| ((l.sw))              |
641
| ((l.sys))             |
642
| ((l.trap))            |
643
| ((l.xor))             |
644
| ((l.xori))            |
645
| ((lf.add.s))          | Yes
646
| ((lf.div.s))          | Yes
647
| ((lf.ftoi.s))         | Yes
648
| ((lf.itof.s))         | Yes
649
| ((lf.mul.s))          | Yes
650
| ((lf.sfeq.s))         | Yes
651
| ((lf.sfge.s))         | Yes
652
| ((lf.sfgt.s))         | Yes
653
| ((lf.sfle.s))         | Yes
654
| ((lf.sflt.s))         | Yes
655
| ((lf.sfne.s))         | Yes
656
| ((lf.sub.s))          | Yes
657
|=================================
658
 
659
For a complete description of each instruction's format refer to
660
<>.
661
 
662
Instruction Unit
663
^^^^^^^^^^^^^^^^
664
((Instruction unit)) generates instruction fetch effective address and fetches
665
instructions from instruction cache. Each clock cycle one instruction can
666
be fetched. Instruction fetch EA is further translated into physical address
667
by IMMU.
668
 
669
General-Purpose Registers
670
^^^^^^^^^^^^^^^^^^^^^^^^^
671
((General-purpose register)) file can supply two read operands each clock cycle
672
and store one result in a destination register.
673
 
674
GPRs can be also read and written through development interface.
675
 
676
Load/Store Unit
677
^^^^^^^^^^^^^^^
678
((LSU)) can execute one load instruction every two clock cycles assuming load
679
instruction have a hit in the data cache. Execution of store instructions
680
takes one clock cycle assuming they have a hit in the data cache.
681
 
682
LSU performs calculation of the load/store effective address. EA is further
683
translated into physical address by DMMU.
684
 
685
Load/store effective address and load and store data can be also accessed
686
through development interface.
687
 
688
Integer Execution Pipeline
689
^^^^^^^^^^^^^^^^^^^^^^^^^^
690
(((Pipeline, Integer Execution)))
691
The core implements the following types of 32-bit integer instructions:
692
 
693
* Arithmetic instructions
694
* Compare instructions
695
* Logical instructions
696
* Rotate and shift instructions
697
 
698
[[exec_time_int_table]]
699
.Execution Time of Integer Instructions
700
[width="70%",options="header"]
701
|================================================
702
| Instruction Group     | Clock Cycles to Execute
703
| Arithmetic except Multiply/Divide     | 1
704
| Multiply                              | 3
705
| Divide                                | 32
706
| Compare                               | 1
707
| Logical                               | 1
708
| Rotate and Shift                      | 1
709
| Others                                | 1
710
|================================================
711
 
712
<> lists execution times for instructions executed by
713
integer execution pipeline. Most instructions are executed in one clock cycle.
714
 
715
Integer multiply can be either serial or parallel implementations. Serial
716
operations require one clock cycle per bit of operand, which is 32-cycles
717
on the OR1200. At present no synthesis tools support division operators,
718
and so the serial option must be used.
719
 
720
MAC Unit
721
^^^^^^^^
722
((MAC)) unit executes l.mac instructions. MAC unit implements 32x32 fully
723
pipelined multiplier and 48-bit accumulator. MAC unit can accept one new
724
l.mac instruction each clock cycle.
725
 
726
Care should be taken when executing l.macrc (MAC read and clear) too soon
727
after the final l.mac instruction as the operation may still be underway
728
and the result will not be valid in time. It is recommended at least 3 other
729
instructions (or just l.nops) are inserted between the final l.mac and l.macrc.
730
 
731
Floating Point Unit
732
^^^^^^^^^^^^^^^^^^^
733
The ((floating point unit)) has a mechanism to stall the processor pipeline
734
until processing has completed.
735
 
736
The following table indicates the number of cycles per operation
737
 
738
[[exec_time_fp_table]]
739
.Execution time of floating point instructions
740
[width="60%",options="header"]
741
|=======================
742
| Operation     | Cycles
743
| Add/subtract  | 10
744
| Multiply      | 38
745
| Divide        | 37
746
| Compare       | 2
747
| Convert       | 7
748
|=======================
749
 
750
System Unit
751
^^^^^^^^^^^
752
((System unit)) implements system control and status special-purpose registers
753
and executes all l.mtspr/l.mfspr instructions.
754
 
755
Exceptions
756
^^^^^^^^^^
757
The core implements a precise ((exception model)). This means that when an
758
exception is taken, the following conditions are met:
759
 
760
* Subsequent instructions in program flow are discarded
761
* Previous instructions finish and write back their results
762
* The address of faulting instruction is saved in EPCR registers and the
763
  machine state is saved to ESR registers
764
 
765
[[exceptions_table]]
766
.List of Implemented ((Exceptions))
767
[width="95%",options="header"]
768
|===========================================================
769
| Exception Type        | Vector Offset | Causing Conditions
770
| Reset                 | 0x100 | Caused by reset.
771
| Bus Error             | 0x200 | Caused by an attempt to access invalid
772
  physical address.
773
| Data Page Fault       | 0x300 | Generated artificially by DTLB miss exception
774
  handler when no matching PTE found in page tables or page protection
775
  violation for load/store operations.
776
| Instruction Page Fault| 0x400 | Generated artificially by ITLB miss exception
777
  handler when no matching PTE found in page tables or page protection violation
778
  for instruction fetch.
779
| Low Priority External Interrupt       | 0x500 | Low priority external
780
  interrupt asserted.
781
| Alignment     | 0x600 | Load/store access to naturally not aligned location.
782
| Illegal Instruction   | 0x700 | Illegal instruction in the instruction stream.
783
| High Priority External Interrupt      | 0x800 | High priority external
784
  interrupt asserted.
785
| D-TLB Miss    | 0x900 | No matching entry in DTLB (DTLB miss).
786
| I-TLB Miss    | 0xA00 | No matching entry in ITLB (ITLB miss).
787
| System Call   | 0xC00 | System call initiated by software.
788
| Floating point exception      | 0xD00 | FP operation caused flags in FPCSR to
789
  become set.
790
| Trap  | 0xE00 | Trap instruction was decoded
791
|===========================================================
792
 
793
The OR1200 exception support does not include support for range exceptions
794
or fast context switching.
795
 
796
Data Cache Operation
797
~~~~~~~~~~~~~~~~~~~~
798
Data Cache Load/Store Access
799
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
800
Load/store unit requests data from the data ((cache)) and stores them into
801
the general-purpose register file and forwards them to integer execution
802
units. Therefore LSU is tightly coupled with the data cache.
803
 
804
If there is no data cache line miss nor ((DTLB)) miss, load operations take
805
two clock cycles to execute and store operations take one clock cycle to
806
execute. LSU does all the data alignment work.
807
 
808
Data can be written to the data cache on a word, half-word or byte basis. Since
809
data cache only operates in write-through mode, all writes are immediately
810
written back to main memory or to the next level of caches.
811
 
812
[[wb_write_fig]]
813
.WISHBONE Write Cycle
814
image::img/wb_write.gif[scaledwidth="70%",align="center"]
815
 
816
<> shows how a ((write-through)) cycle on data WISHBONE interface
817
is performed when a store instruction hits in the data cache.  If +dwb_ERR_I+
818
or +dwb_RTY_I+ is asserted instead of usual +dwb_ACK_I+, bus error exception
819
is invoked.
820
 
821
Data Cache Line Fill Operation
822
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
823
When executing load instruction and a cache miss occurs, depending on whether
824
the cache uses ((write-through)) or ((write-back)) strategy and the line
825
is clean or invalid, a 4 beat sequential read burst with critical word
826
first is performed. If the strategy is write-back and the line is dirty,
827
the line is first written back to memory. The critical word is forwarded to
828
the load/store unit to minimize performance loss because of the cache miss.
829
 
830
[[wb_read_fig]]
831
.WISHBONE Block Read Cycle
832
image::img/wb_read.gif[scaledwidth="70%",align="center"]
833
 
834
<> shows how a cache line is read in WISHBONE read block cycle
835
composed out of four read transfers.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted
836
instead of usual +dwb_ACK_I+, bus error exception is invoked.
837
 
838
When executing a store instruction with the cache in write-through strategy,
839
and a cache miss occurs, the write is simply put on the bus and no caching
840
occurs. If it is a miss and the cache is in write back strategy and the line
841
is valid and clean or invalid,  a 4 beat sequential read burst to fill the
842
line is performed, and the the write to cache occurs. If storing and a cache
843
miss occurs, and the desired line is valid and dirty, it is first written
844
back to memory before the desired line is read.
845
 
846
[[wb_rw_fig]]
847
.WISHBONE Block Read/Write Cycle
848
image::img/wb_rw.gif[scaledwidth="70%",align="center"]
849
 
850
<> shows how a cache line is read in WISHBONE read block cycle
851
followed by a write transfer.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted instead
852
of usual +dwb_ACK_I+, bus error exception is invoked.
853
 
854
Cache/Memory Coherency
855
^^^^^^^^^^^^^^^^^^^^^^
856
Data cache in OR1200 operates in either write-through or write-back mode,
857
definable at synthesis time, for default use, and runtime when DMMU is
858
used. There is currently no ((coherency)) support between local data cache and
859
caches of other processors.
860
 
861
Data Cache Enabling/Disabling
862
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
863
Data cache is disabled at power up. Entire data cache can be enabled by setting
864
bit SR[DCE] to one. Before data cache is enabled, it must be invalidated.
865
 
866
Data Cache Invalidation
867
^^^^^^^^^^^^^^^^^^^^^^^
868
Data cache in OR1200 does not support ((invalidation)) of entire data
869
cache. Normal procedure to invalidate entire data cache is to cycle through
870
all data cache lines and invalidate each line separately.
871
 
872
Data Cache Locking
873
^^^^^^^^^^^^^^^^^^
874
Data cache implements way ((locking)) bits in data cache control register
875
DCCR. Bits LWx lock individual ways when they are set to one.
876
 
877
Data Cache Line Prefetch
878
^^^^^^^^^^^^^^^^^^^^^^^^
879
Data cache line ((prefetch)) is optional in the OpenRISC 1000 architecture and
880
is not implemented in OR1200.
881
 
882
Data Cache Line ((Flush))
883
^^^^^^^^^^^^^^^^^^^^^^^^^
884
Operation is performed by writing effective address to the DCBFR register.
885
 
886
When a cache line is valid and clean, or the cache is in write-through
887
strategy, the line is invalidated and no write-back occurs.
888
 
889
Data Cache Line Invalidate
890
^^^^^^^^^^^^^^^^^^^^^^^^^^
891
Data cache line ((invalidate)) invalidates a single data cache line. Operation
892
is performed by writing effective address to the DCBIR register.  If cache
893
is in write-back strategy, it is best to use the line flush function.
894
 
895
Data Cache Line ((Write-back))
896
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
897
Operation is performed by writing effective address to the DCBWR register.
898
 
899
If cache is in ((write-through)) strategy, this operation is ignored as no
900
lines will be cached and dirty, capable of being written back.
901
 
902
Data Cache Line ((Lock))
903
^^^^^^^^^^^^^^^^^^^^^^^^
904
Locking of individual data cache lines is not implemented in OR1200.
905
 
906
Data Cache ((inhibit)) with address bit 31 set
907
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
908
If DMMU is disabled, by default all addresses with bit 31 of the address
909
asserted high will cause the data cache to be inhibited, meaning no reads
910
or writes are cached.
911
 
912
If the ((DMMU)) is enabled, it is possible for any address to be inhibited
913
or not, and in these modes the cache behaves accordingly.
914
 
915
Instruction ((Cache)) Operation
916
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
917
Instruction Cache Instruction ((Fetch)) Access
918
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
919
Instruction unit requests instruction from the instruction cache and forwards
920
them to the instruction queue inside instruction unit. Therefore instruction
921
unit is tightly coupled with the instruction cache.
922
 
923
If there is no instruction cache line ((miss)) nor ITLB miss, instruction fetch
924
operation takes one clock cycle to execute.
925
 
926
Instruction cache cannot be explicitly modified like data cache can be with
927
store instructions.
928
 
929
Instruction Cache Line Fill Operation
930
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
931
On a cache miss, a 4 beat sequential read burst with critical word first is
932
performed. Critical word is forwarded to the instruction unit to minimize
933
performance loss because of the cache miss.
934
 
935
[[wb_block_read_fig]]
936
.WISHBONE Block Read Cycle
937
image::img/wb_block_read.gif[scaledwidth="70%",align="center"]
938
 
939
<> shows how a cache line is read in WISHBONE read block
940
cycle composed out of four read transfers.  If +iwb_ERR_I+ or +iwb_RTY_I+ is
941
asserted instead of usual +dwb_ACK_I+, bus error exception is invoked.
942
 
943
Cache/Memory ((Coherency))
944
^^^^^^^^^^^^^^^^^^^^^^^^^^
945
OR1200 is not intended for use in multiprocessor environments. Therefore no
946
support for coherency between local instruction cache and caches of other
947
processors or main memory is implemented.
948
 
949
Instruction Cache Enabling/Disabling
950
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
951
Instruction cache is disabled at power up. Entire instruction cache can be
952
enabled by setting bit SR[ICE] to one. Before instruction cache is enabled,
953
it must be invalidated.
954
 
955
Instruction Cache ((Invalidation))
956
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
957
Instruction cache in OR1200 does not support invalidation of entire instruction
958
cache. Normal procedure to invalidate entire instruction cache is to cycle
959
through all instruction cache lines and invalidate each line separately.
960
 
961
Instruction Cache Locking
962
^^^^^^^^^^^^^^^^^^^^^^^^^
963
Instruction cache implements way locking bits in instruction cache control
964
register ICCR. Bits LWx lock individual ways when they are set to one.
965
 
966
Instruction Cache Line ((Prefetch))
967
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
968
Instruction cache line prefetch is optional in the OpenRISC 1000 architecture
969
and is not implemented in OR1200.
970
 
971
Instruction Cache Line ((Invalidate))
972
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
973
Instruction cache line invalidate invalidates a single instruction cache
974
line. Operation is performed by writing effective address to the ICBIR
975
register.
976
 
977
Instruction ((Cache Line Lock))
978
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
979
Locking of individual instruction cache lines is not implemented in OR1200.
980
 
981
Data MMU
982
~~~~~~~~
983
Translation Disabled
984
^^^^^^^^^^^^^^^^^^^^
985
Load/store address translation can be disabled by clearing bit SR[DME]. If
986
translation is disabled, then physical address used to access data cache
987
and optionally provided on +dwb_ADDR_O+, is the same as load/store effective
988
address.
989
(((Address Translation,Data)))
990
 
991
Translation Enabled
992
^^^^^^^^^^^^^^^^^^^
993
Load/store address translation can be enabled by setting bit SR[DME]. If
994
translation is enabled, it provides load/store effective address to physical
995
address translation and page protection for memory accesses.
996
(((Address Translation,Data)))
997
 
998
[[addr_translation_fig]]
999
.32-bit Address Translation Mechanism using Two-Level Page Table
1000
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1001
 
1002
In OR1200 case, ((page tables)) must be managed by operating system's virtual
1003
memory management subsystem. <> shows address translation
1004
using two-level page table. Refer to <> for one-level page
1005
table address translation as well as for details about address translation
1006
and page table content.
1007
 
1008
((DMMUCR)) and Flush of Entire ((DTLB))
1009
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1010
DMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1011
must be stored in software variable. Flush of entire DTLB must be performed
1012
by software flush of every DTLB entry separately. Software flush is performed
1013
by manually writing  bits from the TLB entries back to PTEs.
1014
 
1015
Page Protection
1016
^^^^^^^^^^^^^^^
1017
After a virtual address is determined to be within a page covered by the
1018
valid PTE, the access is validated by the memory protection mechanism. If
1019
this protection mechanism prohibits the access, a data page fault exception
1020
is generated.
1021
(((Page Protection,Data)))
1022
 
1023
The memory protection mechanism allows selectively granting read access
1024
and write access for both supervisor and user modes. The page protection
1025
mechanism provides protection at all page level granularities.
1026
 
1027
[[protection_attrs_ldst_table]]
1028
.Protection Attributes for Load/Store Accesses
1029
[width="70%",options="header"]
1030
|================================
1031
| Protection attribute  | Meaning
1032
| DTLBWyTR[SREx]        | Enable load operations in supervisor mode to the
1033
  page.
1034
| DTLBWyTR[SWEx]        | Enable store operations in supervisor mode to the
1035
  page.
1036
| DTLBWyTR[UREx]        | Enable load operations in user mode to the page.
1037
| DTLBWyTR[UWEx]        | Enable store operations in user mode to the page.
1038
|================================
1039
 
1040
<> lists page protection attributes defined in
1041
DTLBWyTR pregister. For the individual page appropriate strategy out of
1042
seven possible strategies programmed with the PPI field of the PTE. Because
1043
OR1200 does not implement DMMUPR, translation of PTE[PPI] into suitable set
1044
of protection bits must be performed by software and written into DTLBWyTR.
1045
 
1046
((DTLB)) Entry Reload
1047
^^^^^^^^^^^^^^^^^^^^^
1048
OR1200 does not implement DTLB entry reloads in hardware. Instead software
1049
routine must be used to search page table for correct page table entry (PTE)
1050
and copy it into the DTLB. Software is responsible for maintaining accessed
1051
and dirty bits in the page tables.
1052
 
1053
When LSU computes load/store effective address whose physical address is
1054
not already cached by DTLB, a DTLB miss exception is invoked.
1055
 
1056
DTLB reload routine must load the correct ((PTE)) to correct ((DTLBWyMR))
1057
and ((DTLBWyTR)) register from one of possible DTLB ways.
1058
 
1059
DTLB Entry Invalidation
1060
^^^^^^^^^^^^^^^^^^^^^^^
1061
Special-purpose register DTLBEIR must be written with the effective address
1062
and corresponding DTLB entry will be invalidated in the local DTLB.
1063
 
1064
Locking DTLB Entries
1065
^^^^^^^^^^^^^^^^^^^^
1066
Since all DTLB entry reloads are performed in software, there is no hardware
1067
locking of DTLB entries. Instead it is up to the software reload routine to
1068
avoid replacing some of the entries if so desired.
1069
 
1070
Page Attribute - Dirty (D)
1071
^^^^^^^^^^^^^^^^^^^^^^^^^^
1072
Dirty (D) attribute is not implemented in OR1200 DTLB. It is up to the
1073
operating system to generate dirty attribute bit with page protection
1074
mechanism.
1075
(((Page Attributes,Data)))
1076
 
1077
Page Attribute - Accessed (A)
1078
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1079
Accessed (A) attribute is not implemented in OR1200 DTLB. It is up to the
1080
operating system to generate accessed attribute bit with page protection
1081
mechanism.
1082
(((Page Attributes,Data)))
1083
 
1084
Page Attribute - Weakly Ordered Memory (WOM)
1085
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1086
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1087
memory accesses are serialized and therefore this attribute is not implemented.
1088
(((Page Attributes,Data)))
1089
 
1090
Page Attribute - Write-Back Cache (WBC)
1091
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1092
Write-back cache (WBC) attribute is not implemented as the data cache cannot
1093
be configured at run time to be write-back enabled if write-through strategy
1094
was selected at synthesis-time.
1095
(((Page Attributes,Data)))
1096
 
1097
Page Attribute - Caching-Inhibited (CI)
1098
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1099
Caching-inhibited (CI) attribute is not implemented in OR1200 DTLB. Cached
1100
and uncached regions are divided by bit 30 of data effective address.
1101
(((Page Attributes,Data)))
1102
 
1103
[[data_cached_regions_table]]
1104
.Cached and uncached regions
1105
[width="70%",options="header"]
1106
|===============================
1107
| Effective Address     | Region
1108
| 0x00000000 - 0x3FFFFFFF       | Cached
1109
| 0x40000000 - 0x7FFFFFFF       | Uncached
1110
| 0x80000000 - 0xBFFFFFFF       | Cached
1111
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1112
|===============================
1113
 
1114
Uncached accesses must be performed when I/O registers are memory mapped
1115
and all reads and writes must be always performed directly to the external
1116
interface and not to the data cache.
1117
 
1118
Page Attribute - Cache Coherency (CC)
1119
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1120
Cache coherency (CC) attribute is not needed in OR1200 because it does not
1121
implement support for multiprocessor environments and because data cache
1122
operates only in write-through mode and therefore this attribute is not
1123
implemented.
1124
(((Page Attributes,Data)))
1125
 
1126
((Instruction MMU))
1127
~~~~~~~~~~~~~~~~~~~
1128
Translation Disabled
1129
^^^^^^^^^^^^^^^^^^^^
1130
Instruction fetch address translation can be disabled by clearing bit
1131
SR[IME]. If translation is disabled, then physical address used to access
1132
instruction cache and optionally provided on iwb_ADDR_O, is the same as
1133
instruction fetch effective address.
1134
(((Address Translation,Instruction)))
1135
 
1136
Translation Enabled
1137
^^^^^^^^^^^^^^^^^^^
1138
Instruction fetch address translation can be enabled by setting bit
1139
SR[IME]. If translation is enabled, it provides instruction fetch effective
1140
address to physical address translation and page protection for instruction
1141
fetch accesses.
1142
(((Address Translation,Instruction)))
1143
 
1144
[[addr_translation_rep_fig]]
1145
.32-bit Address Translation Mechanism using Two-Level Page Table
1146
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1147
 
1148
In OR1200 case, page tables must be managed by operating system s virtual
1149
memory management subsystem. <> shows address
1150
translation using two-level page table. Refer to <> for
1151
one-level page table address translation as well as for details about address
1152
translation and page table content.
1153
 
1154
((IMMUCR)) and ((Flush)) of Entire ITLB
1155
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1156
IMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1157
must be stored in software variable. Flush of entire ITLB must be performed
1158
by software flush of every ITLB entry separately. Software flush is performed
1159
by manually writing bits from the TLB entries back to PTEs.
1160
 
1161
Page Protection
1162
^^^^^^^^^^^^^^^
1163
After a virtual address is determined to be within a page covered by the
1164
valid PTE, the access is validated by the memory protection mechanism. If
1165
this protection mechanism prohibits the access, an instruction page fault
1166
exception is generated.
1167
(((Page Protection,Instruction)))
1168
 
1169
The memory protection mechanism allows selectively granting execute access
1170
for both supervisor and user modes. The page protection mechanism provides
1171
protection at all page level granularities.
1172
 
1173
[[protection_attrs_inst_table]]
1174
.Protection Attributes for Instruction Fetch Accesses
1175
[width="70%",options="header"]
1176
|================================
1177
| Protection attribute  | Meaning
1178
| ITLBWyTR[SXEx]        | Enable execute operations in supervisor mode of the
1179
  page.
1180
| ITLBWyTR[UXEx]        | Enable execute operations in user mode of the page.
1181
|================================
1182
 
1183
<> lists page protection attributes defined
1184
in ITLBWyTR pregister. For the individual page appropriate strategy out
1185
of seven possible strategies programmed with PPI field of the PTE. Because
1186
OR1200 does not implement IMMUPR, translation of PTE[PPI] into suitable set
1187
of protection bits must be performed by software and written into ITLBWyTR.
1188
 
1189
((ITLB)) Entry Reload
1190
^^^^^^^^^^^^^^^^^^^^^
1191
OR1200 does not implement ITLB entry reloads in hardware. Instead software
1192
routine must be used to search page table for correct page table entry (PTE)
1193
and copy it into the ITLB. Software is responsible for maintaining accessed
1194
bit in the page tables.
1195
 
1196
When LSU computes instruction fetch effective address whose physical address
1197
is not already cached by ITLB, an ITLB miss exception is invoked.
1198
 
1199
ITLB reload routine must load the correct PTE to correct ITLBWyMR and ITLBWyTR
1200
register from one of possible ITLB ways.
1201
 
1202
ITLB Entry Invalidation
1203
^^^^^^^^^^^^^^^^^^^^^^^
1204
Special-purpose register ITLBEIR must be written with the effective address
1205
and corresponding ITLB entry will be invalidated in the local ITLB.
1206
 
1207
Locking ITLB Entries
1208
^^^^^^^^^^^^^^^^^^^^
1209
Since all ITLB entry reloads are performed in software, there is no hardware
1210
locking of ITLB entries. Instead it is up to the software reload routine to
1211
avoid replacing some of the entries if so desired.
1212
 
1213
Page Attribute - Dirty (D)
1214
^^^^^^^^^^^^^^^^^^^^^^^^^^
1215
Dirty (D) attribute resides in the PTE but it is not used by the IMMU.
1216
(((Page Attributes,Instruction)))
1217
 
1218
Page Attribute - Accessed (A)
1219
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1220
Accessed (A) attribute is not implemented in OR1200 ITLB. It is up to the
1221
operating system to generate accessed attribute bit with page protection
1222
mechanism.
1223
(((Page Attributes,Instruction)))
1224
 
1225
Page Attribute - Weakly Ordered Memory (WOM)
1226
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1227
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1228
instruction fetch accesses are serialized and therefore this attribute is
1229
not implemented.
1230
(((Page Attributes,Instruction)))
1231
 
1232
Page Attribute - Write-Back Cache (WBC)
1233
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1234
Write-back cache (WBC) attribute resides in the PTE but it is not used by
1235
the IMMU.
1236
(((Page Attributes,Instruction)))
1237
 
1238
Page Attribute - Caching-Inhibited (CI)
1239
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1240
Caching-inhibited (CI) attribute is not implemented in OR1200 ITLB. Cached
1241
and uncached regions are divided by bit 30 of instruction effective address.
1242
(((Page Attributes,Instruction)))
1243
 
1244
[[inst_cached_regions_table]]
1245
.Cached and uncached regions
1246
[width="70%",options="header"]
1247
|===============================
1248
| Effective Address     | Region
1249
| 0x00000000 - 0x3FFFFFFF       | Cached
1250
| 0x40000000 - 0x7FFFFFFF       | Uncached
1251
| 0x80000000 - 0xBFFFFFFF       | Cached
1252
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1253
|===============================
1254
 
1255
Page Attribute - Cache Coherency (CC)
1256
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1257
Cache coherency (CC) attribute resides in the PTE but it is not used by
1258
the IMMU.
1259
(((Page Attributes,Instruction)))
1260
 
1261
((Programmable Interrupt Controller))
1262
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1263
PICMR special-purpose register is used to mask or unmask up to 30 programmable
1264
interrupt sources. PICPR special-purpose register is used to assign low or
1265
high priority to maximum of 30 interrupt sources.
1266
 
1267
PICSR special-purpose register is used to determine status of each interrupt
1268
input. Bits in PICSR represent status of the interrupt inputs and the
1269
actual interrupt must be cleared in the device that is the source of a
1270
pending interrupt.
1271
 
1272
The ((PIC)) implementation in the OR1200  differs from the architecture
1273
specification. The PIC instead offers a latched level-sensitive interrupt.
1274
 
1275
Once an interrupt line is latched (i.e. its value appears in PICSR), no
1276
new interrupts can be triggered for that line until its bit in PICSR is
1277
cleared. The usual sequence for an interrupt handler is then as follows.
1278
 
1279
. Peripheral asserts interrupt, which is latched and triggers handler.
1280
. Handler processes interrupt.
1281
. Handler notifies peripheral that the interrupt has been processed (typically
1282
  via a memory mapped register).
1283
. Peripheral deasserts interrupt.
1284
. Handler clears corresponding bit in PICSR and returns.
1285
 
1286
It is assumed that the peripheral will de-assert its interrupt promptly
1287
(within 1-2 cycles). Otherwise on exiting the interrupt handler, having
1288
cleared PICSR, the level sensitive interrupt will immediately retrigger.
1289
 
1290
((Tick Timer))
1291
~~~~~~~~~~~~~~
1292
Tick timer facility is enabled with TTMR[M]. TTCR is incremented with each
1293
clock cycle and a high priority interrupt can be asserted whenever lower 28
1294
bits of TTCR match TTMR[TP] and TTMR[IE] is set.
1295
 
1296
TTCR restarts counting from zero when match event happens and TTMR[M] is
1297
0x1. If TTMR[M] is 0x2, TTCR is stoped when match event happens and TTCR
1298
must be changed to start counting again. When TTMR[M] is 0x3, TTCR keeps
1299
counting even when match event happens.
1300
 
1301
((Power Management))
1302
~~~~~~~~~~~~~~~~~~~~
1303
((Clock Gating)) and Frequency Changing Versus CPU Stalling
1304
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1305
If system doesn t support clock gating and if changing clock frequency in
1306
slow down mode is not possible, CPU can be stalled for certain number of
1307
clock cycles. This is much lower benefit on power consumption however it
1308
still reduces power consumption.
1309
 
1310
Slow Down Mode
1311
^^^^^^^^^^^^^^
1312
Slow down mode is software controlled with the 4-bit value in PMR[SDF]. Lower
1313
value specifies higher expected performance from the processor core. Usually
1314
PMR[SDF] is dynamically set by the operating system s idle routine, that
1315
monitors the usage of the processor core.
1316
(((Mode,Slow Down)))
1317
 
1318
PMR[SDF] is broadcast on +pm_clksd+. External clock generator should adjust
1319
clock frequency according to the value of +pm_clksd+. Exact slow down factors
1320
are not defined but 0xF should go all the way down to 32.768 KHz.
1321
 
1322
With +pm_clksd+ equal to 0xF, +pm_lvolt+ is asserted. This is an indication for
1323
the external power supply to lower the voltage.
1324
 
1325
Doze Mode
1326
^^^^^^^^^
1327
To switch to doze mode, software should set the PMR[DME]. Once an interrupt
1328
is received by the programmable interrupt controller (PIC), +pm_wakeup+
1329
is asserted and external clock generation circuitry should enable all
1330
clocks. Once clocks are running RISC is switched back again to the normal
1331
mode and PMR[DME] is cleared.
1332
(((Mode,Doze)))
1333
 
1334
When doze mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1335
+pm_immu_gate+ and +pm_cpugate+ are asserted. As a result all clocks except
1336
+clk_tt+ should be gated by external clock generation circuitry.
1337
 
1338
Sleep Mode
1339
^^^^^^^^^^
1340
To switch to sleep mode, software should set the PMR[SME]. Once an interrupt
1341
is received by the programmable interrupt controller (PIC), +pm_wakeup+ is
1342
asserted and external clock generation should enable all clocks. Once clocks
1343
are running, RISC is switched back again to the normal mode and PMR[SME]
1344
is cleared.
1345
(((Mode,Sleep)))
1346
 
1347
When sleep mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1348
+pm_immu_gate+, +pm_cpu_gate+ and +pm_tt_gate+ are asserted. As a result
1349
all clocks including +clk_tt+ should be gated by external clock generation
1350
circuitry.
1351
 
1352
In sleep mode, +pm_lvolt+ is asserted. This is an indication for the external
1353
power supply to lower the voltage.
1354
 
1355
Clock Gating
1356
^^^^^^^^^^^^
1357
((Clock gating)) feature is not implemented in OR1200 power management.
1358
 
1359
Disabled Units Force Clock Gating
1360
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1361
Units that are disabled in special-purpose register SR, have their clock
1362
gate signals asserted. Cleared bits SR[DCE], SR[ICE], SR[DME] and SR[IME]
1363
directly force assertion of +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+
1364
and +pm_immu_gate+.
1365
 
1366
((Debug Unit))
1367
~~~~~~~~~~~~~~
1368
Debug unit can be controlled through development interface or it can operate
1369
independently programmed and handled by the RISC s resident debug software.
1370
 
1371
((Watchpoints))
1372
^^^^^^^^^^^^^^^
1373
OR1200 debug unit does not implement OR12000 architecture watchpoints.
1374
 
1375
((Breakpoint)) Exception
1376
^^^^^^^^^^^^^^^^^^^^^^^^
1377
Which breakpointDMR2[WGB] bits specify which watchpoints invoke breakpoint
1378
exception. By invoking breakpoint exception, target resident debugger can
1379
be built.
1380
 
1381
Breakpoint is broadcast on development interface on +dbg_bp_o+.
1382
 
1383
((Development Interface))
1384
~~~~~~~~~~~~~~~~~~~~~~~~~
1385
NOTE: The information in this section is to be reviewed. It is the author's
1386
opinion that the debug interface is now largely provided by the SPR mappings,
1387
and no special sideband functions exist aside from stalling and resetting
1388
the core.
1389
 
1390
An additional _development and debug interface IP_ core may be used to connect
1391
OpenRISC 1200 to standard debuggers using IEEE.1149.1 (JTAG) protocol.
1392
 
1393
((Debugging)) Through ((Development Interface))
1394
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1395
The DSR special-purpose register specifies which exceptions cause the core
1396
to stop the execution of the exception handler and turn over control to
1397
development interface. It can be programmed by the resident debug software
1398
or by the development interface.
1399
 
1400
The DRR special-purpose register is specifies which event caused the core to
1401
stop the execution of program flow and turned over control to the development
1402
interface. It should be cleared by the resident debug software or by the
1403
development interface.
1404
 
1405
The DIR special-purpose register is not implemented.
1406
 
1407
Reading PC, Load/Store EA, Load Data, Store Data, Instruction
1408
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1409
Crucial information like ((program counter)) (PC), load/store effective
1410
address (LSEA), load data, store data and current instruction in execution
1411
pipeline can be asynchronously read through the development interface.
1412
 
1413
[[dev_commands_table]]
1414
.Development Interface Operation Commands
1415
[width="70%",options="header"]
1416
|========================
1417
| dbg_op_i[2:0] | Meaning
1418
| 0x0           | Reading Program Counter (PC)
1419
| 0x1           | Reading Load/Store Effective Address
1420
| 0x2           | Reading Load Data
1421
| 0x3           | Reading Store Data
1422
| 0x4           | Reading SPR
1423
| 0x5           | Writing SPR
1424
| 0x6           | Reading Instruction in Execution Pipeline
1425
| 0x7           | Reserved
1426
|========================
1427
 
1428
<> lists operation commands that control what is read
1429
or written through development interface. All reads except reads and writes
1430
of SPRs are asynchronous.
1431
 
1432
Reading and Writing SPRs Through Development Interface
1433
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1434
For reads and write to SPRs +dbg_op_i+ must be set to 0x4 and 0x5,
1435
respectively.
1436
 
1437
[[dev_interface_cycles_fig]]
1438
.Development Interface Cycles
1439
image::img/dev_interface_cycles.gif[scaledwidth="70%",align="center"]
1440
 
1441
<> shows development interface cycles. Writes must
1442
be synchronous to the main RISC clock positive edge and should take one clock
1443
cycle. Reads must take two clock cycles because access to synchronous cache
1444
lines or to TLB entries introduces one clock cycle of delay.
1445
 
1446
If required, external debugger can stop the CPU core by asserting
1447
+dbg_stall_i+. This way it can have enough time to read all interesting
1448
registers from the RISC or guarantee that writes into SPRs are performed
1449
without RISC writing to the same registers.
1450
 
1451
Tracking ((Data Flow))
1452
^^^^^^^^^^^^^^^^^^^^^^
1453
An external debugger can monitor and record data flow inside the RISC for
1454
debugging purposes and profiling analysis. This is accomplished by monitoring
1455
status of the load/store unit, load/store effective address and load/store
1456
data, all available at the development interface.
1457
 
1458
[[status_ldst_unit_table]]
1459
.Status of the Load/Store Unit
1460
[width="70%",options="header"]
1461
|============================================================
1462
| dbg_lss_o[3:0]        | Load/Store Instruction in Execution
1463
| 0x0   | No load/store instruction in execution
1464
| 0x1   | Reserved for load doubleword
1465
| 0x2   | Load byte and zero extend
1466
| 0x3   | Load byte and sign extend
1467
| 0x4   | Load halfword and zero extend
1468
| 0x5   | Load halfword and sign extend
1469
| 0x6   | Load singleword and zero extend
1470
| 0x7   | Load singleword and sign extend
1471
| 0x8   | Reserved for store doubleword
1472
| 0x9   | Reserved
1473
| 0xA   | Store byte
1474
| 0xB   | Reserved
1475
| 0xC   | Store halfword
1476
| 0xD   | Reserved
1477
| 0xE   | Store singleword
1478
| 0xF   | Reserved
1479
|============================================================
1480
 
1481
External trace buffer can capture all interesting data flow
1482
events by analyzing status of the load/store unit available on
1483
+dbg_lss_o+. <> lists different status encoding for
1484
the load/store unit.
1485
 
1486
Tracking ((Program Flow))
1487
^^^^^^^^^^^^^^^^^^^^^^^^^
1488
An external debugger can monitor and record program flow inside the RISC
1489
for debugging purposes and profiling analysis. This is accomplished by
1490
monitoring status of the instruction unit, PC and fetched instruction word,
1491
all available at the development interface.
1492
 
1493
[[status_inst_unit_table]]
1494
.Status of the Instruction Unit
1495
[width="70%",options="header"]
1496
|=========================================
1497
| dbg_is_o[1:0] | Instruction Fetch Status
1498
| 0x0   | No instruction fetch in progress
1499
| 0x1   | Normal instruction fetch
1500
| 0x2   | Executing branch instruction
1501
| 0x3   | Fetching instruction in delay slot
1502
|=========================================
1503
 
1504
External trace buffer can capture all interesting program flow
1505
events by analyzing status of the instruction unit available on
1506
+dbg_is_o+. <> lists different status encoding for
1507
the instruction unit.
1508
 
1509
Triggering ((External Watchpoint Event))
1510
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1511
<> shows how development interface can assert
1512
+dbg_ewt_I+ and cause watchpoint event. If programmed, external watchpoint
1513
event will cause a breakpoint exception.
1514
 
1515
[[watchpoint_trigger_fig]]
1516
.Assertion of External Watchpoint Trigger
1517
image::img/watchpoint_trigger.gif[scaledwidth="70%",align="center"]
1518
 
1519
((Registers))
1520
-------------
1521
This section describes all registers inside the OR1200 core. Shifting _GRP_
1522
number 11 bits left and adding _REG_ number computes the address of each
1523
special-purpose register. All registers are 32 bits wide from software
1524
perspective. _USER MODE_ and _SUPV MODE_ specify the valid access types for
1525
each register in user mode and supervisor mode of operation. R/W stands for
1526
read and write access and R stands for read only access.
1527
 
1528
((Registers list))
1529
~~~~~~~~~~~~~~~~~~
1530
[[regs_table]]
1531
.List of All Registers
1532
[width="95%",options="header"]
1533
|============================================================================
1534
| Grp # | Reg # | Reg Name      | USER MODE     | SUPV MODE     | Description
1535
| 0     | 0     | ((VR))        | -             | R     | Version Register
1536
| 0     | 1     | ((UPR))       | -             | R     | Unit Present Register
1537
| 0     | 2     | ((CPUCFGR))   | -             | R     | CPU Configuration Register
1538
| 0     | 3     | ((DMMUCFGR))  | -             | R     | Data MMU Configuration Register
1539
| 0     | 4     | ((IMMUCFGR))  | -             | R     | Instruction MMU Configuration Register
1540
| 0     | 5     | ((DCCFGR))    | -             | R     | Data Cache Configuration Register
1541
| 0     | 6     | ((ICCFGR))    | -             | R     | Instruction Cache Configuration Register
1542
| 0     | 7     | ((DCFGR))     | -             | R     | Debug Configuration Register
1543
| 0     | 16    | ((PC))        | -             | R/W   | PC mapped to SPR space
1544
| 0     | 17    | ((SR))        | -             | R/W   | Supervision Register
1545
| 0     | 20    | ((FPCSR))     | -             | R/W   | FP Control Status Register
1546
| 0     | 32    | ((EPCR0))     | -             | R/W   | Exception PC Register
1547
| 0     | 48    | ((EEAR0))     | -             | R/W   | Exception EA Register
1548
| 0     | 64    | ((ESR0))      | -             | R/W   | Exception SR Register
1549
| 0     | 1024-1055     | ((GPR0-GPR31))        | -     | R/W   | GPRs mapped to SPR space
1550
| 1     | 2             | ((DTLBEIR))   | -     | W     | Data TLB Entry Invalidate Register
1551
| 1     | 1024-1151     | ((DTLBW0MR0-DTLBW0MR127))     | -     | R/W   | Data TLB Match Registers Way 0
1552
| 1     | 1536-1663     | ((DTLBW0TR0-DTLBW0TR127))     | -     | R/W   | Data TLB Translate Registers Way 0
1553
| 2     | 2             | ((ITLBEIR))   | -     | W     | Instruction TLB Entry Invalidate Register
1554
| 2     | 1024-1151     | ((ITLBW0MR0-ITLBW0MR127))     | -     | R/W   | Instruction TLB Match Registers Way 0
1555
| 2     | 1536-1663     | ((ITLBW0TR0-ITLBW0TR127))     | -     | R/W   | Instruction TLB Translate Registers Way 0
1556
| 3     | 0     | ((DCCR))      | -             | R/W   | DC Control Register
1557
| 3     | 2     | ((DCBFR))     | W             | W     | DC Block Flush Register
1558
| 3     | 3     | ((DCBIR))     | W             | W     | DC Block Invalidate Register
1559
| 3     | 4     | ((DCBWR))     | W             | W     | DC Block Write-back register
1560
| 4     | 0     | ((ICCR))      | -             | R/W   | IC Control Register
1561
| 4     | 256   | ((ICBIR))     | W             | W     | IC Block Invalidate Register
1562
| 5     | 256   | ((MACLO))     | R/W           | R/W   | MAC Low
1563
| 5     | 257   | ((MACHI))     | R/W           | R/W   | MAC High
1564
| 6     | 16    | ((DMR1))      | -             | R/W   | Debug Mode Register 1
1565
| 6     | 17    | ((DMR2))      | -             | R/W   | Debug Mode Register 2
1566
| 6     | 20    | ((DSR))       | -             | R/W   | Debug Stop Register
1567
| 6     | 21    | ((DRR))       | -             | R/W   | Debug Reason Register
1568
| 8     | 0     | ((PMR))       | -             | R/W   | Power Management Register
1569
| 9     | 0     | ((PICMR))     | -             | R/W   | PIC Mask Register
1570
| 9     | 2     | ((PICSR))     | -             | R/W   | PIC Status Register
1571
| 10    | 0     | ((TTMR))      | -             | R/W   | Tick Timer Mode Register
1572
| 10    | 1     | ((TTCR))      | R*            | R/W   | Tick Timer Count Register
1573
|============================================================================
1574
 
1575
<> lists all OpenRISC 1000 special-purpose registers implemented
1576
in OR1200. Registers VR and UPR are described below. For description of
1577
other registers refer to <>.
1578
 
1579
Register VR description
1580
~~~~~~~~~~~~~~~~~~~~~~~
1581
Special-purpose register VR identifies the version (model) and revision
1582
level of the OpenRISC 1000 processor. It also specifies possible standard
1583
template on which this implementation is based.
1584
(((Register,VR)))
1585
 
1586
[[vr_reg_table]]
1587
.VR Register
1588
[width="95%",options="header"]
1589
|============================================================
1590
| Bit # | Access        | Reset | Short Name    | Description
1591
| 5:0   | R     | Revision      | REV           | Revision number of this document.
1592
| 15:6  | R     | 0x0           | -             | Reserved
1593
| 23:16 | R     | 0x00          | CFG           | Configuration should be read from UPR and configuration registers
1594
| 31:24 | R     | 0x12          | VER           | Version number for OR1200 is fixed at 0x1200.
1595
|============================================================
1596
 
1597
Register UPR description
1598
~~~~~~~~~~~~~~~~~~~~~~~~
1599
Special-purpose register UPR identifies the units present in the processor. It
1600
has a bit for each implemented unit or functionality. Lower sixteen bits
1601
identify present units defined in the OpenRISC 1000 architecture. Upper
1602
sixteen bits define present custom units.
1603
(((Register,UPR)))
1604
 
1605
[[upr_reg_table]]
1606
.UPR Register
1607
[width="95%",options="header"]
1608
|============================================================
1609
| Bit # | Access        | Reset | Short Name    | Description
1610
| 0     | R             | 1     | UP            | UPR present
1611
| 1     | R             | 1     | DCP           | Data cache present[†]
1612
| 2     | R             | 1     | ICP           | Instruction cache present[†]
1613
| 3     | R             | 1     | DMP           | Data MMU present[†]
1614
| 4     | R             | 1     | IMP           | Instruction MMU present[†]
1615
| 5     | R             | 1     | MP            | MAC present[†]
1616
| 6     | R             | 1     | DUP           | Debug unit present[†]
1617
| 7     | R             | 0     | PCUP          | Performance counters unit not present[†]
1618
| 8     | R             | 1     | PMP           | Power Management Present[†]
1619
| 9     | R             | 1     | PICP          | Programmable interrupt controller present
1620
| 10    | R             | 1     | TTP           | Tick timer present
1621
| 11    | R             | 1     | FPP           | Floating point present[†]
1622
| 23:12 | R             | X     | -             | Reserved
1623
| 31:24 | R             | 0xXXXX| CUP           | The user of the OR1200 core adds custom units.
1624
|============================================================
1625
[†]: if enabled at synthesis time
1626
 
1627
Register CPUCFGR description
1628
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1629
Special-purpose register CPUCFGR identifies the capabilities and configuration
1630
of the CPU.
1631
(((Register,CPUCFGR)))
1632
 
1633
[[cpucfgr_reg_table]]
1634
.CPUCFGR Register
1635
[width="95%",options="header"]
1636
|============================================================
1637
| Bit # | Access        | Reset | Short Name    | Description
1638
| 3:0   | R             | 0x0   | NSGF          | Zero number of shadow GPR files
1639
| 4     | R             | 0     | HGF           | No half GPR files[†]
1640
| 5     | R             | 1     | OB32S         | ORBIS32 supported
1641
| 6     | R             | 0     | OB64S         | ORBIS64 not supported
1642
| 7     | R             | 1     | OF32S         | ORFPX32 supported[‡]
1643
| 8     | R             | 0     | OF64S         | ORFPX64 not supported
1644
| 9     | R             | 0     | OV64S         | ORVDX64 not supported
1645
|============================================================
1646
[†]: If disabled at synthesis time
1647
 
1648
[‡]: If FPU enabled at synthesis time
1649
 
1650
Register DMMUCFGR description
1651
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1652
Special-purpose register DMMUCFGR identifies the capabilities and configuration
1653
of the DMMU.
1654
(((Register,DMMUCFGR)))
1655
 
1656
[[dmmucfgr_reg_table]]
1657
.DMMUCFGR Register
1658
[width="95%",options="header"]
1659
|============================================================
1660
| Bit # | Access        | Reset | Short Name    | Description
1661
| 1:0   | R             | 0x0   | NTW           | One DTLB way
1662
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 DTLB sets
1663
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1664
| 8     | R             | 0     | CRI           | No DMMU control register implemented
1665
| 9     | R             | 0     | PRI           | No protection register implemented
1666
| 10    | R             | 1     | TEIRI         | DTLB entry invalidate register implemented
1667
| 11    | R             | 0     | HTR           | No hardware DTLB reload
1668
|============================================================
1669
 
1670
Register IMMUCFGR description
1671
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1672
Special-purpose register IMMUCFGR identifies the capabilities and configuration
1673
of the IMMU.
1674
(((Register,IMMUCFGR)))
1675
 
1676
[[immucfgr_reg_table]]
1677
.IMMUCFGR Register
1678
[width="95%",options="header"]
1679
|============================================================
1680
| Bit # | Access        | Reset | Short Name    | Description
1681
| 1:0   | R             | 0x0   | NTW           | One ITLB way
1682
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 ITLB sets
1683
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1684
| 8     | R             | 0     | CRI           | No IMMU control register implemented
1685
| 9     | R             | 0     | PRI           | No protection register implemented
1686
| 10    | R             | 1     | TEIRI         | ITLB entry invalidate register implemented
1687
| 11    | R             | 0     | HTR           | No hardware ITLB reload
1688
|============================================================
1689
 
1690
Register DCCFGR description
1691
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1692
Special-purpose register DCCFGR identifies the capabilities and configuration
1693
of the data cache.
1694
(((Register,DCCFGR)))
1695
 
1696
[[dccfgr_reg_table]]
1697
.DCCFGR Register
1698
[width="95%",options="header"]
1699
|============================================================
1700
| Bit # | Access        | Reset | Short Name    | Description
1701
| 2:0   | R             | 0x0   | NCW           | One DC way
1702
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 DC sets
1703
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1704
| 8     | R             | 0     | CWS           | Cache write-through strategy[†]
1705
| 9     | R             | 1     | CCRI          | DC control register implemented
1706
| 10    | R             | 1     | CBIRI         | DC block invalidate register implemented
1707
| 11    | R             | 0     | CBPRI         | DC block prefetch register not implemented
1708
| 12    | R             | 0     | CBLRI         | DC block lock register not implemented
1709
| 13    | R             | 1     | CBFRI         | DC block flush register implemented
1710
| 14    | R             | 1     | CBWBRI        | DC block write-back register  implemented[‡]
1711
|============================================================
1712
[†]: If disabled at synthesis time
1713
 
1714
[‡]: If FPU enabled at synthesis time
1715
 
1716
Register ICCFGR description
1717
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1718
Special-purpose register ICCFGR identifies the capabilities and configuration
1719
of the instruction cache.
1720
(((Register,ICCFGR)))
1721
 
1722
[[iccfgr_reg_table]]
1723
.ICCFGR Register
1724
[width="95%",options="header"]
1725
|============================================================
1726
| Bit # | Access        | Reset | Short Name    | Description
1727
| 2:0   | R             | 0x0   | NCW           | One IC way
1728
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 IC sets
1729
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1730
| 8     | R             | 0     | CWS           | Cache write-through strategy
1731
| 9     | R             | 1     | CCRI          | IC control register implemented
1732
| 10    | R             | 1     | CBIRI         | IC block invalidate register implemented
1733
| 11    | R             | 0     | CBPRI         | IC block prefetch register not implemented
1734
| 12    | R             | 0     | CBLRI         | IC block lock register not implemented
1735
| 13    | R             | 1     | CBFRI         | IC block flush register implemented
1736
| 14    | R             | 0     | CBWBRI        | IC block write-back register not implemented
1737
|============================================================
1738
 
1739
Register DCFGR description
1740
~~~~~~~~~~~~~~~~~~~~~~~~~~
1741
Special-purpose register DCFGR identifies the capabilities and configuration
1742
of the debut unit.
1743
(((Register,DCFGR)))
1744
 
1745
[[dcfgr_reg_table]]
1746
.DCFGR Register
1747
[width="95%",options="header"]
1748
|============================================================
1749
| Bit # | Access        | Reset | Short Name    | Description
1750
| 3:0   | R             | 0x0   | NDP           | Zero DVR/DCR pairs[†]
1751
| 4     | R             | 0     | WPCI          | Watchpoint counters not implemented
1752
|============================================================
1753
[†]: If hardware breakpoints disabled at synthesis time
1754
 
1755
((IO ports))
1756
------------
1757
OR1200 IP core has several interfaces. <> below shows
1758
all interfaces:
1759
 
1760
* Instruction and data WISHBONE host interfaces
1761
* Power management interface
1762
* Development interface
1763
* Interrupts interface
1764
 
1765
[[core_interfaces_fig]]
1766
.Core's Interfaces
1767
image::img/core_interfaces.gif[scaledwidth="50%",align="center"]
1768
 
1769
Instruction WISHBONE Master Interface
1770
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1771
OR1200 has two master WISHBONE Rev B compliant interfaces. Instruction
1772
interface is used to connect OR1200 core to memory subsystem for purpose of
1773
fetching instructions or instruction cache lines.
1774
 
1775
[[inst_wb_master_table]]
1776
.Instruction WISHBONE Master Interface' Signals
1777
[width="95%",options="header"]
1778
|====================================================
1779
| Port          | Width | Direction     | Description
1780
| ((iwb_CLK_I)) | 1     | Input         | Clock input
1781
| ((iwb_RST_I)) | 1     | Input         | Reset input
1782
| ((iwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1783
| ((iwb_ADR_O)) | 32    | Outputs       | Address outputs
1784
| ((iwb_DAT_I)) | 32    | Inputs        | Data inputs
1785
| ((iwb_DAT_O)) | 32    | Outputs       | Data outputs
1786
| ((iwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1787
| ((iwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1788
| ((iwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1789
| ((iwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as iwb_ERR_I.
1790
| ((iwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1791
| ((iwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1792
|====================================================
1793
 
1794
Data WISHBONE Master Interface
1795
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1796
OR1200 has two master WISHBONE Rev B compliant interfaces. Data interface
1797
is used to connect OR1200 core to external peripherals and memory subsystem
1798
for purpose of reading and writing data or data cache lines.
1799
 
1800
[[data_wb_master_table]]
1801
.Data WISHBONE Master Interface' Signals
1802
[width="95%",options="header"]
1803
|====================================================
1804
| Port          | Width | Direction     | Description
1805
| ((dwb_CLK_I)) | 1     | Input         | Clock input
1806
| ((dwb_RST_I)) | 1     | Input         | Reset input
1807
| ((dwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1808
| ((dwb_ADR_O)) | 32    | Outputs       | Address outputs
1809
| ((dwb_DAT_I)) | 32    | Inputs        | Data inputs
1810
| ((dwb_DAT_O)) | 32    | Outputs       | Data outputs
1811
| ((dwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1812
| ((dwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1813
| ((dwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1814
| ((dwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as dwb_ERR_I.
1815
| ((dwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1816
| ((dwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1817
|====================================================
1818
 
1819
System Interface
1820
~~~~~~~~~~~~~~~~
1821
System interface connects reset, clock and other system signals to the
1822
OR1200 core.
1823
 
1824
[[sys_interface_table]]
1825
.System Interface Signals
1826
[width="95%",options="header"]
1827
|====================================================
1828
| Port          | Width | Direction     | Description
1829
| ((Rst))       | 1     | Input         | Asynchronous reset
1830
| ((clk_cpu))   | 1     | Input         | Main clock input to the RISC
1831
| ((clk_dc))    | 1     | Input         | Data cache clock
1832
| ((clk_ic))    | 1     | Input         | Instruction cache clock
1833
| ((clk_dmmu))  | 1     | Input         | Data MMU clock
1834
| ((clk_immu))  | 1     | Input         | Instruction MMU clock
1835
| ((clk_tt))    | 1     | Input         | Tick timer clock
1836
|====================================================
1837
 
1838
Development Interface
1839
~~~~~~~~~~~~~~~~~~~~~
1840
Development interface connects external development port to the RISC s internal
1841
debug facility. Debug facility allows control over program execution inside
1842
RISC, setting of breakpoints and watchpoints, and tracing of instruction
1843
and data flows.
1844
 
1845
[[dev_interface_table]]
1846
.Development Interface
1847
[width="95%",options="header"]
1848
|====================================================
1849
| Port          | Width | Direction     | Description
1850
| ((dbg_dat_o)) | 32    | Output        | Transfer of data from RISC to external development interface
1851
| ((dbg_dat_i)) | 32    | Input         | Transfer of data from external development interface to RISC
1852
| ((dbg_adr_i)) | 32    | Input         | Address of special-purpose register to be read or written
1853
| ((dbg_op_I))  | 3     | Input         | Operation select for development interface
1854
| ((dbg_lss_o)) | 4     | Output        | Status of load/store unit
1855
| ((dbg_is_o))  | 2     | Output        | Status of instruction fetch unit
1856
| ((dbg_wp_o))  | 11    | Output        | Status of watchpoints
1857
| ((dbg_bp_o))  | 1     | Output        | Status of the breakpoint
1858
| ((dbg_stall_i))       | 1     | Input | Stalls RISC CPU core
1859
| ((dbg_ewt_i)) | 1     | Input         | External watchpoint trigger
1860
|====================================================
1861
 
1862
Power Management Interface
1863
~~~~~~~~~~~~~~~~~~~~~~~~~~
1864
Power management interface provides signals for interfacing RISC core with
1865
external power management circuitry. External power management circuitry is
1866
required to implement functions that are technology specific and cannot be
1867
implemented inside OR1200 core.
1868
 
1869
[[pow_mgmt_interface_table]]
1870
.Power Management Interface
1871
[width="95%",options="header"]
1872
|============================================================================
1873
| Port                  | Width | Direction     | Generation            | Description
1874
| ((pm_clksd))          | 4     | Output        | Static (in SW)        | Slow down outputs that control reduction of RISC clock frequency
1875
| ((pm_cpustall))       | 1     | Input         | -                     | Synchronous stall of the RISC’s CPU core
1876
| ((pm_dc_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of data cache clock
1877
| ((pm_ic_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of instruction cache clock
1878
| ((pm_dmmu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of data MMU clock
1879
| ((pm_immu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of instruction MMU clock
1880
| ((pm_tt_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of tick timer clock
1881
| ((pm_cpu_gate))       | 1     | Output        | Static (in SW)        | Gating of main CPU clock
1882
| ((pm_wakeup))         | 1     | Output        | Dynamic (in HW)       | Activate all clocks
1883
| ((pm_lvolt))          | 1     | Output        | Static (in SW)        | Lower voltage
1884
|============================================================================
1885
 
1886
Interrupt Interface
1887
~~~~~~~~~~~~~~~~~~~
1888
Interrupt interface has interrupt inputs for interfacing external peripheral
1889
s interrupt outputs to the RISC core. All interrupt inputs are evaluated on
1890
positive edge of main RISC clock.
1891
 
1892
[[interrupt_interface_table]]
1893
.Interrupt Interface
1894
[width="95%",options="header"]
1895
|============================================================
1896
| Port          | Width         | Direction     | Description
1897
| ((pic_ints))  | PIC_INTS      | Input         | External interrupts
1898
|============================================================
1899
 
1900
 
1901
 
1902
[appendix]
1903
Core HW Configuration
1904
=====================
1905
(((Hardware,Configuration)))
1906
This section describes parameters that are set by the user of the core and
1907
define configuration of the core. Parameters must be set by the user before
1908
actual use of the core in simulation or synthesis.
1909
 
1910
[[core_hw_conf_table]]
1911
.Core HW configuration table
1912
[width="95%",options="header"]
1913
|============================================================
1914
| Variable Name | Range         | Default       | Description
1915
| ((EADDR_WIDTH))       | 32    | 32    | Effective address width
1916
| ((VADDR_WIDTH))       | 32    | 32    | Virtual address width
1917
| ((PADDR_WIDTH))       | 24 - 36| 32   | Physical address width
1918
| ((DATA_WIDTH))        | 32    | 32    | Data width / Operation width
1919
| ((DC_IMPL))   | 0 - 1         | 1     | Data cache implementation
1920
| ((DC_SETS))   | 256-1024      | 512   | Data cache number of sets
1921
| ((DC_WAYS))   | 1             | 1     | Data cache number of ways
1922
| ((DC_LINE))   | 16 - 32       | 16    | Data cache line size
1923
| ((IC_IMPL))   | 0 - 1         | 1     | Instruction cache implementation
1924
| ((IC_SETS))   | 32-1024       | 512   | Instruction cache number of sets
1925
| ((IC_WAYS))   | 1             | 1     | Instruction cache number of ways
1926
| ((IC_LINE))   | 16-32         | 16    | Instruction cache line size in bytes
1927
| ((DMMU_IMPL)) | 0 - 1         | 1     | Data MMU implementation
1928
| ((DTLB_SETS)) | 64            | 64    | Data TLB number of sets
1929
| ((DTLB_WAYS)) | 1             | 1     | Data TLB number of ways
1930
| ((IMMU_IMPL)) | 0 - 1         | 1     | Instruction MMU implementation
1931
| ((ITLB_SETS)) | 64            | 64    | Instruction TLB number of sets
1932
| ((ITLB_WAYS)) | 1             | 1     | Instruction TLB number of ways
1933
| ((PIC_INTS))  | 2 - 32        | 20    | Number of interrupt inputs
1934
|============================================================
1935
 
1936
:numbered!:
1937
 
1938
[bibliography]
1939
((Bibliography))
1940
================
1941
[bibliography]
1942
- [[[or1000_manual]]] Damjan Lampret et al. 'OpenRISC 1000 System Architecture
1943
  Manual'. 2004.
1944
 
1945
[index]
1946
Index
1947
=====
1948
// The index is generated automatically by the DocBook toolchain.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.