OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [or1200/] [doc/] [openrisc1200_spec.txt] - Blame information for rev 853

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 645 julius
OpenRISC 1200 IP Core Specification (Preliminary Draft)
2
=======================================================
3
:doctype: book
4
 
5
////
6
Revision history
7
Note: When adding new entries, strictly follow the format of the existing ones.
8
 
9
Rev.    | Date          | Author        | Description
10
__vstart__
11
v0.1    | 28/3/01       | Damjan Lampret        | First Draft
12
 
13
v0.2    | 16/4/01       | Damjan Lampret        | First time published
14
 
15
v0.3    | 29/4/01       | Damjan Lampret        | All chapters almost
16
finished. Some bugs hidden waiting for an update. Awaiting feedback.
17
 
18
v0.4    | 16/5/01       | Damjan Lampret        | Synchronization with
19
OR1K Arch Manual
20
 
21
v0.5    | 24/5/01       | Damjan Lampret        | Fixed bugs
22
 
23
v0.6    | 28/5/01       | Damjan Lampret        | Changed some SPR addresses.
24
 
25
v0.7    | 06/9/01       | Damjan Lampret        | Simplified debug unit.
26
 
27
v0.8    | 30/08/10      | Julius Baxter         | Adding information about FPU
28
implementation, data cache write-back capability. PIC behavior update.
29
Instruction list update. Update of bits in config registers, bringing into
30
line with latest OR1200 - not entirely complete.
31
 
32
v0.9    | 12/9/10       | Julius Baxter         | Clarified supported parts of
33
OR1K instruction set. Updated core clock input information.
34
Fixed up reference to instruction execute stage cycle table.
35
Added divide cycles to execute stage cycle table.
36
 
37
0.10    | 1/11/10       | Julius Baxter         | Added FF1/FL1 instructions to
38
supported instructions table.
39
 
40
v0.11   | 19/1/11       | Julius Baxter | Cache information update.
41
Wishbone behavior clarification. Serial integer multiply/divide update.
42
Reset address clarification
43 647 julius
 
44
v0.12   | 13/9/11       | Julius Baxter | Addition of extension instructions
45
l.extbs, l.extbz, l.exths, l.exthz, l.extws and l.extwz. Range exception
46
support, overflow bit in supervision register.
47 809 julius
 
48 808 julius
v0.13   | 27/5/12       | Julius Baxter | Addition of support for delay-slot
49
exception indicator bit in supervision register
50 645 julius
__vend__
51
////
52
 
53
Introduction
54
------------
55
Purpose of this document is to define specifications of the OpenRISC 1200
56
implementation. This specification defines all implementation specific
57
variables that are not part of the general architecture specification. This
58
includes type and size of data and instruction caches, type and size of data
59
and instruction MMUs, details of all execution pipelines, implementation
60
of exception unit, interrupt controller and other supplemental units.
61
This document does not cover general architecture topics like instruction set,
62
memory addressing modes and other architectural definitions. See
63
<> for more information about architecture.
64
 
65
OpenRISC Family
66
~~~~~~~~~~~~~~~
67
(((OpenRISC,Family)))
68
OpenRISC 1000 is architecture for a family of free, open source RISC processor
69
cores. As architecture, OpenRISC 1000 allows for a spectrum of chip and
70
system implementations at a variety of price/performance points for a range of
71
applications. It is a 32/64-bit load and store RISC architecture designed with
72
emphasis on performance, simplicity, low power requirements, scalability and
73
versatility. OpenRISC 1000 architecture targets medium and high performance
74
networking, embedded, automotive and portable computer environments.
75
 
76
image::img/or_family.gif[scaledwidth="50%",align="center"]
77
 
78
All OpenRISC implementations, whose first digit in identification number
79
is  1 , belong to OpenRISC 1000 family. Second digit defines which features
80
of OpenRISC 1000 architecture are implemented and in which way they are
81
implemented. Last two digits define how an implementation is configured
82
before it is used in a real application.
83
 
84
However, at present the OR1200 is the only major RTL implementation of the
85
OR1K architecture spec, and the OR1200 name has stuck, despite the high level
86
of reconfigurability possible that would, strictly speaking, mean the core
87
is either a OR1000, OR1300, etc. So, despite the various features that may
88
or may not be implemented, the core is still only referred to as the OR1200.
89
 
90
OpenRISC 1200
91
~~~~~~~~~~~~~
92
(((OpenRISC,1200)))
93
The OR1200 is a 32-bit scalar RISC with Harvard microarchitecture, 5 stage
94
integer pipeline, virtual memory support (MMU) and basic DSP capabilities.
95
Default caches are 1-way direct-mapped 8KB data cache and 1-way direct-mapped
96
8KB instruction cache, each with 16-byte line size. Both caches are
97
physically tagged.  By default MMUs are implemented and they are constructed of
98
64-entry hash based 1-way direct-mpped data TLB and 64-entry hash based 1-way
99
direct-mapped instruction TLB.
100
 
101
Supplemental facilities include debug unit for real-time debugging, high
102
resolution tick timer, programmable interrupt controller and power management
103
support.  When implemented in a typical 0.18u 6LM process it should provide
104
over 300 dhrystone 2.1 MIPS at 300MHz and 300 DSP MAC 32x32 operations, at
105
least 20% more than any other competitor in this class. OR1200 in default
106
configuration has about 1M transistors.
107
 
108
OR1200 is intended for embedded, portable and networking applications. It can
109
successfully compete with latest scalar 32-bit RISC processors in his class
110
and can efficiently run any modern operating system.  Competitors include
111
ARM10, ARC and Tensilica RISC processors.
112
 
113
Features
114
^^^^^^^^
115
The following lists the main features of OR1200 IP core:
116
 
117
- All major characteristics of the core can be set by the user
118
- High performance of 300 Dhrystone 2.1 MIPS at 300 MHz using 0.18u process
119
- High performance cache and MMU subsystems
120
- WISHBONE SoC Interconnection Rev. B3 compliant interface
121
 
122
Architecture
123
------------
124
<> below shows general architecture of OR1200 IP core. It
125
consists of several building blocks:
126
 
127
- CPU/FPU/DSP central block
128
- Direct-mapped data cache
129
- Direct-mapped instruction cache
130
- Data MMU based on hash based DTLB
131
- Instruction MMU based on hash based ITLB
132
- Power management unit and power management interface
133
- Tick timer
134
- Debug unit and development interface
135
- Interrupt controller and interrupt interface
136
- Instruction and Data WISHBONE host interfaces
137
 
138
[[core_arch_fig]]
139
.Core's Architecture
140
image::img/core_arch.gif[scaledwidth="50%",align="center"]
141
 
142
CPU/FPU/DSP
143
~~~~~~~~~~~
144
((CPU))/((FPU))/((DSP)) is a central part of the OR1200 RISC processor.
145
<> shows basic block diagram of the CPU/DSP. Not pictured
146
are the FPU components.  OR1200 CPU/FPU/DSP ony implements sections of
147
the ORBIS32 and ORFPX32 instruction set. No ((ORBIS64)), ((ORFBX64)) or
148
((ORVDX64)) instructions are implemented in OR1200.
149
 
150
[[cpu_fpu_dsp_fig]]
151
.CPU/FPU/DSP Block Diagram
152
image::img/cpu_fpu_dsp.gif[scaledwidth="50%",align="center"]
153
 
154
Instruction unit
155
^^^^^^^^^^^^^^^^
156
The instruction unit implements the basic instruction pipeline, fetches
157
instructions from the memory subsystem, dispatches them to available execution
158
units, and maintains a state history to ensure a precise exception model
159
and that operations finish in order. It also executes conditional branch
160
and unconditional jump instructions.
161
 
162
The sequencer can dispatch a sequential instruction on each clock if the
163
appropriate execution unit is available. The execution unit must discern
164
whether source data is available and to ensure that no other instruction is
165
targeting the same destination register.
166
 
167
Instruction unit handles only ((ORBIS32)) and, optionally, a subset of the
168
((ORFPX32)) instruction class. Some ((ORFPX32)) and all ((ORFPX3264)) and
169
((ORVDX64)) instruction classes are not supported by the OR1200 at present.
170
 
171
General-Purpose Registers
172
^^^^^^^^^^^^^^^^^^^^^^^^^
173
OpenRISC 1200 implements 32 general-purpose 32-bit ((registers)). OpenRISC 1000
174
architecture also support shadow copies of register file to implement fast
175
switching between working contexts, however this feature is not implemented
176
in current OR1200 implementation.
177
 
178
OR1200 implements general-purpose register file as two synchronous dual-port
179
memories with capacity of 32 words by 32 bits per word.
180
 
181
Load/Store Unit
182
^^^^^^^^^^^^^^^
183
The ((load/store unit (LSU))) transfers all data between the GPRs and the CPU's
184
internal bus. It is implemented as an independent execution unit so that stalls
185
in memory subsystem only affect master pipeline if there is a data dependency.
186
 
187
The following are LSU's main features:
188
 
189
- all load/store instruction implemented in hardware (atomic instructions
190
  included)
191
- address entry buffer
192
- pipelined operation
193
- aligned accesses for fast memory access
194
 
195
When load and store instructions are issued, the LSU determines if all
196
operands are available. These operands include the following:
197
 
198
- address register operand
199
- source data register operand (for store instructions)
200
- destination data register operand (for load instructions)
201
 
202
Integer Execution Pipeline
203
^^^^^^^^^^^^^^^^^^^^^^^^^^
204
(((Pipeline, Integer Execution)))
205
The core implements the following types of 32-bit integer instructions:
206
 
207
- Arithmetic instructions
208
- Compare instructions
209
- Logical instructions
210
- Rotate and shift instructions
211
 
212
Most integer instructions can execute in one cycle. For details about timing
213
see <>.
214
 
215
MAC Unit
216
^^^^^^^^
217
The ((MAC)) unit executes DSP MAC operations. MAC operations are 32x32 with
218
48-bit accumulator. MAC unit is fully pipelined and can accept new MAC
219
operation in each new clock cycle.
220
 
221
Floating Point Unit
222
^^^^^^^^^^^^^^^^^^^
223
(((Floating Point Unit)))
224
The ((FPU)) implementation is based on two other FPUs available from
225
OpenCores.org. For the comparison and conversion functions, parts were taken
226
from the FPU project by Rudolf Usselmann, and for the arithmetic operations,
227
the fpu100 project by Jidan Al-Eryani was converted to Verilog HDL.
228
 
229
All ((ORFPX32)) instructions except for ((lf.madd.s)) and ((lf.rem.s)) are
230
supported when the FPU is enabled in the OR1200 configuration.
231
 
232
System Unit
233
^^^^^^^^^^^
234
The ((system unit)) connects all other signals of the CPU/FPU/DSP that are not
235
connected through instruction and data interfaces. It also implements all
236
system special-purpose registers (e.g. supervisor register).
237
 
238
Exceptions
239
^^^^^^^^^^
240
Core exceptions can be generated when an exception condition occurs.
241
((Exception sources)) in OR1200 include the following:
242
 
243
- External interrupt request
244
- Certain memory access condition
245
- Internal errors, such as an attempt to execute unimplemented opcode
246
- System call
247
- Internal exception, such as breakpoint exceptions
248 647 julius
- Arithmetic overflow
249 645 julius
 
250
((Exception handling)) is transparent to user software and uses the same
251
mechanism to handle all types of exceptions. When an exception is taken,
252
control is transferred to an exception handler at an offset defined by for
253
the type of exception encountered. Exceptions are handled in supervisor mode.
254
 
255 808 julius
Exceptions caused by instructions in a delay slot will set the supervision
256
register's DSX bit.
257
 
258 645 julius
Data Cache
259
~~~~~~~~~~
260
The default configuration of OR1200 data ((cache)) is 8-Kbyte, 1-way
261
direct-mapped data cache, which allows rapid core access to data. However
262
data cache can be configured according to <>.
263
 
264
[[data_confs_or1200_table]]
265
.Possible Data Cache Configurations of OR1200
266
[width="60%",options="header"]
267
|======================================================
268
|                                       | Direct mapped
269
| 16B/line, 256 lines, 1 way            | 4KB
270
| 16B/line, 512 lines, 1 way            | *8KB (default)*
271
| 16B/line, 1024 lines, 1 way           | 16KB
272
| 32B/line, 1024 lines, 1 way           | 32KB
273
|======================================================
274
 
275
It is possible to operate the data cache with write-through or write-back
276
strategies, however write-back is currently experimental.
277
 
278
Features:
279
 
280
- data cache is separate from instruction cache (Harvard architecture)
281
- data cache implements a least-recently used (LRU) replacement algorithm
282
  within each set
283
- the cache directory is physically addressed. The physical address tag is
284
  stored in the cache directory
285
- write-through or write-back operation
286
- entire cache can be disabled, lines invalidated, flushed or forced to be
287
  written back, by writing to cache special purpose registers
288
 
289
On a miss, and appropriate conditions, the cache line is filled or emptied
290
(written back) with 16-byte bursts. The burst fill is performed as a
291
critical-word-first operation; the critical word is simultaneously written
292
to the cache and forwarded to the requesting unit, thus minimizing stalls
293
due to cache fill latency. Data cache provides storage for cache tags and
294
performs cache line replacement function.
295
 
296
Data cache is tightly coupled to external interface to allow efficient
297
access to the system memory controller.
298
 
299
The data cache supplies data to the GPRs by means of a 32-bit interface
300
to the load/store unit. The LSU provides all logic required to calculate
301
effective addresses, handles data alignment to and from the data cache,
302
and provides sequencing for load and store operations. Write operations to
303
the data cache can be performed on a byte, half-word or word basis.
304
 
305
image::img/data_cache_diag.gif[scaledwidth="50%",align="center"]
306
 
307
Each line contains four contiguous words from memory that are loaded from
308
a cache line aligned boundary. As a result, cache lines are aligned with
309
page boundaries.
310
 
311
Instruction Cache
312
~~~~~~~~~~~~~~~~~
313
The default configuration of OR1200 instruction ((cache)) is 8-Kbyte, 1-way
314
direct mapped instruction cache, which allows rapid core access to
315
instructions. However instruction cache can be configured according to
316
<>.
317
 
318
[[inst_confs_or1200_table]]
319
.Possible Instruction Cache Configurations of OR1200
320
[width="60%",options="header"]
321
|==============================================
322
|                               | Direct mapped
323
| 16B/line, 32 lines, 1 way     | 512B
324
| 16B/line, 256 lines, 1 way    | 4KB
325
| 16B/line, 512 lines, 1 way    | *8KB (Default)*
326
| 16B/line, 1024 lines, 1 way   | 16KB
327
| 32B/line, 1024 lines, 1 way   | 32KB
328
|==============================================
329
 
330
Features:
331
 
332
- instruction cache is separate from data cache (Harvard architecture)
333
  (((Architecture,Harvard)))
334
- instruction cache implements a least-recently used (LRU) replacement
335
  algorithm within each set
336
  ((LRU))
337
- the ((cache directory)) is physically addressed. The physical address tag is
338
  stored in the cache directory
339
- it can be disabled or invalidated by writing to cache special purpose
340
  registers
341
 
342
On a miss, the cache is filled in with 16-byte bursts. The burst fill
343
is performed as a critical-word-first operation; the critical word is
344
simultaneously written to the cache and forwarded to the requesting unit,
345
thus minimizing stalls due to cache fill latency. Instruction cache provides
346
storage for cache tags and performs cache line replacement function.
347
 
348
Instruction cache is tightly coupled to external interface to allow efficient
349
access to the system memory controller.
350
 
351
The instruction cache supplies instructions to the instruction sequencer by
352
means of a 32-bit interface to the instruction fetch subunit. The instruction
353
fetch subunit provides all logic required to calculate effective addresses.
354
 
355
image::img/inst_cache_diag.gif[scaledwidth="50%",align="center"]
356
 
357
Each line contains four contiguous words from memory that are loaded from
358
a line-size  aligned boundary. As a result, cache lines are aligned with
359
page boundaries.
360
 
361
Data MMU
362
~~~~~~~~
363
(((MMU, Data)))
364
The OR1200 implements a ((virtual memory management)) scheme that
365
provides memory access protection and effective-to-physical address
366
translation. ((Protection)) granularity is as defined by OpenRISC 1000
367
architecture - 8-Kbyte and 16-Mbyte pages.
368
 
369
[[data_tlb_confs_or1200_table]]
370
.Possible Data TLB Configurations of OR1200
371
[width="60%",options="header"]
372
|======================================
373
|                       | Direct mapped
374
| 16 entries per way    | 16 DTLB entries
375
| 32 entries per way    | 32 DTLB entries
376
| 64 entries per way    | *64 DTLB entries (default)*
377
| 128 entries per way   | 128 DTLB entries
378
|======================================
379
 
380
Features:
381
 
382
* data MMU is separate from instruction MMU
383
* page size 8-Kbyte
384
* comprehensive page protection scheme
385
* direct mapped hash based translation lookaside buffer (DTLB) with the
386
  default of 1 way and the following features:
387
** miss and fault exceptions
388
** software tablewalk
389
** high performance because of hashed based design
390
** variable number DTLB entries with default of 64 per each way
391
 
392
image::img/tlb_diag.gif[scaledwidth="50%",align="center"]
393
 
394
The MMU hardware supports two-level software tablewalk.
395
 
396
Instruction MMU
397
~~~~~~~~~~~~~~~
398
(((MMU, Instruction)))
399
The OR1200 implements a virtual memory management scheme that provides memory
400
access protection and effective-to-physical address translation. Protection
401
granularity is as defined by OpenRISC 1000 architecture - 8-Kbyte and
402
16-Mbyte pages.
403
 
404
[[inst_tlb_confs_or1200_table]]
405
.Possible Instruction TLB Configurations of OR1200
406
[width="60%",options="header"]
407
|======================================
408
|                       | Direct mapped
409
| 16 entries per way    | 16 DTLB entries
410
| 32 entries per way    | 32 DTLB entries
411
| 64 entries per way    | *64 DTLB entries (default)*
412
| 128 entries per way   | 128 DTLB entries
413
|======================================
414
 
415
Features:
416
 
417
* instruction MMU is separate from data MMU
418
* pages size 8-Kbyte
419
* comprehensive page protection scheme
420
* 1 way direct-mapped hash based translation lookaside buffer (ITLB) with the
421
  following features:
422
** miss and fault exceptions
423
** software tablewalk
424
** high performance because of hashed based design
425
** Variable number of ITLB entries with default of 64 entries per way
426
 
427
image::img/inst_mmu_diag.gif[scaledwidth="50%",align="center"]
428
 
429
The MMU hardware supports two-level software tablewalk.
430
 
431
Programmable Interrupt Controller
432
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
433
The ((interrupt)) controller receives interrupts from external sources and
434
forwards them as low or high priority interrupt exception to the CPU core.
435
 
436
[[interrupt_controller_fig]]
437
.Block Diagram of the Interrupt Controller
438
image::img/interrupt_controller.gif[scaledwidth="50%",align="center"]
439
 
440
Programmable interrupt controller has three special-purpose registers and 32
441
interrupt inputs. Interrupt input 0 and 1 are always enabled and connected
442
to high and low priority interrupt input, respectively.
443
 
444
30 other interrupt inputs can be masked and assigned low or high priority
445
through programming special-purpose registers.
446
 
447
Tick Timer
448
~~~~~~~~~~
449
OR1200 implements tick ((timer)) facility. Basically this is a timer that is
450
clocked by RISC clock and is used by the operating system to precisely
451
measure time and schedule system tasks.
452
 
453
OR1200 precisely follow architectural definition of the tick timer facility:
454
 
455
* Maximum timer count of 2^32 clock cycles
456
* Maximum time period of 2^28 clock cycles between interrupts
457
* Maskable tick timer interrupt
458
* Single run, restartable or continues timer
459
 
460
Tick timer operates from independent clock source so that doze power management
461
mode can be implemented.
462
 
463
Power Management Support
464
~~~~~~~~~~~~~~~~~~~~~~~~
465
To optimize ((power consumption)), the OR1200 provides ((low-power)) modes that
466
can be used to dynamically activate and deactivate certain internal modules.
467
 
468
OR1200 has three major features to minimize power consumption:
469
 
470
* Slow and Idle Modes (SW controlled clock freq reduction)
471
* Doze and Sleep Modes (interrupt wake-up)
472
 
473
[[power_consumption_table]]
474
.Power Consumption
475
[width="60%",options="header"]
476
|===================================================================
477
| Power Minimization Feature    | Approx Power Consumption Reduction
478
| Slow and Idle mode            | 2x - 10x
479
| Doze mode                     | 100x
480
| Sleep mode                    | 200x
481
| Dynamic clock gating          | N/A
482
|===================================================================
483
 
484
Slow down mode takes advantage of the low-power dividers in external clock
485
generation circuitry to enable full functionality, but at a lower frequency
486
so that a power consumption is reduced.  PMR[SDF] 4 bits are broadcasted on
487
pm_clksd and external clock generation for the RISC should adapt RISC clock
488
frequency according to the value on pm_clksd.
489
 
490
When software initiates the doze mode, software processing on the core
491
suspends. The clocks to the RISC internal modules are disabled except to
492
the tick timer. However any other on-chip blocks can continue to function
493
as normal.  The OR1200 will leave doze mode and enter normal mode when a
494
pending interrupt occurs.
495
 
496
In sleep mode, all OR1200 internal units are disabled and clocks
497
gated. Optionally implementation may choose to lower the operating voltage
498
of the OR1200 core.  The OR1200 should leave sleep mode and enter normal
499
mode when a pending interrupt occurs.
500
 
501
Dynamic ((Clock gating)) (unit clock gating on clock by clock basis) is not
502
supported by OR1200.
503
 
504
Debug unit
505
~~~~~~~~~~
506
((Debug unit)) assists software developers to debug their systems. It provides
507
support only for basic debugging and does not have support for more advanced
508
debug features of OpenRISC 1000 architecture such as watchpoints, breakpoints
509
and program-flow control registers.
510
 
511
[[debug_unit_fig]]
512
.Block Diagram of Debug Unit
513
image::img/debug_unit_diag.gif[scaledwidth="50%",align="center"]
514
 
515
Watchpoints and breakpoints are events triggered by program- or data-flow
516
matching the conditions programmed in the debug registers. Breakpoints
517
unlike watchpoints also suspend execution of the current program-flow and
518
start breakpoint exception.
519
 
520
Clocks & Reset
521
~~~~~~~~~~~~~~
522
The OR1200 core has a ((clock)) input each for the instruction and data Wishbone
523
interface logic, and for the CPU core. Clock input clk_cpu clocks everything
524
inside the Wishbone interfaces. Data Wishbone interface is clocked by
525
dwb_clk_i, instruction Wishbone interface is clocked by iwb_clk_i.
526
 
527
OR1200 has asynchronous ((reset)) signal. Reset signal rst, when asserted high,
528
immediately resets all flip-flops inside OR1200. When deasserted, OR1200
529
will start reset exception.
530
 
531
WISHBONE Interfaces
532
~~~~~~~~~~~~~~~~~~~
533
Two ((WISHBONE)) interfaces connect OR1200 core to external peripherals and
534
external memory subsystem. They are WISHBONE SoC Interconnection specification
535
Rev. B3 compliant. The implementation implements a 32-bit bus width and does
536
not support other bus widths.
537
 
538
Wishbone registered-feedback incrementing burst accesses occur when not
539
disabled, and cache lines are filled. The burst size (beats) is determined
540
by the cache line size.
541
 
542
image::img/wb_compatible.png[scaledwidth="30%",align="center"]
543
 
544
Operation
545
---------
546
This section describes the operation of the OR1200 core. For operations
547
that pertain to the architectural definitions, see <>.
548
 
549
Reset
550
~~~~~
551
OR1200 has one asynchronous ((reset)) signal that can be used by a soft and hard
552
reset on a higher system hierarchy levels.
553
 
554
[[powerup_sequence_fig]]
555
.Power-Up and Reset Sequence
556
image::img/powerup_seq.gif[scaledwidth="70%",align="center"]
557
 
558
<> shows how asynchronous reset is applied after
559
powering up the OR1200 core. Reset is connected to asynchronous reset of
560
almost all flip-flops inside RISC core. Special care must be taken to ensure
561
hold and setup times of all flip-flops compared to main RISC clock.
562
 
563
If system implements gated clocks, then clock gating can be used to ensure
564
proper reset timing.
565
 
566
[[powerup_sequence_gatedclk_fig]]
567
.Power-Up and Reset Sequence w/ Gated Clock
568
image::img/powerup_seq_gatedclk.gif[scaledwidth="70%",align="center"]
569
 
570
The address the PC assumes at hard reset (assertion of external reset signal)
571
is definable at synthesis time, via the OR1200_BOOT_ADR define. This is not
572
to be confused with the ability to set the exception prefix address with
573
the EPH bit.
574
 
575
CPU/FPU/DSP
576
~~~~~~~~~~~
577
((CPU))/((FPU))/((DSP)) is implementation of the 32-bit part of the OpenRISC
578
1000 architecture and only a subset of all features is implemented.
579
 
580
Instructions
581
^^^^^^^^^^^^
582
(((OpenRISC 1200, Instruction List)))
583
The following table lists the instructions implemented in the OR1200. Those
584
optionally implemented are indicated as such.
585
 
586
// The table below is split into several columns for readability by the
587
// preprocessing script. It is better to have this automated because
588
// given the pseudo-lexicographical ordering, adding a new instruction
589
// would require manual changes in all subsequent columns, which is
590
// tedious and error-prone.
591
//
592
// When changing the column headers, remember to change the script accordingly.
593
 
594
[[instructions_table]]
595
.Instructions implemented in OR1200
596
[width="95%",options="header"]
597
|=================================
598
| Instruction mnemonic  | Optional
599
| ((l.add))             |
600
| ((l.addc))            | Yes
601
| ((l.addi))            |
602
| ((l.and))             |
603
| ((l.andi))            |
604
| ((l.bf))              |
605
| ((l.bnf))             |
606
| ((l.div))             | Yes
607 647 julius
| ((l.extbs))           | Yes
608
| ((l.extbz))           | Yes
609
| ((l.exths))           | Yes
610
| ((l.exthz))           | Yes
611
| ((l.extws))           | Yes
612
| ((l.extwz))           | Yes
613 645 julius
| ((l.ff1))             | Yes
614
| ((l.fl1))             | Yes
615
| ((l.j))               |
616
| ((l.jal))             |
617
| ((l.jalr))            |
618
| ((l.jr))              |
619
| ((l.lbs))             |
620
| ((l.lbz))             |
621
| ((l.lhs))             |
622
| ((l.lhz))             |
623
| ((l.lws))             |
624
| ((l.lwz))             |
625
| ((l.mac))             | Yes
626
| ((l.maci))            | Yes
627
| ((l.macrc))           | Yes
628
| ((l.mfspr))           |
629
| ((l.movhi))           |
630
| ((l.msb))             | Yes
631
| ((l.mtspr))           |
632
| ((l.mul))             | Yes
633
| ((l.muli))            | Yes
634
| ((l.nop))             |
635
| ((l.or))              |
636
| ((l.ori))             |
637
| ((l.rfe))             |
638
| ((l.rori))            |
639
| ((l.sb))              |
640
| ((l.sfeq))            |
641
| ((l.sfges))           |
642
| ((l.sfgeu))           |
643
| ((l.sfgts))           |
644
| ((l.sfgtu))           |
645
| ((l.sfleu))           |
646
| ((l.sflts))           |
647
| ((l.sfltu))           |
648
| ((l.sfne))            |
649
| ((l.sh))              |
650
| ((l.sll))             |
651
| ((l.slli))            |
652
| ((l.sra))             |
653
| ((l.srai))            |
654
| ((l.srl))             |
655
| ((l.srli))            |
656
| ((l.sub))             | Yes
657
| ((l.sw))              |
658
| ((l.sys))             |
659
| ((l.trap))            |
660
| ((l.xor))             |
661
| ((l.xori))            |
662
| ((lf.add.s))          | Yes
663
| ((lf.div.s))          | Yes
664
| ((lf.ftoi.s))         | Yes
665
| ((lf.itof.s))         | Yes
666
| ((lf.mul.s))          | Yes
667
| ((lf.sfeq.s))         | Yes
668
| ((lf.sfge.s))         | Yes
669
| ((lf.sfgt.s))         | Yes
670
| ((lf.sfle.s))         | Yes
671
| ((lf.sflt.s))         | Yes
672
| ((lf.sfne.s))         | Yes
673
| ((lf.sub.s))          | Yes
674
|=================================
675
 
676
For a complete description of each instruction's format refer to
677
<>.
678
 
679
Instruction Unit
680
^^^^^^^^^^^^^^^^
681
((Instruction unit)) generates instruction fetch effective address and fetches
682
instructions from instruction cache. Each clock cycle one instruction can
683
be fetched. Instruction fetch EA is further translated into physical address
684
by IMMU.
685
 
686
General-Purpose Registers
687
^^^^^^^^^^^^^^^^^^^^^^^^^
688
((General-purpose register)) file can supply two read operands each clock cycle
689
and store one result in a destination register.
690
 
691
GPRs can be also read and written through development interface.
692
 
693
Load/Store Unit
694
^^^^^^^^^^^^^^^
695
((LSU)) can execute one load instruction every two clock cycles assuming load
696
instruction have a hit in the data cache. Execution of store instructions
697
takes one clock cycle assuming they have a hit in the data cache.
698
 
699
LSU performs calculation of the load/store effective address. EA is further
700
translated into physical address by DMMU.
701
 
702
Load/store effective address and load and store data can be also accessed
703
through development interface.
704
 
705
Integer Execution Pipeline
706
^^^^^^^^^^^^^^^^^^^^^^^^^^
707
(((Pipeline, Integer Execution)))
708
The core implements the following types of 32-bit integer instructions:
709
 
710
* Arithmetic instructions
711
* Compare instructions
712
* Logical instructions
713
* Rotate and shift instructions
714
 
715
[[exec_time_int_table]]
716
.Execution Time of Integer Instructions
717
[width="70%",options="header"]
718
|================================================
719
| Instruction Group     | Clock Cycles to Execute
720
| Arithmetic except Multiply/Divide     | 1
721
| Multiply                              | 3
722
| Divide                                | 32
723
| Compare                               | 1
724
| Logical                               | 1
725
| Rotate and Shift                      | 1
726
| Others                                | 1
727
|================================================
728
 
729
<> lists execution times for instructions executed by
730
integer execution pipeline. Most instructions are executed in one clock cycle.
731
 
732
Integer multiply can be either serial or parallel implementations. Serial
733
operations require one clock cycle per bit of operand, which is 32-cycles
734
on the OR1200. At present no synthesis tools support division operators,
735
and so the serial option must be used.
736
 
737
MAC Unit
738
^^^^^^^^
739
((MAC)) unit executes l.mac instructions. MAC unit implements 32x32 fully
740
pipelined multiplier and 48-bit accumulator. MAC unit can accept one new
741
l.mac instruction each clock cycle.
742
 
743
Care should be taken when executing l.macrc (MAC read and clear) too soon
744
after the final l.mac instruction as the operation may still be underway
745
and the result will not be valid in time. It is recommended at least 3 other
746
instructions (or just l.nops) are inserted between the final l.mac and l.macrc.
747
 
748
Floating Point Unit
749
^^^^^^^^^^^^^^^^^^^
750
The ((floating point unit)) has a mechanism to stall the processor pipeline
751
until processing has completed.
752
 
753
The following table indicates the number of cycles per operation
754
 
755
[[exec_time_fp_table]]
756
.Execution time of floating point instructions
757
[width="60%",options="header"]
758
|=======================
759
| Operation     | Cycles
760
| Add/subtract  | 10
761
| Multiply      | 38
762
| Divide        | 37
763
| Compare       | 2
764
| Convert       | 7
765
|=======================
766
 
767
System Unit
768
^^^^^^^^^^^
769
((System unit)) implements system control and status special-purpose registers
770
and executes all l.mtspr/l.mfspr instructions.
771
 
772
Exceptions
773
^^^^^^^^^^
774
The core implements a precise ((exception model)). This means that when an
775
exception is taken, the following conditions are met:
776
 
777
* Subsequent instructions in program flow are discarded
778
* Previous instructions finish and write back their results
779
* The address of faulting instruction is saved in EPCR registers and the
780
  machine state is saved to ESR registers
781 808 julius
* If the exception occurred in a delay slot, the DSX bit of the SR is set
782 645 julius
 
783
[[exceptions_table]]
784
.List of Implemented ((Exceptions))
785
[width="95%",options="header"]
786
|===========================================================
787
| Exception Type        | Vector Offset | Causing Conditions
788
| Reset                 | 0x100 | Caused by reset.
789
| Bus Error             | 0x200 | Caused by an attempt to access invalid
790
  physical address.
791
| Data Page Fault       | 0x300 | Generated artificially by DTLB miss exception
792
  handler when no matching PTE found in page tables or page protection
793
  violation for load/store operations.
794
| Instruction Page Fault| 0x400 | Generated artificially by ITLB miss exception
795
  handler when no matching PTE found in page tables or page protection violation
796
  for instruction fetch.
797
| Low Priority External Interrupt       | 0x500 | Low priority external
798
  interrupt asserted.
799
| Alignment     | 0x600 | Load/store access to naturally not aligned location.
800
| Illegal Instruction   | 0x700 | Illegal instruction in the instruction stream.
801
| High Priority External Interrupt      | 0x800 | High priority external
802
  interrupt asserted.
803
| D-TLB Miss    | 0x900 | No matching entry in DTLB (DTLB miss).
804
| I-TLB Miss    | 0xA00 | No matching entry in ITLB (ITLB miss).
805 647 julius
| Range         | 0xB00 | If programmed in the SR, the setting of  SR[OV],
806
  usually by an arithmetic instruction, causes a range exception.
807 645 julius
| System Call   | 0xC00 | System call initiated by software.
808
| Floating point exception      | 0xD00 | FP operation caused flags in FPCSR to
809
  become set.
810
| Trap  | 0xE00 | Trap instruction was decoded
811
|===========================================================
812
 
813 808 julius
The OR1200 exception support does not include support for fast context
814
switching.
815 645 julius
 
816
Data Cache Operation
817
~~~~~~~~~~~~~~~~~~~~
818
Data Cache Load/Store Access
819
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
820
Load/store unit requests data from the data ((cache)) and stores them into
821
the general-purpose register file and forwards them to integer execution
822
units. Therefore LSU is tightly coupled with the data cache.
823
 
824
If there is no data cache line miss nor ((DTLB)) miss, load operations take
825
two clock cycles to execute and store operations take one clock cycle to
826
execute. LSU does all the data alignment work.
827
 
828
Data can be written to the data cache on a word, half-word or byte basis. Since
829
data cache only operates in write-through mode, all writes are immediately
830
written back to main memory or to the next level of caches.
831
 
832
[[wb_write_fig]]
833
.WISHBONE Write Cycle
834
image::img/wb_write.gif[scaledwidth="70%",align="center"]
835
 
836
<> shows how a ((write-through)) cycle on data WISHBONE interface
837
is performed when a store instruction hits in the data cache.  If +dwb_ERR_I+
838
or +dwb_RTY_I+ is asserted instead of usual +dwb_ACK_I+, bus error exception
839
is invoked.
840
 
841
Data Cache Line Fill Operation
842
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
843
When executing load instruction and a cache miss occurs, depending on whether
844
the cache uses ((write-through)) or ((write-back)) strategy and the line
845
is clean or invalid, a 4 beat sequential read burst with critical word
846
first is performed. If the strategy is write-back and the line is dirty,
847
the line is first written back to memory. The critical word is forwarded to
848
the load/store unit to minimize performance loss because of the cache miss.
849
 
850
[[wb_read_fig]]
851
.WISHBONE Block Read Cycle
852
image::img/wb_read.gif[scaledwidth="70%",align="center"]
853
 
854
<> shows how a cache line is read in WISHBONE read block cycle
855
composed out of four read transfers.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted
856
instead of usual +dwb_ACK_I+, bus error exception is invoked.
857
 
858
When executing a store instruction with the cache in write-through strategy,
859
and a cache miss occurs, the write is simply put on the bus and no caching
860
occurs. If it is a miss and the cache is in write back strategy and the line
861
is valid and clean or invalid,  a 4 beat sequential read burst to fill the
862
line is performed, and the the write to cache occurs. If storing and a cache
863
miss occurs, and the desired line is valid and dirty, it is first written
864
back to memory before the desired line is read.
865
 
866
[[wb_rw_fig]]
867
.WISHBONE Block Read/Write Cycle
868
image::img/wb_rw.gif[scaledwidth="70%",align="center"]
869
 
870
<> shows how a cache line is read in WISHBONE read block cycle
871
followed by a write transfer.  If +dwb_ERR_I+ or +dwb_RTY_I+ is asserted instead
872
of usual +dwb_ACK_I+, bus error exception is invoked.
873
 
874
Cache/Memory Coherency
875
^^^^^^^^^^^^^^^^^^^^^^
876
Data cache in OR1200 operates in either write-through or write-back mode,
877
definable at synthesis time, for default use, and runtime when DMMU is
878
used. There is currently no ((coherency)) support between local data cache and
879
caches of other processors.
880
 
881
Data Cache Enabling/Disabling
882
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
883
Data cache is disabled at power up. Entire data cache can be enabled by setting
884
bit SR[DCE] to one. Before data cache is enabled, it must be invalidated.
885
 
886
Data Cache Invalidation
887
^^^^^^^^^^^^^^^^^^^^^^^
888
Data cache in OR1200 does not support ((invalidation)) of entire data
889
cache. Normal procedure to invalidate entire data cache is to cycle through
890
all data cache lines and invalidate each line separately.
891
 
892
Data Cache Locking
893
^^^^^^^^^^^^^^^^^^
894
Data cache implements way ((locking)) bits in data cache control register
895
DCCR. Bits LWx lock individual ways when they are set to one.
896
 
897
Data Cache Line Prefetch
898
^^^^^^^^^^^^^^^^^^^^^^^^
899
Data cache line ((prefetch)) is optional in the OpenRISC 1000 architecture and
900
is not implemented in OR1200.
901
 
902
Data Cache Line ((Flush))
903
^^^^^^^^^^^^^^^^^^^^^^^^^
904
Operation is performed by writing effective address to the DCBFR register.
905
 
906
When a cache line is valid and clean, or the cache is in write-through
907
strategy, the line is invalidated and no write-back occurs.
908
 
909
Data Cache Line Invalidate
910
^^^^^^^^^^^^^^^^^^^^^^^^^^
911
Data cache line ((invalidate)) invalidates a single data cache line. Operation
912
is performed by writing effective address to the DCBIR register.  If cache
913
is in write-back strategy, it is best to use the line flush function.
914
 
915
Data Cache Line ((Write-back))
916
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
917
Operation is performed by writing effective address to the DCBWR register.
918
 
919
If cache is in ((write-through)) strategy, this operation is ignored as no
920
lines will be cached and dirty, capable of being written back.
921
 
922
Data Cache Line ((Lock))
923
^^^^^^^^^^^^^^^^^^^^^^^^
924
Locking of individual data cache lines is not implemented in OR1200.
925
 
926
Data Cache ((inhibit)) with address bit 31 set
927
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
928
If DMMU is disabled, by default all addresses with bit 31 of the address
929
asserted high will cause the data cache to be inhibited, meaning no reads
930
or writes are cached.
931
 
932
If the ((DMMU)) is enabled, it is possible for any address to be inhibited
933
or not, and in these modes the cache behaves accordingly.
934
 
935
Instruction ((Cache)) Operation
936
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
937
Instruction Cache Instruction ((Fetch)) Access
938
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
939
Instruction unit requests instruction from the instruction cache and forwards
940
them to the instruction queue inside instruction unit. Therefore instruction
941
unit is tightly coupled with the instruction cache.
942
 
943
If there is no instruction cache line ((miss)) nor ITLB miss, instruction fetch
944
operation takes one clock cycle to execute.
945
 
946
Instruction cache cannot be explicitly modified like data cache can be with
947
store instructions.
948
 
949
Instruction Cache Line Fill Operation
950
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
951
On a cache miss, a 4 beat sequential read burst with critical word first is
952
performed. Critical word is forwarded to the instruction unit to minimize
953
performance loss because of the cache miss.
954
 
955
[[wb_block_read_fig]]
956
.WISHBONE Block Read Cycle
957
image::img/wb_block_read.gif[scaledwidth="70%",align="center"]
958
 
959
<> shows how a cache line is read in WISHBONE read block
960
cycle composed out of four read transfers.  If +iwb_ERR_I+ or +iwb_RTY_I+ is
961
asserted instead of usual +dwb_ACK_I+, bus error exception is invoked.
962
 
963
Cache/Memory ((Coherency))
964
^^^^^^^^^^^^^^^^^^^^^^^^^^
965
OR1200 is not intended for use in multiprocessor environments. Therefore no
966
support for coherency between local instruction cache and caches of other
967
processors or main memory is implemented.
968
 
969
Instruction Cache Enabling/Disabling
970
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
971
Instruction cache is disabled at power up. Entire instruction cache can be
972
enabled by setting bit SR[ICE] to one. Before instruction cache is enabled,
973
it must be invalidated.
974
 
975
Instruction Cache ((Invalidation))
976
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
977
Instruction cache in OR1200 does not support invalidation of entire instruction
978
cache. Normal procedure to invalidate entire instruction cache is to cycle
979
through all instruction cache lines and invalidate each line separately.
980
 
981
Instruction Cache Locking
982
^^^^^^^^^^^^^^^^^^^^^^^^^
983
Instruction cache implements way locking bits in instruction cache control
984
register ICCR. Bits LWx lock individual ways when they are set to one.
985
 
986
Instruction Cache Line ((Prefetch))
987
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
988
Instruction cache line prefetch is optional in the OpenRISC 1000 architecture
989
and is not implemented in OR1200.
990
 
991
Instruction Cache Line ((Invalidate))
992
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
993
Instruction cache line invalidate invalidates a single instruction cache
994
line. Operation is performed by writing effective address to the ICBIR
995
register.
996
 
997
Instruction ((Cache Line Lock))
998
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
999
Locking of individual instruction cache lines is not implemented in OR1200.
1000
 
1001
Data MMU
1002
~~~~~~~~
1003
Translation Disabled
1004
^^^^^^^^^^^^^^^^^^^^
1005
Load/store address translation can be disabled by clearing bit SR[DME]. If
1006
translation is disabled, then physical address used to access data cache
1007
and optionally provided on +dwb_ADDR_O+, is the same as load/store effective
1008
address.
1009
(((Address Translation,Data)))
1010
 
1011
Translation Enabled
1012
^^^^^^^^^^^^^^^^^^^
1013
Load/store address translation can be enabled by setting bit SR[DME]. If
1014
translation is enabled, it provides load/store effective address to physical
1015
address translation and page protection for memory accesses.
1016
(((Address Translation,Data)))
1017
 
1018
[[addr_translation_fig]]
1019
.32-bit Address Translation Mechanism using Two-Level Page Table
1020
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1021
 
1022
In OR1200 case, ((page tables)) must be managed by operating system's virtual
1023
memory management subsystem. <> shows address translation
1024
using two-level page table. Refer to <> for one-level page
1025
table address translation as well as for details about address translation
1026
and page table content.
1027
 
1028
((DMMUCR)) and Flush of Entire ((DTLB))
1029
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1030
DMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1031
must be stored in software variable. Flush of entire DTLB must be performed
1032
by software flush of every DTLB entry separately. Software flush is performed
1033
by manually writing  bits from the TLB entries back to PTEs.
1034
 
1035
Page Protection
1036
^^^^^^^^^^^^^^^
1037
After a virtual address is determined to be within a page covered by the
1038
valid PTE, the access is validated by the memory protection mechanism. If
1039
this protection mechanism prohibits the access, a data page fault exception
1040
is generated.
1041
(((Page Protection,Data)))
1042
 
1043
The memory protection mechanism allows selectively granting read access
1044
and write access for both supervisor and user modes. The page protection
1045
mechanism provides protection at all page level granularities.
1046
 
1047
[[protection_attrs_ldst_table]]
1048
.Protection Attributes for Load/Store Accesses
1049
[width="70%",options="header"]
1050
|================================
1051
| Protection attribute  | Meaning
1052
| DTLBWyTR[SREx]        | Enable load operations in supervisor mode to the
1053
  page.
1054
| DTLBWyTR[SWEx]        | Enable store operations in supervisor mode to the
1055
  page.
1056
| DTLBWyTR[UREx]        | Enable load operations in user mode to the page.
1057
| DTLBWyTR[UWEx]        | Enable store operations in user mode to the page.
1058
|================================
1059
 
1060
<> lists page protection attributes defined in
1061
DTLBWyTR pregister. For the individual page appropriate strategy out of
1062
seven possible strategies programmed with the PPI field of the PTE. Because
1063
OR1200 does not implement DMMUPR, translation of PTE[PPI] into suitable set
1064
of protection bits must be performed by software and written into DTLBWyTR.
1065
 
1066
((DTLB)) Entry Reload
1067
^^^^^^^^^^^^^^^^^^^^^
1068
OR1200 does not implement DTLB entry reloads in hardware. Instead software
1069
routine must be used to search page table for correct page table entry (PTE)
1070
and copy it into the DTLB. Software is responsible for maintaining accessed
1071
and dirty bits in the page tables.
1072
 
1073
When LSU computes load/store effective address whose physical address is
1074
not already cached by DTLB, a DTLB miss exception is invoked.
1075
 
1076
DTLB reload routine must load the correct ((PTE)) to correct ((DTLBWyMR))
1077
and ((DTLBWyTR)) register from one of possible DTLB ways.
1078
 
1079
DTLB Entry Invalidation
1080
^^^^^^^^^^^^^^^^^^^^^^^
1081
Special-purpose register DTLBEIR must be written with the effective address
1082
and corresponding DTLB entry will be invalidated in the local DTLB.
1083
 
1084
Locking DTLB Entries
1085
^^^^^^^^^^^^^^^^^^^^
1086
Since all DTLB entry reloads are performed in software, there is no hardware
1087
locking of DTLB entries. Instead it is up to the software reload routine to
1088
avoid replacing some of the entries if so desired.
1089
 
1090
Page Attribute - Dirty (D)
1091
^^^^^^^^^^^^^^^^^^^^^^^^^^
1092
Dirty (D) attribute is not implemented in OR1200 DTLB. It is up to the
1093
operating system to generate dirty attribute bit with page protection
1094
mechanism.
1095
(((Page Attributes,Data)))
1096
 
1097
Page Attribute - Accessed (A)
1098
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1099
Accessed (A) attribute is not implemented in OR1200 DTLB. It is up to the
1100
operating system to generate accessed attribute bit with page protection
1101
mechanism.
1102
(((Page Attributes,Data)))
1103
 
1104
Page Attribute - Weakly Ordered Memory (WOM)
1105
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1106
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1107
memory accesses are serialized and therefore this attribute is not implemented.
1108
(((Page Attributes,Data)))
1109
 
1110
Page Attribute - Write-Back Cache (WBC)
1111
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1112
Write-back cache (WBC) attribute is not implemented as the data cache cannot
1113
be configured at run time to be write-back enabled if write-through strategy
1114
was selected at synthesis-time.
1115
(((Page Attributes,Data)))
1116
 
1117
Page Attribute - Caching-Inhibited (CI)
1118
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1119
Caching-inhibited (CI) attribute is not implemented in OR1200 DTLB. Cached
1120
and uncached regions are divided by bit 30 of data effective address.
1121
(((Page Attributes,Data)))
1122
 
1123
[[data_cached_regions_table]]
1124
.Cached and uncached regions
1125
[width="70%",options="header"]
1126
|===============================
1127
| Effective Address     | Region
1128
| 0x00000000 - 0x3FFFFFFF       | Cached
1129
| 0x40000000 - 0x7FFFFFFF       | Uncached
1130
| 0x80000000 - 0xBFFFFFFF       | Cached
1131
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1132
|===============================
1133
 
1134
Uncached accesses must be performed when I/O registers are memory mapped
1135
and all reads and writes must be always performed directly to the external
1136
interface and not to the data cache.
1137
 
1138
Page Attribute - Cache Coherency (CC)
1139
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1140
Cache coherency (CC) attribute is not needed in OR1200 because it does not
1141
implement support for multiprocessor environments and because data cache
1142
operates only in write-through mode and therefore this attribute is not
1143
implemented.
1144
(((Page Attributes,Data)))
1145
 
1146
((Instruction MMU))
1147
~~~~~~~~~~~~~~~~~~~
1148
Translation Disabled
1149
^^^^^^^^^^^^^^^^^^^^
1150
Instruction fetch address translation can be disabled by clearing bit
1151
SR[IME]. If translation is disabled, then physical address used to access
1152
instruction cache and optionally provided on iwb_ADDR_O, is the same as
1153
instruction fetch effective address.
1154
(((Address Translation,Instruction)))
1155
 
1156
Translation Enabled
1157
^^^^^^^^^^^^^^^^^^^
1158
Instruction fetch address translation can be enabled by setting bit
1159
SR[IME]. If translation is enabled, it provides instruction fetch effective
1160
address to physical address translation and page protection for instruction
1161
fetch accesses.
1162
(((Address Translation,Instruction)))
1163
 
1164
[[addr_translation_rep_fig]]
1165
.32-bit Address Translation Mechanism using Two-Level Page Table
1166
image::img/addr_translation.gif[scaledwidth="70%",align="center"]
1167
 
1168
In OR1200 case, page tables must be managed by operating system s virtual
1169
memory management subsystem. <> shows address
1170
translation using two-level page table. Refer to <> for
1171
one-level page table address translation as well as for details about address
1172
translation and page table content.
1173
 
1174
((IMMUCR)) and ((Flush)) of Entire ITLB
1175
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1176
IMMUCR is not implemented in OR1200. Therefore page table base pointer (PTBP)
1177
must be stored in software variable. Flush of entire ITLB must be performed
1178
by software flush of every ITLB entry separately. Software flush is performed
1179
by manually writing bits from the TLB entries back to PTEs.
1180
 
1181
Page Protection
1182
^^^^^^^^^^^^^^^
1183
After a virtual address is determined to be within a page covered by the
1184
valid PTE, the access is validated by the memory protection mechanism. If
1185
this protection mechanism prohibits the access, an instruction page fault
1186
exception is generated.
1187
(((Page Protection,Instruction)))
1188
 
1189
The memory protection mechanism allows selectively granting execute access
1190
for both supervisor and user modes. The page protection mechanism provides
1191
protection at all page level granularities.
1192
 
1193
[[protection_attrs_inst_table]]
1194
.Protection Attributes for Instruction Fetch Accesses
1195
[width="70%",options="header"]
1196
|================================
1197
| Protection attribute  | Meaning
1198
| ITLBWyTR[SXEx]        | Enable execute operations in supervisor mode of the
1199
  page.
1200
| ITLBWyTR[UXEx]        | Enable execute operations in user mode of the page.
1201
|================================
1202
 
1203
<> lists page protection attributes defined
1204
in ITLBWyTR pregister. For the individual page appropriate strategy out
1205
of seven possible strategies programmed with PPI field of the PTE. Because
1206
OR1200 does not implement IMMUPR, translation of PTE[PPI] into suitable set
1207
of protection bits must be performed by software and written into ITLBWyTR.
1208
 
1209
((ITLB)) Entry Reload
1210
^^^^^^^^^^^^^^^^^^^^^
1211
OR1200 does not implement ITLB entry reloads in hardware. Instead software
1212
routine must be used to search page table for correct page table entry (PTE)
1213
and copy it into the ITLB. Software is responsible for maintaining accessed
1214
bit in the page tables.
1215
 
1216
When LSU computes instruction fetch effective address whose physical address
1217
is not already cached by ITLB, an ITLB miss exception is invoked.
1218
 
1219
ITLB reload routine must load the correct PTE to correct ITLBWyMR and ITLBWyTR
1220
register from one of possible ITLB ways.
1221
 
1222
ITLB Entry Invalidation
1223
^^^^^^^^^^^^^^^^^^^^^^^
1224
Special-purpose register ITLBEIR must be written with the effective address
1225
and corresponding ITLB entry will be invalidated in the local ITLB.
1226
 
1227
Locking ITLB Entries
1228
^^^^^^^^^^^^^^^^^^^^
1229
Since all ITLB entry reloads are performed in software, there is no hardware
1230
locking of ITLB entries. Instead it is up to the software reload routine to
1231
avoid replacing some of the entries if so desired.
1232
 
1233
Page Attribute - Dirty (D)
1234
^^^^^^^^^^^^^^^^^^^^^^^^^^
1235
Dirty (D) attribute resides in the PTE but it is not used by the IMMU.
1236
(((Page Attributes,Instruction)))
1237
 
1238
Page Attribute - Accessed (A)
1239
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1240
Accessed (A) attribute is not implemented in OR1200 ITLB. It is up to the
1241
operating system to generate accessed attribute bit with page protection
1242
mechanism.
1243
(((Page Attributes,Instruction)))
1244
 
1245
Page Attribute - Weakly Ordered Memory (WOM)
1246
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1247
Weakly ordered memory (WOM) attribute is not needed in OR1200 because all
1248
instruction fetch accesses are serialized and therefore this attribute is
1249
not implemented.
1250
(((Page Attributes,Instruction)))
1251
 
1252
Page Attribute - Write-Back Cache (WBC)
1253
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1254
Write-back cache (WBC) attribute resides in the PTE but it is not used by
1255
the IMMU.
1256
(((Page Attributes,Instruction)))
1257
 
1258
Page Attribute - Caching-Inhibited (CI)
1259
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1260
Caching-inhibited (CI) attribute is not implemented in OR1200 ITLB. Cached
1261
and uncached regions are divided by bit 30 of instruction effective address.
1262
(((Page Attributes,Instruction)))
1263
 
1264
[[inst_cached_regions_table]]
1265
.Cached and uncached regions
1266
[width="70%",options="header"]
1267
|===============================
1268
| Effective Address     | Region
1269
| 0x00000000 - 0x3FFFFFFF       | Cached
1270
| 0x40000000 - 0x7FFFFFFF       | Uncached
1271
| 0x80000000 - 0xBFFFFFFF       | Cached
1272
| 0xC0000000 - 0xFFFFFFFF       | Uncached
1273
|===============================
1274
 
1275
Page Attribute - Cache Coherency (CC)
1276
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1277
Cache coherency (CC) attribute resides in the PTE but it is not used by
1278
the IMMU.
1279
(((Page Attributes,Instruction)))
1280
 
1281
((Programmable Interrupt Controller))
1282
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1283
PICMR special-purpose register is used to mask or unmask up to 30 programmable
1284
interrupt sources. PICPR special-purpose register is used to assign low or
1285
high priority to maximum of 30 interrupt sources.
1286
 
1287
PICSR special-purpose register is used to determine status of each interrupt
1288
input. Bits in PICSR represent status of the interrupt inputs and the
1289
actual interrupt must be cleared in the device that is the source of a
1290
pending interrupt.
1291
 
1292
The ((PIC)) implementation in the OR1200  differs from the architecture
1293
specification. The PIC instead offers a latched level-sensitive interrupt.
1294
 
1295
Once an interrupt line is latched (i.e. its value appears in PICSR), no
1296
new interrupts can be triggered for that line until its bit in PICSR is
1297
cleared. The usual sequence for an interrupt handler is then as follows.
1298
 
1299
. Peripheral asserts interrupt, which is latched and triggers handler.
1300
. Handler processes interrupt.
1301
. Handler notifies peripheral that the interrupt has been processed (typically
1302
  via a memory mapped register).
1303
. Peripheral deasserts interrupt.
1304
. Handler clears corresponding bit in PICSR and returns.
1305
 
1306
It is assumed that the peripheral will de-assert its interrupt promptly
1307
(within 1-2 cycles). Otherwise on exiting the interrupt handler, having
1308
cleared PICSR, the level sensitive interrupt will immediately retrigger.
1309
 
1310
((Tick Timer))
1311
~~~~~~~~~~~~~~
1312
Tick timer facility is enabled with TTMR[M]. TTCR is incremented with each
1313
clock cycle and a high priority interrupt can be asserted whenever lower 28
1314
bits of TTCR match TTMR[TP] and TTMR[IE] is set.
1315
 
1316
TTCR restarts counting from zero when match event happens and TTMR[M] is
1317
0x1. If TTMR[M] is 0x2, TTCR is stoped when match event happens and TTCR
1318
must be changed to start counting again. When TTMR[M] is 0x3, TTCR keeps
1319
counting even when match event happens.
1320
 
1321
((Power Management))
1322
~~~~~~~~~~~~~~~~~~~~
1323
((Clock Gating)) and Frequency Changing Versus CPU Stalling
1324
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1325
If system doesn t support clock gating and if changing clock frequency in
1326
slow down mode is not possible, CPU can be stalled for certain number of
1327
clock cycles. This is much lower benefit on power consumption however it
1328
still reduces power consumption.
1329
 
1330
Slow Down Mode
1331
^^^^^^^^^^^^^^
1332
Slow down mode is software controlled with the 4-bit value in PMR[SDF]. Lower
1333
value specifies higher expected performance from the processor core. Usually
1334
PMR[SDF] is dynamically set by the operating system s idle routine, that
1335
monitors the usage of the processor core.
1336
(((Mode,Slow Down)))
1337
 
1338
PMR[SDF] is broadcast on +pm_clksd+. External clock generator should adjust
1339
clock frequency according to the value of +pm_clksd+. Exact slow down factors
1340
are not defined but 0xF should go all the way down to 32.768 KHz.
1341
 
1342
With +pm_clksd+ equal to 0xF, +pm_lvolt+ is asserted. This is an indication for
1343
the external power supply to lower the voltage.
1344
 
1345
Doze Mode
1346
^^^^^^^^^
1347
To switch to doze mode, software should set the PMR[DME]. Once an interrupt
1348
is received by the programmable interrupt controller (PIC), +pm_wakeup+
1349
is asserted and external clock generation circuitry should enable all
1350
clocks. Once clocks are running RISC is switched back again to the normal
1351
mode and PMR[DME] is cleared.
1352
(((Mode,Doze)))
1353
 
1354
When doze mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1355
+pm_immu_gate+ and +pm_cpugate+ are asserted. As a result all clocks except
1356
+clk_tt+ should be gated by external clock generation circuitry.
1357
 
1358
Sleep Mode
1359
^^^^^^^^^^
1360
To switch to sleep mode, software should set the PMR[SME]. Once an interrupt
1361
is received by the programmable interrupt controller (PIC), +pm_wakeup+ is
1362
asserted and external clock generation should enable all clocks. Once clocks
1363
are running, RISC is switched back again to the normal mode and PMR[SME]
1364
is cleared.
1365
(((Mode,Sleep)))
1366
 
1367
When sleep mode is enabled, +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+,
1368
+pm_immu_gate+, +pm_cpu_gate+ and +pm_tt_gate+ are asserted. As a result
1369
all clocks including +clk_tt+ should be gated by external clock generation
1370
circuitry.
1371
 
1372
In sleep mode, +pm_lvolt+ is asserted. This is an indication for the external
1373
power supply to lower the voltage.
1374
 
1375
Clock Gating
1376
^^^^^^^^^^^^
1377
((Clock gating)) feature is not implemented in OR1200 power management.
1378
 
1379
Disabled Units Force Clock Gating
1380
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1381
Units that are disabled in special-purpose register SR, have their clock
1382
gate signals asserted. Cleared bits SR[DCE], SR[ICE], SR[DME] and SR[IME]
1383
directly force assertion of +pm_dc_gate+, +pm_ic_gate+, +pm_dmmu_gate+
1384
and +pm_immu_gate+.
1385
 
1386
((Debug Unit))
1387
~~~~~~~~~~~~~~
1388
Debug unit can be controlled through development interface or it can operate
1389
independently programmed and handled by the RISC s resident debug software.
1390
 
1391
((Watchpoints))
1392
^^^^^^^^^^^^^^^
1393
OR1200 debug unit does not implement OR12000 architecture watchpoints.
1394
 
1395
((Breakpoint)) Exception
1396
^^^^^^^^^^^^^^^^^^^^^^^^
1397
Which breakpointDMR2[WGB] bits specify which watchpoints invoke breakpoint
1398
exception. By invoking breakpoint exception, target resident debugger can
1399
be built.
1400
 
1401
Breakpoint is broadcast on development interface on +dbg_bp_o+.
1402
 
1403
((Development Interface))
1404
~~~~~~~~~~~~~~~~~~~~~~~~~
1405
NOTE: The information in this section is to be reviewed. It is the author's
1406
opinion that the debug interface is now largely provided by the SPR mappings,
1407
and no special sideband functions exist aside from stalling and resetting
1408
the core.
1409
 
1410
An additional _development and debug interface IP_ core may be used to connect
1411
OpenRISC 1200 to standard debuggers using IEEE.1149.1 (JTAG) protocol.
1412
 
1413
((Debugging)) Through ((Development Interface))
1414
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1415
The DSR special-purpose register specifies which exceptions cause the core
1416
to stop the execution of the exception handler and turn over control to
1417
development interface. It can be programmed by the resident debug software
1418
or by the development interface.
1419
 
1420
The DRR special-purpose register is specifies which event caused the core to
1421
stop the execution of program flow and turned over control to the development
1422
interface. It should be cleared by the resident debug software or by the
1423
development interface.
1424
 
1425
The DIR special-purpose register is not implemented.
1426
 
1427
Reading PC, Load/Store EA, Load Data, Store Data, Instruction
1428
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1429
Crucial information like ((program counter)) (PC), load/store effective
1430
address (LSEA), load data, store data and current instruction in execution
1431
pipeline can be asynchronously read through the development interface.
1432
 
1433
[[dev_commands_table]]
1434
.Development Interface Operation Commands
1435
[width="70%",options="header"]
1436
|========================
1437
| dbg_op_i[2:0] | Meaning
1438
| 0x0           | Reading Program Counter (PC)
1439
| 0x1           | Reading Load/Store Effective Address
1440
| 0x2           | Reading Load Data
1441
| 0x3           | Reading Store Data
1442
| 0x4           | Reading SPR
1443
| 0x5           | Writing SPR
1444
| 0x6           | Reading Instruction in Execution Pipeline
1445
| 0x7           | Reserved
1446
|========================
1447
 
1448
<> lists operation commands that control what is read
1449
or written through development interface. All reads except reads and writes
1450
of SPRs are asynchronous.
1451
 
1452
Reading and Writing SPRs Through Development Interface
1453
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1454
For reads and write to SPRs +dbg_op_i+ must be set to 0x4 and 0x5,
1455
respectively.
1456
 
1457
[[dev_interface_cycles_fig]]
1458
.Development Interface Cycles
1459
image::img/dev_interface_cycles.gif[scaledwidth="70%",align="center"]
1460
 
1461
<> shows development interface cycles. Writes must
1462
be synchronous to the main RISC clock positive edge and should take one clock
1463
cycle. Reads must take two clock cycles because access to synchronous cache
1464
lines or to TLB entries introduces one clock cycle of delay.
1465
 
1466
If required, external debugger can stop the CPU core by asserting
1467
+dbg_stall_i+. This way it can have enough time to read all interesting
1468
registers from the RISC or guarantee that writes into SPRs are performed
1469
without RISC writing to the same registers.
1470
 
1471
Tracking ((Data Flow))
1472
^^^^^^^^^^^^^^^^^^^^^^
1473
An external debugger can monitor and record data flow inside the RISC for
1474
debugging purposes and profiling analysis. This is accomplished by monitoring
1475
status of the load/store unit, load/store effective address and load/store
1476
data, all available at the development interface.
1477
 
1478
[[status_ldst_unit_table]]
1479
.Status of the Load/Store Unit
1480
[width="70%",options="header"]
1481
|============================================================
1482
| dbg_lss_o[3:0]        | Load/Store Instruction in Execution
1483
| 0x0   | No load/store instruction in execution
1484
| 0x1   | Reserved for load doubleword
1485
| 0x2   | Load byte and zero extend
1486
| 0x3   | Load byte and sign extend
1487
| 0x4   | Load halfword and zero extend
1488
| 0x5   | Load halfword and sign extend
1489
| 0x6   | Load singleword and zero extend
1490
| 0x7   | Load singleword and sign extend
1491
| 0x8   | Reserved for store doubleword
1492
| 0x9   | Reserved
1493
| 0xA   | Store byte
1494
| 0xB   | Reserved
1495
| 0xC   | Store halfword
1496
| 0xD   | Reserved
1497
| 0xE   | Store singleword
1498
| 0xF   | Reserved
1499
|============================================================
1500
 
1501
External trace buffer can capture all interesting data flow
1502
events by analyzing status of the load/store unit available on
1503
+dbg_lss_o+. <> lists different status encoding for
1504
the load/store unit.
1505
 
1506
Tracking ((Program Flow))
1507
^^^^^^^^^^^^^^^^^^^^^^^^^
1508
An external debugger can monitor and record program flow inside the RISC
1509
for debugging purposes and profiling analysis. This is accomplished by
1510
monitoring status of the instruction unit, PC and fetched instruction word,
1511
all available at the development interface.
1512
 
1513
[[status_inst_unit_table]]
1514
.Status of the Instruction Unit
1515
[width="70%",options="header"]
1516
|=========================================
1517
| dbg_is_o[1:0] | Instruction Fetch Status
1518
| 0x0   | No instruction fetch in progress
1519
| 0x1   | Normal instruction fetch
1520
| 0x2   | Executing branch instruction
1521
| 0x3   | Fetching instruction in delay slot
1522
|=========================================
1523
 
1524
External trace buffer can capture all interesting program flow
1525
events by analyzing status of the instruction unit available on
1526
+dbg_is_o+. <> lists different status encoding for
1527
the instruction unit.
1528
 
1529
Triggering ((External Watchpoint Event))
1530
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1531
<> shows how development interface can assert
1532
+dbg_ewt_I+ and cause watchpoint event. If programmed, external watchpoint
1533
event will cause a breakpoint exception.
1534
 
1535
[[watchpoint_trigger_fig]]
1536
.Assertion of External Watchpoint Trigger
1537
image::img/watchpoint_trigger.gif[scaledwidth="70%",align="center"]
1538
 
1539
((Registers))
1540
-------------
1541
This section describes all registers inside the OR1200 core. Shifting _GRP_
1542
number 11 bits left and adding _REG_ number computes the address of each
1543
special-purpose register. All registers are 32 bits wide from software
1544
perspective. _USER MODE_ and _SUPV MODE_ specify the valid access types for
1545
each register in user mode and supervisor mode of operation. R/W stands for
1546
read and write access and R stands for read only access.
1547
 
1548
((Registers list))
1549
~~~~~~~~~~~~~~~~~~
1550
[[regs_table]]
1551
.List of All Registers
1552
[width="95%",options="header"]
1553
|============================================================================
1554
| Grp # | Reg # | Reg Name      | USER MODE     | SUPV MODE     | Description
1555
| 0     | 0     | ((VR))        | -             | R     | Version Register
1556
| 0     | 1     | ((UPR))       | -             | R     | Unit Present Register
1557
| 0     | 2     | ((CPUCFGR))   | -             | R     | CPU Configuration Register
1558
| 0     | 3     | ((DMMUCFGR))  | -             | R     | Data MMU Configuration Register
1559
| 0     | 4     | ((IMMUCFGR))  | -             | R     | Instruction MMU Configuration Register
1560
| 0     | 5     | ((DCCFGR))    | -             | R     | Data Cache Configuration Register
1561
| 0     | 6     | ((ICCFGR))    | -             | R     | Instruction Cache Configuration Register
1562
| 0     | 7     | ((DCFGR))     | -             | R     | Debug Configuration Register
1563
| 0     | 16    | ((PC))        | -             | R/W   | PC mapped to SPR space
1564
| 0     | 17    | ((SR))        | -             | R/W   | Supervision Register
1565
| 0     | 20    | ((FPCSR))     | -             | R/W   | FP Control Status Register
1566
| 0     | 32    | ((EPCR0))     | -             | R/W   | Exception PC Register
1567
| 0     | 48    | ((EEAR0))     | -             | R/W   | Exception EA Register
1568
| 0     | 64    | ((ESR0))      | -             | R/W   | Exception SR Register
1569
| 0     | 1024-1055     | ((GPR0-GPR31))        | -     | R/W   | GPRs mapped to SPR space
1570
| 1     | 2             | ((DTLBEIR))   | -     | W     | Data TLB Entry Invalidate Register
1571
| 1     | 1024-1151     | ((DTLBW0MR0-DTLBW0MR127))     | -     | R/W   | Data TLB Match Registers Way 0
1572
| 1     | 1536-1663     | ((DTLBW0TR0-DTLBW0TR127))     | -     | R/W   | Data TLB Translate Registers Way 0
1573
| 2     | 2             | ((ITLBEIR))   | -     | W     | Instruction TLB Entry Invalidate Register
1574
| 2     | 1024-1151     | ((ITLBW0MR0-ITLBW0MR127))     | -     | R/W   | Instruction TLB Match Registers Way 0
1575
| 2     | 1536-1663     | ((ITLBW0TR0-ITLBW0TR127))     | -     | R/W   | Instruction TLB Translate Registers Way 0
1576
| 3     | 0     | ((DCCR))      | -             | R/W   | DC Control Register
1577
| 3     | 2     | ((DCBFR))     | W             | W     | DC Block Flush Register
1578
| 3     | 3     | ((DCBIR))     | W             | W     | DC Block Invalidate Register
1579
| 3     | 4     | ((DCBWR))     | W             | W     | DC Block Write-back register
1580
| 4     | 0     | ((ICCR))      | -             | R/W   | IC Control Register
1581
| 4     | 256   | ((ICBIR))     | W             | W     | IC Block Invalidate Register
1582
| 5     | 256   | ((MACLO))     | R/W           | R/W   | MAC Low
1583
| 5     | 257   | ((MACHI))     | R/W           | R/W   | MAC High
1584
| 6     | 16    | ((DMR1))      | -             | R/W   | Debug Mode Register 1
1585
| 6     | 17    | ((DMR2))      | -             | R/W   | Debug Mode Register 2
1586
| 6     | 20    | ((DSR))       | -             | R/W   | Debug Stop Register
1587
| 6     | 21    | ((DRR))       | -             | R/W   | Debug Reason Register
1588
| 8     | 0     | ((PMR))       | -             | R/W   | Power Management Register
1589
| 9     | 0     | ((PICMR))     | -             | R/W   | PIC Mask Register
1590
| 9     | 2     | ((PICSR))     | -             | R/W   | PIC Status Register
1591
| 10    | 0     | ((TTMR))      | -             | R/W   | Tick Timer Mode Register
1592
| 10    | 1     | ((TTCR))      | R*            | R/W   | Tick Timer Count Register
1593
|============================================================================
1594
 
1595
<> lists all OpenRISC 1000 special-purpose registers implemented
1596
in OR1200. Registers VR and UPR are described below. For description of
1597
other registers refer to <>.
1598
 
1599
Register VR description
1600
~~~~~~~~~~~~~~~~~~~~~~~
1601
Special-purpose register VR identifies the version (model) and revision
1602
level of the OpenRISC 1000 processor. It also specifies possible standard
1603
template on which this implementation is based.
1604
(((Register,VR)))
1605
 
1606
[[vr_reg_table]]
1607
.VR Register
1608
[width="95%",options="header"]
1609
|============================================================
1610
| Bit # | Access        | Reset | Short Name    | Description
1611
| 5:0   | R     | Revision      | REV           | Revision number of this document.
1612
| 15:6  | R     | 0x0           | -             | Reserved
1613
| 23:16 | R     | 0x00          | CFG           | Configuration should be read from UPR and configuration registers
1614
| 31:24 | R     | 0x12          | VER           | Version number for OR1200 is fixed at 0x1200.
1615
|============================================================
1616
 
1617
Register UPR description
1618
~~~~~~~~~~~~~~~~~~~~~~~~
1619
Special-purpose register UPR identifies the units present in the processor. It
1620
has a bit for each implemented unit or functionality. Lower sixteen bits
1621
identify present units defined in the OpenRISC 1000 architecture. Upper
1622
sixteen bits define present custom units.
1623
(((Register,UPR)))
1624
 
1625
[[upr_reg_table]]
1626
.UPR Register
1627
[width="95%",options="header"]
1628
|============================================================
1629
| Bit # | Access        | Reset | Short Name    | Description
1630
| 0     | R             | 1     | UP            | UPR present
1631
| 1     | R             | 1     | DCP           | Data cache present[†]
1632
| 2     | R             | 1     | ICP           | Instruction cache present[†]
1633
| 3     | R             | 1     | DMP           | Data MMU present[†]
1634
| 4     | R             | 1     | IMP           | Instruction MMU present[†]
1635
| 5     | R             | 1     | MP            | MAC present[†]
1636
| 6     | R             | 1     | DUP           | Debug unit present[†]
1637
| 7     | R             | 0     | PCUP          | Performance counters unit not present[†]
1638
| 8     | R             | 1     | PMP           | Power Management Present[†]
1639
| 9     | R             | 1     | PICP          | Programmable interrupt controller present
1640
| 10    | R             | 1     | TTP           | Tick timer present
1641
| 11    | R             | 1     | FPP           | Floating point present[†]
1642
| 23:12 | R             | X     | -             | Reserved
1643
| 31:24 | R             | 0xXXXX| CUP           | The user of the OR1200 core adds custom units.
1644
|============================================================
1645
[†]: if enabled at synthesis time
1646
 
1647
Register CPUCFGR description
1648
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1649
Special-purpose register CPUCFGR identifies the capabilities and configuration
1650
of the CPU.
1651
(((Register,CPUCFGR)))
1652
 
1653
[[cpucfgr_reg_table]]
1654
.CPUCFGR Register
1655
[width="95%",options="header"]
1656
|============================================================
1657
| Bit # | Access        | Reset | Short Name    | Description
1658
| 3:0   | R             | 0x0   | NSGF          | Zero number of shadow GPR files
1659
| 4     | R             | 0     | HGF           | No half GPR files[†]
1660
| 5     | R             | 1     | OB32S         | ORBIS32 supported
1661
| 6     | R             | 0     | OB64S         | ORBIS64 not supported
1662
| 7     | R             | 1     | OF32S         | ORFPX32 supported[‡]
1663
| 8     | R             | 0     | OF64S         | ORFPX64 not supported
1664
| 9     | R             | 0     | OV64S         | ORVDX64 not supported
1665
|============================================================
1666
[†]: If disabled at synthesis time
1667
 
1668
[‡]: If FPU enabled at synthesis time
1669
 
1670
Register DMMUCFGR description
1671
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1672
Special-purpose register DMMUCFGR identifies the capabilities and configuration
1673
of the DMMU.
1674
(((Register,DMMUCFGR)))
1675
 
1676
[[dmmucfgr_reg_table]]
1677
.DMMUCFGR Register
1678
[width="95%",options="header"]
1679
|============================================================
1680
| Bit # | Access        | Reset | Short Name    | Description
1681
| 1:0   | R             | 0x0   | NTW           | One DTLB way
1682
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 DTLB sets
1683
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1684
| 8     | R             | 0     | CRI           | No DMMU control register implemented
1685
| 9     | R             | 0     | PRI           | No protection register implemented
1686
| 10    | R             | 1     | TEIRI         | DTLB entry invalidate register implemented
1687
| 11    | R             | 0     | HTR           | No hardware DTLB reload
1688
|============================================================
1689
 
1690
Register IMMUCFGR description
1691
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1692
Special-purpose register IMMUCFGR identifies the capabilities and configuration
1693
of the IMMU.
1694
(((Register,IMMUCFGR)))
1695
 
1696
[[immucfgr_reg_table]]
1697
.IMMUCFGR Register
1698
[width="95%",options="header"]
1699
|============================================================
1700
| Bit # | Access        | Reset | Short Name    | Description
1701
| 1:0   | R             | 0x0   | NTW           | One ITLB way
1702
| 4:2   | R             | 0x4 - 0x7     | NTS   | 16, 32, 64 or 128 ITLB sets
1703
| 7:5   | R             | 0x0   | NAE           | No ATB Entries
1704
| 8     | R             | 0     | CRI           | No IMMU control register implemented
1705
| 9     | R             | 0     | PRI           | No protection register implemented
1706
| 10    | R             | 1     | TEIRI         | ITLB entry invalidate register implemented
1707
| 11    | R             | 0     | HTR           | No hardware ITLB reload
1708
|============================================================
1709
 
1710
Register DCCFGR description
1711
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1712
Special-purpose register DCCFGR identifies the capabilities and configuration
1713
of the data cache.
1714
(((Register,DCCFGR)))
1715
 
1716
[[dccfgr_reg_table]]
1717
.DCCFGR Register
1718
[width="95%",options="header"]
1719
|============================================================
1720
| Bit # | Access        | Reset | Short Name    | Description
1721
| 2:0   | R             | 0x0   | NCW           | One DC way
1722
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 DC sets
1723
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1724
| 8     | R             | 0     | CWS           | Cache write-through strategy[†]
1725
| 9     | R             | 1     | CCRI          | DC control register implemented
1726
| 10    | R             | 1     | CBIRI         | DC block invalidate register implemented
1727
| 11    | R             | 0     | CBPRI         | DC block prefetch register not implemented
1728
| 12    | R             | 0     | CBLRI         | DC block lock register not implemented
1729
| 13    | R             | 1     | CBFRI         | DC block flush register implemented
1730
| 14    | R             | 1     | CBWBRI        | DC block write-back register  implemented[‡]
1731
|============================================================
1732
[†]: If disabled at synthesis time
1733
 
1734
[‡]: If FPU enabled at synthesis time
1735
 
1736
Register ICCFGR description
1737
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1738
Special-purpose register ICCFGR identifies the capabilities and configuration
1739
of the instruction cache.
1740
(((Register,ICCFGR)))
1741
 
1742
[[iccfgr_reg_table]]
1743
.ICCFGR Register
1744
[width="95%",options="header"]
1745
|============================================================
1746
| Bit # | Access        | Reset | Short Name    | Description
1747
| 2:0   | R             | 0x0   | NCW           | One IC way
1748
| 6:3   | R             | 0x4 - 0x7     | NCS   | 16, 32, 64 or 128 IC sets
1749
| 7     | R             | 0x0   | CBS           | 16-byte cache block size
1750
| 8     | R             | 0     | CWS           | Cache write-through strategy
1751
| 9     | R             | 1     | CCRI          | IC control register implemented
1752
| 10    | R             | 1     | CBIRI         | IC block invalidate register implemented
1753
| 11    | R             | 0     | CBPRI         | IC block prefetch register not implemented
1754
| 12    | R             | 0     | CBLRI         | IC block lock register not implemented
1755
| 13    | R             | 1     | CBFRI         | IC block flush register implemented
1756
| 14    | R             | 0     | CBWBRI        | IC block write-back register not implemented
1757
|============================================================
1758
 
1759
Register DCFGR description
1760
~~~~~~~~~~~~~~~~~~~~~~~~~~
1761
Special-purpose register DCFGR identifies the capabilities and configuration
1762
of the debut unit.
1763
(((Register,DCFGR)))
1764
 
1765
[[dcfgr_reg_table]]
1766
.DCFGR Register
1767
[width="95%",options="header"]
1768
|============================================================
1769
| Bit # | Access        | Reset | Short Name    | Description
1770
| 3:0   | R             | 0x0   | NDP           | Zero DVR/DCR pairs[†]
1771
| 4     | R             | 0     | WPCI          | Watchpoint counters not implemented
1772
|============================================================
1773
[†]: If hardware breakpoints disabled at synthesis time
1774
 
1775
((IO ports))
1776
------------
1777
OR1200 IP core has several interfaces. <> below shows
1778
all interfaces:
1779
 
1780
* Instruction and data WISHBONE host interfaces
1781
* Power management interface
1782
* Development interface
1783
* Interrupts interface
1784
 
1785
[[core_interfaces_fig]]
1786
.Core's Interfaces
1787
image::img/core_interfaces.gif[scaledwidth="50%",align="center"]
1788
 
1789
Instruction WISHBONE Master Interface
1790
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1791
OR1200 has two master WISHBONE Rev B compliant interfaces. Instruction
1792
interface is used to connect OR1200 core to memory subsystem for purpose of
1793
fetching instructions or instruction cache lines.
1794
 
1795
[[inst_wb_master_table]]
1796
.Instruction WISHBONE Master Interface' Signals
1797
[width="95%",options="header"]
1798
|====================================================
1799
| Port          | Width | Direction     | Description
1800
| ((iwb_CLK_I)) | 1     | Input         | Clock input
1801
| ((iwb_RST_I)) | 1     | Input         | Reset input
1802
| ((iwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1803
| ((iwb_ADR_O)) | 32    | Outputs       | Address outputs
1804
| ((iwb_DAT_I)) | 32    | Inputs        | Data inputs
1805
| ((iwb_DAT_O)) | 32    | Outputs       | Data outputs
1806
| ((iwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1807
| ((iwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1808
| ((iwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1809
| ((iwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as iwb_ERR_I.
1810
| ((iwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1811
| ((iwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1812
|====================================================
1813
 
1814
Data WISHBONE Master Interface
1815
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1816
OR1200 has two master WISHBONE Rev B compliant interfaces. Data interface
1817
is used to connect OR1200 core to external peripherals and memory subsystem
1818
for purpose of reading and writing data or data cache lines.
1819
 
1820
[[data_wb_master_table]]
1821
.Data WISHBONE Master Interface' Signals
1822
[width="95%",options="header"]
1823
|====================================================
1824
| Port          | Width | Direction     | Description
1825
| ((dwb_CLK_I)) | 1     | Input         | Clock input
1826
| ((dwb_RST_I)) | 1     | Input         | Reset input
1827
| ((dwb_CYC_O)) | 1     | Output        | Indicates valid bus cycle (core select)
1828
| ((dwb_ADR_O)) | 32    | Outputs       | Address outputs
1829
| ((dwb_DAT_I)) | 32    | Inputs        | Data inputs
1830
| ((dwb_DAT_O)) | 32    | Outputs       | Data outputs
1831
| ((dwb_SEL_O)) | 4     | Outputs       | Indicates valid bytes on data bus (during valid cycle it must be 0xf)
1832
| ((dwb_ACK_I)) | 1     | Input         | Acknowledgment input (indicates normal transaction termination)
1833
| ((dwb_ERR_I)) | 1     | Input         | Error acknowledgment input (indicates an abnormal transaction termination)
1834
| ((dwb_RTY_I)) | 1     | Input         | In OR1200 treated same way as dwb_ERR_I.
1835
| ((dwb_WE_O))  | 1     | Output        | Write transaction when asserted high
1836
| ((dwb_STB_O)) | 1     | Outputs       | Indicates valid data transfer cycle
1837
|====================================================
1838
 
1839
System Interface
1840
~~~~~~~~~~~~~~~~
1841
System interface connects reset, clock and other system signals to the
1842
OR1200 core.
1843
 
1844
[[sys_interface_table]]
1845
.System Interface Signals
1846
[width="95%",options="header"]
1847
|====================================================
1848
| Port          | Width | Direction     | Description
1849
| ((Rst))       | 1     | Input         | Asynchronous reset
1850
| ((clk_cpu))   | 1     | Input         | Main clock input to the RISC
1851
| ((clk_dc))    | 1     | Input         | Data cache clock
1852
| ((clk_ic))    | 1     | Input         | Instruction cache clock
1853
| ((clk_dmmu))  | 1     | Input         | Data MMU clock
1854
| ((clk_immu))  | 1     | Input         | Instruction MMU clock
1855
| ((clk_tt))    | 1     | Input         | Tick timer clock
1856
|====================================================
1857
 
1858
Development Interface
1859
~~~~~~~~~~~~~~~~~~~~~
1860
Development interface connects external development port to the RISC s internal
1861
debug facility. Debug facility allows control over program execution inside
1862
RISC, setting of breakpoints and watchpoints, and tracing of instruction
1863
and data flows.
1864
 
1865
[[dev_interface_table]]
1866
.Development Interface
1867
[width="95%",options="header"]
1868
|====================================================
1869
| Port          | Width | Direction     | Description
1870
| ((dbg_dat_o)) | 32    | Output        | Transfer of data from RISC to external development interface
1871
| ((dbg_dat_i)) | 32    | Input         | Transfer of data from external development interface to RISC
1872
| ((dbg_adr_i)) | 32    | Input         | Address of special-purpose register to be read or written
1873
| ((dbg_op_I))  | 3     | Input         | Operation select for development interface
1874
| ((dbg_lss_o)) | 4     | Output        | Status of load/store unit
1875
| ((dbg_is_o))  | 2     | Output        | Status of instruction fetch unit
1876
| ((dbg_wp_o))  | 11    | Output        | Status of watchpoints
1877
| ((dbg_bp_o))  | 1     | Output        | Status of the breakpoint
1878
| ((dbg_stall_i))       | 1     | Input | Stalls RISC CPU core
1879
| ((dbg_ewt_i)) | 1     | Input         | External watchpoint trigger
1880
|====================================================
1881
 
1882
Power Management Interface
1883
~~~~~~~~~~~~~~~~~~~~~~~~~~
1884
Power management interface provides signals for interfacing RISC core with
1885
external power management circuitry. External power management circuitry is
1886
required to implement functions that are technology specific and cannot be
1887
implemented inside OR1200 core.
1888
 
1889
[[pow_mgmt_interface_table]]
1890
.Power Management Interface
1891
[width="95%",options="header"]
1892
|============================================================================
1893
| Port                  | Width | Direction     | Generation            | Description
1894
| ((pm_clksd))          | 4     | Output        | Static (in SW)        | Slow down outputs that control reduction of RISC clock frequency
1895
| ((pm_cpustall))       | 1     | Input         | -                     | Synchronous stall of the RISC’s CPU core
1896
| ((pm_dc_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of data cache clock
1897
| ((pm_ic_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of instruction cache clock
1898
| ((pm_dmmu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of data MMU clock
1899
| ((pm_immu_gate))      | 1     | Output        | Dynamic (in HW)       | Gating of instruction MMU clock
1900
| ((pm_tt_gate))        | 1     | Output        | Dynamic (in HW)       | Gating of tick timer clock
1901
| ((pm_cpu_gate))       | 1     | Output        | Static (in SW)        | Gating of main CPU clock
1902
| ((pm_wakeup))         | 1     | Output        | Dynamic (in HW)       | Activate all clocks
1903
| ((pm_lvolt))          | 1     | Output        | Static (in SW)        | Lower voltage
1904
|============================================================================
1905
 
1906
Interrupt Interface
1907
~~~~~~~~~~~~~~~~~~~
1908
Interrupt interface has interrupt inputs for interfacing external peripheral
1909
s interrupt outputs to the RISC core. All interrupt inputs are evaluated on
1910
positive edge of main RISC clock.
1911
 
1912
[[interrupt_interface_table]]
1913
.Interrupt Interface
1914
[width="95%",options="header"]
1915
|============================================================
1916
| Port          | Width         | Direction     | Description
1917
| ((pic_ints))  | PIC_INTS      | Input         | External interrupts
1918
|============================================================
1919
 
1920
 
1921
 
1922
[appendix]
1923
Core HW Configuration
1924
=====================
1925
(((Hardware,Configuration)))
1926
This section describes parameters that are set by the user of the core and
1927
define configuration of the core. Parameters must be set by the user before
1928
actual use of the core in simulation or synthesis.
1929
 
1930
[[core_hw_conf_table]]
1931
.Core HW configuration table
1932
[width="95%",options="header"]
1933
|============================================================
1934
| Variable Name | Range         | Default       | Description
1935
| ((EADDR_WIDTH))       | 32    | 32    | Effective address width
1936
| ((VADDR_WIDTH))       | 32    | 32    | Virtual address width
1937
| ((PADDR_WIDTH))       | 24 - 36| 32   | Physical address width
1938
| ((DATA_WIDTH))        | 32    | 32    | Data width / Operation width
1939
| ((DC_IMPL))   | 0 - 1         | 1     | Data cache implementation
1940
| ((DC_SETS))   | 256-1024      | 512   | Data cache number of sets
1941
| ((DC_WAYS))   | 1             | 1     | Data cache number of ways
1942
| ((DC_LINE))   | 16 - 32       | 16    | Data cache line size
1943
| ((IC_IMPL))   | 0 - 1         | 1     | Instruction cache implementation
1944
| ((IC_SETS))   | 32-1024       | 512   | Instruction cache number of sets
1945
| ((IC_WAYS))   | 1             | 1     | Instruction cache number of ways
1946
| ((IC_LINE))   | 16-32         | 16    | Instruction cache line size in bytes
1947
| ((DMMU_IMPL)) | 0 - 1         | 1     | Data MMU implementation
1948
| ((DTLB_SETS)) | 64            | 64    | Data TLB number of sets
1949
| ((DTLB_WAYS)) | 1             | 1     | Data TLB number of ways
1950
| ((IMMU_IMPL)) | 0 - 1         | 1     | Instruction MMU implementation
1951
| ((ITLB_SETS)) | 64            | 64    | Instruction TLB number of sets
1952
| ((ITLB_WAYS)) | 1             | 1     | Instruction TLB number of ways
1953
| ((PIC_INTS))  | 2 - 32        | 20    | Number of interrupt inputs
1954
|============================================================
1955
 
1956
:numbered!:
1957
 
1958
[bibliography]
1959
((Bibliography))
1960
================
1961
[bibliography]
1962
- [[[or1000_manual]]] Damjan Lampret et al. 'OpenRISC 1000 System Architecture
1963
  Manual'. 2004.
1964
 
1965
[index]
1966
Index
1967
=====
1968
// The index is generated automatically by the DocBook toolchain.

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.