OpenCores
URL https://opencores.org/ocsvn/or1k/or1k/trunk

Subversion Repositories or1k

[/] [or1k/] [trunk/] [rtems-20020807/] [doc/] [porting/] [codetuning.t] - Blame information for rev 1771

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 1026 ivang
@c
2
@c  COPYRIGHT (c) 1988-2002.
3
@c  On-Line Applications Research Corporation (OAR).
4
@c  All rights reserved.
5
@c
6
@c  codetuning.t,v 1.5 2002/01/17 21:47:45 joel Exp
7
@c
8
 
9
@chapter Code Tuning Parameters
10
 
11
@section Inline Thread_Enable_dispatch
12
 
13
Should the calls to _Thread_Enable_dispatch be inlined?
14
 
15
If TRUE, then they are inlined.
16
 
17
If FALSE, then a subroutine call is made.
18
 
19
 
20
Basically this is an example of the classic trade-off of size versus
21
speed.  Inlining the call (TRUE) typically increases the size of RTEMS
22
while speeding up the enabling of dispatching.
23
 
24
[NOTE: In general, the _Thread_Dispatch_disable_level will only be 0 or 1
25
unless you are in an interrupt handler and that interrupt handler invokes
26
the executive.] When not inlined something calls _Thread_Enable_dispatch
27
which in turns calls _Thread_Dispatch.  If the enable dispatch is inlined,
28
then one subroutine call is avoided entirely.]
29
 
30
@example
31
#define CPU_INLINE_ENABLE_DISPATCH       FALSE
32
@end example
33
 
34
@section Inline Thread_queue_Enqueue_priority
35
 
36
Should the body of the search loops in _Thread_queue_Enqueue_priority be
37
unrolled one time?  In unrolled each iteration of the loop examines two
38
"nodes" on the chain being searched.  Otherwise, only one node is examined
39
per iteration.
40
 
41
If TRUE, then the loops are unrolled.
42
 
43
If FALSE, then the loops are not unrolled.
44
 
45
The primary factor in making this decision is the cost of disabling and
46
enabling interrupts (_ISR_Flash) versus the cost of rest of the body of
47
the loop.  On some CPUs, the flash is more expensive than one iteration of
48
the loop body.  In this case, it might be desirable to unroll the loop.
49
It is important to note that on some CPUs, this code is the longest
50
interrupt disable period in RTEMS.  So it is necessary to strike a balance
51
when setting this parameter.
52
 
53
@example
54
#define CPU_UNROLL_ENQUEUE_PRIORITY      TRUE
55
@end example
56
 
57
 
58
@section Structure Alignment Optimization
59
 
60
The following macro may be defined to the attribute setting used to force
61
alignment of critical RTEMS structures.  On some processors it may make
62
sense to have these aligned on tighter boundaries than the minimum
63
requirements of the compiler in order to have as much of the critical data
64
area as possible in a cache line.  This ensures that the first access of
65
an element in that structure fetches most, if not all, of the data
66
structure and places it in the data cache.  Modern CPUs often have cache
67
lines of at least 16 bytes and thus a single access implicitly fetches
68
some surrounding data and places that unreferenced data in the cache.
69
Taking advantage of this allows RTEMS to essentially prefetch critical
70
data elements.
71
 
72
The placement of this macro in the declaration of the variables is based
73
on the syntactically requirements of the GNU C "__attribute__" extension.
74
For another toolset, the placement of this macro could be incorrect.  For
75
example with GNU C, use the following definition of
76
CPU_STRUCTURE_ALIGNMENT to force a structures to a 32 byte boundary.
77
 
78
#define CPU_STRUCTURE_ALIGNMENT __attribute__ ((aligned (32)))
79
 
80
To benefit from using this, the data must be heavily used so it will stay
81
in the cache and used frequently enough in the executive to justify
82
turning this on.  NOTE:  Because of this, only the Priority Bit Map table
83
currently uses this feature.
84
 
85
The following illustrates how the CPU_STRUCTURE_ALIGNMENT is defined on
86
ports which require no special alignment for optimized access to data
87
structures:
88
 
89
@example
90
#define CPU_STRUCTURE_ALIGNMENT
91
@end example
92
 
93
@section Data Alignment Requirements
94
 
95
@subsection Data Element Alignment
96
 
97
The CPU_ALIGNMENT macro should be set to the CPU's worst alignment
98
requirement for data types on a byte boundary.  This is typically the
99
alignment requirement for a C double. This alignment does not take into
100
account the requirements for the stack.
101
 
102
The following sets the CPU_ALIGNMENT macro to 8 which indicates that there
103
is a basic C data type for this port which much be aligned to an 8 byte
104
boundary.
105
 
106
@example
107
#define CPU_ALIGNMENT              8
108
@end example
109
 
110
@subsection Heap Element Alignment
111
 
112
The CPU_HEAP_ALIGNMENT macro is set to indicate the byte alignment
113
requirement for data allocated by the RTEMS Code Heap Handler.  This
114
alignment requirement may be stricter than that for the data types
115
alignment specified by CPU_ALIGNMENT.  It is common for the heap to follow
116
the same alignment requirement as CPU_ALIGNMENT.  If the CPU_ALIGNMENT is
117
strict enough for the heap, then this should be set to CPU_ALIGNMENT. This
118
macro is necessary to ensure that allocated memory is properly aligned for
119
use by high level language routines.
120
 
121
The following example illustrates how the CPU_HEAP_ALIGNMENT macro is set
122
when the required alignment for elements from the heap is the same as the
123
basic CPU alignment requirements.
124
 
125
@example
126
#define CPU_HEAP_ALIGNMENT         CPU_ALIGNMENT
127
@end example
128
 
129
NOTE:  This does not have to be a power of 2.  It does have to be greater
130
or equal to than CPU_ALIGNMENT.
131
 
132
@subsection Partition Element Alignment
133
 
134
The CPU_PARTITION_ALIGNMENT macro is set to indicate the byte alignment
135
requirement for memory buffers allocated by the RTEMS Partition Manager
136
that is part of the Classic API.  This alignment requirement may be
137
stricter than that for the data types alignment specified by
138
CPU_ALIGNMENT.  It is common for the partition to follow the same
139
alignment requirement as CPU_ALIGNMENT.  If the CPU_ALIGNMENT is strict
140
enough for the partition, then this should be set to CPU_ALIGNMENT.  This
141
macro is necessary to ensure that allocated memory is properly aligned for
142
use by high level language routines.
143
 
144
The following example illustrates how the CPU_PARTITION_ALIGNMENT macro is
145
set when the required alignment for elements from the RTEMS Partition
146
Manager is the same as the basic CPU alignment requirements.
147
 
148
 
149
@example
150
#define CPU_PARTITION_ALIGNMENT    CPU_ALIGNMENT
151
@end example
152
 
153
NOTE:  This does not have to be a power of 2.  It does have to be greater
154
or equal to than CPU_ALIGNMENT.
155
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.