1 |
1026 |
ivang |
@c
|
2 |
|
|
@c COPYRIGHT (c) 1988-2002.
|
3 |
|
|
@c On-Line Applications Research Corporation (OAR).
|
4 |
|
|
@c All rights reserved.
|
5 |
|
|
@c
|
6 |
|
|
@c timing.t,v 1.3 2002/01/17 21:47:44 joel Exp
|
7 |
|
|
@c
|
8 |
|
|
|
9 |
|
|
@chapter Timing Specification
|
10 |
|
|
|
11 |
|
|
@section Introduction
|
12 |
|
|
|
13 |
|
|
This chapter provides information pertaining to the
|
14 |
|
|
measurement of the performance of RTEMS, the methods of
|
15 |
|
|
gathering the timing data, and the usefulness of the data. Also
|
16 |
|
|
discussed are other time critical aspects of RTEMS that affect
|
17 |
|
|
an applications design and ultimate throughput. These aspects
|
18 |
|
|
include determinancy, interrupt latency and context switch times.
|
19 |
|
|
|
20 |
|
|
@section Philosophy
|
21 |
|
|
|
22 |
|
|
Benchmarks are commonly used to evaluate the
|
23 |
|
|
performance of software and hardware. Benchmarks can be an
|
24 |
|
|
effective tool when comparing systems. Unfortunately,
|
25 |
|
|
benchmarks can also be manipulated to justify virtually any
|
26 |
|
|
claim. Benchmarks of real-time executives are difficult to
|
27 |
|
|
evaluate for a variety of reasons. Executives vary in the
|
28 |
|
|
robustness of features and options provided. Even when
|
29 |
|
|
executives compare favorably in functionality, it is quite
|
30 |
|
|
likely that different methodologies were used to obtain the
|
31 |
|
|
timing data. Another problem is that some executives provide
|
32 |
|
|
times for only a small subset of directives, This is typically
|
33 |
|
|
justified by claiming that these are the only time-critical
|
34 |
|
|
directives. The performance of some executives is also very
|
35 |
|
|
sensitive to the number of objects in the system. To obtain any
|
36 |
|
|
measure of usefulness, the performance information provided for
|
37 |
|
|
an executive should address each of these issues.
|
38 |
|
|
|
39 |
|
|
When evaluating the performance of a real-time
|
40 |
|
|
executive, one typically considers the following areas:
|
41 |
|
|
determinancy, directive times, worst case interrupt latency, and
|
42 |
|
|
context switch time. Unfortunately, these areas do not have
|
43 |
|
|
standard measurement methodologies. This allows vendors to
|
44 |
|
|
manipulate the results such that their product is favorably
|
45 |
|
|
represented. We have attempted to provide useful and meaningful
|
46 |
|
|
timing information for RTEMS. To insure the usefulness of our
|
47 |
|
|
data, the methodology and definitions used to obtain and
|
48 |
|
|
describe the data are also documented.
|
49 |
|
|
|
50 |
|
|
@subsection Determinancy
|
51 |
|
|
|
52 |
|
|
The correctness of data in a real-time system must
|
53 |
|
|
always be judged by its timeliness. In many real-time systems,
|
54 |
|
|
obtaining the correct answer does not necessarily solve the
|
55 |
|
|
problem. For example, in a nuclear reactor it is not enough to
|
56 |
|
|
determine that the core is overheating. This situation must be
|
57 |
|
|
detected and acknowledged early enough that corrective action
|
58 |
|
|
can be taken and a meltdown avoided.
|
59 |
|
|
|
60 |
|
|
Consequently, a system designer must be able to
|
61 |
|
|
predict the worst-case behavior of the application running under
|
62 |
|
|
the selected executive. In this light, it is important that a
|
63 |
|
|
real-time system perform consistently regardless of the number
|
64 |
|
|
of tasks, semaphores, or other resources allocated. An
|
65 |
|
|
important design goal of a real-time executive is that all
|
66 |
|
|
internal algorithms be fixed-cost. Unfortunately, this goal is
|
67 |
|
|
difficult to completely meet without sacrificing the robustness
|
68 |
|
|
of the executive's feature set.
|
69 |
|
|
|
70 |
|
|
Many executives use the term deterministic to mean
|
71 |
|
|
that the execution times of their services can be predicted.
|
72 |
|
|
However, they often provide formulas to modify execution times
|
73 |
|
|
based upon the number of objects in the system. This usage is
|
74 |
|
|
in sharp contrast to the notion of deterministic meaning fixed
|
75 |
|
|
cost.
|
76 |
|
|
|
77 |
|
|
Almost all RTEMS directives execute in a fixed amount
|
78 |
|
|
of time regardless of the number of objects present in the
|
79 |
|
|
system. The primary exception occurs when a task blocks while
|
80 |
|
|
acquiring a resource and specifies a non-zero timeout interval.
|
81 |
|
|
|
82 |
|
|
Other exceptions are message queue broadcast,
|
83 |
|
|
obtaining a variable length memory block, object name to ID
|
84 |
|
|
translation, and deleting a resource upon which tasks are
|
85 |
|
|
waiting. In addition, the time required to service a clock tick
|
86 |
|
|
interrupt is based upon the number of timeouts and other
|
87 |
|
|
"events" which must be processed at that tick. This second
|
88 |
|
|
group is composed primarily of capabilities which are inherently
|
89 |
|
|
non-deterministic but are infrequently used in time critical
|
90 |
|
|
situations. The major exception is that of servicing a clock
|
91 |
|
|
tick. However, most applications have a very small number of
|
92 |
|
|
timeouts which expire at exactly the same millisecond (usually
|
93 |
|
|
none, but occasionally two or three).
|
94 |
|
|
|
95 |
|
|
@subsection Interrupt Latency
|
96 |
|
|
|
97 |
|
|
Interrupt latency is the delay between the CPU's
|
98 |
|
|
receipt of an interrupt request and the execution of the first
|
99 |
|
|
application-specific instruction in an interrupt service
|
100 |
|
|
routine. Interrupts are a critical component of most real-time
|
101 |
|
|
applications and it is critical that they be acted upon as
|
102 |
|
|
quickly as possible.
|
103 |
|
|
|
104 |
|
|
Knowledge of the worst case interrupt latency of an
|
105 |
|
|
executive aids the application designer in determining the
|
106 |
|
|
maximum period of time between the generation of an interrupt
|
107 |
|
|
and an interrupt handler responding to that interrupt. The
|
108 |
|
|
interrupt latency of an system is the greater of the executive's
|
109 |
|
|
and the applications's interrupt latency. If the application
|
110 |
|
|
disables interrupts longer than the executive, then the
|
111 |
|
|
application's interrupt latency is the system's worst case
|
112 |
|
|
interrupt disable period.
|
113 |
|
|
|
114 |
|
|
The worst case interrupt latency for a real-time
|
115 |
|
|
executive is based upon the following components:
|
116 |
|
|
|
117 |
|
|
@itemize @bullet
|
118 |
|
|
@item the longest period of time interrupts are disabled
|
119 |
|
|
by the executive,
|
120 |
|
|
|
121 |
|
|
@item the overhead required by the executive at the
|
122 |
|
|
beginning of each ISR,
|
123 |
|
|
|
124 |
|
|
@item the time required for the CPU to vector the
|
125 |
|
|
interrupt, and
|
126 |
|
|
|
127 |
|
|
@item for some microprocessors, the length of the longest
|
128 |
|
|
instruction.
|
129 |
|
|
@end itemize
|
130 |
|
|
|
131 |
|
|
The first component is irrelevant if an interrupt
|
132 |
|
|
occurs when interrupts are enabled, although it must be included
|
133 |
|
|
in a worst case analysis. The third and fourth components are
|
134 |
|
|
particular to a CPU implementation and are not dependent on the
|
135 |
|
|
executive. The fourth component is ignored by this document
|
136 |
|
|
because most applications use only a subset of a
|
137 |
|
|
microprocessor's instruction set. Because of this the longest
|
138 |
|
|
instruction actually executed is application dependent. The
|
139 |
|
|
worst case interrupt latency of an executive is typically
|
140 |
|
|
defined as the sum of components (1) and (2). The second
|
141 |
|
|
component includes the time necessry for RTEMS to save registers
|
142 |
|
|
and vector to the user-defined handler. RTEMS includes the
|
143 |
|
|
third component, the time required for the CPU to vector the
|
144 |
|
|
interrupt, because it is a required part of any interrupt.
|
145 |
|
|
|
146 |
|
|
Many executives report the maximum interrupt disable
|
147 |
|
|
period as their interrupt latency and ignore the other
|
148 |
|
|
components. This results in very low worst-case interrupt
|
149 |
|
|
latency times which are not indicative of actual application
|
150 |
|
|
performance. The definition used by RTEMS results in a higher
|
151 |
|
|
interrupt latency being reported, but accurately reflects the
|
152 |
|
|
longest delay between the CPU's receipt of an interrupt request
|
153 |
|
|
and the execution of the first application-specific instruction
|
154 |
|
|
in an interrupt service routine.
|
155 |
|
|
|
156 |
|
|
The actual interrupt latency times are reported in
|
157 |
|
|
the Timing Data chapter of this supplement.
|
158 |
|
|
|
159 |
|
|
@subsection Context Switch Time
|
160 |
|
|
|
161 |
|
|
An RTEMS context switch is defined as the act of
|
162 |
|
|
taking the CPU from the currently executing task and giving it
|
163 |
|
|
to another task. This process involves the following components:
|
164 |
|
|
|
165 |
|
|
@itemize @bullet
|
166 |
|
|
@item Saving the hardware state of the current task.
|
167 |
|
|
|
168 |
|
|
@item Optionally, invoking the TASK_SWITCH user extension.
|
169 |
|
|
|
170 |
|
|
@item Restoring the hardware state of the new task.
|
171 |
|
|
@end itemize
|
172 |
|
|
|
173 |
|
|
RTEMS defines the hardware state of a task to include
|
174 |
|
|
the CPU's data registers, address registers, and, optionally,
|
175 |
|
|
floating point registers.
|
176 |
|
|
|
177 |
|
|
Context switch time is often touted as a performance
|
178 |
|
|
measure of real-time executives. However, a context switch is
|
179 |
|
|
performed as part of a directive's actions and should be viewed
|
180 |
|
|
as such when designing an application. For example, if a task
|
181 |
|
|
is unable to acquire a semaphore and blocks, a context switch is
|
182 |
|
|
required to transfer control from the blocking task to a new
|
183 |
|
|
task. From the application's perspective, the context switch is
|
184 |
|
|
a direct result of not acquiring the semaphore. In this light,
|
185 |
|
|
the context switch time is no more relevant than the performance
|
186 |
|
|
of any other of the executive's subroutines which are not
|
187 |
|
|
directly accessible by the application.
|
188 |
|
|
|
189 |
|
|
In spite of the inappropriateness of using the
|
190 |
|
|
context switch time as a performance metric, RTEMS context
|
191 |
|
|
switch times for floating point and non-floating points tasks
|
192 |
|
|
are provided for comparison purposes. Of the executives which
|
193 |
|
|
actually support floating point operations, many do not report
|
194 |
|
|
context switch times for floating point context switch time.
|
195 |
|
|
This results in a reported context switch time which is
|
196 |
|
|
meaningless for an application with floating point tasks.
|
197 |
|
|
|
198 |
|
|
The actual context switch times are reported in the
|
199 |
|
|
Timing Data chapter of this supplement.
|
200 |
|
|
|
201 |
|
|
@subsection Directive Times
|
202 |
|
|
|
203 |
|
|
Directives are the application's interface to the
|
204 |
|
|
executive, and as such their execution times are critical in
|
205 |
|
|
determining the performance of the application. For example, an
|
206 |
|
|
application using a semaphore to protect a critical data
|
207 |
|
|
structure should be aware of the time required to acquire and
|
208 |
|
|
release a semaphore. In addition, the application designer can
|
209 |
|
|
utilize the directive execution times to evaluate the
|
210 |
|
|
performance of different synchronization and communication
|
211 |
|
|
mechanisms.
|
212 |
|
|
|
213 |
|
|
The actual directive execution times are reported in
|
214 |
|
|
the Timing Data chapter of this supplement.
|
215 |
|
|
|
216 |
|
|
@section Methodology
|
217 |
|
|
|
218 |
|
|
@subsection Software Platform
|
219 |
|
|
|
220 |
|
|
The RTEMS timing suite is written in C. The overhead
|
221 |
|
|
of passing arguments to RTEMS by C is not timed. The times
|
222 |
|
|
reported represent the amount of time from entering to exiting
|
223 |
|
|
RTEMS.
|
224 |
|
|
|
225 |
|
|
The tests are based upon one of two execution models:
|
226 |
|
|
(1) single invocation times, and (2) average times of repeated
|
227 |
|
|
invocations. Single invocation times are provided for
|
228 |
|
|
directives which cannot easily be invoked multiple times in the
|
229 |
|
|
same scenario. For example, the times reported for entering and
|
230 |
|
|
exiting an interrupt service routine are single invocation
|
231 |
|
|
times. The second model is used for directives which can easily
|
232 |
|
|
be invoked multiple times in the same scenario. For example,
|
233 |
|
|
the times reported for semaphore obtain and semaphore release
|
234 |
|
|
are averages of multiple invocations. At least 100 invocations
|
235 |
|
|
are used to obtain the average.
|
236 |
|
|
|
237 |
|
|
@subsection Hardware Platform
|
238 |
|
|
|
239 |
|
|
Since RTEMS supports a variety of processors, the
|
240 |
|
|
hardware platform used to gather the benchmark times must also
|
241 |
|
|
vary. Therefore, for each processor supported the hardware
|
242 |
|
|
platform must be defined. Each definition will include a brief
|
243 |
|
|
description of the target hardware platform including the clock
|
244 |
|
|
speed, memory wait states encountered, and any other pertinent
|
245 |
|
|
information. This definition may be found in the processor
|
246 |
|
|
dependent timing data chapter within this supplement.
|
247 |
|
|
|
248 |
|
|
@subsection What is measured?
|
249 |
|
|
|
250 |
|
|
An effort was made to provide execution times for a
|
251 |
|
|
large portion of RTEMS. Times were provided for most directives
|
252 |
|
|
regardless of whether or not they are typically used in time
|
253 |
|
|
critical code. For example, execution times are provided for
|
254 |
|
|
all object create and delete directives, even though these are
|
255 |
|
|
typically part of application initialization.
|
256 |
|
|
|
257 |
|
|
The times include all RTEMS actions necessary in a
|
258 |
|
|
particular scenario. For example, all times for blocking
|
259 |
|
|
directives include the context switch necessary to transfer
|
260 |
|
|
control to a new task. Under no circumstances is it necessary
|
261 |
|
|
to add context switch time to the reported times.
|
262 |
|
|
|
263 |
|
|
The following list describes the objects created by
|
264 |
|
|
the timing suite:
|
265 |
|
|
|
266 |
|
|
@itemize @bullet
|
267 |
|
|
@item All tasks are non-floating point.
|
268 |
|
|
|
269 |
|
|
@item All tasks are created as local objects.
|
270 |
|
|
|
271 |
|
|
@item No timeouts are used on blocking directives.
|
272 |
|
|
|
273 |
|
|
@item All tasks wait for objects in FIFO order.
|
274 |
|
|
|
275 |
|
|
@end itemize
|
276 |
|
|
|
277 |
|
|
In addition, no user extensions are configured.
|
278 |
|
|
|
279 |
|
|
@subsection What is not measured?
|
280 |
|
|
|
281 |
|
|
The times presented in this document are not intended
|
282 |
|
|
to represent best or worst case times, nor are all directives
|
283 |
|
|
included. For example, no times are provided for the initialize
|
284 |
|
|
executive and fatal_error_occurred directives. Other than the
|
285 |
|
|
exceptions detailed in the Determinancy section, all directives
|
286 |
|
|
will execute in the fixed length of time given.
|
287 |
|
|
|
288 |
|
|
Other than entering and exiting an interrupt service
|
289 |
|
|
routine, all directives were executed from tasks and not from
|
290 |
|
|
interrupt service routines. Directives invoked from ISRs, when
|
291 |
|
|
allowable, will execute in slightly less time than when invoked
|
292 |
|
|
from a task because rescheduling is delayed until the interrupt
|
293 |
|
|
exits.
|
294 |
|
|
|
295 |
|
|
@subsection Terminology
|
296 |
|
|
|
297 |
|
|
The following is a list of phrases which are used to
|
298 |
|
|
distinguish individual execution paths of the directives taken
|
299 |
|
|
during the RTEMS performance analysis:
|
300 |
|
|
|
301 |
|
|
@table @b
|
302 |
|
|
@item another task
|
303 |
|
|
The directive was performed
|
304 |
|
|
on a task other than the calling task.
|
305 |
|
|
|
306 |
|
|
@item available
|
307 |
|
|
A task attempted to obtain a resource and
|
308 |
|
|
immediately acquired it.
|
309 |
|
|
|
310 |
|
|
@item blocked task
|
311 |
|
|
The task operated upon by the
|
312 |
|
|
directive was blocked waiting for a resource.
|
313 |
|
|
|
314 |
|
|
@item caller blocks
|
315 |
|
|
The requested resoure was not
|
316 |
|
|
immediately available and the calling task chose to wait.
|
317 |
|
|
|
318 |
|
|
@item calling task
|
319 |
|
|
The task invoking the directive.
|
320 |
|
|
|
321 |
|
|
@item messages flushed
|
322 |
|
|
One or more messages was flushed
|
323 |
|
|
from the message queue.
|
324 |
|
|
|
325 |
|
|
@item no messages flushed
|
326 |
|
|
No messages were flushed from
|
327 |
|
|
the message queue.
|
328 |
|
|
|
329 |
|
|
@item not available
|
330 |
|
|
A task attempted to obtain a resource
|
331 |
|
|
and could not immediately acquire it.
|
332 |
|
|
|
333 |
|
|
@item no reschedule
|
334 |
|
|
The directive did not require a
|
335 |
|
|
rescheduling operation.
|
336 |
|
|
|
337 |
|
|
@item NO_WAIT
|
338 |
|
|
A resource was not available and the
|
339 |
|
|
calling task chose to return immediately via the NO_WAIT option
|
340 |
|
|
with an error.
|
341 |
|
|
|
342 |
|
|
@item obtain current
|
343 |
|
|
The current value of something was
|
344 |
|
|
requested by the calling task.
|
345 |
|
|
|
346 |
|
|
@item preempts caller
|
347 |
|
|
The release of a resource caused a
|
348 |
|
|
task of higher priority than the calling to be readied and it
|
349 |
|
|
became the executing task.
|
350 |
|
|
|
351 |
|
|
@item ready task
|
352 |
|
|
The task operated upon by the directive
|
353 |
|
|
was in the ready state.
|
354 |
|
|
|
355 |
|
|
@item reschedule
|
356 |
|
|
The actions of the directive
|
357 |
|
|
necessitated a rescheduling operation.
|
358 |
|
|
|
359 |
|
|
@item returns to caller
|
360 |
|
|
The directive succeeded and
|
361 |
|
|
immediately returned to the calling task.
|
362 |
|
|
|
363 |
|
|
@item returns to interrupted task
|
364 |
|
|
The instructions
|
365 |
|
|
executed immediately following this interrupt will be in the
|
366 |
|
|
interrupted task.
|
367 |
|
|
|
368 |
|
|
@item returns to nested interrupt
|
369 |
|
|
The instructions
|
370 |
|
|
executed immediately following this interrupt will be in a
|
371 |
|
|
previously interrupted ISR.
|
372 |
|
|
|
373 |
|
|
@item returns to preempting task
|
374 |
|
|
The instructions
|
375 |
|
|
executed immediately following this interrupt or signal handler
|
376 |
|
|
will be in a task other than the interrupted task.
|
377 |
|
|
|
378 |
|
|
@item signal to self
|
379 |
|
|
The signal set was sent to the
|
380 |
|
|
calling task and signal processing was enabled.
|
381 |
|
|
|
382 |
|
|
@item suspended task
|
383 |
|
|
The task operated upon by the
|
384 |
|
|
directive was in the suspended state.
|
385 |
|
|
|
386 |
|
|
@item task readied
|
387 |
|
|
The release of a resource caused a
|
388 |
|
|
task of lower or equal priority to be readied and the calling
|
389 |
|
|
task remained the executing task.
|
390 |
|
|
|
391 |
|
|
@item yield
|
392 |
|
|
The act of attempting to voluntarily release
|
393 |
|
|
the CPU.
|
394 |
|
|
|
395 |
|
|
@end table
|
396 |
|
|
|