OpenCores
URL https://opencores.org/ocsvn/eco32/eco32/trunk

Subversion Repositories eco32

[/] [eco32/] [trunk/] [fpga/] [tests/] [test_101/] [dhry/] [RATIONALE] - Blame information for rev 296

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 295 hellwig
 
2
 
3
    Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules
4
 
5
        [published in SIGPLAN Notices 23,8 (Aug. 1988), 49-62]
6
 
7
 
8
                 Reinhold P. Weicker
9
                 Siemens AG, E STE 35
10
                 [now: Siemens AG, AUT E 51]
11
                 Postfach 3220
12
                 D-8520 Erlangen
13
                 Germany (West)
14
 
15
 
16
 
17
 
18
1.  Why a Version 2 of Dhrystone?
19
 
20
The Dhrystone benchmark  program  [1]  has  become  a  popular  benchmark  for
21
CPU/compiler   performance   measurement,   in   particular  in  the  area  of
22
minicomputers, workstations, PC's and microprocesors.  It apparently satisfies
23
a  need  for  an  easy-to-use  integer benchmark; it gives a first performance
24
indication which is more meaningful than MIPS numbers which, in their  literal
25
meaning  (million  instructions  per  second), cannot be used across different
26
instruction sets (e.g. RISC  vs.  CISC).   With  the  increasing  use  of  the
27
benchmark, it seems necessary to reconsider the benchmark and to check whether
28
it can still fulfill this function.  Version 2 of Dhrystone is the  result  of
29
such a re-evaluation, it has been made for two reasons:
30
 
31
o Dhrystone has been published in Ada [1], and Versions in Ada, Pascal  and  C
32
  have  been  distributed  by  Reinhold Weicker via floppy disk.  However, the
33
  version that was used most often for benchmarking has been the version  made
34
  by  Rick  Richardson  by another translation from the Ada version into the C
35
  programming language, this has been the version  distributed  via  the  UNIX
36
  network Usenet [2].
37
 
38
  There is an obvious need for a common C version of Dhrystone, since C is  at
39
  present  the  most  popular  system  programming  language  for the class of
40
  systems (microcomputers, minicomputers,  workstations)  where  Dhrystone  is
41
  used  most.   There  should  be,  as  far as possible, only one C version of
42
  Dhrystone such that results can be compared  without  restrictions.  In  the
43
  past,  the  C  versions  distributed by Rick Richardson (Version 1.1) and by
44
  Reinhold Weicker had small (though not significant) differences.
45
 
46
  Together with the new C version, the  Ada  and  Pascal  versions  have  been
47
  updated as well.
48
 
49
o As far as it is  possible  without  changes  to  the  Dhrystone  statistics,
50
  optimizing   compilers   should   be  prevented  from  removing  significant
51
  statements.  It has  turned  out  in  the  past  that  optimizing  compilers
52
  suppressed  code  generation for too many statements (by "dead code removal"
53
  or  "dead  variable  elimination").   This  has  lead  to  the  danger  that
54
  benchmarking  results obtained by a naive application of Dhrystone - without
55
  inspection of the code that was generated - could become meaningless.
56
 
57
The  overall  policiy  for  version  2  has  been  that  the  distribution  of
58
statements,  operand types and operand locality described in [1] should remain
59
unchanged as much as possible.  (Very few changes were necessary; their impact
60
should be negligible.)  Also, the order of statements should remain unchanged.
61
Although I am aware of some critical remarks on the benchmark - I  agree  with
62
several  of them - and know some suggestions for improvement, I didn't want to
63
change the benchmark into something different from what has  become  known  as
64
"Dhrystone"; the confusion generated by such a change would probably outweight
65
the benefits. If I were to write a new benchmark program, I wouldn't  give  it
66
the  name  "Dhrystone"  since  this  denotes  the  program  published  in [1].
67
However, I do recognize  the  need  for  a  larger  number  of  representative
68
programs  that can be used as benchmarks; users should always be encouraged to
69
use more than just one benchmark.
70
 
71
The new versions (version 2.1 for C, Pascal and Ada) will  be  distributed  as
72
widely as possible.  (Version 2.1 differs from version 2.0 distributed via the
73
UNIX Network Usenet in  March  1988  only  in  a  few  corrections  for  minor
74
deficiencies  found  by  users  of  version 2.0.)  Readers who want to use the
75
benchmark for their own measurements can obtain  a  copy  in  machine-readable
76
form on floppy disk (MS-DOS or XENIX format) from the author.
77
 
78
 
79
2.  Overall Characteristics of Version 2
80
 
81
In general, version 2  follows  -  in  the  parts  that  are  significant  for
82
performance  measurement,  i.e.   within  the measurement loop - the published
83
(Ada) version and the C versions previously distributed.  Where  the  versions
84
distributed  by  Rick Richardson [2] and Reinhold Weicker have been different,
85
it  follows  the  version  distributed  by  Reinhold  Weicker.  (However,  the
86
differences  have  been  so  small  that their impact on execution time in all
87
likelihood has been negligible.)  The initialization and UNIX  instrumentation
88
part  -  which  had  been  omitted  in  [1] - follows mostly the ideas of Rick
89
Richardson [2].  However, any changes in the initialization part  and  in  the
90
printing  of  the  result have no impact on performance measurement since they
91
are outside the measaurement loop.  As a concession to older compilers,  names
92
have been made unique within the first 8 characters for the C version.
93
 
94
The original publication of Dhrystone did not contain any statements for  time
95
measurement  since  they  are necessarily system-dependent. However, it turned
96
out that it is not enough just to inclose the main procedure of Dhrystone in a
97
loop  and  to  measure the execution time.  If the variables that are computed
98
are not used somehow, there is the danger that the compiler considers them  as
99
"dead  variables" and suppresses code generation for a part of the statements.
100
Therefore in version 2 all variables of "main" are printed at the end  of  the
101
program.  This also permits some plausibility control for correct execution of
102
the benchmark.
103
 
104
At several places in the benchmark, code has been added, but only in  branches
105
that  are  not  executed. The intention is that optimizing compilers should be
106
prevented from moving code out of the measurement loop, or from removing  code
107
altogether.  Statements that are executed have been changed in very few places
108
only.  In these cases, only the role of some operands has been changed, and it
109
was   made  sure  that  the  numbers  defining  the  "Dhrystone  distribution"
110
(distribution of statements, operand types and locality) still hold as much as
111
possible.   Except for sophisticated optimizing compilers, execution times for
112
version 2.1 should be the same as for previous versions.
113
 
114
Because of the self-imposed limitation that the order and distribution of  the
115
executed  statements  should  not  be  changed,  there  are  still cases where
116
optimizing compilers may not generate code for some statements. To  a  certain
117
degree,  this  is  unavoidable  for  small synthetic benchmarks.  Users of the
118
benchmark are advised to check code listings whether code is generated for all
119
statements of Dhrystone.
120
 
121
Contrary to the suggestion in the published paper and its realization  in  the
122
versions previously distributed, no attempt has been made to subtract the time
123
for the measurement loop overhead. (This calculation has proven  difficult  to
124
implement  in  a  correct  way,  and  its omission makes the program simpler.)
125
However, since the loop check is now part of the benchmark, this does have  an
126
impact  -  though a very minor one - on the distribution statistics which have
127
been updated for this version.
128
 
129
 
130
3.  Discussion of Individual Changes
131
 
132
In this section, all changes are described that affect  the  measurement  loop
133
and  that  are  not  just  renamings  of variables. All remarks refer to the C
134
version; the other language versions have been updated similarly.
135
 
136
In addition to adding  the  measurement  loop  and  the  printout  statements,
137
changes have been made at the following places:
138
 
139
o In procedure "main", three statements have been added  in  the  non-executed
140
  "then" part of the statement
141
 
142
        if (Enum_Loc == Func_1 (Ch_Index, 'C'))
143
 
144
  they are
145
 
146
        strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING");
147
        Int_2_Loc = Run_Index;
148
        Int_Glob = Run_Index;
149
 
150
  The string assignment prevents  movement  of  the  preceding  assignment  to
151
  Str_2_Loc  (5'th  statement  of  "main")  out  of the measurement loop (This
152
  probably will not happen for the C version, but it did happen  with  another
153
  language   and  compiler.)   The  assignment  to  Int_2_Loc  prevents  value
154
  propagation for Int_2_Loc, and the assignment to Int_Glob makes the value of
155
  Int_Glob possibly dependent from the value of Run_Index.
156
 
157
o In the three arithmetic computations at the end of the measurement  loop  in
158
  "main  ",  the  role  of  some  variables has been exchanged, to prevent the
159
  division from just cancelling out the multiplication as it was  in  [1].   A
160
  very   smart  compiler  might  have  recognized  this  and  suppressed  code
161
  generation for the division.
162
 
163
o For Proc_2, no code has been changed, but the values of the actual parameter
164
  have changed due to changes in "main".
165
 
166
o In Proc_4, the second assignment has been changed from
167
 
168
        Bool_Loc = Bool_Loc | Bool_Glob;
169
 
170
  to
171
 
172
        Bool_Glob = Bool_Loc | Bool_Glob;
173
 
174
  It now assigns a value to a global variable  instead  of  a  local  variable
175
  (Bool_Loc);   Bool_Loc  would  be  a  "dead  variable"  which  is  not  used
176
  afterwards.
177
 
178
o In Func_1, the statement
179
 
180
        Ch_1_Glob = Ch_1_Loc;
181
 
182
  was added in the non-executed "else" part of the "if" statement, to  prevent
183
  the suppression of code generation for the assignment to Ch_1_Loc.
184
 
185
o In Func_2, the second character comparison statement has been changed to
186
 
187
        if (Ch_Loc == 'R')
188
 
189
  ('R' instead of 'X') because  a  comparison  with  'X'  is  implied  in  the
190
  preceding "if" statement.
191
 
192
  Also in Func_2, the statement
193
 
194
        Int_Glob = Int_Loc;
195
 
196
  has been added in the non-executed part of the last "if" statement, in order
197
  to prevent Int_Loc from becoming a dead variable.
198
 
199
o In Func_3, a non-executed "else" part has been added to the "if"  statement.
200
  While  the  program  would  not be incorrect without this "else" part, it is
201
  considered bad programming practice if a function  can  be  left  without  a
202
  return value.
203
 
204
  To compensate for this change, the (non-executed) "else" part  in  the  "if"
205
  statement of Proc_3 was removed.
206
 
207
The distribution statistics have been changed only  by  the  addition  of  the
208
measurement loop iteration (1 additional statement, 4 additional local integer
209
operands) and by the change in Proc_4  (one  operand  changed  from  local  to
210
global).  The distribution statistics in the comment headers have been updated
211
accordingly.
212
 
213
 
214
4.  String Operations
215
 
216
The string operations (string assignment and string comparison) have not  been
217
changed, to keep the program consistent with the original version.
218
 
219
There has been some concern that the string operations are over-represented in
220
the  program,  and that execution time is dominated by these operations.  This
221
was true in particular when optimizing compilers removed too much code in  the
222
main part of the program, this should have been mitigated in version 2.
223
 
224
It should be noted that this is a  language-dependent  issue:   Dhrystone  was
225
first  published  in  Ada, and with Ada or Pascal semantics, the time spent in
226
the string operations is,  at  least  in  all  implementations  known  to  me,
227
considerably smaller.  In Ada and Pascal, assignment and comparison of strings
228
are operators defined in the language, and the upper  bounds  of  the  strings
229
occuring  in  Dhrystone  are part of the type information known at compilation
230
time.  The compilers can therefore generate  efficient  inline  code.   In  C,
231
string  assignemt  and comparisons are not part of the language, so the string
232
operations must be expressed in terms of the C library functions "strcpy"  and
233
"strcmp".   (ANSI  C  allows  an  implementation  to use inline code for these
234
functions.)  In addition to the overhead caused by additional function  calls,
235
these  functions  are  defined for null-terminated strings where the length of
236
the strings is not known at compilation time; the function has to check  every
237
byte for the termination condition (the null byte).
238
 
239
Obviously, a C library which includes efficiently coded "strcpy" and  "strcmp"
240
functions  helps to obtain good Dhrystone results. However, I don't think that
241
this is unfair since string  functions  do  occur  quite  frequently  in  real
242
programs  (editors, command interpreters, etc.).  If the strings functions are
243
implemented efficiently,  this  helps  real  programs  as  well  as  benchmark
244
programs.
245
 
246
I admit that the  string  comparison  in  Dhrystone  terminates  later  (after
247
scanning  20  characters)  than most string comparisons in real programs.  For
248
consistency with the original benchmark, I didn't change the  program  despite
249
this weakness.
250
 
251
 
252
5.  Intended Use of Dhrystone
253
 
254
When Dhrystone is used, the following "ground rules" apply:
255
 
256
o Separate compilation (Ada and C versions)
257
 
258
  As mentioned in [1], Dhrystone was written  to  reflect  actual  programming
259
  practice  in  systems  programming.   The  division into several compilation
260
  units (5 in the Ada version, 2 in the C version)  is  intended,  as  is  the
261
  distribution of inter-module and intra-module subprogram calls.  Although on
262
  many systems there will be no difference in execution time  to  a  Dhrystone
263
  version  where  all  compilation units are merged into one file, the rule is
264
  that separate compilation should  be  used.   The  intention  is  that  real
265
  programming  practice,  where  programs  consist  of  several  independently
266
  compiled units, should  be  reflected.   This  also  has  implies  that  the
267
  compiler,  while  compiling  one  unit,  has no information about the use of
268
  variables, register allocation etc.  occuring in  other  compilation  units.
269
  Although  in  real  life  compilation  units  will  probably  be larger, the
270
  intention is that these effects  of  separate  compilation  are  modeled  in
271
  Dhrystone.
272
 
273
  A few language systems have post-linkage optimization available (e.g., final
274
  register allocation is performed after linkage).  This is a borderline case:
275
  Post-linkage  optimization  involves  additional  program  preparation  time
276
  (although  not  as  much  as  compilation in one unit) which may prevent its
277
  general use in practical programming.  I think that  since  it  defeats  the
278
  intentions given above, it should not be used for Dhrystone.
279
 
280
  Unfortunately, ISO/ANSI  Pascal  does  not  contain  language  features  for
281
  separate  compilation.   Although  most  commercial Pascal compilers provide
282
  separate compilation in some way, we cannot use it for Dhrystone since  such
283
  a  version  would  not  be portable.  Therefore, no attempt has been made to
284
  provide a Pascal version with several compilation units.
285
 
286
o No procedure merging
287
 
288
  Although Dhrystone contains some very short procedures where execution would
289
  benefit  from  procedure  merging (inlining, macro expansion of procedures),
290
  procedure merging is not to be used.  The reason is that the  percentage  of
291
  procedure  and  function  calls  is  part of the "Dhrystone distribution" of
292
  statements contained in [1].  This restriction does not hold for the  string
293
  functions  of  the  C  version  since ANSI C allows an implementation to use
294
  inline code for these functions.
295
 
296
o Other optimizations are allowed, but they should be indicated
297
 
298
  It is often hard to draw an exact line between "normal code generation"  and
299
  "optimization"  in  compilers:  Some compilers perform operations by default
300
  that are invoked in other compilers only  when  optimization  is  explicitly
301
  requested.  Also, we cannot avoid that in benchmarking people try to achieve
302
  results that look as good as possible.  Therefore,  optimizations  performed
303
  by  compilers  -  other  than  those  listed  above - are not forbidden when
304
  Dhrystone execution times are measured.  Dhrystone is  not  intended  to  be
305
  non-optimizable  but  is  intended  to  be  similarly  optimizable as normal
306
  programs.   For  example,  there  are  several  places  in  Dhrystone  where
307
  performance   benefits   from   optimizations   like   common  subexpression
308
  elimination, value  propagation  etc.,  but  normal  programs  usually  also
309
  benefit  from  these  optimizations.   Therefore,  no  effort  was  made  to
310
  artificially  prevent  such  optimizations.   However,  measurement  reports
311
  should  indicate  which  compiler  optimization  levels  have been used, and
312
  reporting results with different levels of  compiler  optimization  for  the
313
  same hardware is encouraged.
314
 
315
o Default results are those without "register" declarations (C version)
316
 
317
  When Dhrystone results are quoted  without  additional  qualification,  they
318
  should  be  understood  as  results  obtained  without use of the "register"
319
  attribute. Good compilers should be able to make good use of registers  even
320
  without explicit register declarations ([3], p. 193).
321
 
322
Of course, for experimental  purposes,  post-linkage  optimization,  procedure
323
merging and/or compilation in one unit can be done to determine their effects.
324
However,  Dhrystone  numbers  obtained  under  these  conditions   should   be
325
explicitly  marked as such; "normal" Dhrystone results should be understood as
326
results obtained following the ground rules listed above.
327
 
328
In any case, for serious performance evaluation, users are advised to ask  for
329
code  listings  and  to  check  them carefully.  In this way, when results for
330
different systems are  compared,  the  reader  can  get  a  feeling  how  much
331
performance  difference is due to compiler optimization and how much is due to
332
hardware speed.
333
 
334
 
335
6.  Acknowledgements
336
 
337
The C version 2.1 of Dhrystone has been developed  in  cooperation  with  Rick
338
Richardson  (Tinton  Falls,  NJ), it incorporates many ideas from the "Version
339
1.1" distributed previously by him over the UNIX network Usenet.  Through  his
340
activity with Usenet, Rick Richardson has made a very valuable contribution to
341
the dissemination of the benchmark.  I also thank  Chaim  Benedelac  (National
342
Semiconductor),  David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan
343
Smith and Rafael  Saavedra-Barrera  (UC  at  Berkeley)  for  their  help  with
344
comments on earlier versions of the benchmark.
345
 
346
 
347
7.  Bibliography
348
 
349
[1]
350
   Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark.
351
   Communications of the ACM 27, 10 (Oct. 1984), 1013-1030
352
 
353
[2]
354
   Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text)
355
   Informal Distribution via "Usenet", Last Version Known  to  me:  Sept.  21,
356
   1987
357
 
358
[3]
359
   Brian W. Kernighan and Dennis M. Ritchie:  The C Programming Language.
360
   Prentice-Hall, Englewood Cliffs (NJ) 1978
361
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.