OpenCores
URL https://opencores.org/ocsvn/eco32/eco32/trunk

Subversion Repositories eco32

[/] [eco32/] [trunk/] [fpga/] [tests/] [test_101/] [dhry/] [VARIATIONS] - Blame information for rev 296

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 295 hellwig
 
2
            Understanding Variations in Dhrystone Performance
3
 
4
 
5
 
6
          By Reinhold P. Weicker, Siemens AG, AUT E 51, Erlangen
7
 
8
 
9
 
10
                                April 1989
11
 
12
 
13
                      This article has appeared in:
14
 
15
 
16
        Microprocessor Report, May 1989 (Editor: M. Slater), pp. 16-17
17
 
18
 
19
 
20
 
21
Microprocessor manufacturers tend to credit all the  performance  measured  by
22
benchmarks to the speed of their processors, they often don't even mention the
23
programming language and compiler used. In their detailed  documents,  usually
24
called  "performance brief" or "performance report," they usually do give more
25
details. However, these details are often lost in the press releases and other
26
marketing  statements.  For serious performance evaluation, it is necessary to
27
study the code generated by the various compilers.
28
 
29
Dhrystone was originally published in Ada (Communications  of  the  ACM,  Oct.
30
1984).  However, since good Ada compilers were rare at this time and, together
31
with UNIX, C became more and more popular, the C version of Dhrystone  is  the
32
one  now  mainly  used in industry. There are "official" versions 2.1 for Ada,
33
Pascal, and C,  which  are  as  close  together  as  the  languages'  semantic
34
differences permit.
35
 
36
Dhrystone contains two statements  where  the  programming  language  and  its
37
translation play a major part in the execution time measured by the benchmark:
38
 
39
  o   String assignment (in procedure Proc_0 / main)
40
  o   String comparison (in function Func_2)
41
 
42
In Ada and Pascal, strings are arrays of characters where the  length  of  the
43
string  is  part  of the type information known at compile time. In C, strings
44
are also arrays of characters, but there  are  no  operators  defined  in  the
45
language  for  assignment  and  comparison  of  strings.   Instead,  functions
46
"strcpy" and "strcmp" are used. These functions are  defined  for  strings  of
47
arbitrary  length, and make use of the fact that strings in C have to end with
48
a terminating null byte. For general-purpose calls  to  these  functions,  the
49
implementor  can  assume  nothing  about  the  length and the alignment of the
50
strings involved.
51
 
52
The C version of Dhrystone spends a relatively large amount of time  in  these
53
two  functions.  Some  time  ago, I made measurements on a VAX 11/785 with the
54
Berkeley UNIX (4.2) compilers (often-used compilers,  but  certainly  not  the
55
most  advanced).  In  the  C  version, 23% of the time was spent in the string
56
functions; in the Pascal version, only 10%. On good RISC machines (where  less
57
time is spent in the procedure calling sequence than on a VAX) and with better
58
optimizing compilers, the percentage is higher; MIPS has reported 34%  for  an
59
R3000.   Because  of this effect, Pascal and Ada Dhrystone results are usually
60
better than C results (except when the optimization quality of the C  compiler
61
is considerably better than that of the other compilers).
62
 
63
Several people have noted that the string operations are  over-represented  in
64
Dhrystone,  mainly  because the strings occurring in Dhrystone are longer than
65
average strings. I admit that this is true, and have said  so  in  my  SIGPLAN
66
Notices  paper  (Aug.  1988);  however, I didn't want to generate confusion by
67
changing the string lengths from version 1 to version 2.
68
 
69
Even if they are somewhat over-represented in Dhrystone, string operations are
70
frequent  enough  that  it makes sense to implement them in the most efficient
71
way possible, not only for benchmarking purposes.  This means  that  they  can
72
and should be written in assembly language code. ANSI C also explicitly allows
73
the strings functions to be implemented as macros, i.e. by inline code.
74
 
75
There is also a third way to speed up the "strcpy" statement in Dhrystone: For
76
this  particular  "strcpy" statement, the source of the assignment is a string
77
constant. Therefore, in contrast to calls to "strcpy" in the general case, the
78
compiler  knows  the  length  and alignment of the strings involved at compile
79
time and can generate code in the same efficient  way  as  a  Pascal  compiler
80
(word instructions instead of byte instructions).
81
 
82
This is not allowed in the case of the "strcmp" call: Here, the addresses  are
83
formal  procedure  parameters, and no assumptions can be made about the length
84
or alignment of the strings.  Any such assumptions would indicate an incorrect
85
implementation.  They  might work for Dhrystone, where the strings are in fact
86
word-aligned  with  typical  compilers,  but  other  programs  would   deliver
87
incorrect results.
88
 
89
So, for an apple-to-apple  comparison  between  processors,  and  not  between
90
several  possible  (legal  or  illegal)  degrees of compiler optimization, one
91
should check that the systems are comparable with  respect  to  the  following
92
three points:
93
 
94
  (1) String functions in assembly language vs. in C
95
 
96
      Frequently used functions such as the string functions can and should be
97
      written  in  assembly language, and all serious C language systems known
98
      to me do this. (I list this point  for  completeness  only.)  Note  that
99
      processors  with an instruction that checks a word for a null byte (such
100
      as AMD's  29000  and  Intel's  80960)  have  an  advantage  here.  (This
101
      advantage  decreases  relatively if optimization (3) is applied.) Due to
102
      the length of the strings involved in Dhrystone, this advantage  may  be
103
      considered  too  high  in  perspective, but it is certainly legal to use
104
      such instructions - after all,  these  situations  are  what  they  were
105
      invented for.
106
 
107
  (2) String function code inline vs. as library functions.
108
 
109
      ANSI  C  has  created  a  new  situation,  compared   with   the   older
110
      Kernighan/Ritchie  C.  In  the  original C, the definition of the string
111
      function was not part of the  language.  Now  it  is,  and  inlining  is
112
      explicitly  allowed.  I  probably  should have stated more clearly in my
113
      SIGPLAN  Notices  paper  that  the  rule  "No  procedure  inlining   for
114
      Dhrystone"  referred  to  the  user level procedures only and not to the
115
      library routines.
116
 
117
  (3) Fixed-length and alignment assumptions for the strings
118
 
119
      Compilers should be allowed to optimize in these cases if (and only  if)
120
      it  is safe to do so. For Dhrystone, this is the "strcpy" statement, but
121
      not the  "strcmp"  statement  (unless,  of  course,  the  "strcmp"  code
122
      explicitly   checks   the  alignment  at  execution  time  and  branches
123
      accordingly).  A "Dhrystone switch" for the  compiler  that  causes  the
124
      generation  of  code  that  may  not work under certain circumstances is
125
      certainly inappropriate for comparisons. It has been reported in  Usenet
126
      that some C compilers provide such a compiler option; since I don't have
127
      access to all C compilers involved, I cannot verify this.
128
 
129
      If the fixed-length and word-alignment assumption can be  used,  a  wide
130
      bus  that permits fast multi-word load instructions certainly does help;
131
      however, this fact by itself should not make a really big difference.
132
 
133
A check of  these  points  -  something  that  is  necessary  for  a  thorough
134
evaluation  and  comparison  of  the  Dhrystone  performance claims - requires
135
object code listings as well as listings for  the  string  functions  (strcpy,
136
strcmp) that are possibly called by the program.
137
 
138
I don't pretend that Dhrystone is  a  perfect  tool  to  measure  the  integer
139
performance  of microprocessors. The more it is used and discussed, the more I
140
myself learn about aspects that I hadn't noticed yet when I wrote the program.
141
And  of  course,  the  very success of a benchmark program is a danger in that
142
people may tune their compilers and/or hardware to it, and  with  this  action
143
make it less useful.
144
 
145
Whetstone and Linpack have their critical points also:  The  Whetstone  rating
146
depends  heavily on the speed of the mathematical functions (sine, sqrt, ...),
147
and Linpack is sensitive to data alignment for some cache configurations.
148
 
149
Introduction of a standard set of public domain benchmark software  (something
150
the  SPEC  effort attempts) is certainly a worthwhile thing.  In the meantime,
151
people will continue to use whatever is available and widely distributed,  and
152
Dhrystone  ratings  are probably still better than MIPS ratings if these are -
153
as often in industry - based on  no  reproducible  derivation.   However,  any
154
serious  performance  evaluation  requires  more than just a comparison of raw
155
numbers; one has to make sure  that  the  numbers  have  been  obtained  in  a
156
comparable way.
157
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.