1 |
424 |
jeremybenn |
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
2 |
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
3 |
|
|
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Design</title><meta name="generator" content="DocBook XSL Stylesheets V1.75.2" /><meta name="keywords" content=" C++ , library , profile " /><meta name="keywords" content=" ISO C++ , library " /><link rel="home" href="../spine.html" title="The GNU C++ Library Documentation" /><link rel="up" href="profile_mode.html" title="Chapter 32. Profile Mode" /><link rel="prev" href="profile_mode.html" title="Chapter 32. Profile Mode" /><link rel="next" href="bk01pt12ch32s03.html" title="Extensions for Custom Containers" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><th width="60%" align="center">Chapter 32. Profile Mode</th><td width="20%" align="right"> <a accesskey="n" href="bk01pt12ch32s03.html">Next</a></td></tr></table><hr /></div><div class="sect1" title="Design"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.design"></a>Design</h2></div></div></div><p>
|
4 |
|
|
</p><div class="table"><a id="id594983"></a><p class="title"><b>Table 32.1. Code Location</b></p><div class="table-contents"><table summary="Code Location" border="1"><colgroup><col align="left" /><col align="left" /></colgroup><thead><tr><th align="left">Code Location</th><th align="left">Use</th></tr></thead><tbody><tr><td align="left"><code class="code">libstdc++-v3/include/std/*</code></td><td align="left">Preprocessor code to redirect to profile extension headers.</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/*</code></td><td align="left">Profile extension public headers (map, vector, ...).</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/impl/*</code></td><td align="left">Profile extension internals. Implementation files are
|
5 |
|
|
only included from <code class="code">impl/profiler.h</code>, which is the only
|
6 |
|
|
file included from the public headers.</td></tr></tbody></table></div></div><br class="table-break" /><p>
|
7 |
|
|
</p><div class="sect2" title="Wrapper Model"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.wrapper"></a>Wrapper Model</h3></div></div></div><p>
|
8 |
|
|
In order to get our instrumented library version included instead of the
|
9 |
|
|
release one,
|
10 |
|
|
we use the same wrapper model as the debug mode.
|
11 |
|
|
We subclass entities from the release version. Wherever
|
12 |
|
|
<code class="code">_GLIBCXX_PROFILE</code> is defined, the release namespace is
|
13 |
|
|
<code class="code">std::__norm</code>, whereas the profile namespace is
|
14 |
|
|
<code class="code">std::__profile</code>. Using plain <code class="code">std</code> translates
|
15 |
|
|
into <code class="code">std::__profile</code>.
|
16 |
|
|
</p><p>
|
17 |
|
|
Whenever possible, we try to wrap at the public interface level, e.g.,
|
18 |
|
|
in <code class="code">unordered_set</code> rather than in <code class="code">hashtable</code>,
|
19 |
|
|
in order not to depend on implementation.
|
20 |
|
|
</p><p>
|
21 |
|
|
Mixing object files built with and without the profile mode must
|
22 |
|
|
not affect the program execution. However, there are no guarantees to
|
23 |
|
|
the accuracy of diagnostics when using even a single object not built with
|
24 |
|
|
<code class="code">-D_GLIBCXX_PROFILE</code>.
|
25 |
|
|
Currently, mixing the profile mode with debug and parallel extensions is
|
26 |
|
|
not allowed. Mixing them at compile time will result in preprocessor errors.
|
27 |
|
|
Mixing them at link time is undefined.
|
28 |
|
|
</p></div><div class="sect2" title="Instrumentation"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.instrumentation"></a>Instrumentation</h3></div></div></div><p>
|
29 |
|
|
Instead of instrumenting every public entry and exit point,
|
30 |
|
|
we chose to add instrumentation on demand, as needed
|
31 |
|
|
by individual diagnostics.
|
32 |
|
|
The main reason is that some diagnostics require us to extract bits of
|
33 |
|
|
internal state that are particular only to that diagnostic.
|
34 |
|
|
We plan to formalize this later, after we learn more about the requirements
|
35 |
|
|
of several diagnostics.
|
36 |
|
|
</p><p>
|
37 |
|
|
All the instrumentation points can be switched on and off using
|
38 |
|
|
<code class="code">-D[_NO]_GLIBCXX_PROFILE_<diagnostic></code> options.
|
39 |
|
|
With all the instrumentation calls off, there should be negligible
|
40 |
|
|
overhead over the release version. This property is needed to support
|
41 |
|
|
diagnostics based on timing of internal operations. For such diagnostics,
|
42 |
|
|
we anticipate turning most of the instrumentation off in order to prevent
|
43 |
|
|
profiling overhead from polluting time measurements, and thus diagnostics.
|
44 |
|
|
</p><p>
|
45 |
|
|
All the instrumentation on/off compile time switches live in
|
46 |
|
|
<code class="code">include/profile/profiler.h</code>.
|
47 |
|
|
</p></div><div class="sect2" title="Run Time Behavior"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.rtlib"></a>Run Time Behavior</h3></div></div></div><p>
|
48 |
|
|
For practical reasons, the instrumentation library processes the trace
|
49 |
|
|
partially
|
50 |
|
|
rather than dumping it to disk in raw form. Each event is processed when
|
51 |
|
|
it occurs. It is usually attached a cost and it is aggregated into
|
52 |
|
|
the database of a specific diagnostic class. The cost model
|
53 |
|
|
is based largely on the standard performance guarantees, but in some
|
54 |
|
|
cases we use knowledge about GCC's standard library implementation.
|
55 |
|
|
</p><p>
|
56 |
|
|
Information is indexed by (1) call stack and (2) instance id or address
|
57 |
|
|
to be able to understand and summarize precise creation-use-destruction
|
58 |
|
|
dynamic chains. Although the analysis is sensitive to dynamic instances,
|
59 |
|
|
the reports are only sensitive to call context. Whenever a dynamic instance
|
60 |
|
|
is destroyed, we accumulate its effect to the corresponding entry for the
|
61 |
|
|
call stack of its constructor location.
|
62 |
|
|
</p><p>
|
63 |
|
|
For details, see
|
64 |
|
|
<a class="ulink" href="http://dx.doi.org/10.1109/CGO.2009.36" target="_top">paper presented at
|
65 |
|
|
CGO 2009</a>.
|
66 |
|
|
</p></div><div class="sect2" title="Analysis and Diagnostics"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.analysis"></a>Analysis and Diagnostics</h3></div></div></div><p>
|
67 |
|
|
Final analysis takes place offline, and it is based entirely on the
|
68 |
|
|
generated trace and debugging info in the application binary.
|
69 |
|
|
See section Diagnostics for a list of analysis types that we plan to support.
|
70 |
|
|
</p><p>
|
71 |
|
|
The input to the analysis is a table indexed by profile type and call stack.
|
72 |
|
|
The data type for each entry depends on the profile type.
|
73 |
|
|
</p></div><div class="sect2" title="Cost Model"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.cost-model"></a>Cost Model</h3></div></div></div><p>
|
74 |
|
|
While it is likely that cost models become complex as we get into
|
75 |
|
|
more sophisticated analysis, we will try to follow a simple set of rules
|
76 |
|
|
at the beginning.
|
77 |
|
|
</p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p><span class="emphasis"><em>Relative benefit estimation:</em></span>
|
78 |
|
|
The idea is to estimate or measure the cost of all operations
|
79 |
|
|
in the original scenario versus the scenario we advise to switch to.
|
80 |
|
|
For instance, when advising to change a vector to a list, an occurrence
|
81 |
|
|
of the <code class="code">insert</code> method will generally count as a benefit.
|
82 |
|
|
Its magnitude depends on (1) the number of elements that get shifted
|
83 |
|
|
and (2) whether it triggers a reallocation.
|
84 |
|
|
</p></li><li class="listitem"><p><span class="emphasis"><em>Synthetic measurements:</em></span>
|
85 |
|
|
We will measure the relative difference between similar operations on
|
86 |
|
|
different containers. We plan to write a battery of small tests that
|
87 |
|
|
compare the times of the executions of similar methods on different
|
88 |
|
|
containers. The idea is to run these tests on the target machine.
|
89 |
|
|
If this training phase is very quick, we may decide to perform it at
|
90 |
|
|
library initialization time. The results can be cached on disk and reused
|
91 |
|
|
across runs.
|
92 |
|
|
</p></li><li class="listitem"><p><span class="emphasis"><em>Timers:</em></span>
|
93 |
|
|
We plan to use timers for operations of larger granularity, such as sort.
|
94 |
|
|
For instance, we can switch between different sort methods on the fly
|
95 |
|
|
and report the one that performs best for each call context.
|
96 |
|
|
</p></li><li class="listitem"><p><span class="emphasis"><em>Show stoppers:</em></span>
|
97 |
|
|
We may decide that the presence of an operation nullifies the advice.
|
98 |
|
|
For instance, when considering switching from <code class="code">set</code> to
|
99 |
|
|
<code class="code">unordered_set</code>, if we detect use of operator <code class="code">++</code>,
|
100 |
|
|
we will simply not issue the advice, since this could signal that the use
|
101 |
|
|
care require a sorted container.</p></li></ul></div></div><div class="sect2" title="Reports"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.reports"></a>Reports</h3></div></div></div><p>
|
102 |
|
|
There are two types of reports. First, if we recognize a pattern for which
|
103 |
|
|
we have a substitute that is likely to give better performance, we print
|
104 |
|
|
the advice and estimated performance gain. The advice is usually associated
|
105 |
|
|
to a code position and possibly a call stack.
|
106 |
|
|
</p><p>
|
107 |
|
|
Second, we report performance characteristics for which we do not have
|
108 |
|
|
a clear solution for improvement. For instance, we can point to the user
|
109 |
|
|
the top 10 <code class="code">multimap</code> locations
|
110 |
|
|
which have the worst data locality in actual traversals.
|
111 |
|
|
Although this does not offer a solution,
|
112 |
|
|
it helps the user focus on the key problems and ignore the uninteresting ones.
|
113 |
|
|
</p></div><div class="sect2" title="Testing"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.testing"></a>Testing</h3></div></div></div><p>
|
114 |
|
|
First, we want to make sure we preserve the behavior of the release mode.
|
115 |
|
|
You can just type <code class="code">"make check-profile"</code>, which
|
116 |
|
|
builds and runs the whole test suite in profile mode.
|
117 |
|
|
</p><p>
|
118 |
|
|
Second, we want to test the correctness of each diagnostic.
|
119 |
|
|
We created a <code class="code">profile</code> directory in the test suite.
|
120 |
|
|
Each diagnostic must come with at least two tests, one for false positives
|
121 |
|
|
and one for false negatives.
|
122 |
|
|
</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="bk01pt12ch32s03.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 32. Profile Mode </td><td width="20%" align="center"><a accesskey="h" href="../spine.html">Home</a></td><td width="40%" align="right" valign="top"> Extensions for Custom Containers</td></tr></table></div></body></html>
|