1 |
17 |
julius |
Instruction analysis program
|
2 |
|
|
|
3 |
|
|
This application reads in a binary list of instructions, and analyses it with a
|
4 |
|
|
set of functions looking at various parameters in each instruction.
|
5 |
|
|
|
6 |
|
|
Right now it's not so user friendly. Everything is hardcoded, and only support
|
7 |
|
|
for the OR1K instruction set exists.
|
8 |
|
|
|
9 |
26 |
julius |
It has been written in a way that should allow other instructinos to be added
|
10 |
|
|
easily. It remains to be seen how much would be reusable between the sets but
|
11 |
|
|
for now, at least it would be easy enough to take the OR1K instruction
|
12 |
|
|
analysis functions and drop in a different instruction set.
|
13 |
|
|
|
14 |
|
|
The types of information given for OR1K instruction analysis is instruction
|
15 |
|
|
frequency, immediate frequency for each instruction, branch distance value
|
16 |
|
|
frequency, and register usage frequency. For each instruction, the most common
|
17 |
|
|
n-tuple sets of instructions, finishing with that instruction, are presented,
|
18 |
|
|
for pairs, triples and quadruples. Additionally output is the most common
|
19 |
|
|
overall n-tuples.
|
20 |
|
|
|
21 |
17 |
julius |
Compile the program with:
|
22 |
|
|
|
23 |
|
|
$ make all
|
24 |
|
|
|
25 |
|
|
And run a test (it needs the or32-elf- toolchain) with:
|
26 |
|
|
|
27 |
|
|
$ make test
|
28 |
|
|
|
29 |
26 |
julius |
Static analysis:
|
30 |
17 |
julius |
|
31 |
28 |
julius |
To generate a raw binary representation of the instructions that end up in
|
32 |
|
|
something like the Linux kernel, take the ELF file that results from compilation
|
33 |
|
|
and pass it to or32-elf-objcopy like the following:
|
34 |
26 |
julius |
|
35 |
17 |
julius |
$ or32-elf-objcopy -O binary -j .text -S vmlinux vmlinux.text.bin
|
36 |
|
|
|
37 |
28 |
julius |
Use the -f flag to indicate the input file, and -o to indicate the output file.
|
38 |
17 |
julius |
|
39 |
28 |
julius |
$ ./insnanalysis -f vmlinux.text.bin o vmlinux.insnanalysis
|
40 |
17 |
julius |
|
41 |
26 |
julius |
Dynamic analysis with binary execution log from or1ksim:
|
42 |
17 |
julius |
|
43 |
26 |
julius |
As of revision 202 of the OpenRISC repository, or1ksim is capable of generating
|
44 |
|
|
an execution trace log in binary format, logging each instruction executed.
|
45 |
|
|
This log file can be given to insnanalysis.
|
46 |
17 |
julius |
|
47 |
26 |
julius |
In the or1ksim config file ensure the line "exe_bin_insn_log = 1" is in the
|
48 |
|
|
sim section. This will enable the binary instruction logging. The resulting
|
49 |
|
|
output file is then given to insnanalysis in the same manner as above.
|
50 |
17 |
julius |
|
51 |
26 |
julius |
Output:
|
52 |
|
|
|
53 |
|
|
Currently there are only two output formats, human readable string and CSV.
|
54 |
|
|
|
55 |
|
|
The output can be switched between human readable strings and CSV format (ready
|
56 |
|
|
to be imported into a spreadsheet application) by uncommenting one of the
|
57 |
|
|
"#define DISPLAY_" defines in the instruction set header. The program must be
|
58 |
|
|
recompiled if this is changed.
|
59 |
|
|
|
60 |
28 |
julius |
Individual instruction analysis:
|
61 |
26 |
julius |
|
62 |
28 |
julius |
Instead of only breaking the instruction up and recording statistics on an
|
63 |
|
|
opcode basis - the instructions can be tracked in their entireity and statistics
|
64 |
|
|
on the most frequently seen entire instruction presented. Use the -u flag when
|
65 |
|
|
running the program. Note that this will make execution time longer. For a
|
66 |
|
|
binary trace of Linux booting 1.7GB in size, a 2.5GHz Intel Core 2 Duo machine
|
67 |
|
|
took 30 minutes to parse with the -u option.
|
68 |
|
|
|
69 |
|
|
|
70 |
17 |
julius |
TODO:
|
71 |
26 |
julius |
o Collect and display information about l.j and l.jal instruction immediates
|
72 |
17 |
julius |
o Add an easy way to switch between human readable and CSV output
|
73 |
28 |
julius |
o Figure out how to tack this thing onto a simulator (or1ksim for now) to give
|
74 |
17 |
julius |
results of execution when that finishes executing, or just how to get the
|
75 |
|
|
simulator to output a binary dump of executed instructions to be fed through
|
76 |
|
|
this
|
77 |
26 |
julius |
o Add support for a list of binary files to be specified at the command line
|
78 |
|
|
o Allow statistics to be collated over different files - this would allow each
|
79 |
|
|
function to be broken out of a library, or application, and in that regard
|
80 |
|
|
the instruction sequence data would then be accurate for static analysis.
|
81 |
|
|
|
82 |
|
|
|
83 |
28 |
julius |
July 25, 2010 - Julius Baxter
|