1 |
17 |
julius |
Instruction analysis program
|
2 |
|
|
|
3 |
|
|
This application reads in a binary list of instructions, and analyses it with a
|
4 |
|
|
set of functions looking at various parameters in each instruction.
|
5 |
|
|
|
6 |
|
|
Right now it's not so user friendly. Everything is hardcoded, and only support
|
7 |
|
|
for the OR1K instruction set exists.
|
8 |
|
|
|
9 |
26 |
julius |
It has been written in a way that should allow other instructinos to be added
|
10 |
|
|
easily. It remains to be seen how much would be reusable between the sets but
|
11 |
|
|
for now, at least it would be easy enough to take the OR1K instruction
|
12 |
|
|
analysis functions and drop in a different instruction set.
|
13 |
|
|
|
14 |
|
|
The types of information given for OR1K instruction analysis is instruction
|
15 |
|
|
frequency, immediate frequency for each instruction, branch distance value
|
16 |
|
|
frequency, and register usage frequency. For each instruction, the most common
|
17 |
|
|
n-tuple sets of instructions, finishing with that instruction, are presented,
|
18 |
|
|
for pairs, triples and quadruples. Additionally output is the most common
|
19 |
|
|
overall n-tuples.
|
20 |
|
|
|
21 |
17 |
julius |
Compile the program with:
|
22 |
|
|
|
23 |
|
|
$ make all
|
24 |
|
|
|
25 |
|
|
And run a test (it needs the or32-elf- toolchain) with:
|
26 |
|
|
|
27 |
|
|
$ make test
|
28 |
|
|
|
29 |
|
|
To run the program itself, just give it a binary blob of instructions (usually
|
30 |
|
|
the output of objcopy -O binary).
|
31 |
|
|
|
32 |
26 |
julius |
Static analysis:
|
33 |
17 |
julius |
|
34 |
26 |
julius |
For instance the Linux kernel ELF for OR1K can be prepared with the following
|
35 |
|
|
command:
|
36 |
|
|
|
37 |
17 |
julius |
$ or32-elf-objcopy -O binary -j .text -S vmlinux vmlinux.text.bin
|
38 |
|
|
|
39 |
26 |
julius |
It is passed to the program like so, and the output is captured by redirecting
|
40 |
|
|
stdout.
|
41 |
17 |
julius |
|
42 |
|
|
$ ./insnanalysis vmlinux.text.bin > vmlinux.insnanalysis
|
43 |
|
|
|
44 |
26 |
julius |
Dynamic analysis with binary execution log from or1ksim:
|
45 |
17 |
julius |
|
46 |
26 |
julius |
As of revision 202 of the OpenRISC repository, or1ksim is capable of generating
|
47 |
|
|
an execution trace log in binary format, logging each instruction executed.
|
48 |
|
|
This log file can be given to insnanalysis.
|
49 |
17 |
julius |
|
50 |
26 |
julius |
In the or1ksim config file ensure the line "exe_bin_insn_log = 1" is in the
|
51 |
|
|
sim section. This will enable the binary instruction logging. The resulting
|
52 |
|
|
output file is then given to insnanalysis in the same manner as above.
|
53 |
17 |
julius |
|
54 |
26 |
julius |
Output:
|
55 |
|
|
|
56 |
|
|
Currently there are only two output formats, human readable string and CSV.
|
57 |
|
|
|
58 |
|
|
The output can be switched between human readable strings and CSV format (ready
|
59 |
|
|
to be imported into a spreadsheet application) by uncommenting one of the
|
60 |
|
|
"#define DISPLAY_" defines in the instruction set header. The program must be
|
61 |
|
|
recompiled if this is changed.
|
62 |
|
|
|
63 |
|
|
|
64 |
17 |
julius |
TODO:
|
65 |
26 |
julius |
o Collect and display information about l.j and l.jal instruction immediates
|
66 |
17 |
julius |
o Add an easy way to switch between human readable and CSV output
|
67 |
|
|
o Figure out how to tack this thing onto a simulator (or1ksim maybe) to give
|
68 |
|
|
results of execution when that finishes executing, or just how to get the
|
69 |
|
|
simulator to output a binary dump of executed instructions to be fed through
|
70 |
|
|
this
|
71 |
26 |
julius |
o Add support for a list of binary files to be specified at the command line
|
72 |
|
|
o Allow statistics to be collated over different files - this would allow each
|
73 |
|
|
function to be broken out of a library, or application, and in that regard
|
74 |
|
|
the instruction sequence data would then be accurate for static analysis.
|
75 |
|
|
|
76 |
|
|
|
77 |
|
|
July 24, 2010 - Julius Baxter
|