| 1 |
17 |
julius |
Instruction analysis program
|
| 2 |
|
|
|
| 3 |
|
|
This application reads in a binary list of instructions, and analyses it with a
|
| 4 |
|
|
set of functions looking at various parameters in each instruction.
|
| 5 |
|
|
|
| 6 |
|
|
Right now it's not so user friendly. Everything is hardcoded, and only support
|
| 7 |
|
|
for the OR1K instruction set exists.
|
| 8 |
|
|
|
| 9 |
26 |
julius |
It has been written in a way that should allow other instructinos to be added
|
| 10 |
|
|
easily. It remains to be seen how much would be reusable between the sets but
|
| 11 |
|
|
for now, at least it would be easy enough to take the OR1K instruction
|
| 12 |
|
|
analysis functions and drop in a different instruction set.
|
| 13 |
|
|
|
| 14 |
|
|
The types of information given for OR1K instruction analysis is instruction
|
| 15 |
|
|
frequency, immediate frequency for each instruction, branch distance value
|
| 16 |
|
|
frequency, and register usage frequency. For each instruction, the most common
|
| 17 |
|
|
n-tuple sets of instructions, finishing with that instruction, are presented,
|
| 18 |
|
|
for pairs, triples and quadruples. Additionally output is the most common
|
| 19 |
|
|
overall n-tuples.
|
| 20 |
|
|
|
| 21 |
17 |
julius |
Compile the program with:
|
| 22 |
|
|
|
| 23 |
|
|
$ make all
|
| 24 |
|
|
|
| 25 |
|
|
And run a test (it needs the or32-elf- toolchain) with:
|
| 26 |
|
|
|
| 27 |
|
|
$ make test
|
| 28 |
|
|
|
| 29 |
|
|
To run the program itself, just give it a binary blob of instructions (usually
|
| 30 |
|
|
the output of objcopy -O binary).
|
| 31 |
|
|
|
| 32 |
26 |
julius |
Static analysis:
|
| 33 |
17 |
julius |
|
| 34 |
26 |
julius |
For instance the Linux kernel ELF for OR1K can be prepared with the following
|
| 35 |
|
|
command:
|
| 36 |
|
|
|
| 37 |
17 |
julius |
$ or32-elf-objcopy -O binary -j .text -S vmlinux vmlinux.text.bin
|
| 38 |
|
|
|
| 39 |
26 |
julius |
It is passed to the program like so, and the output is captured by redirecting
|
| 40 |
|
|
stdout.
|
| 41 |
17 |
julius |
|
| 42 |
|
|
$ ./insnanalysis vmlinux.text.bin > vmlinux.insnanalysis
|
| 43 |
|
|
|
| 44 |
26 |
julius |
Dynamic analysis with binary execution log from or1ksim:
|
| 45 |
17 |
julius |
|
| 46 |
26 |
julius |
As of revision 202 of the OpenRISC repository, or1ksim is capable of generating
|
| 47 |
|
|
an execution trace log in binary format, logging each instruction executed.
|
| 48 |
|
|
This log file can be given to insnanalysis.
|
| 49 |
17 |
julius |
|
| 50 |
26 |
julius |
In the or1ksim config file ensure the line "exe_bin_insn_log = 1" is in the
|
| 51 |
|
|
sim section. This will enable the binary instruction logging. The resulting
|
| 52 |
|
|
output file is then given to insnanalysis in the same manner as above.
|
| 53 |
17 |
julius |
|
| 54 |
26 |
julius |
Output:
|
| 55 |
|
|
|
| 56 |
|
|
Currently there are only two output formats, human readable string and CSV.
|
| 57 |
|
|
|
| 58 |
|
|
The output can be switched between human readable strings and CSV format (ready
|
| 59 |
|
|
to be imported into a spreadsheet application) by uncommenting one of the
|
| 60 |
|
|
"#define DISPLAY_" defines in the instruction set header. The program must be
|
| 61 |
|
|
recompiled if this is changed.
|
| 62 |
|
|
|
| 63 |
|
|
|
| 64 |
17 |
julius |
TODO:
|
| 65 |
26 |
julius |
o Collect and display information about l.j and l.jal instruction immediates
|
| 66 |
17 |
julius |
o Add an easy way to switch between human readable and CSV output
|
| 67 |
|
|
o Figure out how to tack this thing onto a simulator (or1ksim maybe) to give
|
| 68 |
|
|
results of execution when that finishes executing, or just how to get the
|
| 69 |
|
|
simulator to output a binary dump of executed instructions to be fed through
|
| 70 |
|
|
this
|
| 71 |
26 |
julius |
o Add support for a list of binary files to be specified at the command line
|
| 72 |
|
|
o Allow statistics to be collated over different files - this would allow each
|
| 73 |
|
|
function to be broken out of a library, or application, and in that regard
|
| 74 |
|
|
the instruction sequence data would then be accurate for static analysis.
|
| 75 |
|
|
|
| 76 |
|
|
|
| 77 |
|
|
July 24, 2010 - Julius Baxter
|