| 1 |
282 |
jeremybenn |
Arm / Thumb Interworking
|
| 2 |
|
|
========================
|
| 3 |
|
|
|
| 4 |
|
|
The Cygnus GNU Pro Toolkit for the ARM7T processor supports function
|
| 5 |
|
|
calls between code compiled for the ARM instruction set and code
|
| 6 |
|
|
compiled for the Thumb instruction set and vice versa. This document
|
| 7 |
|
|
describes how that interworking support operates and explains the
|
| 8 |
|
|
command line switches that should be used in order to produce working
|
| 9 |
|
|
programs.
|
| 10 |
|
|
|
| 11 |
|
|
Note: The Cygnus GNU Pro Toolkit does not support switching between
|
| 12 |
|
|
compiling for the ARM instruction set and the Thumb instruction set
|
| 13 |
|
|
on anything other than a per file basis. There are in fact two
|
| 14 |
|
|
completely separate compilers, one that produces ARM assembler
|
| 15 |
|
|
instructions and one that produces Thumb assembler instructions. The
|
| 16 |
|
|
two compilers share the same assembler, linker and so on.
|
| 17 |
|
|
|
| 18 |
|
|
|
| 19 |
|
|
1. Explicit interworking support for C and C++ files
|
| 20 |
|
|
====================================================
|
| 21 |
|
|
|
| 22 |
|
|
By default if a file is compiled without any special command line
|
| 23 |
|
|
switches then the code produced will not support interworking.
|
| 24 |
|
|
Provided that a program is made up entirely from object files and
|
| 25 |
|
|
libraries produced in this way and which contain either exclusively
|
| 26 |
|
|
ARM instructions or exclusively Thumb instructions then this will not
|
| 27 |
|
|
matter and a working executable will be created. If an attempt is
|
| 28 |
|
|
made to link together mixed ARM and Thumb object files and libraries,
|
| 29 |
|
|
then warning messages will be produced by the linker and a non-working
|
| 30 |
|
|
executable will be created.
|
| 31 |
|
|
|
| 32 |
|
|
In order to produce code which does support interworking it should be
|
| 33 |
|
|
compiled with the
|
| 34 |
|
|
|
| 35 |
|
|
-mthumb-interwork
|
| 36 |
|
|
|
| 37 |
|
|
command line option. Provided that a program is made up entirely from
|
| 38 |
|
|
object files and libraries built with this command line switch a
|
| 39 |
|
|
working executable will be produced, even if both ARM and Thumb
|
| 40 |
|
|
instructions are used by the various components of the program. (No
|
| 41 |
|
|
warning messages will be produced by the linker either).
|
| 42 |
|
|
|
| 43 |
|
|
Note that specifying -mthumb-interwork does result in slightly larger,
|
| 44 |
|
|
slower code being produced. This is why interworking support must be
|
| 45 |
|
|
specifically enabled by a switch.
|
| 46 |
|
|
|
| 47 |
|
|
|
| 48 |
|
|
2. Explicit interworking support for assembler files
|
| 49 |
|
|
====================================================
|
| 50 |
|
|
|
| 51 |
|
|
If assembler files are to be included into an interworking program
|
| 52 |
|
|
then the following rules must be obeyed:
|
| 53 |
|
|
|
| 54 |
|
|
* Any externally visible functions must return by using the BX
|
| 55 |
|
|
instruction.
|
| 56 |
|
|
|
| 57 |
|
|
* Normal function calls can just use the BL instruction. The
|
| 58 |
|
|
linker will automatically insert code to switch between ARM
|
| 59 |
|
|
and Thumb modes as necessary.
|
| 60 |
|
|
|
| 61 |
|
|
* Calls via function pointers should use the BX instruction if
|
| 62 |
|
|
the call is made in ARM mode:
|
| 63 |
|
|
|
| 64 |
|
|
.code 32
|
| 65 |
|
|
mov lr, pc
|
| 66 |
|
|
bx rX
|
| 67 |
|
|
|
| 68 |
|
|
This code sequence will not work in Thumb mode however, since
|
| 69 |
|
|
the mov instruction will not set the bottom bit of the lr
|
| 70 |
|
|
register. Instead a branch-and-link to the _call_via_rX
|
| 71 |
|
|
functions should be used instead:
|
| 72 |
|
|
|
| 73 |
|
|
.code 16
|
| 74 |
|
|
bl _call_via_rX
|
| 75 |
|
|
|
| 76 |
|
|
where rX is replaced by the name of the register containing
|
| 77 |
|
|
the function address.
|
| 78 |
|
|
|
| 79 |
|
|
* All externally visible functions which should be entered in
|
| 80 |
|
|
Thumb mode must have the .thumb_func pseudo op specified just
|
| 81 |
|
|
before their entry point. e.g.:
|
| 82 |
|
|
|
| 83 |
|
|
.code 16
|
| 84 |
|
|
.global function
|
| 85 |
|
|
.thumb_func
|
| 86 |
|
|
function:
|
| 87 |
|
|
...start of function....
|
| 88 |
|
|
|
| 89 |
|
|
* All assembler files must be assembled with the switch
|
| 90 |
|
|
-mthumb-interwork specified on the command line. (If the file
|
| 91 |
|
|
is assembled by calling gcc it will automatically pass on the
|
| 92 |
|
|
-mthumb-interwork switch to the assembler, provided that it
|
| 93 |
|
|
was specified on the gcc command line in the first place.)
|
| 94 |
|
|
|
| 95 |
|
|
|
| 96 |
|
|
3. Support for old, non-interworking aware code.
|
| 97 |
|
|
================================================
|
| 98 |
|
|
|
| 99 |
|
|
If it is necessary to link together code produced by an older,
|
| 100 |
|
|
non-interworking aware compiler, or code produced by the new compiler
|
| 101 |
|
|
but without the -mthumb-interwork command line switch specified, then
|
| 102 |
|
|
there are two command line switches that can be used to support this.
|
| 103 |
|
|
|
| 104 |
|
|
The switch
|
| 105 |
|
|
|
| 106 |
|
|
-mcaller-super-interworking
|
| 107 |
|
|
|
| 108 |
|
|
will allow calls via function pointers in Thumb mode to work,
|
| 109 |
|
|
regardless of whether the function pointer points to old,
|
| 110 |
|
|
non-interworking aware code or not. Specifying this switch does
|
| 111 |
|
|
produce slightly slower code however.
|
| 112 |
|
|
|
| 113 |
|
|
Note: There is no switch to allow calls via function pointers in ARM
|
| 114 |
|
|
mode to be handled specially. Calls via function pointers from
|
| 115 |
|
|
interworking aware ARM code to non-interworking aware ARM code work
|
| 116 |
|
|
without any special considerations by the compiler. Calls via
|
| 117 |
|
|
function pointers from interworking aware ARM code to non-interworking
|
| 118 |
|
|
aware Thumb code however will not work. (Actually under some
|
| 119 |
|
|
circumstances they may work, but there are no guarantees). This is
|
| 120 |
|
|
because only the new compiler is able to produce Thumb code, and this
|
| 121 |
|
|
compiler already has a command line switch to produce interworking
|
| 122 |
|
|
aware code.
|
| 123 |
|
|
|
| 124 |
|
|
|
| 125 |
|
|
The switch
|
| 126 |
|
|
|
| 127 |
|
|
-mcallee-super-interworking
|
| 128 |
|
|
|
| 129 |
|
|
will allow non-interworking aware ARM or Thumb code to call Thumb
|
| 130 |
|
|
functions, either directly or via function pointers. Specifying this
|
| 131 |
|
|
switch does produce slightly larger, slower code however.
|
| 132 |
|
|
|
| 133 |
|
|
Note: There is no switch to allow non-interworking aware ARM or Thumb
|
| 134 |
|
|
code to call ARM functions. There is no need for any special handling
|
| 135 |
|
|
of calls from non-interworking aware ARM code to interworking aware
|
| 136 |
|
|
ARM functions, they just work normally. Calls from non-interworking
|
| 137 |
|
|
aware Thumb functions to ARM code however, will not work. There is no
|
| 138 |
|
|
option to support this, since it is always possible to recompile the
|
| 139 |
|
|
Thumb code to be interworking aware.
|
| 140 |
|
|
|
| 141 |
|
|
As an alternative to the command line switch
|
| 142 |
|
|
-mcallee-super-interworking, which affects all externally visible
|
| 143 |
|
|
functions in a file, it is possible to specify an attribute or
|
| 144 |
|
|
declspec for individual functions, indicating that that particular
|
| 145 |
|
|
function should support being called by non-interworking aware code.
|
| 146 |
|
|
The function should be defined like this:
|
| 147 |
|
|
|
| 148 |
|
|
int __attribute__((interfacearm)) function
|
| 149 |
|
|
{
|
| 150 |
|
|
... body of function ...
|
| 151 |
|
|
}
|
| 152 |
|
|
|
| 153 |
|
|
or
|
| 154 |
|
|
|
| 155 |
|
|
int __declspec(interfacearm) function
|
| 156 |
|
|
{
|
| 157 |
|
|
... body of function ...
|
| 158 |
|
|
}
|
| 159 |
|
|
|
| 160 |
|
|
|
| 161 |
|
|
|
| 162 |
|
|
4. Interworking support in dlltool
|
| 163 |
|
|
==================================
|
| 164 |
|
|
|
| 165 |
|
|
It is possible to create DLLs containing mixed ARM and Thumb code. It
|
| 166 |
|
|
is also possible to call Thumb code in a DLL from an ARM program and
|
| 167 |
|
|
vice versa. It is even possible to call ARM DLLs that have been compiled
|
| 168 |
|
|
without interworking support (say by an older version of the compiler),
|
| 169 |
|
|
from Thumb programs and still have things work properly.
|
| 170 |
|
|
|
| 171 |
|
|
A version of the `dlltool' program which supports the `--interwork'
|
| 172 |
|
|
command line switch is needed, as well as the following special
|
| 173 |
|
|
considerations when building programs and DLLs:
|
| 174 |
|
|
|
| 175 |
|
|
*Use `-mthumb-interwork'*
|
| 176 |
|
|
When compiling files for a DLL or a program the `-mthumb-interwork'
|
| 177 |
|
|
command line switch should be specified if calling between ARM and
|
| 178 |
|
|
Thumb code can happen. If a program is being compiled and the
|
| 179 |
|
|
mode of the DLLs that it uses is not known, then it should be
|
| 180 |
|
|
assumed that interworking might occur and the switch used.
|
| 181 |
|
|
|
| 182 |
|
|
*Use `-m thumb'*
|
| 183 |
|
|
If the exported functions from a DLL are all Thumb encoded then the
|
| 184 |
|
|
`-m thumb' command line switch should be given to dlltool when
|
| 185 |
|
|
building the stubs. This will make dlltool create Thumb encoded
|
| 186 |
|
|
stubs, rather than its default of ARM encoded stubs.
|
| 187 |
|
|
|
| 188 |
|
|
If the DLL consists of both exported Thumb functions and exported
|
| 189 |
|
|
ARM functions then the `-m thumb' switch should not be used.
|
| 190 |
|
|
Instead the Thumb functions in the DLL should be compiled with the
|
| 191 |
|
|
`-mcallee-super-interworking' switch, or with the `interfacearm'
|
| 192 |
|
|
attribute specified on their prototypes. In this way they will be
|
| 193 |
|
|
given ARM encoded prologues, which will work with the ARM encoded
|
| 194 |
|
|
stubs produced by dlltool.
|
| 195 |
|
|
|
| 196 |
|
|
*Use `-mcaller-super-interworking'*
|
| 197 |
|
|
If it is possible for Thumb functions in a DLL to call
|
| 198 |
|
|
non-interworking aware code via a function pointer, then the Thumb
|
| 199 |
|
|
code must be compiled with the `-mcaller-super-interworking'
|
| 200 |
|
|
command line switch. This will force the function pointer calls
|
| 201 |
|
|
to use the _interwork_call_via_rX stub functions which will
|
| 202 |
|
|
correctly restore Thumb mode upon return from the called function.
|
| 203 |
|
|
|
| 204 |
|
|
*Link with `libgcc.a'*
|
| 205 |
|
|
When the dll is built it may have to be linked with the GCC
|
| 206 |
|
|
library (`libgcc.a') in order to extract the _call_via_rX functions
|
| 207 |
|
|
or the _interwork_call_via_rX functions. This represents a partial
|
| 208 |
|
|
redundancy since the same functions *may* be present in the
|
| 209 |
|
|
application itself, but since they only take up 372 bytes this
|
| 210 |
|
|
should not be too much of a consideration.
|
| 211 |
|
|
|
| 212 |
|
|
*Use `--support-old-code'*
|
| 213 |
|
|
When linking a program with an old DLL which does not support
|
| 214 |
|
|
interworking, the `--support-old-code' command line switch to the
|
| 215 |
|
|
linker should be used. This causes the linker to generate special
|
| 216 |
|
|
interworking stubs which can cope with old, non-interworking aware
|
| 217 |
|
|
ARM code, at the cost of generating bulkier code. The linker will
|
| 218 |
|
|
still generate a warning message along the lines of:
|
| 219 |
|
|
"Warning: input file XXX does not support interworking, whereas YYY does."
|
| 220 |
|
|
but this can now be ignored because the --support-old-code switch
|
| 221 |
|
|
has been used.
|
| 222 |
|
|
|
| 223 |
|
|
|
| 224 |
|
|
|
| 225 |
|
|
5. How interworking support works
|
| 226 |
|
|
=================================
|
| 227 |
|
|
|
| 228 |
|
|
Switching between the ARM and Thumb instruction sets is accomplished
|
| 229 |
|
|
via the BX instruction which takes as an argument a register name.
|
| 230 |
|
|
Control is transfered to the address held in this register (with the
|
| 231 |
|
|
bottom bit masked out), and if the bottom bit is set, then Thumb
|
| 232 |
|
|
instruction processing is enabled, otherwise ARM instruction
|
| 233 |
|
|
processing is enabled.
|
| 234 |
|
|
|
| 235 |
|
|
When the -mthumb-interwork command line switch is specified, gcc
|
| 236 |
|
|
arranges for all functions to return to their caller by using the BX
|
| 237 |
|
|
instruction. Thus provided that the return address has the bottom bit
|
| 238 |
|
|
correctly initialized to indicate the instruction set of the caller,
|
| 239 |
|
|
correct operation will ensue.
|
| 240 |
|
|
|
| 241 |
|
|
When a function is called explicitly (rather than via a function
|
| 242 |
|
|
pointer), the compiler generates a BL instruction to do this. The
|
| 243 |
|
|
Thumb version of the BL instruction has the special property of
|
| 244 |
|
|
setting the bottom bit of the LR register after it has stored the
|
| 245 |
|
|
return address into it, so that a future BX instruction will correctly
|
| 246 |
|
|
return the instruction after the BL instruction, in Thumb mode.
|
| 247 |
|
|
|
| 248 |
|
|
The BL instruction does not change modes itself however, so if an ARM
|
| 249 |
|
|
function is calling a Thumb function, or vice versa, it is necessary
|
| 250 |
|
|
to generate some extra instructions to handle this. This is done in
|
| 251 |
|
|
the linker when it is storing the address of the referenced function
|
| 252 |
|
|
into the BL instruction. If the BL instruction is an ARM style BL
|
| 253 |
|
|
instruction, but the referenced function is a Thumb function, then the
|
| 254 |
|
|
linker automatically generates a calling stub that converts from ARM
|
| 255 |
|
|
mode to Thumb mode, puts the address of this stub into the BL
|
| 256 |
|
|
instruction, and puts the address of the referenced function into the
|
| 257 |
|
|
stub. Similarly if the BL instruction is a Thumb BL instruction, and
|
| 258 |
|
|
the referenced function is an ARM function, the linker generates a
|
| 259 |
|
|
stub which converts from Thumb to ARM mode, puts the address of this
|
| 260 |
|
|
stub into the BL instruction, and the address of the referenced
|
| 261 |
|
|
function into the stub.
|
| 262 |
|
|
|
| 263 |
|
|
This is why it is necessary to mark Thumb functions with the
|
| 264 |
|
|
.thumb_func pseudo op when creating assembler files. This pseudo op
|
| 265 |
|
|
allows the assembler to distinguish between ARM functions and Thumb
|
| 266 |
|
|
functions. (The Thumb version of GCC automatically generates these
|
| 267 |
|
|
pseudo ops for any Thumb functions that it generates).
|
| 268 |
|
|
|
| 269 |
|
|
Calls via function pointers work differently. Whenever the address of
|
| 270 |
|
|
a function is taken, the linker examines the type of the function
|
| 271 |
|
|
being referenced. If the function is a Thumb function, then it sets
|
| 272 |
|
|
the bottom bit of the address. Technically this makes the address
|
| 273 |
|
|
incorrect, since it is now one byte into the start of the function,
|
| 274 |
|
|
but this is never a problem because:
|
| 275 |
|
|
|
| 276 |
|
|
a. with interworking enabled all calls via function pointer
|
| 277 |
|
|
are done using the BX instruction and this ignores the
|
| 278 |
|
|
bottom bit when computing where to go to.
|
| 279 |
|
|
|
| 280 |
|
|
b. the linker will always set the bottom bit when the address
|
| 281 |
|
|
of the function is taken, so it is never possible to take
|
| 282 |
|
|
the address of the function in two different places and
|
| 283 |
|
|
then compare them and find that they are not equal.
|
| 284 |
|
|
|
| 285 |
|
|
As already mentioned any call via a function pointer will use the BX
|
| 286 |
|
|
instruction (provided that interworking is enabled). The only problem
|
| 287 |
|
|
with this is computing the return address for the return from the
|
| 288 |
|
|
called function. For ARM code this can easily be done by the code
|
| 289 |
|
|
sequence:
|
| 290 |
|
|
|
| 291 |
|
|
mov lr, pc
|
| 292 |
|
|
bx rX
|
| 293 |
|
|
|
| 294 |
|
|
(where rX is the name of the register containing the function
|
| 295 |
|
|
pointer). This code does not work for the Thumb instruction set,
|
| 296 |
|
|
since the MOV instruction will not set the bottom bit of the LR
|
| 297 |
|
|
register, so that when the called function returns, it will return in
|
| 298 |
|
|
ARM mode not Thumb mode. Instead the compiler generates this
|
| 299 |
|
|
sequence:
|
| 300 |
|
|
|
| 301 |
|
|
bl _call_via_rX
|
| 302 |
|
|
|
| 303 |
|
|
(again where rX is the name if the register containing the function
|
| 304 |
|
|
pointer). The special call_via_rX functions look like this:
|
| 305 |
|
|
|
| 306 |
|
|
.thumb_func
|
| 307 |
|
|
_call_via_r0:
|
| 308 |
|
|
bx r0
|
| 309 |
|
|
nop
|
| 310 |
|
|
|
| 311 |
|
|
The BL instruction ensures that the correct return address is stored
|
| 312 |
|
|
in the LR register and then the BX instruction jumps to the address
|
| 313 |
|
|
stored in the function pointer, switch modes if necessary.
|
| 314 |
|
|
|
| 315 |
|
|
|
| 316 |
|
|
6. How caller-super-interworking support works
|
| 317 |
|
|
==============================================
|
| 318 |
|
|
|
| 319 |
|
|
When the -mcaller-super-interworking command line switch is specified
|
| 320 |
|
|
it changes the code produced by the Thumb compiler so that all calls
|
| 321 |
|
|
via function pointers (including virtual function calls) now go via a
|
| 322 |
|
|
different stub function. The code to call via a function pointer now
|
| 323 |
|
|
looks like this:
|
| 324 |
|
|
|
| 325 |
|
|
bl _interwork_call_via_r0
|
| 326 |
|
|
|
| 327 |
|
|
Note: The compiler does not insist that r0 be used to hold the
|
| 328 |
|
|
function address. Any register will do, and there are a suite of stub
|
| 329 |
|
|
functions, one for each possible register. The stub functions look
|
| 330 |
|
|
like this:
|
| 331 |
|
|
|
| 332 |
|
|
.code 16
|
| 333 |
|
|
.thumb_func
|
| 334 |
|
|
_interwork_call_via_r0
|
| 335 |
|
|
bx pc
|
| 336 |
|
|
nop
|
| 337 |
|
|
|
| 338 |
|
|
.code 32
|
| 339 |
|
|
tst r0, #1
|
| 340 |
|
|
stmeqdb r13!, {lr}
|
| 341 |
|
|
adreq lr, _arm_return
|
| 342 |
|
|
bx r0
|
| 343 |
|
|
|
| 344 |
|
|
The stub first switches to ARM mode, since it is a lot easier to
|
| 345 |
|
|
perform the necessary operations using ARM instructions. It then
|
| 346 |
|
|
tests the bottom bit of the register containing the address of the
|
| 347 |
|
|
function to be called. If this bottom bit is set then the function
|
| 348 |
|
|
being called uses Thumb instructions and the BX instruction to come
|
| 349 |
|
|
will switch back into Thumb mode before calling this function. (Note
|
| 350 |
|
|
that it does not matter how this called function chooses to return to
|
| 351 |
|
|
its caller, since the both the caller and callee are Thumb functions,
|
| 352 |
|
|
and mode switching is necessary). If the function being called is an
|
| 353 |
|
|
ARM mode function however, the stub pushes the return address (with
|
| 354 |
|
|
its bottom bit set) onto the stack, replaces the return address with
|
| 355 |
|
|
the address of the a piece of code called '_arm_return' and then
|
| 356 |
|
|
performs a BX instruction to call the function.
|
| 357 |
|
|
|
| 358 |
|
|
The '_arm_return' code looks like this:
|
| 359 |
|
|
|
| 360 |
|
|
.code 32
|
| 361 |
|
|
_arm_return:
|
| 362 |
|
|
ldmia r13!, {r12}
|
| 363 |
|
|
bx r12
|
| 364 |
|
|
.code 16
|
| 365 |
|
|
|
| 366 |
|
|
|
| 367 |
|
|
It simply retrieves the return address from the stack, and then
|
| 368 |
|
|
performs a BX operation to return to the caller and switch back into
|
| 369 |
|
|
Thumb mode.
|
| 370 |
|
|
|
| 371 |
|
|
|
| 372 |
|
|
7. How callee-super-interworking support works
|
| 373 |
|
|
==============================================
|
| 374 |
|
|
|
| 375 |
|
|
When -mcallee-super-interworking is specified on the command line the
|
| 376 |
|
|
Thumb compiler behaves as if every externally visible function that it
|
| 377 |
|
|
compiles has had the (interfacearm) attribute specified for it. What
|
| 378 |
|
|
this attribute does is to put a special, ARM mode header onto the
|
| 379 |
|
|
function which forces a switch into Thumb mode:
|
| 380 |
|
|
|
| 381 |
|
|
without __attribute__((interfacearm)):
|
| 382 |
|
|
|
| 383 |
|
|
.code 16
|
| 384 |
|
|
.thumb_func
|
| 385 |
|
|
function:
|
| 386 |
|
|
... start of function ...
|
| 387 |
|
|
|
| 388 |
|
|
with __attribute__((interfacearm)):
|
| 389 |
|
|
|
| 390 |
|
|
.code 32
|
| 391 |
|
|
function:
|
| 392 |
|
|
orr r12, pc, #1
|
| 393 |
|
|
bx r12
|
| 394 |
|
|
|
| 395 |
|
|
.code 16
|
| 396 |
|
|
.thumb_func
|
| 397 |
|
|
.real_start_of_function:
|
| 398 |
|
|
|
| 399 |
|
|
... start of function ...
|
| 400 |
|
|
|
| 401 |
|
|
Note that since the function now expects to be entered in ARM mode, it
|
| 402 |
|
|
no longer has the .thumb_func pseudo op specified for its name.
|
| 403 |
|
|
Instead the pseudo op is attached to a new label .real_start_of_
|
| 404 |
|
|
(where is the name of the function) which indicates the start
|
| 405 |
|
|
of the Thumb code. This does have the interesting side effect in that
|
| 406 |
|
|
if this function is now called from a Thumb mode piece of code
|
| 407 |
|
|
outside of the current file, the linker will generate a calling stub
|
| 408 |
|
|
to switch from Thumb mode into ARM mode, and then this is immediately
|
| 409 |
|
|
overridden by the function's header which switches back into Thumb
|
| 410 |
|
|
mode.
|
| 411 |
|
|
|
| 412 |
|
|
In addition the (interfacearm) attribute also forces the function to
|
| 413 |
|
|
return by using the BX instruction, even if has not been compiled with
|
| 414 |
|
|
the -mthumb-interwork command line flag, so that the correct mode will
|
| 415 |
|
|
be restored upon exit from the function.
|
| 416 |
|
|
|
| 417 |
|
|
|
| 418 |
|
|
8. Some examples
|
| 419 |
|
|
================
|
| 420 |
|
|
|
| 421 |
|
|
Given these two test files:
|
| 422 |
|
|
|
| 423 |
|
|
int arm (void) { return 1 + thumb (); }
|
| 424 |
|
|
|
| 425 |
|
|
int thumb (void) { return 2 + arm (); }
|
| 426 |
|
|
|
| 427 |
|
|
The following pieces of assembler are produced by the ARM and Thumb
|
| 428 |
|
|
version of GCC depending upon the command line options used:
|
| 429 |
|
|
|
| 430 |
|
|
`-O2':
|
| 431 |
|
|
.code 32 .code 16
|
| 432 |
|
|
.global _arm .global _thumb
|
| 433 |
|
|
.thumb_func
|
| 434 |
|
|
_arm: _thumb:
|
| 435 |
|
|
mov ip, sp
|
| 436 |
|
|
stmfd sp!, {fp, ip, lr, pc} push {lr}
|
| 437 |
|
|
sub fp, ip, #4
|
| 438 |
|
|
bl _thumb bl _arm
|
| 439 |
|
|
add r0, r0, #1 add r0, r0, #2
|
| 440 |
|
|
ldmea fp, {fp, sp, pc} pop {pc}
|
| 441 |
|
|
|
| 442 |
|
|
Note how the functions return without using the BX instruction. If
|
| 443 |
|
|
these files were assembled and linked together they would fail to work
|
| 444 |
|
|
because they do not change mode when returning to their caller.
|
| 445 |
|
|
|
| 446 |
|
|
`-O2 -mthumb-interwork':
|
| 447 |
|
|
|
| 448 |
|
|
.code 32 .code 16
|
| 449 |
|
|
.global _arm .global _thumb
|
| 450 |
|
|
.thumb_func
|
| 451 |
|
|
_arm: _thumb:
|
| 452 |
|
|
mov ip, sp
|
| 453 |
|
|
stmfd sp!, {fp, ip, lr, pc} push {lr}
|
| 454 |
|
|
sub fp, ip, #4
|
| 455 |
|
|
bl _thumb bl _arm
|
| 456 |
|
|
add r0, r0, #1 add r0, r0, #2
|
| 457 |
|
|
ldmea fp, {fp, sp, lr} pop {r1}
|
| 458 |
|
|
bx lr bx r1
|
| 459 |
|
|
|
| 460 |
|
|
Now the functions use BX to return their caller. They have grown by
|
| 461 |
|
|
4 and 2 bytes respectively, but they can now successfully be linked
|
| 462 |
|
|
together and be expect to work. The linker will replace the
|
| 463 |
|
|
destinations of the two BL instructions with the addresses of calling
|
| 464 |
|
|
stubs which convert to the correct mode before jumping to the called
|
| 465 |
|
|
function.
|
| 466 |
|
|
|
| 467 |
|
|
`-O2 -mcallee-super-interworking':
|
| 468 |
|
|
|
| 469 |
|
|
.code 32 .code 32
|
| 470 |
|
|
.global _arm .global _thumb
|
| 471 |
|
|
_arm: _thumb:
|
| 472 |
|
|
orr r12, pc, #1
|
| 473 |
|
|
bx r12
|
| 474 |
|
|
mov ip, sp .code 16
|
| 475 |
|
|
stmfd sp!, {fp, ip, lr, pc} push {lr}
|
| 476 |
|
|
sub fp, ip, #4
|
| 477 |
|
|
bl _thumb bl _arm
|
| 478 |
|
|
add r0, r0, #1 add r0, r0, #2
|
| 479 |
|
|
ldmea fp, {fp, sp, lr} pop {r1}
|
| 480 |
|
|
bx lr bx r1
|
| 481 |
|
|
|
| 482 |
|
|
The thumb function now has an ARM encoded prologue, and it no longer
|
| 483 |
|
|
has the `.thumb-func' pseudo op attached to it. The linker will not
|
| 484 |
|
|
generate a calling stub for the call from arm() to thumb(), but it will
|
| 485 |
|
|
still have to generate a stub for the call from thumb() to arm(). Also
|
| 486 |
|
|
note how specifying `--mcallee-super-interworking' automatically
|
| 487 |
|
|
implies `-mthumb-interworking'.
|
| 488 |
|
|
|
| 489 |
|
|
|
| 490 |
|
|
9. Some Function Pointer Examples
|
| 491 |
|
|
=================================
|
| 492 |
|
|
|
| 493 |
|
|
Given this test file:
|
| 494 |
|
|
|
| 495 |
|
|
int func (void) { return 1; }
|
| 496 |
|
|
|
| 497 |
|
|
int call (int (* ptr)(void)) { return ptr (); }
|
| 498 |
|
|
|
| 499 |
|
|
The following varying pieces of assembler are produced by the Thumb
|
| 500 |
|
|
version of GCC depending upon the command line options used:
|
| 501 |
|
|
|
| 502 |
|
|
`-O2':
|
| 503 |
|
|
.code 16
|
| 504 |
|
|
.globl _func
|
| 505 |
|
|
.thumb_func
|
| 506 |
|
|
_func:
|
| 507 |
|
|
mov r0, #1
|
| 508 |
|
|
bx lr
|
| 509 |
|
|
|
| 510 |
|
|
.globl _call
|
| 511 |
|
|
.thumb_func
|
| 512 |
|
|
_call:
|
| 513 |
|
|
push {lr}
|
| 514 |
|
|
bl __call_via_r0
|
| 515 |
|
|
pop {pc}
|
| 516 |
|
|
|
| 517 |
|
|
Note how the two functions have different exit sequences. In
|
| 518 |
|
|
particular call() uses pop {pc} to return, which would not work if the
|
| 519 |
|
|
caller was in ARM mode. func() however, uses the BX instruction, even
|
| 520 |
|
|
though `-mthumb-interwork' has not been specified, as this is the most
|
| 521 |
|
|
efficient way to exit a function when the return address is held in the
|
| 522 |
|
|
link register.
|
| 523 |
|
|
|
| 524 |
|
|
`-O2 -mthumb-interwork':
|
| 525 |
|
|
|
| 526 |
|
|
.code 16
|
| 527 |
|
|
.globl _func
|
| 528 |
|
|
.thumb_func
|
| 529 |
|
|
_func:
|
| 530 |
|
|
mov r0, #1
|
| 531 |
|
|
bx lr
|
| 532 |
|
|
|
| 533 |
|
|
.globl _call
|
| 534 |
|
|
.thumb_func
|
| 535 |
|
|
_call:
|
| 536 |
|
|
push {lr}
|
| 537 |
|
|
bl __call_via_r0
|
| 538 |
|
|
pop {r1}
|
| 539 |
|
|
bx r1
|
| 540 |
|
|
|
| 541 |
|
|
This time both functions return by using the BX instruction. This
|
| 542 |
|
|
means that call() is now two bytes longer and several cycles slower
|
| 543 |
|
|
than the previous version.
|
| 544 |
|
|
|
| 545 |
|
|
`-O2 -mcaller-super-interworking':
|
| 546 |
|
|
.code 16
|
| 547 |
|
|
.globl _func
|
| 548 |
|
|
.thumb_func
|
| 549 |
|
|
_func:
|
| 550 |
|
|
mov r0, #1
|
| 551 |
|
|
bx lr
|
| 552 |
|
|
|
| 553 |
|
|
.globl _call
|
| 554 |
|
|
.thumb_func
|
| 555 |
|
|
_call:
|
| 556 |
|
|
push {lr}
|
| 557 |
|
|
bl __interwork_call_via_r0
|
| 558 |
|
|
pop {pc}
|
| 559 |
|
|
|
| 560 |
|
|
Very similar to the first (non-interworking) version, except that a
|
| 561 |
|
|
different stub is used to call via the function pointer. This new stub
|
| 562 |
|
|
will work even if the called function is not interworking aware, and
|
| 563 |
|
|
tries to return to call() in ARM mode. Note that the assembly code for
|
| 564 |
|
|
call() is still not interworking aware itself, and so should not be
|
| 565 |
|
|
called from ARM code.
|
| 566 |
|
|
|
| 567 |
|
|
`-O2 -mcallee-super-interworking':
|
| 568 |
|
|
|
| 569 |
|
|
.code 32
|
| 570 |
|
|
.globl _func
|
| 571 |
|
|
_func:
|
| 572 |
|
|
orr r12, pc, #1
|
| 573 |
|
|
bx r12
|
| 574 |
|
|
|
| 575 |
|
|
.code 16
|
| 576 |
|
|
.globl .real_start_of_func
|
| 577 |
|
|
.thumb_func
|
| 578 |
|
|
.real_start_of_func:
|
| 579 |
|
|
mov r0, #1
|
| 580 |
|
|
bx lr
|
| 581 |
|
|
|
| 582 |
|
|
.code 32
|
| 583 |
|
|
.globl _call
|
| 584 |
|
|
_call:
|
| 585 |
|
|
orr r12, pc, #1
|
| 586 |
|
|
bx r12
|
| 587 |
|
|
|
| 588 |
|
|
.code 16
|
| 589 |
|
|
.globl .real_start_of_call
|
| 590 |
|
|
.thumb_func
|
| 591 |
|
|
.real_start_of_call:
|
| 592 |
|
|
push {lr}
|
| 593 |
|
|
bl __call_via_r0
|
| 594 |
|
|
pop {r1}
|
| 595 |
|
|
bx r1
|
| 596 |
|
|
|
| 597 |
|
|
Now both functions have an ARM coded prologue, and both functions
|
| 598 |
|
|
return by using the BX instruction. These functions are interworking
|
| 599 |
|
|
aware therefore and can safely be called from ARM code. The code for
|
| 600 |
|
|
the call() function is now 10 bytes longer than the original, non
|
| 601 |
|
|
interworking aware version, an increase of over 200%.
|
| 602 |
|
|
|
| 603 |
|
|
If a prototype for call() is added to the source code, and this
|
| 604 |
|
|
prototype includes the `interfacearm' attribute:
|
| 605 |
|
|
|
| 606 |
|
|
int __attribute__((interfacearm)) call (int (* ptr)(void));
|
| 607 |
|
|
|
| 608 |
|
|
then this code is produced (with only -O2 specified on the command
|
| 609 |
|
|
line):
|
| 610 |
|
|
|
| 611 |
|
|
.code 16
|
| 612 |
|
|
.globl _func
|
| 613 |
|
|
.thumb_func
|
| 614 |
|
|
_func:
|
| 615 |
|
|
mov r0, #1
|
| 616 |
|
|
bx lr
|
| 617 |
|
|
|
| 618 |
|
|
.globl _call
|
| 619 |
|
|
.code 32
|
| 620 |
|
|
_call:
|
| 621 |
|
|
orr r12, pc, #1
|
| 622 |
|
|
bx r12
|
| 623 |
|
|
|
| 624 |
|
|
.code 16
|
| 625 |
|
|
.globl .real_start_of_call
|
| 626 |
|
|
.thumb_func
|
| 627 |
|
|
.real_start_of_call:
|
| 628 |
|
|
push {lr}
|
| 629 |
|
|
bl __call_via_r0
|
| 630 |
|
|
pop {r1}
|
| 631 |
|
|
bx r1
|
| 632 |
|
|
|
| 633 |
|
|
So now both call() and func() can be safely called via
|
| 634 |
|
|
non-interworking aware ARM code. If, when such a file is assembled,
|
| 635 |
|
|
the assembler detects the fact that call() is being called by another
|
| 636 |
|
|
function in the same file, it will automatically adjust the target of
|
| 637 |
|
|
the BL instruction to point to .real_start_of_call. In this way there
|
| 638 |
|
|
is no need for the linker to generate a Thumb-to-ARM calling stub so
|
| 639 |
|
|
that call can be entered in ARM mode.
|
| 640 |
|
|
|
| 641 |
|
|
|
| 642 |
|
|
10. How to use dlltool to build ARM/Thumb DLLs
|
| 643 |
|
|
==============================================
|
| 644 |
|
|
Given a program (`prog.c') like this:
|
| 645 |
|
|
|
| 646 |
|
|
extern int func_in_dll (void);
|
| 647 |
|
|
|
| 648 |
|
|
int main (void) { return func_in_dll(); }
|
| 649 |
|
|
|
| 650 |
|
|
And a DLL source file (`dll.c') like this:
|
| 651 |
|
|
|
| 652 |
|
|
int func_in_dll (void) { return 1; }
|
| 653 |
|
|
|
| 654 |
|
|
Here is how to build the DLL and the program for a purely ARM based
|
| 655 |
|
|
environment:
|
| 656 |
|
|
|
| 657 |
|
|
*Step One
|
| 658 |
|
|
Build a `.def' file describing the DLL:
|
| 659 |
|
|
|
| 660 |
|
|
; example.def
|
| 661 |
|
|
; This file describes the contents of the DLL
|
| 662 |
|
|
LIBRARY example
|
| 663 |
|
|
HEAPSIZE 0x40000, 0x2000
|
| 664 |
|
|
EXPORTS
|
| 665 |
|
|
func_in_dll 1
|
| 666 |
|
|
|
| 667 |
|
|
*Step Two
|
| 668 |
|
|
Compile the DLL source code:
|
| 669 |
|
|
|
| 670 |
|
|
arm-pe-gcc -O2 -c dll.c
|
| 671 |
|
|
|
| 672 |
|
|
*Step Three
|
| 673 |
|
|
Use `dlltool' to create an exports file and a library file:
|
| 674 |
|
|
|
| 675 |
|
|
dlltool --def example.def --output-exp example.o --output-lib example.a
|
| 676 |
|
|
|
| 677 |
|
|
*Step Four
|
| 678 |
|
|
Link together the complete DLL:
|
| 679 |
|
|
|
| 680 |
|
|
arm-pe-ld dll.o example.o -o example.dll
|
| 681 |
|
|
|
| 682 |
|
|
*Step Five
|
| 683 |
|
|
Compile the program's source code:
|
| 684 |
|
|
|
| 685 |
|
|
arm-pe-gcc -O2 -c prog.c
|
| 686 |
|
|
|
| 687 |
|
|
*Step Six
|
| 688 |
|
|
Link together the program and the DLL's library file:
|
| 689 |
|
|
|
| 690 |
|
|
arm-pe-gcc prog.o example.a -o prog
|
| 691 |
|
|
|
| 692 |
|
|
If instead this was a Thumb DLL being called from an ARM program, the
|
| 693 |
|
|
steps would look like this. (To save space only those steps that are
|
| 694 |
|
|
different from the previous version are shown):
|
| 695 |
|
|
|
| 696 |
|
|
*Step Two
|
| 697 |
|
|
Compile the DLL source code (using the Thumb compiler):
|
| 698 |
|
|
|
| 699 |
|
|
thumb-pe-gcc -O2 -c dll.c -mthumb-interwork
|
| 700 |
|
|
|
| 701 |
|
|
*Step Three
|
| 702 |
|
|
Build the exports and library files (and support interworking):
|
| 703 |
|
|
|
| 704 |
|
|
dlltool -d example.def -z example.o -l example.a --interwork -m thumb
|
| 705 |
|
|
|
| 706 |
|
|
*Step Five
|
| 707 |
|
|
Compile the program's source code (and support interworking):
|
| 708 |
|
|
|
| 709 |
|
|
arm-pe-gcc -O2 -c prog.c -mthumb-interwork
|
| 710 |
|
|
|
| 711 |
|
|
If instead, the DLL was an old, ARM DLL which does not support
|
| 712 |
|
|
interworking, and which cannot be rebuilt, then these steps would be
|
| 713 |
|
|
used.
|
| 714 |
|
|
|
| 715 |
|
|
*Step One
|
| 716 |
|
|
Skip. If you do not have access to the sources of a DLL, there is
|
| 717 |
|
|
no point in building a `.def' file for it.
|
| 718 |
|
|
|
| 719 |
|
|
*Step Two
|
| 720 |
|
|
Skip. With no DLL sources there is nothing to compile.
|
| 721 |
|
|
|
| 722 |
|
|
*Step Three
|
| 723 |
|
|
Skip. Without a `.def' file you cannot use dlltool to build an
|
| 724 |
|
|
exports file or a library file.
|
| 725 |
|
|
|
| 726 |
|
|
*Step Four
|
| 727 |
|
|
Skip. Without a set of DLL object files you cannot build the DLL.
|
| 728 |
|
|
Besides it has already been built for you by somebody else.
|
| 729 |
|
|
|
| 730 |
|
|
*Step Five
|
| 731 |
|
|
Compile the program's source code, this is the same as before:
|
| 732 |
|
|
|
| 733 |
|
|
arm-pe-gcc -O2 -c prog.c
|
| 734 |
|
|
|
| 735 |
|
|
*Step Six
|
| 736 |
|
|
Link together the program and the DLL's library file, passing the
|
| 737 |
|
|
`--support-old-code' option to the linker:
|
| 738 |
|
|
|
| 739 |
|
|
arm-pe-gcc prog.o example.a -Wl,--support-old-code -o prog
|
| 740 |
|
|
|
| 741 |
|
|
Ignore the warning message about the input file not supporting
|
| 742 |
|
|
interworking as the --support-old-code switch has taken care if this.
|
| 743 |
|
|
|
| 744 |
|
|
|
| 745 |
|
|
Copyright (C) 1998, 2002, 2003, 2004 Free Software Foundation, Inc.
|
| 746 |
|
|
|
| 747 |
|
|
Copying and distribution of this file, with or without modification,
|
| 748 |
|
|
are permitted in any medium without royalty provided the copyright
|
| 749 |
|
|
notice and this notice are preserved.
|