1 |
38 |
julius |
Arm / Thumb Interworking
|
2 |
|
|
========================
|
3 |
|
|
|
4 |
|
|
The Cygnus GNU Pro Toolkit for the ARM7T processor supports function
|
5 |
|
|
calls between code compiled for the ARM instruction set and code
|
6 |
|
|
compiled for the Thumb instruction set and vice versa. This document
|
7 |
|
|
describes how that interworking support operates and explains the
|
8 |
|
|
command line switches that should be used in order to produce working
|
9 |
|
|
programs.
|
10 |
|
|
|
11 |
|
|
Note: The Cygnus GNU Pro Toolkit does not support switching between
|
12 |
|
|
compiling for the ARM instruction set and the Thumb instruction set
|
13 |
|
|
on anything other than a per file basis. There are in fact two
|
14 |
|
|
completely separate compilers, one that produces ARM assembler
|
15 |
|
|
instructions and one that produces Thumb assembler instructions. The
|
16 |
|
|
two compilers share the same assembler, linker and so on.
|
17 |
|
|
|
18 |
|
|
|
19 |
|
|
1. Explicit interworking support for C and C++ files
|
20 |
|
|
====================================================
|
21 |
|
|
|
22 |
|
|
By default if a file is compiled without any special command line
|
23 |
|
|
switches then the code produced will not support interworking.
|
24 |
|
|
Provided that a program is made up entirely from object files and
|
25 |
|
|
libraries produced in this way and which contain either exclusively
|
26 |
|
|
ARM instructions or exclusively Thumb instructions then this will not
|
27 |
|
|
matter and a working executable will be created. If an attempt is
|
28 |
|
|
made to link together mixed ARM and Thumb object files and libraries,
|
29 |
|
|
then warning messages will be produced by the linker and a non-working
|
30 |
|
|
executable will be created.
|
31 |
|
|
|
32 |
|
|
In order to produce code which does support interworking it should be
|
33 |
|
|
compiled with the
|
34 |
|
|
|
35 |
|
|
-mthumb-interwork
|
36 |
|
|
|
37 |
|
|
command line option. Provided that a program is made up entirely from
|
38 |
|
|
object files and libraries built with this command line switch a
|
39 |
|
|
working executable will be produced, even if both ARM and Thumb
|
40 |
|
|
instructions are used by the various components of the program. (No
|
41 |
|
|
warning messages will be produced by the linker either).
|
42 |
|
|
|
43 |
|
|
Note that specifying -mthumb-interwork does result in slightly larger,
|
44 |
|
|
slower code being produced. This is why interworking support must be
|
45 |
|
|
specifically enabled by a switch.
|
46 |
|
|
|
47 |
|
|
|
48 |
|
|
2. Explicit interworking support for assembler files
|
49 |
|
|
====================================================
|
50 |
|
|
|
51 |
|
|
If assembler files are to be included into an interworking program
|
52 |
|
|
then the following rules must be obeyed:
|
53 |
|
|
|
54 |
|
|
* Any externally visible functions must return by using the BX
|
55 |
|
|
instruction.
|
56 |
|
|
|
57 |
|
|
* Normal function calls can just use the BL instruction. The
|
58 |
|
|
linker will automatically insert code to switch between ARM
|
59 |
|
|
and Thumb modes as necessary.
|
60 |
|
|
|
61 |
|
|
* Calls via function pointers should use the BX instruction if
|
62 |
|
|
the call is made in ARM mode:
|
63 |
|
|
|
64 |
|
|
.code 32
|
65 |
|
|
mov lr, pc
|
66 |
|
|
bx rX
|
67 |
|
|
|
68 |
|
|
This code sequence will not work in Thumb mode however, since
|
69 |
|
|
the mov instruction will not set the bottom bit of the lr
|
70 |
|
|
register. Instead a branch-and-link to the _call_via_rX
|
71 |
|
|
functions should be used instead:
|
72 |
|
|
|
73 |
|
|
.code 16
|
74 |
|
|
bl _call_via_rX
|
75 |
|
|
|
76 |
|
|
where rX is replaced by the name of the register containing
|
77 |
|
|
the function address.
|
78 |
|
|
|
79 |
|
|
* All externally visible functions which should be entered in
|
80 |
|
|
Thumb mode must have the .thumb_func pseudo op specified just
|
81 |
|
|
before their entry point. e.g.:
|
82 |
|
|
|
83 |
|
|
.code 16
|
84 |
|
|
.global function
|
85 |
|
|
.thumb_func
|
86 |
|
|
function:
|
87 |
|
|
...start of function....
|
88 |
|
|
|
89 |
|
|
* All assembler files must be assembled with the switch
|
90 |
|
|
-mthumb-interwork specified on the command line. (If the file
|
91 |
|
|
is assembled by calling gcc it will automatically pass on the
|
92 |
|
|
-mthumb-interwork switch to the assembler, provided that it
|
93 |
|
|
was specified on the gcc command line in the first place.)
|
94 |
|
|
|
95 |
|
|
|
96 |
|
|
3. Support for old, non-interworking aware code.
|
97 |
|
|
================================================
|
98 |
|
|
|
99 |
|
|
If it is necessary to link together code produced by an older,
|
100 |
|
|
non-interworking aware compiler, or code produced by the new compiler
|
101 |
|
|
but without the -mthumb-interwork command line switch specified, then
|
102 |
|
|
there are two command line switches that can be used to support this.
|
103 |
|
|
|
104 |
|
|
The switch
|
105 |
|
|
|
106 |
|
|
-mcaller-super-interworking
|
107 |
|
|
|
108 |
|
|
will allow calls via function pointers in Thumb mode to work,
|
109 |
|
|
regardless of whether the function pointer points to old,
|
110 |
|
|
non-interworking aware code or not. Specifying this switch does
|
111 |
|
|
produce slightly slower code however.
|
112 |
|
|
|
113 |
|
|
Note: There is no switch to allow calls via function pointers in ARM
|
114 |
|
|
mode to be handled specially. Calls via function pointers from
|
115 |
|
|
interworking aware ARM code to non-interworking aware ARM code work
|
116 |
|
|
without any special considerations by the compiler. Calls via
|
117 |
|
|
function pointers from interworking aware ARM code to non-interworking
|
118 |
|
|
aware Thumb code however will not work. (Actually under some
|
119 |
|
|
circumstances they may work, but there are no guarantees). This is
|
120 |
|
|
because only the new compiler is able to produce Thumb code, and this
|
121 |
|
|
compiler already has a command line switch to produce interworking
|
122 |
|
|
aware code.
|
123 |
|
|
|
124 |
|
|
|
125 |
|
|
The switch
|
126 |
|
|
|
127 |
|
|
-mcallee-super-interworking
|
128 |
|
|
|
129 |
|
|
will allow non-interworking aware ARM or Thumb code to call Thumb
|
130 |
|
|
functions, either directly or via function pointers. Specifying this
|
131 |
|
|
switch does produce slightly larger, slower code however.
|
132 |
|
|
|
133 |
|
|
Note: There is no switch to allow non-interworking aware ARM or Thumb
|
134 |
|
|
code to call ARM functions. There is no need for any special handling
|
135 |
|
|
of calls from non-interworking aware ARM code to interworking aware
|
136 |
|
|
ARM functions, they just work normally. Calls from non-interworking
|
137 |
|
|
aware Thumb functions to ARM code however, will not work. There is no
|
138 |
|
|
option to support this, since it is always possible to recompile the
|
139 |
|
|
Thumb code to be interworking aware.
|
140 |
|
|
|
141 |
|
|
As an alternative to the command line switch
|
142 |
|
|
-mcallee-super-interworking, which affects all externally visible
|
143 |
|
|
functions in a file, it is possible to specify an attribute or
|
144 |
|
|
declspec for individual functions, indicating that that particular
|
145 |
|
|
function should support being called by non-interworking aware code.
|
146 |
|
|
The function should be defined like this:
|
147 |
|
|
|
148 |
|
|
int __attribute__((interfacearm)) function
|
149 |
|
|
{
|
150 |
|
|
... body of function ...
|
151 |
|
|
}
|
152 |
|
|
|
153 |
|
|
or
|
154 |
|
|
|
155 |
|
|
int __declspec(interfacearm) function
|
156 |
|
|
{
|
157 |
|
|
... body of function ...
|
158 |
|
|
}
|
159 |
|
|
|
160 |
|
|
|
161 |
|
|
|
162 |
|
|
4. Interworking support in dlltool
|
163 |
|
|
==================================
|
164 |
|
|
|
165 |
|
|
It is possible to create DLLs containing mixed ARM and Thumb code. It
|
166 |
|
|
is also possible to call Thumb code in a DLL from an ARM program and
|
167 |
|
|
vice versa. It is even possible to call ARM DLLs that have been compiled
|
168 |
|
|
without interworking support (say by an older version of the compiler),
|
169 |
|
|
from Thumb programs and still have things work properly.
|
170 |
|
|
|
171 |
|
|
A version of the `dlltool' program which supports the `--interwork'
|
172 |
|
|
command line switch is needed, as well as the following special
|
173 |
|
|
considerations when building programs and DLLs:
|
174 |
|
|
|
175 |
|
|
*Use `-mthumb-interwork'*
|
176 |
|
|
When compiling files for a DLL or a program the `-mthumb-interwork'
|
177 |
|
|
command line switch should be specified if calling between ARM and
|
178 |
|
|
Thumb code can happen. If a program is being compiled and the
|
179 |
|
|
mode of the DLLs that it uses is not known, then it should be
|
180 |
|
|
assumed that interworking might occur and the switch used.
|
181 |
|
|
|
182 |
|
|
*Use `-m thumb'*
|
183 |
|
|
If the exported functions from a DLL are all Thumb encoded then the
|
184 |
|
|
`-m thumb' command line switch should be given to dlltool when
|
185 |
|
|
building the stubs. This will make dlltool create Thumb encoded
|
186 |
|
|
stubs, rather than its default of ARM encoded stubs.
|
187 |
|
|
|
188 |
|
|
If the DLL consists of both exported Thumb functions and exported
|
189 |
|
|
ARM functions then the `-m thumb' switch should not be used.
|
190 |
|
|
Instead the Thumb functions in the DLL should be compiled with the
|
191 |
|
|
`-mcallee-super-interworking' switch, or with the `interfacearm'
|
192 |
|
|
attribute specified on their prototypes. In this way they will be
|
193 |
|
|
given ARM encoded prologues, which will work with the ARM encoded
|
194 |
|
|
stubs produced by dlltool.
|
195 |
|
|
|
196 |
|
|
*Use `-mcaller-super-interworking'*
|
197 |
|
|
If it is possible for Thumb functions in a DLL to call
|
198 |
|
|
non-interworking aware code via a function pointer, then the Thumb
|
199 |
|
|
code must be compiled with the `-mcaller-super-interworking'
|
200 |
|
|
command line switch. This will force the function pointer calls
|
201 |
|
|
to use the _interwork_call_via_rX stub functions which will
|
202 |
|
|
correctly restore Thumb mode upon return from the called function.
|
203 |
|
|
|
204 |
|
|
*Link with `libgcc.a'*
|
205 |
|
|
When the dll is built it may have to be linked with the GCC
|
206 |
|
|
library (`libgcc.a') in order to extract the _call_via_rX functions
|
207 |
|
|
or the _interwork_call_via_rX functions. This represents a partial
|
208 |
|
|
redundancy since the same functions *may* be present in the
|
209 |
|
|
application itself, but since they only take up 372 bytes this
|
210 |
|
|
should not be too much of a consideration.
|
211 |
|
|
|
212 |
|
|
*Use `--support-old-code'*
|
213 |
|
|
When linking a program with an old DLL which does not support
|
214 |
|
|
interworking, the `--support-old-code' command line switch to the
|
215 |
|
|
linker should be used. This causes the linker to generate special
|
216 |
|
|
interworking stubs which can cope with old, non-interworking aware
|
217 |
|
|
ARM code, at the cost of generating bulkier code. The linker will
|
218 |
|
|
still generate a warning message along the lines of:
|
219 |
|
|
"Warning: input file XXX does not support interworking, whereas YYY does."
|
220 |
|
|
but this can now be ignored because the --support-old-code switch
|
221 |
|
|
has been used.
|
222 |
|
|
|
223 |
|
|
|
224 |
|
|
|
225 |
|
|
5. How interworking support works
|
226 |
|
|
=================================
|
227 |
|
|
|
228 |
|
|
Switching between the ARM and Thumb instruction sets is accomplished
|
229 |
|
|
via the BX instruction which takes as an argument a register name.
|
230 |
|
|
Control is transfered to the address held in this register (with the
|
231 |
|
|
bottom bit masked out), and if the bottom bit is set, then Thumb
|
232 |
|
|
instruction processing is enabled, otherwise ARM instruction
|
233 |
|
|
processing is enabled.
|
234 |
|
|
|
235 |
|
|
When the -mthumb-interwork command line switch is specified, gcc
|
236 |
|
|
arranges for all functions to return to their caller by using the BX
|
237 |
|
|
instruction. Thus provided that the return address has the bottom bit
|
238 |
|
|
correctly initialized to indicate the instruction set of the caller,
|
239 |
|
|
correct operation will ensue.
|
240 |
|
|
|
241 |
|
|
When a function is called explicitly (rather than via a function
|
242 |
|
|
pointer), the compiler generates a BL instruction to do this. The
|
243 |
|
|
Thumb version of the BL instruction has the special property of
|
244 |
|
|
setting the bottom bit of the LR register after it has stored the
|
245 |
|
|
return address into it, so that a future BX instruction will correctly
|
246 |
|
|
return the instruction after the BL instruction, in Thumb mode.
|
247 |
|
|
|
248 |
|
|
The BL instruction does not change modes itself however, so if an ARM
|
249 |
|
|
function is calling a Thumb function, or vice versa, it is necessary
|
250 |
|
|
to generate some extra instructions to handle this. This is done in
|
251 |
|
|
the linker when it is storing the address of the referenced function
|
252 |
|
|
into the BL instruction. If the BL instruction is an ARM style BL
|
253 |
|
|
instruction, but the referenced function is a Thumb function, then the
|
254 |
|
|
linker automatically generates a calling stub that converts from ARM
|
255 |
|
|
mode to Thumb mode, puts the address of this stub into the BL
|
256 |
|
|
instruction, and puts the address of the referenced function into the
|
257 |
|
|
stub. Similarly if the BL instruction is a Thumb BL instruction, and
|
258 |
|
|
the referenced function is an ARM function, the linker generates a
|
259 |
|
|
stub which converts from Thumb to ARM mode, puts the address of this
|
260 |
|
|
stub into the BL instruction, and the address of the referenced
|
261 |
|
|
function into the stub.
|
262 |
|
|
|
263 |
|
|
This is why it is necessary to mark Thumb functions with the
|
264 |
|
|
.thumb_func pseudo op when creating assembler files. This pseudo op
|
265 |
|
|
allows the assembler to distinguish between ARM functions and Thumb
|
266 |
|
|
functions. (The Thumb version of GCC automatically generates these
|
267 |
|
|
pseudo ops for any Thumb functions that it generates).
|
268 |
|
|
|
269 |
|
|
Calls via function pointers work differently. Whenever the address of
|
270 |
|
|
a function is taken, the linker examines the type of the function
|
271 |
|
|
being referenced. If the function is a Thumb function, then it sets
|
272 |
|
|
the bottom bit of the address. Technically this makes the address
|
273 |
|
|
incorrect, since it is now one byte into the start of the function,
|
274 |
|
|
but this is never a problem because:
|
275 |
|
|
|
276 |
|
|
a. with interworking enabled all calls via function pointer
|
277 |
|
|
are done using the BX instruction and this ignores the
|
278 |
|
|
bottom bit when computing where to go to.
|
279 |
|
|
|
280 |
|
|
b. the linker will always set the bottom bit when the address
|
281 |
|
|
of the function is taken, so it is never possible to take
|
282 |
|
|
the address of the function in two different places and
|
283 |
|
|
then compare them and find that they are not equal.
|
284 |
|
|
|
285 |
|
|
As already mentioned any call via a function pointer will use the BX
|
286 |
|
|
instruction (provided that interworking is enabled). The only problem
|
287 |
|
|
with this is computing the return address for the return from the
|
288 |
|
|
called function. For ARM code this can easily be done by the code
|
289 |
|
|
sequence:
|
290 |
|
|
|
291 |
|
|
mov lr, pc
|
292 |
|
|
bx rX
|
293 |
|
|
|
294 |
|
|
(where rX is the name of the register containing the function
|
295 |
|
|
pointer). This code does not work for the Thumb instruction set,
|
296 |
|
|
since the MOV instruction will not set the bottom bit of the LR
|
297 |
|
|
register, so that when the called function returns, it will return in
|
298 |
|
|
ARM mode not Thumb mode. Instead the compiler generates this
|
299 |
|
|
sequence:
|
300 |
|
|
|
301 |
|
|
bl _call_via_rX
|
302 |
|
|
|
303 |
|
|
(again where rX is the name if the register containing the function
|
304 |
|
|
pointer). The special call_via_rX functions look like this:
|
305 |
|
|
|
306 |
|
|
.thumb_func
|
307 |
|
|
_call_via_r0:
|
308 |
|
|
bx r0
|
309 |
|
|
nop
|
310 |
|
|
|
311 |
|
|
The BL instruction ensures that the correct return address is stored
|
312 |
|
|
in the LR register and then the BX instruction jumps to the address
|
313 |
|
|
stored in the function pointer, switch modes if necessary.
|
314 |
|
|
|
315 |
|
|
|
316 |
|
|
6. How caller-super-interworking support works
|
317 |
|
|
==============================================
|
318 |
|
|
|
319 |
|
|
When the -mcaller-super-interworking command line switch is specified
|
320 |
|
|
it changes the code produced by the Thumb compiler so that all calls
|
321 |
|
|
via function pointers (including virtual function calls) now go via a
|
322 |
|
|
different stub function. The code to call via a function pointer now
|
323 |
|
|
looks like this:
|
324 |
|
|
|
325 |
|
|
bl _interwork_call_via_r0
|
326 |
|
|
|
327 |
|
|
Note: The compiler does not insist that r0 be used to hold the
|
328 |
|
|
function address. Any register will do, and there are a suite of stub
|
329 |
|
|
functions, one for each possible register. The stub functions look
|
330 |
|
|
like this:
|
331 |
|
|
|
332 |
|
|
.code 16
|
333 |
|
|
.thumb_func
|
334 |
|
|
_interwork_call_via_r0
|
335 |
|
|
bx pc
|
336 |
|
|
nop
|
337 |
|
|
|
338 |
|
|
.code 32
|
339 |
|
|
tst r0, #1
|
340 |
|
|
stmeqdb r13!, {lr}
|
341 |
|
|
adreq lr, _arm_return
|
342 |
|
|
bx r0
|
343 |
|
|
|
344 |
|
|
The stub first switches to ARM mode, since it is a lot easier to
|
345 |
|
|
perform the necessary operations using ARM instructions. It then
|
346 |
|
|
tests the bottom bit of the register containing the address of the
|
347 |
|
|
function to be called. If this bottom bit is set then the function
|
348 |
|
|
being called uses Thumb instructions and the BX instruction to come
|
349 |
|
|
will switch back into Thumb mode before calling this function. (Note
|
350 |
|
|
that it does not matter how this called function chooses to return to
|
351 |
|
|
its caller, since the both the caller and callee are Thumb functions,
|
352 |
|
|
and mode switching is necessary). If the function being called is an
|
353 |
|
|
ARM mode function however, the stub pushes the return address (with
|
354 |
|
|
its bottom bit set) onto the stack, replaces the return address with
|
355 |
|
|
the address of the a piece of code called '_arm_return' and then
|
356 |
|
|
performs a BX instruction to call the function.
|
357 |
|
|
|
358 |
|
|
The '_arm_return' code looks like this:
|
359 |
|
|
|
360 |
|
|
.code 32
|
361 |
|
|
_arm_return:
|
362 |
|
|
ldmia r13!, {r12}
|
363 |
|
|
bx r12
|
364 |
|
|
.code 16
|
365 |
|
|
|
366 |
|
|
|
367 |
|
|
It simply retrieves the return address from the stack, and then
|
368 |
|
|
performs a BX operation to return to the caller and switch back into
|
369 |
|
|
Thumb mode.
|
370 |
|
|
|
371 |
|
|
|
372 |
|
|
7. How callee-super-interworking support works
|
373 |
|
|
==============================================
|
374 |
|
|
|
375 |
|
|
When -mcallee-super-interworking is specified on the command line the
|
376 |
|
|
Thumb compiler behaves as if every externally visible function that it
|
377 |
|
|
compiles has had the (interfacearm) attribute specified for it. What
|
378 |
|
|
this attribute does is to put a special, ARM mode header onto the
|
379 |
|
|
function which forces a switch into Thumb mode:
|
380 |
|
|
|
381 |
|
|
without __attribute__((interfacearm)):
|
382 |
|
|
|
383 |
|
|
.code 16
|
384 |
|
|
.thumb_func
|
385 |
|
|
function:
|
386 |
|
|
... start of function ...
|
387 |
|
|
|
388 |
|
|
with __attribute__((interfacearm)):
|
389 |
|
|
|
390 |
|
|
.code 32
|
391 |
|
|
function:
|
392 |
|
|
orr r12, pc, #1
|
393 |
|
|
bx r12
|
394 |
|
|
|
395 |
|
|
.code 16
|
396 |
|
|
.thumb_func
|
397 |
|
|
.real_start_of_function:
|
398 |
|
|
|
399 |
|
|
... start of function ...
|
400 |
|
|
|
401 |
|
|
Note that since the function now expects to be entered in ARM mode, it
|
402 |
|
|
no longer has the .thumb_func pseudo op specified for its name.
|
403 |
|
|
Instead the pseudo op is attached to a new label .real_start_of_
|
404 |
|
|
(where is the name of the function) which indicates the start
|
405 |
|
|
of the Thumb code. This does have the interesting side effect in that
|
406 |
|
|
if this function is now called from a Thumb mode piece of code
|
407 |
|
|
outside of the current file, the linker will generate a calling stub
|
408 |
|
|
to switch from Thumb mode into ARM mode, and then this is immediately
|
409 |
|
|
overridden by the function's header which switches back into Thumb
|
410 |
|
|
mode.
|
411 |
|
|
|
412 |
|
|
In addition the (interfacearm) attribute also forces the function to
|
413 |
|
|
return by using the BX instruction, even if has not been compiled with
|
414 |
|
|
the -mthumb-interwork command line flag, so that the correct mode will
|
415 |
|
|
be restored upon exit from the function.
|
416 |
|
|
|
417 |
|
|
|
418 |
|
|
8. Some examples
|
419 |
|
|
================
|
420 |
|
|
|
421 |
|
|
Given these two test files:
|
422 |
|
|
|
423 |
|
|
int arm (void) { return 1 + thumb (); }
|
424 |
|
|
|
425 |
|
|
int thumb (void) { return 2 + arm (); }
|
426 |
|
|
|
427 |
|
|
The following pieces of assembler are produced by the ARM and Thumb
|
428 |
|
|
version of GCC depending upon the command line options used:
|
429 |
|
|
|
430 |
|
|
`-O2':
|
431 |
|
|
.code 32 .code 16
|
432 |
|
|
.global _arm .global _thumb
|
433 |
|
|
.thumb_func
|
434 |
|
|
_arm: _thumb:
|
435 |
|
|
mov ip, sp
|
436 |
|
|
stmfd sp!, {fp, ip, lr, pc} push {lr}
|
437 |
|
|
sub fp, ip, #4
|
438 |
|
|
bl _thumb bl _arm
|
439 |
|
|
add r0, r0, #1 add r0, r0, #2
|
440 |
|
|
ldmea fp, {fp, sp, pc} pop {pc}
|
441 |
|
|
|
442 |
|
|
Note how the functions return without using the BX instruction. If
|
443 |
|
|
these files were assembled and linked together they would fail to work
|
444 |
|
|
because they do not change mode when returning to their caller.
|
445 |
|
|
|
446 |
|
|
`-O2 -mthumb-interwork':
|
447 |
|
|
|
448 |
|
|
.code 32 .code 16
|
449 |
|
|
.global _arm .global _thumb
|
450 |
|
|
.thumb_func
|
451 |
|
|
_arm: _thumb:
|
452 |
|
|
mov ip, sp
|
453 |
|
|
stmfd sp!, {fp, ip, lr, pc} push {lr}
|
454 |
|
|
sub fp, ip, #4
|
455 |
|
|
bl _thumb bl _arm
|
456 |
|
|
add r0, r0, #1 add r0, r0, #2
|
457 |
|
|
ldmea fp, {fp, sp, lr} pop {r1}
|
458 |
|
|
bx lr bx r1
|
459 |
|
|
|
460 |
|
|
Now the functions use BX to return their caller. They have grown by
|
461 |
|
|
4 and 2 bytes respectively, but they can now successfully be linked
|
462 |
|
|
together and be expect to work. The linker will replace the
|
463 |
|
|
destinations of the two BL instructions with the addresses of calling
|
464 |
|
|
stubs which convert to the correct mode before jumping to the called
|
465 |
|
|
function.
|
466 |
|
|
|
467 |
|
|
`-O2 -mcallee-super-interworking':
|
468 |
|
|
|
469 |
|
|
.code 32 .code 32
|
470 |
|
|
.global _arm .global _thumb
|
471 |
|
|
_arm: _thumb:
|
472 |
|
|
orr r12, pc, #1
|
473 |
|
|
bx r12
|
474 |
|
|
mov ip, sp .code 16
|
475 |
|
|
stmfd sp!, {fp, ip, lr, pc} push {lr}
|
476 |
|
|
sub fp, ip, #4
|
477 |
|
|
bl _thumb bl _arm
|
478 |
|
|
add r0, r0, #1 add r0, r0, #2
|
479 |
|
|
ldmea fp, {fp, sp, lr} pop {r1}
|
480 |
|
|
bx lr bx r1
|
481 |
|
|
|
482 |
|
|
The thumb function now has an ARM encoded prologue, and it no longer
|
483 |
|
|
has the `.thumb-func' pseudo op attached to it. The linker will not
|
484 |
|
|
generate a calling stub for the call from arm() to thumb(), but it will
|
485 |
|
|
still have to generate a stub for the call from thumb() to arm(). Also
|
486 |
|
|
note how specifying `--mcallee-super-interworking' automatically
|
487 |
|
|
implies `-mthumb-interworking'.
|
488 |
|
|
|
489 |
|
|
|
490 |
|
|
9. Some Function Pointer Examples
|
491 |
|
|
=================================
|
492 |
|
|
|
493 |
|
|
Given this test file:
|
494 |
|
|
|
495 |
|
|
int func (void) { return 1; }
|
496 |
|
|
|
497 |
|
|
int call (int (* ptr)(void)) { return ptr (); }
|
498 |
|
|
|
499 |
|
|
The following varying pieces of assembler are produced by the Thumb
|
500 |
|
|
version of GCC depending upon the command line options used:
|
501 |
|
|
|
502 |
|
|
`-O2':
|
503 |
|
|
.code 16
|
504 |
|
|
.globl _func
|
505 |
|
|
.thumb_func
|
506 |
|
|
_func:
|
507 |
|
|
mov r0, #1
|
508 |
|
|
bx lr
|
509 |
|
|
|
510 |
|
|
.globl _call
|
511 |
|
|
.thumb_func
|
512 |
|
|
_call:
|
513 |
|
|
push {lr}
|
514 |
|
|
bl __call_via_r0
|
515 |
|
|
pop {pc}
|
516 |
|
|
|
517 |
|
|
Note how the two functions have different exit sequences. In
|
518 |
|
|
particular call() uses pop {pc} to return, which would not work if the
|
519 |
|
|
caller was in ARM mode. func() however, uses the BX instruction, even
|
520 |
|
|
though `-mthumb-interwork' has not been specified, as this is the most
|
521 |
|
|
efficient way to exit a function when the return address is held in the
|
522 |
|
|
link register.
|
523 |
|
|
|
524 |
|
|
`-O2 -mthumb-interwork':
|
525 |
|
|
|
526 |
|
|
.code 16
|
527 |
|
|
.globl _func
|
528 |
|
|
.thumb_func
|
529 |
|
|
_func:
|
530 |
|
|
mov r0, #1
|
531 |
|
|
bx lr
|
532 |
|
|
|
533 |
|
|
.globl _call
|
534 |
|
|
.thumb_func
|
535 |
|
|
_call:
|
536 |
|
|
push {lr}
|
537 |
|
|
bl __call_via_r0
|
538 |
|
|
pop {r1}
|
539 |
|
|
bx r1
|
540 |
|
|
|
541 |
|
|
This time both functions return by using the BX instruction. This
|
542 |
|
|
means that call() is now two bytes longer and several cycles slower
|
543 |
|
|
than the previous version.
|
544 |
|
|
|
545 |
|
|
`-O2 -mcaller-super-interworking':
|
546 |
|
|
.code 16
|
547 |
|
|
.globl _func
|
548 |
|
|
.thumb_func
|
549 |
|
|
_func:
|
550 |
|
|
mov r0, #1
|
551 |
|
|
bx lr
|
552 |
|
|
|
553 |
|
|
.globl _call
|
554 |
|
|
.thumb_func
|
555 |
|
|
_call:
|
556 |
|
|
push {lr}
|
557 |
|
|
bl __interwork_call_via_r0
|
558 |
|
|
pop {pc}
|
559 |
|
|
|
560 |
|
|
Very similar to the first (non-interworking) version, except that a
|
561 |
|
|
different stub is used to call via the function pointer. This new stub
|
562 |
|
|
will work even if the called function is not interworking aware, and
|
563 |
|
|
tries to return to call() in ARM mode. Note that the assembly code for
|
564 |
|
|
call() is still not interworking aware itself, and so should not be
|
565 |
|
|
called from ARM code.
|
566 |
|
|
|
567 |
|
|
`-O2 -mcallee-super-interworking':
|
568 |
|
|
|
569 |
|
|
.code 32
|
570 |
|
|
.globl _func
|
571 |
|
|
_func:
|
572 |
|
|
orr r12, pc, #1
|
573 |
|
|
bx r12
|
574 |
|
|
|
575 |
|
|
.code 16
|
576 |
|
|
.globl .real_start_of_func
|
577 |
|
|
.thumb_func
|
578 |
|
|
.real_start_of_func:
|
579 |
|
|
mov r0, #1
|
580 |
|
|
bx lr
|
581 |
|
|
|
582 |
|
|
.code 32
|
583 |
|
|
.globl _call
|
584 |
|
|
_call:
|
585 |
|
|
orr r12, pc, #1
|
586 |
|
|
bx r12
|
587 |
|
|
|
588 |
|
|
.code 16
|
589 |
|
|
.globl .real_start_of_call
|
590 |
|
|
.thumb_func
|
591 |
|
|
.real_start_of_call:
|
592 |
|
|
push {lr}
|
593 |
|
|
bl __call_via_r0
|
594 |
|
|
pop {r1}
|
595 |
|
|
bx r1
|
596 |
|
|
|
597 |
|
|
Now both functions have an ARM coded prologue, and both functions
|
598 |
|
|
return by using the BX instruction. These functions are interworking
|
599 |
|
|
aware therefore and can safely be called from ARM code. The code for
|
600 |
|
|
the call() function is now 10 bytes longer than the original, non
|
601 |
|
|
interworking aware version, an increase of over 200%.
|
602 |
|
|
|
603 |
|
|
If a prototype for call() is added to the source code, and this
|
604 |
|
|
prototype includes the `interfacearm' attribute:
|
605 |
|
|
|
606 |
|
|
int __attribute__((interfacearm)) call (int (* ptr)(void));
|
607 |
|
|
|
608 |
|
|
then this code is produced (with only -O2 specified on the command
|
609 |
|
|
line):
|
610 |
|
|
|
611 |
|
|
.code 16
|
612 |
|
|
.globl _func
|
613 |
|
|
.thumb_func
|
614 |
|
|
_func:
|
615 |
|
|
mov r0, #1
|
616 |
|
|
bx lr
|
617 |
|
|
|
618 |
|
|
.globl _call
|
619 |
|
|
.code 32
|
620 |
|
|
_call:
|
621 |
|
|
orr r12, pc, #1
|
622 |
|
|
bx r12
|
623 |
|
|
|
624 |
|
|
.code 16
|
625 |
|
|
.globl .real_start_of_call
|
626 |
|
|
.thumb_func
|
627 |
|
|
.real_start_of_call:
|
628 |
|
|
push {lr}
|
629 |
|
|
bl __call_via_r0
|
630 |
|
|
pop {r1}
|
631 |
|
|
bx r1
|
632 |
|
|
|
633 |
|
|
So now both call() and func() can be safely called via
|
634 |
|
|
non-interworking aware ARM code. If, when such a file is assembled,
|
635 |
|
|
the assembler detects the fact that call() is being called by another
|
636 |
|
|
function in the same file, it will automatically adjust the target of
|
637 |
|
|
the BL instruction to point to .real_start_of_call. In this way there
|
638 |
|
|
is no need for the linker to generate a Thumb-to-ARM calling stub so
|
639 |
|
|
that call can be entered in ARM mode.
|
640 |
|
|
|
641 |
|
|
|
642 |
|
|
10. How to use dlltool to build ARM/Thumb DLLs
|
643 |
|
|
==============================================
|
644 |
|
|
Given a program (`prog.c') like this:
|
645 |
|
|
|
646 |
|
|
extern int func_in_dll (void);
|
647 |
|
|
|
648 |
|
|
int main (void) { return func_in_dll(); }
|
649 |
|
|
|
650 |
|
|
And a DLL source file (`dll.c') like this:
|
651 |
|
|
|
652 |
|
|
int func_in_dll (void) { return 1; }
|
653 |
|
|
|
654 |
|
|
Here is how to build the DLL and the program for a purely ARM based
|
655 |
|
|
environment:
|
656 |
|
|
|
657 |
|
|
*Step One
|
658 |
|
|
Build a `.def' file describing the DLL:
|
659 |
|
|
|
660 |
|
|
; example.def
|
661 |
|
|
; This file describes the contents of the DLL
|
662 |
|
|
LIBRARY example
|
663 |
|
|
HEAPSIZE 0x40000, 0x2000
|
664 |
|
|
EXPORTS
|
665 |
|
|
func_in_dll 1
|
666 |
|
|
|
667 |
|
|
*Step Two
|
668 |
|
|
Compile the DLL source code:
|
669 |
|
|
|
670 |
|
|
arm-pe-gcc -O2 -c dll.c
|
671 |
|
|
|
672 |
|
|
*Step Three
|
673 |
|
|
Use `dlltool' to create an exports file and a library file:
|
674 |
|
|
|
675 |
|
|
dlltool --def example.def --output-exp example.o --output-lib example.a
|
676 |
|
|
|
677 |
|
|
*Step Four
|
678 |
|
|
Link together the complete DLL:
|
679 |
|
|
|
680 |
|
|
arm-pe-ld dll.o example.o -o example.dll
|
681 |
|
|
|
682 |
|
|
*Step Five
|
683 |
|
|
Compile the program's source code:
|
684 |
|
|
|
685 |
|
|
arm-pe-gcc -O2 -c prog.c
|
686 |
|
|
|
687 |
|
|
*Step Six
|
688 |
|
|
Link together the program and the DLL's library file:
|
689 |
|
|
|
690 |
|
|
arm-pe-gcc prog.o example.a -o prog
|
691 |
|
|
|
692 |
|
|
If instead this was a Thumb DLL being called from an ARM program, the
|
693 |
|
|
steps would look like this. (To save space only those steps that are
|
694 |
|
|
different from the previous version are shown):
|
695 |
|
|
|
696 |
|
|
*Step Two
|
697 |
|
|
Compile the DLL source code (using the Thumb compiler):
|
698 |
|
|
|
699 |
|
|
thumb-pe-gcc -O2 -c dll.c -mthumb-interwork
|
700 |
|
|
|
701 |
|
|
*Step Three
|
702 |
|
|
Build the exports and library files (and support interworking):
|
703 |
|
|
|
704 |
|
|
dlltool -d example.def -z example.o -l example.a --interwork -m thumb
|
705 |
|
|
|
706 |
|
|
*Step Five
|
707 |
|
|
Compile the program's source code (and support interworking):
|
708 |
|
|
|
709 |
|
|
arm-pe-gcc -O2 -c prog.c -mthumb-interwork
|
710 |
|
|
|
711 |
|
|
If instead, the DLL was an old, ARM DLL which does not support
|
712 |
|
|
interworking, and which cannot be rebuilt, then these steps would be
|
713 |
|
|
used.
|
714 |
|
|
|
715 |
|
|
*Step One
|
716 |
|
|
Skip. If you do not have access to the sources of a DLL, there is
|
717 |
|
|
no point in building a `.def' file for it.
|
718 |
|
|
|
719 |
|
|
*Step Two
|
720 |
|
|
Skip. With no DLL sources there is nothing to compile.
|
721 |
|
|
|
722 |
|
|
*Step Three
|
723 |
|
|
Skip. Without a `.def' file you cannot use dlltool to build an
|
724 |
|
|
exports file or a library file.
|
725 |
|
|
|
726 |
|
|
*Step Four
|
727 |
|
|
Skip. Without a set of DLL object files you cannot build the DLL.
|
728 |
|
|
Besides it has already been built for you by somebody else.
|
729 |
|
|
|
730 |
|
|
*Step Five
|
731 |
|
|
Compile the program's source code, this is the same as before:
|
732 |
|
|
|
733 |
|
|
arm-pe-gcc -O2 -c prog.c
|
734 |
|
|
|
735 |
|
|
*Step Six
|
736 |
|
|
Link together the program and the DLL's library file, passing the
|
737 |
|
|
`--support-old-code' option to the linker:
|
738 |
|
|
|
739 |
|
|
arm-pe-gcc prog.o example.a -Wl,--support-old-code -o prog
|
740 |
|
|
|
741 |
|
|
Ignore the warning message about the input file not supporting
|
742 |
|
|
interworking as the --support-old-code switch has taken care if this.
|