1 |
38 |
julius |
\input texinfo @c -*-texinfo-*-
|
2 |
|
|
|
3 |
|
|
@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
|
4 |
|
|
@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
|
5 |
|
|
|
6 |
|
|
|
7 |
|
|
@c %**start of header
|
8 |
|
|
@setfilename treelang.info
|
9 |
|
|
|
10 |
|
|
@include gcc-common.texi
|
11 |
|
|
|
12 |
|
|
@set copyrights-treelang 1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005
|
13 |
|
|
|
14 |
|
|
@set email-general gcc@@gcc.gnu.org
|
15 |
|
|
@set email-bugs gcc-bugs@@gcc.gnu.org or bug-gcc@@gnu.org
|
16 |
|
|
@set email-patches gcc-patches@@gcc.gnu.org
|
17 |
|
|
@set path-treelang gcc/gcc/treelang
|
18 |
|
|
|
19 |
|
|
@set which-treelang GCC-@value{version-GCC}
|
20 |
|
|
@set which-GCC GCC
|
21 |
|
|
|
22 |
|
|
@set email-josling tej@@melbpc.org.au
|
23 |
|
|
@set www-josling http://www.geocities.com/timjosling
|
24 |
|
|
|
25 |
|
|
@c This tells @include'd files that they're part of the overall TREELANG doc
|
26 |
|
|
@c set. (They might be part of a higher-level doc set too.)
|
27 |
|
|
@set DOC-TREELANG
|
28 |
|
|
|
29 |
|
|
@c @setfilename usetreelang.info
|
30 |
|
|
@c @setfilename maintaintreelang.info
|
31 |
|
|
@c To produce the full manual, use the "treelang.info" setfilename, and
|
32 |
|
|
@c make sure the following do NOT begin with '@c' (and the @clear lines DO)
|
33 |
|
|
@set INTERNALS
|
34 |
|
|
@set USING
|
35 |
|
|
@c To produce a user-only manual, use the "usetreelang.info" setfilename, and
|
36 |
|
|
@c make sure the following does NOT begin with '@c':
|
37 |
|
|
@c @clear INTERNALS
|
38 |
|
|
@c To produce a maintainer-only manual, use the "maintaintreelang.info" setfilename,
|
39 |
|
|
@c and make sure the following does NOT begin with '@c':
|
40 |
|
|
@c @clear USING
|
41 |
|
|
|
42 |
|
|
@ifset INTERNALS
|
43 |
|
|
@ifset USING
|
44 |
|
|
@settitle Using and Maintaining GNU Treelang
|
45 |
|
|
@end ifset
|
46 |
|
|
@end ifset
|
47 |
|
|
@c seems reasonable to assume at least one of INTERNALS or USING is set...
|
48 |
|
|
@ifclear INTERNALS
|
49 |
|
|
@settitle Using GNU Treelang
|
50 |
|
|
@end ifclear
|
51 |
|
|
@ifclear USING
|
52 |
|
|
@settitle Maintaining GNU Treelang
|
53 |
|
|
@end ifclear
|
54 |
|
|
@c then again, have some fun
|
55 |
|
|
@ifclear INTERNALS
|
56 |
|
|
@ifclear USING
|
57 |
|
|
@settitle Doing Very Little at all with GNU Treelang
|
58 |
|
|
@end ifclear
|
59 |
|
|
@end ifclear
|
60 |
|
|
|
61 |
|
|
@syncodeindex fn cp
|
62 |
|
|
@syncodeindex vr cp
|
63 |
|
|
@c %**end of header
|
64 |
|
|
|
65 |
|
|
@c Cause even numbered pages to be printed on the left hand side of
|
66 |
|
|
@c the page and odd numbered pages to be printed on the right hand
|
67 |
|
|
@c side of the page. Using this, you can print on both sides of a
|
68 |
|
|
@c sheet of paper and have the text on the same part of the sheet.
|
69 |
|
|
|
70 |
|
|
@c The text on right hand pages is pushed towards the right hand
|
71 |
|
|
@c margin and the text on left hand pages is pushed toward the left
|
72 |
|
|
@c hand margin.
|
73 |
|
|
@c (To provide the reverse effect, set bindingoffset to -0.75in.)
|
74 |
|
|
|
75 |
|
|
@c @tex
|
76 |
|
|
@c \global\bindingoffset=0.75in
|
77 |
|
|
@c \global\normaloffset =0.75in
|
78 |
|
|
@c @end tex
|
79 |
|
|
|
80 |
|
|
@copying
|
81 |
|
|
Copyright @copyright{} @value{copyrights-treelang} Free Software Foundation, Inc.
|
82 |
|
|
|
83 |
|
|
Permission is granted to copy, distribute and/or modify this document
|
84 |
|
|
under the terms of the GNU Free Documentation License, Version 1.2 or
|
85 |
|
|
any later version published by the Free Software Foundation; with the
|
86 |
|
|
Invariant Sections being ``GNU General Public License'', the Front-Cover
|
87 |
|
|
texts being (a) (see below), and with the Back-Cover Texts being (b)
|
88 |
|
|
(see below). A copy of the license is included in the section entitled
|
89 |
|
|
``GNU Free Documentation License''.
|
90 |
|
|
|
91 |
|
|
(a) The FSF's Front-Cover Text is:
|
92 |
|
|
|
93 |
|
|
A GNU Manual
|
94 |
|
|
|
95 |
|
|
(b) The FSF's Back-Cover Text is:
|
96 |
|
|
|
97 |
|
|
You have freedom to copy and modify this GNU Manual, like GNU
|
98 |
|
|
software. Copies published by the Free Software Foundation raise
|
99 |
|
|
funds for GNU development.
|
100 |
|
|
@end copying
|
101 |
|
|
|
102 |
|
|
@ifnottex
|
103 |
|
|
@dircategory Software development
|
104 |
|
|
@direntry
|
105 |
|
|
* treelang: (treelang). The GNU Treelang compiler.
|
106 |
|
|
@end direntry
|
107 |
|
|
@ifset INTERNALS
|
108 |
|
|
@ifset USING
|
109 |
|
|
This file documents the use and the internals of the GNU Treelang
|
110 |
|
|
(@code{treelang}) compiler. At the moment this manual is not
|
111 |
|
|
incorporated into the main GCC manual as it is incomplete. It
|
112 |
|
|
corresponds to the @value{which-treelang} version of @code{treelang}.
|
113 |
|
|
@end ifset
|
114 |
|
|
@end ifset
|
115 |
|
|
@ifclear USING
|
116 |
|
|
This file documents the internals of the GNU Treelang (@code{treelang}) compiler.
|
117 |
|
|
It corresponds to the @value{which-treelang} version of @code{treelang}.
|
118 |
|
|
@end ifclear
|
119 |
|
|
@ifclear INTERNALS
|
120 |
|
|
This file documents the use of the GNU Treelang (@code{treelang}) compiler.
|
121 |
|
|
It corresponds to the @value{which-treelang} version of @code{treelang}.
|
122 |
|
|
@end ifclear
|
123 |
|
|
|
124 |
|
|
Published by the Free Software Foundation
|
125 |
|
|
51 Franklin Street, Fifth Floor
|
126 |
|
|
Boston, MA 02110-1301 USA
|
127 |
|
|
|
128 |
|
|
@insertcopying
|
129 |
|
|
@end ifnottex
|
130 |
|
|
|
131 |
|
|
@setchapternewpage odd
|
132 |
|
|
@c @finalout
|
133 |
|
|
@titlepage
|
134 |
|
|
@ifset INTERNALS
|
135 |
|
|
@ifset USING
|
136 |
|
|
@title Using and Maintaining GNU Treelang
|
137 |
|
|
@end ifset
|
138 |
|
|
@end ifset
|
139 |
|
|
@ifclear INTERNALS
|
140 |
|
|
@title Using GNU Treelang
|
141 |
|
|
@end ifclear
|
142 |
|
|
@ifclear USING
|
143 |
|
|
@title Maintaining GNU Treelang
|
144 |
|
|
@end ifclear
|
145 |
|
|
@versionsubtitle
|
146 |
|
|
@author Tim Josling
|
147 |
|
|
@page
|
148 |
|
|
@vskip 0pt plus 1filll
|
149 |
|
|
Published by the Free Software Foundation @*
|
150 |
|
|
51 Franklin Street, Fifth Floor@*
|
151 |
|
|
Boston, MA 02110-1301, USA@*
|
152 |
|
|
@c Last printed ??ber, 19??.@*
|
153 |
|
|
@c Printed copies are available for $? each.@*
|
154 |
|
|
@c ISBN ???
|
155 |
|
|
@sp 1
|
156 |
|
|
@insertcopying
|
157 |
|
|
@end titlepage
|
158 |
|
|
@page
|
159 |
|
|
|
160 |
|
|
@ifnottex
|
161 |
|
|
|
162 |
|
|
@node Top, Copying,, (dir)
|
163 |
|
|
@top Introduction
|
164 |
|
|
@cindex Introduction
|
165 |
|
|
|
166 |
|
|
@ifset INTERNALS
|
167 |
|
|
@ifset USING
|
168 |
|
|
This manual documents how to run, install and maintain @code{treelang}.
|
169 |
|
|
It also documents the features and incompatibilities in the @value{which-treelang}
|
170 |
|
|
version of @code{treelang}.
|
171 |
|
|
@end ifset
|
172 |
|
|
@end ifset
|
173 |
|
|
|
174 |
|
|
@ifclear INTERNALS
|
175 |
|
|
This manual documents how to run and install @code{treelang}.
|
176 |
|
|
It also documents the features and incompatibilities in the @value{which-treelang}
|
177 |
|
|
version of @code{treelang}.
|
178 |
|
|
@end ifclear
|
179 |
|
|
@ifclear USING
|
180 |
|
|
This manual documents how to maintain @code{treelang}.
|
181 |
|
|
It also documents the features and incompatibilities in the @value{which-treelang}
|
182 |
|
|
version of @code{treelang}.
|
183 |
|
|
@end ifclear
|
184 |
|
|
|
185 |
|
|
@end ifnottex
|
186 |
|
|
|
187 |
|
|
@menu
|
188 |
|
|
* Copying::
|
189 |
|
|
* Contributors::
|
190 |
|
|
* GNU Free Documentation License::
|
191 |
|
|
* Funding::
|
192 |
|
|
* Getting Started::
|
193 |
|
|
* What is GNU Treelang?::
|
194 |
|
|
* Lexical Syntax::
|
195 |
|
|
* Parsing Syntax::
|
196 |
|
|
* Compiler Overview::
|
197 |
|
|
* TREELANG and GCC::
|
198 |
|
|
* Compiler::
|
199 |
|
|
* Other Languages::
|
200 |
|
|
* treelang internals::
|
201 |
|
|
* Open Questions::
|
202 |
|
|
* Bugs::
|
203 |
|
|
* Service::
|
204 |
|
|
* Projects::
|
205 |
|
|
* Index::
|
206 |
|
|
|
207 |
|
|
@detailmenu
|
208 |
|
|
--- The Detailed Node Listing ---
|
209 |
|
|
|
210 |
|
|
Other Languages
|
211 |
|
|
|
212 |
|
|
* Interoperating with C and C++::
|
213 |
|
|
|
214 |
|
|
treelang internals
|
215 |
|
|
|
216 |
|
|
* treelang files::
|
217 |
|
|
* treelang compiler interfaces::
|
218 |
|
|
* Hints and tips::
|
219 |
|
|
|
220 |
|
|
treelang compiler interfaces
|
221 |
|
|
|
222 |
|
|
* treelang driver::
|
223 |
|
|
* treelang main compiler::
|
224 |
|
|
|
225 |
|
|
treelang main compiler
|
226 |
|
|
|
227 |
|
|
* Interfacing to toplev.c::
|
228 |
|
|
* Interfacing to the garbage collection::
|
229 |
|
|
* Interfacing to the code generation code. ::
|
230 |
|
|
|
231 |
|
|
Reporting Bugs
|
232 |
|
|
|
233 |
|
|
* Sending Patches::
|
234 |
|
|
|
235 |
|
|
@end detailmenu
|
236 |
|
|
@end menu
|
237 |
|
|
|
238 |
|
|
@include gpl.texi
|
239 |
|
|
|
240 |
|
|
@include fdl.texi
|
241 |
|
|
|
242 |
|
|
@node Contributors
|
243 |
|
|
|
244 |
|
|
@unnumbered Contributors to GNU Treelang
|
245 |
|
|
@cindex contributors
|
246 |
|
|
@cindex credits
|
247 |
|
|
|
248 |
|
|
Treelang was based on 'toy' by Richard Kenner, and also uses code from
|
249 |
|
|
the GCC core code tree. Tim Josling first created the language and
|
250 |
|
|
documentation, based on the GCC Fortran compiler's documentation
|
251 |
|
|
framework. Treelang was updated to use the TreeSSA infrastructure by
|
252 |
|
|
James A. Morrison.
|
253 |
|
|
|
254 |
|
|
@itemize @bullet
|
255 |
|
|
@item
|
256 |
|
|
The packaging and compiler portions of GNU Treelang are based largely
|
257 |
|
|
on the GCC compiler.
|
258 |
|
|
@xref{Contributors,,Contributors to GCC,GCC,Using and Maintaining GCC},
|
259 |
|
|
for more information.
|
260 |
|
|
|
261 |
|
|
@item
|
262 |
|
|
There is no specific run-time library for treelang, other than the
|
263 |
|
|
standard C runtime.
|
264 |
|
|
|
265 |
|
|
@item
|
266 |
|
|
It would have been difficult to build treelang without access to Joachim
|
267 |
|
|
Nadler's guide to writing a front end to GCC (written in German). A
|
268 |
|
|
translation of this document into English is available via the
|
269 |
|
|
CobolForGCC project or via the documentation links from the GCC home
|
270 |
|
|
page @uref{http://gcc.gnu.org}.
|
271 |
|
|
@end itemize
|
272 |
|
|
|
273 |
|
|
@include funding.texi
|
274 |
|
|
|
275 |
|
|
@node Getting Started
|
276 |
|
|
@chapter Getting Started
|
277 |
|
|
@cindex getting started
|
278 |
|
|
@cindex new users
|
279 |
|
|
@cindex newbies
|
280 |
|
|
@cindex beginners
|
281 |
|
|
|
282 |
|
|
Treelang is a sample language, useful only to help people understand how
|
283 |
|
|
to implement a new language front end to GCC. It is not a useful
|
284 |
|
|
language in itself other than as an example or basis for building a new
|
285 |
|
|
language. Therefore only language developers are likely to have an
|
286 |
|
|
interest in it.
|
287 |
|
|
|
288 |
|
|
This manual assumes familiarity with GCC, which you can obtain by using
|
289 |
|
|
it and by reading the manuals @samp{Using the GNU Compiler Collection (GCC)}
|
290 |
|
|
and @samp{GNU Compiler Collection (GCC) Internals}.
|
291 |
|
|
|
292 |
|
|
To install treelang, follow the GCC installation instructions,
|
293 |
|
|
taking care to ensure you specify treelang in the configure step by adding
|
294 |
|
|
treelang to the list of languages specified by @option{--enable-languages},
|
295 |
|
|
e.g.@: @samp{--enable-languages=all,treelang}.
|
296 |
|
|
|
297 |
|
|
If you're generally curious about the future of
|
298 |
|
|
@code{treelang}, see @ref{Projects}.
|
299 |
|
|
If you're curious about its past,
|
300 |
|
|
see @ref{Contributors}.
|
301 |
|
|
|
302 |
|
|
To see a few of the questions maintainers of @code{treelang} have,
|
303 |
|
|
and that you might be able to answer,
|
304 |
|
|
see @ref{Open Questions}.
|
305 |
|
|
|
306 |
|
|
@ifset USING
|
307 |
|
|
@node What is GNU Treelang?, Lexical Syntax, Getting Started, Top
|
308 |
|
|
@chapter What is GNU Treelang?
|
309 |
|
|
@cindex concepts, basic
|
310 |
|
|
@cindex basic concepts
|
311 |
|
|
|
312 |
|
|
GNU Treelang, or @code{treelang}, is designed initially as a free
|
313 |
|
|
replacement for, or alternative to, the 'toy' language, but which is
|
314 |
|
|
amenable to inclusion within the GCC source tree.
|
315 |
|
|
|
316 |
|
|
@code{treelang} is largely a cut down version of C, designed to showcase
|
317 |
|
|
the features of the GCC code generation back end. Only those features
|
318 |
|
|
that are directly supported by the GCC code generation back end are
|
319 |
|
|
implemented. Features are implemented in a manner which is easiest and
|
320 |
|
|
clearest to implement. Not all or even most code generation back end
|
321 |
|
|
features are implemented. The intention is to add features incrementally
|
322 |
|
|
until most features of the GCC back end are implemented in treelang.
|
323 |
|
|
|
324 |
|
|
The main features missing are structures, arrays and pointers.
|
325 |
|
|
|
326 |
|
|
A sample program follows:
|
327 |
|
|
|
328 |
|
|
@smallexample
|
329 |
|
|
// @r{function prototypes}
|
330 |
|
|
// @r{function 'add' taking two ints and returning an int}
|
331 |
|
|
external_definition int add(int arg1, int arg2);
|
332 |
|
|
external_definition int subtract(int arg3, int arg4);
|
333 |
|
|
external_definition int first_nonzero(int arg5, int arg6);
|
334 |
|
|
external_definition int double_plus_one(int arg7);
|
335 |
|
|
|
336 |
|
|
// @r{function definition}
|
337 |
|
|
add
|
338 |
|
|
@{
|
339 |
|
|
// @r{return the sum of arg1 and arg2}
|
340 |
|
|
return arg1 + arg2;
|
341 |
|
|
@}
|
342 |
|
|
|
343 |
|
|
|
344 |
|
|
subtract
|
345 |
|
|
@{
|
346 |
|
|
return arg3 - arg4;
|
347 |
|
|
@}
|
348 |
|
|
|
349 |
|
|
double_plus_one
|
350 |
|
|
@{
|
351 |
|
|
// @r{aaa is a variable, of type integer and allocated at the start of}
|
352 |
|
|
// @r{the function}
|
353 |
|
|
automatic int aaa;
|
354 |
|
|
// @r{set aaa to the value returned from add, when passed arg7 and arg7 as}
|
355 |
|
|
// @r{the two parameters}
|
356 |
|
|
aaa=add(arg7, arg7);
|
357 |
|
|
aaa=add(aaa, aaa);
|
358 |
|
|
aaa=subtract(subtract(aaa, arg7), arg7) + 1;
|
359 |
|
|
return aaa;
|
360 |
|
|
@}
|
361 |
|
|
|
362 |
|
|
first_nonzero
|
363 |
|
|
@{
|
364 |
|
|
// @r{C-like if statement}
|
365 |
|
|
if (arg5)
|
366 |
|
|
@{
|
367 |
|
|
return arg5;
|
368 |
|
|
@}
|
369 |
|
|
else
|
370 |
|
|
@{
|
371 |
|
|
@}
|
372 |
|
|
return arg6;
|
373 |
|
|
@}
|
374 |
|
|
@end smallexample
|
375 |
|
|
|
376 |
|
|
@node Lexical Syntax, Parsing Syntax, What is GNU Treelang?, Top
|
377 |
|
|
@chapter Lexical Syntax
|
378 |
|
|
@cindex Lexical Syntax
|
379 |
|
|
|
380 |
|
|
Treelang programs consist of whitespace, comments, keywords and names.
|
381 |
|
|
@itemize @bullet
|
382 |
|
|
|
383 |
|
|
@item
|
384 |
|
|
Whitespace consists of the space character, a tab, and the end of line
|
385 |
|
|
character. Line terminations are as defined by the
|
386 |
|
|
standard C library. Whitespace is ignored except within comments,
|
387 |
|
|
and where it separates parts of the program. In the example below, A and
|
388 |
|
|
B are two separate names separated by whitespace.
|
389 |
|
|
|
390 |
|
|
@smallexample
|
391 |
|
|
A B
|
392 |
|
|
@end smallexample
|
393 |
|
|
|
394 |
|
|
@item
|
395 |
|
|
Comments consist of @samp{//} followed by any characters up to the end
|
396 |
|
|
of the line. C style comments (/* */) are not supported. For example,
|
397 |
|
|
the assignment below is followed by a not very helpful comment.
|
398 |
|
|
|
399 |
|
|
@smallexample
|
400 |
|
|
x = 1; // @r{Set X to 1}
|
401 |
|
|
@end smallexample
|
402 |
|
|
|
403 |
|
|
@item
|
404 |
|
|
Keywords consist of any of the following reserved words or symbols:
|
405 |
|
|
|
406 |
|
|
@itemize @bullet
|
407 |
|
|
@item @{
|
408 |
|
|
used to start the statements in a function
|
409 |
|
|
@item @}
|
410 |
|
|
used to end the statements in a function
|
411 |
|
|
@item (
|
412 |
|
|
start list of function arguments, or to change the precedence of operators in
|
413 |
|
|
an expression
|
414 |
|
|
@item )
|
415 |
|
|
end list or prioritized operators in expression
|
416 |
|
|
@item ,
|
417 |
|
|
used to separate parameters in a function prototype or in a function call
|
418 |
|
|
@item ;
|
419 |
|
|
used to end a statement
|
420 |
|
|
@item +
|
421 |
|
|
addition, or unary plus for signed literals
|
422 |
|
|
@item -
|
423 |
|
|
subtraction, or unary minus for signed literals
|
424 |
|
|
@item =
|
425 |
|
|
assignment
|
426 |
|
|
@item ==
|
427 |
|
|
equality test
|
428 |
|
|
@item if
|
429 |
|
|
begin IF statement
|
430 |
|
|
@item else
|
431 |
|
|
begin 'else' portion of IF statement
|
432 |
|
|
@item static
|
433 |
|
|
indicate variable is permanent, or function has file scope only
|
434 |
|
|
@item automatic
|
435 |
|
|
indicate that variable is allocated for the life of the current scope
|
436 |
|
|
@item external_reference
|
437 |
|
|
indicate that variable or function is defined in another file
|
438 |
|
|
@item external_definition
|
439 |
|
|
indicate that variable or function is to be accessible from other files
|
440 |
|
|
@item int
|
441 |
|
|
variable is an integer (same as C int)
|
442 |
|
|
@item char
|
443 |
|
|
variable is a character (same as C char)
|
444 |
|
|
@item unsigned
|
445 |
|
|
variable is unsigned. If this is not present, the variable is signed
|
446 |
|
|
@item return
|
447 |
|
|
start function return statement
|
448 |
|
|
@item void
|
449 |
|
|
used as function type to indicate function returns nothing
|
450 |
|
|
@end itemize
|
451 |
|
|
|
452 |
|
|
|
453 |
|
|
@item
|
454 |
|
|
Names consist of any letter or "_" followed by any number of letters,
|
455 |
|
|
numbers, or "_". "$" is not allowed in a name. All names must be globally
|
456 |
|
|
unique, i.e. may not be used twice in any context, and must
|
457 |
|
|
not be a keyword. Names and keywords are case sensitive. For example:
|
458 |
|
|
|
459 |
|
|
@smallexample
|
460 |
|
|
a A _a a_ IF_X
|
461 |
|
|
@end smallexample
|
462 |
|
|
|
463 |
|
|
are all different names.
|
464 |
|
|
|
465 |
|
|
@end itemize
|
466 |
|
|
|
467 |
|
|
@node Parsing Syntax, Compiler Overview, Lexical Syntax, Top
|
468 |
|
|
@chapter Parsing Syntax
|
469 |
|
|
@cindex Parsing Syntax
|
470 |
|
|
|
471 |
|
|
Declarations are built up from the lexical elements described above. A
|
472 |
|
|
file may contain one of more declarations.
|
473 |
|
|
|
474 |
|
|
@itemize @bullet
|
475 |
|
|
|
476 |
|
|
@item
|
477 |
|
|
declaration: variable declaration OR function prototype OR function declaration
|
478 |
|
|
|
479 |
|
|
@item
|
480 |
|
|
Function Prototype: storage type NAME ( optional_parameter_list )
|
481 |
|
|
|
482 |
|
|
@smallexample
|
483 |
|
|
static int add (int a, int b)
|
484 |
|
|
@end smallexample
|
485 |
|
|
|
486 |
|
|
@item
|
487 |
|
|
variable_declaration: storage type NAME initial;
|
488 |
|
|
|
489 |
|
|
Example:
|
490 |
|
|
|
491 |
|
|
@smallexample
|
492 |
|
|
int temp1 = 1;
|
493 |
|
|
@end smallexample
|
494 |
|
|
|
495 |
|
|
A variable declaration can be outside a function, or at the start of a
|
496 |
|
|
function.
|
497 |
|
|
|
498 |
|
|
@item
|
499 |
|
|
storage: automatic OR static OR external_reference OR external_definition
|
500 |
|
|
|
501 |
|
|
This defines the scope, duration and visibility of a function or variable
|
502 |
|
|
|
503 |
|
|
@enumerate 1
|
504 |
|
|
|
505 |
|
|
@item
|
506 |
|
|
automatic: This means a variable is allocated at start of the current scope and
|
507 |
|
|
released when the current scope is exited. This can only be used for variables
|
508 |
|
|
within functions. It cannot be used for functions.
|
509 |
|
|
|
510 |
|
|
@item
|
511 |
|
|
static: This means a variable is allocated at start of program and
|
512 |
|
|
remains allocated until the program as a whole ends. For a function, it
|
513 |
|
|
means that the function is only visible within the current file.
|
514 |
|
|
|
515 |
|
|
@item
|
516 |
|
|
external_definition: For a variable, which must be defined outside a
|
517 |
|
|
function, it means that the variable is visible from other files. For a
|
518 |
|
|
function, it means that the function is visible from another file.
|
519 |
|
|
|
520 |
|
|
@item
|
521 |
|
|
external_reference: For a variable, which must be defined outside a
|
522 |
|
|
function, it means that the variable is defined in another file. For a
|
523 |
|
|
function, it means that the function is defined in another file.
|
524 |
|
|
|
525 |
|
|
@end enumerate
|
526 |
|
|
|
527 |
|
|
@item
|
528 |
|
|
type: int OR unsigned int OR char OR unsigned char OR void
|
529 |
|
|
|
530 |
|
|
This defines the data type of a variable or the return type of a function.
|
531 |
|
|
|
532 |
|
|
@enumerate a
|
533 |
|
|
|
534 |
|
|
@item
|
535 |
|
|
int: The variable is a signed integer. The function returns a signed integer.
|
536 |
|
|
|
537 |
|
|
@item
|
538 |
|
|
unsigned int: The variable is an unsigned integer. The function returns an unsigned integer.
|
539 |
|
|
|
540 |
|
|
@item
|
541 |
|
|
char: The variable is a signed character. The function returns a signed character.
|
542 |
|
|
|
543 |
|
|
@item
|
544 |
|
|
unsigned char: The variable is an unsigned character. The function returns an unsigned character.
|
545 |
|
|
|
546 |
|
|
@end enumerate
|
547 |
|
|
|
548 |
|
|
@item
|
549 |
|
|
parameter_list OR parameter [, parameter]...
|
550 |
|
|
|
551 |
|
|
@item
|
552 |
|
|
parameter: variable_declaration ,
|
553 |
|
|
|
554 |
|
|
The variable declarations must not have initializations.
|
555 |
|
|
|
556 |
|
|
@item
|
557 |
|
|
initial: = value
|
558 |
|
|
|
559 |
|
|
@item
|
560 |
|
|
value: integer_constant
|
561 |
|
|
|
562 |
|
|
Values without a unary plus or minus are considered to be unsigned.
|
563 |
|
|
@smallexample
|
564 |
|
|
e.g.@: 1 +2 -3
|
565 |
|
|
@end smallexample
|
566 |
|
|
|
567 |
|
|
@item
|
568 |
|
|
function_declaration: name @{ variable_declarations statements @}
|
569 |
|
|
|
570 |
|
|
A function consists of the function name then the declarations (if any)
|
571 |
|
|
and statements (if any) within one pair of braces.
|
572 |
|
|
|
573 |
|
|
The details of the function arguments come from the function
|
574 |
|
|
prototype. The function prototype must precede the function declaration
|
575 |
|
|
in the file.
|
576 |
|
|
|
577 |
|
|
@item
|
578 |
|
|
statement: if_statement OR expression_statement OR return_statement
|
579 |
|
|
|
580 |
|
|
@item
|
581 |
|
|
if_statement: if ( expression ) @{ variable_declarations statements @}
|
582 |
|
|
else @{ variable_declarations statements @}
|
583 |
|
|
|
584 |
|
|
The first lot of statements is executed if the expression is
|
585 |
|
|
nonzero. Otherwise the second lot of statements is executed. Either
|
586 |
|
|
list of statements may be empty, but both sets of braces and the else must be present.
|
587 |
|
|
|
588 |
|
|
@smallexample
|
589 |
|
|
if (a==b)
|
590 |
|
|
@{
|
591 |
|
|
// @r{nothing}
|
592 |
|
|
@}
|
593 |
|
|
else
|
594 |
|
|
@{
|
595 |
|
|
a=b;
|
596 |
|
|
@}
|
597 |
|
|
@end smallexample
|
598 |
|
|
|
599 |
|
|
@item
|
600 |
|
|
expression_statement: expression;
|
601 |
|
|
|
602 |
|
|
The expression is executed, including any side effects.
|
603 |
|
|
|
604 |
|
|
@item
|
605 |
|
|
return_statement: return expression_opt;
|
606 |
|
|
|
607 |
|
|
Returns from the function. If the function is void, the expression must
|
608 |
|
|
be absent, and if the function is not void the expression must be
|
609 |
|
|
present.
|
610 |
|
|
|
611 |
|
|
@item
|
612 |
|
|
expression: variable OR integer_constant OR expression + expression
|
613 |
|
|
OR expression - expression OR expression == expression OR ( expression )
|
614 |
|
|
OR variable = expression OR function_call
|
615 |
|
|
|
616 |
|
|
An expression can be a constant or a variable reference or a
|
617 |
|
|
function_call. Expressions can be combined as a sum of two expressions
|
618 |
|
|
or the difference of two expressions, or an equality test of two
|
619 |
|
|
expressions. An assignment is also an expression. Expressions and operator
|
620 |
|
|
precedence work as in C.
|
621 |
|
|
|
622 |
|
|
@item
|
623 |
|
|
function_call: function_name ( optional_comma_separated_expressions )
|
624 |
|
|
|
625 |
|
|
This invokes the function, passing to it the values of the expressions
|
626 |
|
|
as actual parameters.
|
627 |
|
|
|
628 |
|
|
@end itemize
|
629 |
|
|
|
630 |
|
|
@cindex compilers
|
631 |
|
|
@node Compiler Overview, TREELANG and GCC, Parsing Syntax, Top
|
632 |
|
|
@chapter Compiler Overview
|
633 |
|
|
treelang is run as part of the GCC compiler.
|
634 |
|
|
|
635 |
|
|
@itemize @bullet
|
636 |
|
|
@cindex source code
|
637 |
|
|
@cindex file, source
|
638 |
|
|
@cindex code, source
|
639 |
|
|
@cindex source file
|
640 |
|
|
@item
|
641 |
|
|
It reads a user's program, stored in a file and containing instructions
|
642 |
|
|
written in the appropriate language (Treelang, C, and so on). This file
|
643 |
|
|
contains @dfn{source code}.
|
644 |
|
|
|
645 |
|
|
@cindex translation of user programs
|
646 |
|
|
@cindex machine code
|
647 |
|
|
@cindex code, machine
|
648 |
|
|
@cindex mistakes
|
649 |
|
|
@item
|
650 |
|
|
It translates the user's program into instructions a computer can carry
|
651 |
|
|
out more quickly than it takes to translate the instructions in the
|
652 |
|
|
first place. These instructions are called @dfn{machine code}---code
|
653 |
|
|
designed to be efficiently translated and processed by a machine such as
|
654 |
|
|
a computer. Humans usually aren't as good writing machine code as they
|
655 |
|
|
are at writing Treelang or C, because it is easy to make tiny mistakes
|
656 |
|
|
writing machine code. When writing Treelang or C, it is easy to make
|
657 |
|
|
big mistakes. But you can only make one mistake, because the compiler
|
658 |
|
|
stops after it finds any problem.
|
659 |
|
|
|
660 |
|
|
@cindex debugger
|
661 |
|
|
@cindex bugs, finding
|
662 |
|
|
@cindex @code{gdb}, command
|
663 |
|
|
@cindex commands, @code{gdb}
|
664 |
|
|
@item
|
665 |
|
|
It provides information in the generated machine code
|
666 |
|
|
that can make it easier to find bugs in the program
|
667 |
|
|
(using a debugging tool, called a @dfn{debugger},
|
668 |
|
|
such as @code{gdb}).
|
669 |
|
|
|
670 |
|
|
@cindex libraries
|
671 |
|
|
@cindex linking
|
672 |
|
|
@cindex @code{ld} command
|
673 |
|
|
@cindex commands, @code{ld}
|
674 |
|
|
@item
|
675 |
|
|
It locates and gathers machine code already generated to perform actions
|
676 |
|
|
requested by statements in the user's program. This machine code is
|
677 |
|
|
organized into @dfn{libraries} and is located and gathered during the
|
678 |
|
|
@dfn{link} phase of the compilation process. (Linking often is thought
|
679 |
|
|
of as a separate step, because it can be directly invoked via the
|
680 |
|
|
@code{ld} command. However, the @code{gcc} command, as with most
|
681 |
|
|
compiler commands, automatically performs the linking step by calling on
|
682 |
|
|
@code{ld} directly, unless asked to not do so by the user.)
|
683 |
|
|
|
684 |
|
|
@cindex language, incorrect use of
|
685 |
|
|
@cindex incorrect use of language
|
686 |
|
|
@item
|
687 |
|
|
It attempts to diagnose cases where the user's program contains
|
688 |
|
|
incorrect usages of the language. The @dfn{diagnostics} produced by the
|
689 |
|
|
compiler indicate the problem and the location in the user's source file
|
690 |
|
|
where the problem was first noticed. The user can use this information
|
691 |
|
|
to locate and fix the problem.
|
692 |
|
|
|
693 |
|
|
The compiler stops after the first error. There are no plans to fix
|
694 |
|
|
this, ever, as it would vastly complicate the implementation of treelang
|
695 |
|
|
to little or no benefit.
|
696 |
|
|
|
697 |
|
|
@cindex diagnostics, incorrect
|
698 |
|
|
@cindex incorrect diagnostics
|
699 |
|
|
@cindex error messages, incorrect
|
700 |
|
|
@cindex incorrect error messages
|
701 |
|
|
(Sometimes an incorrect usage of the language leads to a situation where
|
702 |
|
|
the compiler can not make any sense of what it reads---while a human
|
703 |
|
|
might be able to---and thus ends up complaining about an incorrect
|
704 |
|
|
``problem'' it encounters that, in fact, reflects a misunderstanding of
|
705 |
|
|
the programmer's intention.)
|
706 |
|
|
|
707 |
|
|
@cindex warnings
|
708 |
|
|
@cindex questionable instructions
|
709 |
|
|
@item
|
710 |
|
|
There are a few warnings in treelang. For example an unused static function
|
711 |
|
|
generate a warnings when -Wunused-function is specified, similarly an unused
|
712 |
|
|
static variable generates a warning when -Wunused-variable are specified.
|
713 |
|
|
The only treelang specific warning is a warning when an expression is in a
|
714 |
|
|
return statement for functions that return void.
|
715 |
|
|
@end itemize
|
716 |
|
|
|
717 |
|
|
@cindex components of treelang
|
718 |
|
|
@cindex @code{treelang}, components of
|
719 |
|
|
@code{treelang} consists of several components:
|
720 |
|
|
|
721 |
|
|
@cindex @code{gcc}, command
|
722 |
|
|
@cindex commands, @code{gcc}
|
723 |
|
|
@itemize @bullet
|
724 |
|
|
@item
|
725 |
|
|
A modified version of the @code{gcc} command, which also might be
|
726 |
|
|
installed as the system's @code{cc} command.
|
727 |
|
|
(In many cases, @code{cc} refers to the
|
728 |
|
|
system's ``native'' C compiler, which
|
729 |
|
|
might be a non-GNU compiler, or an older version
|
730 |
|
|
of @code{GCC} considered more stable or that is
|
731 |
|
|
used to build the operating system kernel.)
|
732 |
|
|
|
733 |
|
|
@cindex @code{treelang}, command
|
734 |
|
|
@cindex commands, @code{treelang}
|
735 |
|
|
@item
|
736 |
|
|
The @code{treelang} command itself.
|
737 |
|
|
|
738 |
|
|
@item
|
739 |
|
|
The @code{libc} run-time library. This library contains the machine
|
740 |
|
|
code needed to support capabilities of the Treelang language that are
|
741 |
|
|
not directly provided by the machine code generated by the
|
742 |
|
|
@code{treelang} compilation phase. This is the same library that the
|
743 |
|
|
main C compiler uses (libc).
|
744 |
|
|
|
745 |
|
|
@cindex @code{tree1}, program
|
746 |
|
|
@cindex programs, @code{tree1}
|
747 |
|
|
@cindex assembler
|
748 |
|
|
@cindex @code{as} command
|
749 |
|
|
@cindex commands, @code{as}
|
750 |
|
|
@cindex assembly code
|
751 |
|
|
@cindex code, assembly
|
752 |
|
|
@item
|
753 |
|
|
The compiler itself, is internally named @code{tree1}.
|
754 |
|
|
|
755 |
|
|
Note that @code{tree1} does not generate machine code directly---it
|
756 |
|
|
generates @dfn{assembly code} that is a more readable form
|
757 |
|
|
of machine code, leaving the conversion to actual machine code
|
758 |
|
|
to an @dfn{assembler}, usually named @code{as}.
|
759 |
|
|
@end itemize
|
760 |
|
|
|
761 |
|
|
@code{GCC} is often thought of as ``the C compiler'' only,
|
762 |
|
|
but it does more than that.
|
763 |
|
|
Based on command-line options and the names given for files
|
764 |
|
|
on the command line, @code{gcc} determines which actions to perform, including
|
765 |
|
|
preprocessing, compiling (in a variety of possible languages), assembling,
|
766 |
|
|
and linking.
|
767 |
|
|
|
768 |
|
|
@cindex driver, gcc command as
|
769 |
|
|
@cindex @code{gcc}, command as driver
|
770 |
|
|
@cindex executable file
|
771 |
|
|
@cindex files, executable
|
772 |
|
|
@cindex cc1 program
|
773 |
|
|
@cindex programs, cc1
|
774 |
|
|
@cindex preprocessor
|
775 |
|
|
@cindex cpp program
|
776 |
|
|
@cindex programs, cpp
|
777 |
|
|
For example, the command @samp{gcc foo.c} @dfn{drives} the file
|
778 |
|
|
@file{foo.c} through the preprocessor @code{cpp}, then
|
779 |
|
|
the C compiler (internally named
|
780 |
|
|
@code{cc1}), then the assembler (usually @code{as}), then the linker
|
781 |
|
|
(@code{ld}), producing an executable program named @file{a.out} (on
|
782 |
|
|
UNIX systems).
|
783 |
|
|
|
784 |
|
|
@cindex treelang program
|
785 |
|
|
@cindex programs, treelang
|
786 |
|
|
As another example, the command @samp{gcc foo.tree} would do much the
|
787 |
|
|
same as @samp{gcc foo.c}, but instead of using the C compiler named
|
788 |
|
|
@code{cc1}, @code{gcc} would use the treelang compiler (named
|
789 |
|
|
@code{tree1}). However there is no preprocessor for treelang.
|
790 |
|
|
|
791 |
|
|
@cindex @code{tree1}, program
|
792 |
|
|
@cindex programs, @code{tree1}
|
793 |
|
|
In a GNU Treelang installation, @code{gcc} recognizes Treelang source
|
794 |
|
|
files by name just like it does C and C++ source files. It knows to use
|
795 |
|
|
the Treelang compiler named @code{tree1}, instead of @code{cc1} or
|
796 |
|
|
@code{cc1plus}, to compile Treelang files. If a file's name ends in
|
797 |
|
|
@code{.tree} then GCC knows that the program is written in treelang. You
|
798 |
|
|
can also manually override the language.
|
799 |
|
|
|
800 |
|
|
@cindex @code{gcc}, not recognizing Treelang source
|
801 |
|
|
@cindex unrecognized file format
|
802 |
|
|
@cindex file format not recognized
|
803 |
|
|
Non-Treelang-related operation of @code{gcc} is generally
|
804 |
|
|
unaffected by installing the GNU Treelang version of @code{gcc}.
|
805 |
|
|
However, without the installed version of @code{gcc} being the
|
806 |
|
|
GNU Treelang version, @code{gcc} will not be able to compile
|
807 |
|
|
and link Treelang programs.
|
808 |
|
|
|
809 |
|
|
@cindex printing version information
|
810 |
|
|
@cindex version information, printing
|
811 |
|
|
The command @samp{gcc -v x.tree} where @samp{x.tree} is a file which
|
812 |
|
|
must exist but whose contents are ignored, is a quick way to display
|
813 |
|
|
version information for the various programs used to compile a typical
|
814 |
|
|
Treelang source file.
|
815 |
|
|
|
816 |
|
|
The @code{tree1} program represents most of what is unique to GNU
|
817 |
|
|
Treelang; @code{tree1} is a combination of two rather large chunks of
|
818 |
|
|
code.
|
819 |
|
|
|
820 |
|
|
@cindex GCC Back End (GBE)
|
821 |
|
|
@cindex GBE
|
822 |
|
|
@cindex @code{GCC}, back end
|
823 |
|
|
@cindex back end, GCC
|
824 |
|
|
@cindex code generator
|
825 |
|
|
One chunk is the so-called @dfn{GNU Back End}, or GBE,
|
826 |
|
|
which knows how to generate fast code for a wide variety of processors.
|
827 |
|
|
The same GBE is used by the C, C++, and Treelang compiler programs @code{cc1},
|
828 |
|
|
@code{cc1plus}, and @code{tree1}, plus others.
|
829 |
|
|
Often the GBE is referred to as the ``GCC back end'' or
|
830 |
|
|
even just ``GCC''---in this manual, the term GBE is used
|
831 |
|
|
whenever the distinction is important.
|
832 |
|
|
|
833 |
|
|
@cindex GNU Treelang Front End (TFE)
|
834 |
|
|
@cindex tree1
|
835 |
|
|
@cindex @code{treelang}, front end
|
836 |
|
|
@cindex front end, @code{treelang}
|
837 |
|
|
The other chunk of @code{tree1} is the majority of what is unique about
|
838 |
|
|
GNU Treelang---the code that knows how to interpret Treelang programs to
|
839 |
|
|
determine what they are intending to do, and then communicate that
|
840 |
|
|
knowledge to the GBE for actual compilation of those programs. This
|
841 |
|
|
chunk is called the @dfn{Treelang Front End} (TFE). The @code{cc1} and
|
842 |
|
|
@code{cc1plus} programs have their own front ends, for the C and C++
|
843 |
|
|
languages, respectively. These fronts ends are responsible for
|
844 |
|
|
diagnosing incorrect usage of their respective languages by the programs
|
845 |
|
|
the process, and are responsible for most of the warnings about
|
846 |
|
|
questionable constructs as well. (The GBE in principle handles
|
847 |
|
|
producing some warnings, like those concerning possible references to
|
848 |
|
|
undefined variables, but these warnings should not occur in treelang
|
849 |
|
|
programs as the front end is meant to pick them up first).
|
850 |
|
|
|
851 |
|
|
Because so much is shared among the compilers for various languages,
|
852 |
|
|
much of the behavior and many of the user-selectable options for these
|
853 |
|
|
compilers are similar.
|
854 |
|
|
For example, diagnostics (error messages and
|
855 |
|
|
warnings) are similar in appearance; command-line
|
856 |
|
|
options like @samp{-Wall} have generally similar effects; and the quality
|
857 |
|
|
of generated code (in terms of speed and size) is roughly similar
|
858 |
|
|
(since that work is done by the shared GBE).
|
859 |
|
|
|
860 |
|
|
@node TREELANG and GCC, Compiler, Compiler Overview, Top
|
861 |
|
|
@chapter Compile Treelang, C, or Other Programs
|
862 |
|
|
@cindex compiling programs
|
863 |
|
|
@cindex programs, compiling
|
864 |
|
|
|
865 |
|
|
@cindex @code{gcc}, command
|
866 |
|
|
@cindex commands, @code{gcc}
|
867 |
|
|
A GNU Treelang installation includes a modified version of the @code{gcc}
|
868 |
|
|
command.
|
869 |
|
|
|
870 |
|
|
In a non-Treelang installation, @code{gcc} recognizes C, C++,
|
871 |
|
|
and Objective-C source files.
|
872 |
|
|
|
873 |
|
|
In a GNU Treelang installation, @code{gcc} also recognizes Treelang source
|
874 |
|
|
files and accepts Treelang-specific command-line options, plus some
|
875 |
|
|
command-line options that are designed to cater to Treelang users
|
876 |
|
|
but apply to other languages as well.
|
877 |
|
|
|
878 |
|
|
@xref{G++ and GCC,,Programming Languages Supported by GCC,GCC,Using
|
879 |
|
|
the GNU Compiler Collection (GCC)},
|
880 |
|
|
for information on the way different languages are handled
|
881 |
|
|
by the GCC compiler (@code{gcc}).
|
882 |
|
|
|
883 |
|
|
You can use this, combined with the output of the @samp{gcc -v x.tree}
|
884 |
|
|
command to get the options applicable to treelang. Treelang programs
|
885 |
|
|
must end with the suffix @samp{.tree}.
|
886 |
|
|
|
887 |
|
|
@cindex preprocessor
|
888 |
|
|
|
889 |
|
|
Treelang programs are not by default run through the C
|
890 |
|
|
preprocessor by @code{gcc}. There is no reason why they cannot be run through the
|
891 |
|
|
preprocessor manually, but you would need to prevent the preprocessor
|
892 |
|
|
from generating #line directives, using the @samp{-P} option, otherwise
|
893 |
|
|
tree1 will not accept the input.
|
894 |
|
|
|
895 |
|
|
@node Compiler, Other Languages, TREELANG and GCC, Top
|
896 |
|
|
@chapter The GNU Treelang Compiler
|
897 |
|
|
|
898 |
|
|
The GNU Treelang compiler, @code{treelang}, supports programs written
|
899 |
|
|
in the GNU Treelang language.
|
900 |
|
|
|
901 |
|
|
@node Other Languages, treelang internals, Compiler, Top
|
902 |
|
|
@chapter Other Languages
|
903 |
|
|
|
904 |
|
|
@menu
|
905 |
|
|
* Interoperating with C and C++::
|
906 |
|
|
@end menu
|
907 |
|
|
|
908 |
|
|
@node Interoperating with C and C++, , Other Languages, Other Languages
|
909 |
|
|
@section Tools and advice for interoperating with C and C++
|
910 |
|
|
|
911 |
|
|
The output of treelang programs looks like C program code to the linker
|
912 |
|
|
and everybody else, so you should be able to freely mix treelang and C
|
913 |
|
|
(and C++) code, with one proviso.
|
914 |
|
|
|
915 |
|
|
C promotes small integer types to 'int' when used as function parameters and
|
916 |
|
|
return values in non-prototyped functions. Since treelang has no
|
917 |
|
|
non-prototyped functions, the treelang compiler does not do this.
|
918 |
|
|
|
919 |
|
|
@ifset INTERNALS
|
920 |
|
|
@node treelang internals, Open Questions, Other Languages, Top
|
921 |
|
|
@chapter treelang internals
|
922 |
|
|
|
923 |
|
|
@menu
|
924 |
|
|
* treelang files::
|
925 |
|
|
* treelang compiler interfaces::
|
926 |
|
|
* Hints and tips::
|
927 |
|
|
@end menu
|
928 |
|
|
|
929 |
|
|
@node treelang files, treelang compiler interfaces, treelang internals, treelang internals
|
930 |
|
|
@section treelang files
|
931 |
|
|
|
932 |
|
|
To create a compiler that integrates into GCC, you need create many
|
933 |
|
|
files. Some of the files are integrated into the main GCC makefile, to
|
934 |
|
|
build the various parts of the compiler and to run the test
|
935 |
|
|
suite. Others are incorporated into various GCC programs such as
|
936 |
|
|
@file{gcc.c}. Finally you must provide the actual programs comprising your
|
937 |
|
|
compiler.
|
938 |
|
|
|
939 |
|
|
@cindex files
|
940 |
|
|
|
941 |
|
|
The files are:
|
942 |
|
|
|
943 |
|
|
@enumerate 1
|
944 |
|
|
|
945 |
|
|
@item
|
946 |
|
|
COPYING. This is the copyright file, assuming you are going to use the
|
947 |
|
|
GNU General Public License. You probably need to use the GPL because if
|
948 |
|
|
you use the GCC back end your program and the back end are one program,
|
949 |
|
|
and the back end is GPLed.
|
950 |
|
|
|
951 |
|
|
This need not be present if the language is incorporated into the main
|
952 |
|
|
GCC tree, as the main GCC directory has this file.
|
953 |
|
|
|
954 |
|
|
@item
|
955 |
|
|
COPYING.LIB. This is the copyright file for those parts of your program
|
956 |
|
|
that are not to be covered by the GPL, but are instead to be covered by
|
957 |
|
|
the LGPL (Library or Lesser GPL). This license may be appropriate for
|
958 |
|
|
the library routines associated with your compiler. These are the
|
959 |
|
|
routines that are linked with the @emph{output} of the compiler. Using
|
960 |
|
|
the LGPL for these programs allows programs written using your compiler
|
961 |
|
|
to be closed source. For example LIBC is under the LGPL.
|
962 |
|
|
|
963 |
|
|
This need not be present if the language is incorporated into the main
|
964 |
|
|
GCC tree, as the main GCC directory has this file.
|
965 |
|
|
|
966 |
|
|
@item
|
967 |
|
|
ChangeLog. Record all the changes to your compiler. Use the same format
|
968 |
|
|
as used in treelang as it is supported by an emacs editing mode and is
|
969 |
|
|
part of the FSF coding standard. Normally each directory has its own
|
970 |
|
|
changelog. The FSF standard allows but does not require a meaningful
|
971 |
|
|
comment on why the changes were made, above and beyond @emph{why} they
|
972 |
|
|
were made. In the author's opinion it is useful to provide this
|
973 |
|
|
information.
|
974 |
|
|
|
975 |
|
|
@item
|
976 |
|
|
treelang.texi. The manual, written in texinfo. Your manual would have a
|
977 |
|
|
different file name. You need not write it in texinfo if you don't want
|
978 |
|
|
do, but a lot of GNU software does use texinfo.
|
979 |
|
|
|
980 |
|
|
@cindex Make-lang.in
|
981 |
|
|
@item
|
982 |
|
|
Make-lang.in. This file is part of the make file which in incorporated
|
983 |
|
|
with the GCC make file skeleton (Makefile.in in the GCC directory) to
|
984 |
|
|
make Makefile, as part of the configuration process.
|
985 |
|
|
|
986 |
|
|
Makefile in turn is the main instruction to actually build
|
987 |
|
|
everything. The build instructions are held in the main GCC manual and
|
988 |
|
|
web site so they are not repeated here.
|
989 |
|
|
|
990 |
|
|
There are some comments at the top which will help you understand what
|
991 |
|
|
you need to do.
|
992 |
|
|
|
993 |
|
|
There are make commands to build things, remove generated files with
|
994 |
|
|
various degrees of thoroughness, count the lines of code (so you know
|
995 |
|
|
how much progress you are making), build info and html files from the
|
996 |
|
|
texinfo source, run the tests etc.
|
997 |
|
|
|
998 |
|
|
@item
|
999 |
|
|
README. Just a brief informative text file saying what is in this
|
1000 |
|
|
directory.
|
1001 |
|
|
|
1002 |
|
|
@cindex config-lang.in
|
1003 |
|
|
@item
|
1004 |
|
|
config-lang.in. This file is read by the configuration progress and must
|
1005 |
|
|
be present. You specify the name of your language, the name(s) of the
|
1006 |
|
|
compiler(s) including preprocessors you are going to build, whether any,
|
1007 |
|
|
usually generated, files should be excluded from diffs (ie when making
|
1008 |
|
|
diff files to send in patches). Whether the equate 'stagestuff' is used
|
1009 |
|
|
is unknown (???).
|
1010 |
|
|
|
1011 |
|
|
@cindex lang.opt
|
1012 |
|
|
@item
|
1013 |
|
|
lang.opt. This file is included into @file{gcc.c}, the main GCC driver, and
|
1014 |
|
|
tells it what options your language supports. This is also used to
|
1015 |
|
|
display help.
|
1016 |
|
|
|
1017 |
|
|
@cindex lang-specs.h
|
1018 |
|
|
@item
|
1019 |
|
|
lang-specs.h. This file is also included in @file{gcc.c}. It tells
|
1020 |
|
|
@file{gcc.c} when to call your programs and what options to send them. The
|
1021 |
|
|
mini-language 'specs' is documented in the source of @file{gcc.c}. Do not
|
1022 |
|
|
attempt to write a specs file from scratch - use an existing one as the base
|
1023 |
|
|
and enhance it.
|
1024 |
|
|
|
1025 |
|
|
@item
|
1026 |
|
|
Your texi files. Texinfo can be used to build documentation in HTML,
|
1027 |
|
|
info, dvi and postscript formats. It is a tagged language, is documented
|
1028 |
|
|
in its own manual, and has its own emacs mode.
|
1029 |
|
|
|
1030 |
|
|
@item
|
1031 |
|
|
Your programs. The relationships between all the programs are explained
|
1032 |
|
|
in the next section. You need to write or use the following programs:
|
1033 |
|
|
|
1034 |
|
|
@itemize @bullet
|
1035 |
|
|
|
1036 |
|
|
@item
|
1037 |
|
|
lexer. This breaks the input into words and passes these to the
|
1038 |
|
|
parser. This is @file{lex.l} in treelang, which is passed through flex, a lex
|
1039 |
|
|
variant, to produce C code @file{lex.c}. Note there is a school of thought
|
1040 |
|
|
that says real men hand code their own lexers. However, you may prefer to
|
1041 |
|
|
write far less code and use flex, as was done with treelang.
|
1042 |
|
|
|
1043 |
|
|
@item
|
1044 |
|
|
parser. This breaks the program into recognizable constructs such as
|
1045 |
|
|
expressions, statements etc. This is @file{parse.y} in treelang, which is
|
1046 |
|
|
passed through bison, which is a yacc variant, to produce C code
|
1047 |
|
|
@file{parse.c}.
|
1048 |
|
|
|
1049 |
|
|
@item
|
1050 |
|
|
back end interface. This interfaces to the code generation back end. In
|
1051 |
|
|
treelang, this is @file{tree1.c} which mainly interfaces to @file{toplev.c} and
|
1052 |
|
|
@file{treetree.c} which mainly interfaces to everything else. Many languages
|
1053 |
|
|
mix up the back end interface with the parser, as in the C compiler for
|
1054 |
|
|
example. It is a matter of taste which way to do it, but with treelang
|
1055 |
|
|
it is separated out to make the back end interface cleaner and easier to
|
1056 |
|
|
understand.
|
1057 |
|
|
|
1058 |
|
|
@item
|
1059 |
|
|
header files. For function prototypes and common data items. One point
|
1060 |
|
|
to note here is that bison can generate a header files with all the
|
1061 |
|
|
numbers is has assigned to the keywords and symbols, and you can include
|
1062 |
|
|
the same header in your lexer. This technique is demonstrated in
|
1063 |
|
|
treelang.
|
1064 |
|
|
|
1065 |
|
|
@item
|
1066 |
|
|
compiler main file. GCC comes with a file @file{toplev.c} which is a
|
1067 |
|
|
perfectly serviceable main program for your compiler. GNU Treelang uses
|
1068 |
|
|
@file{toplev.c} but other languages have been known to replace it with their
|
1069 |
|
|
own main program. Again this is a matter of taste and how much code you
|
1070 |
|
|
want to write.
|
1071 |
|
|
|
1072 |
|
|
@end itemize
|
1073 |
|
|
|
1074 |
|
|
@end enumerate
|
1075 |
|
|
|
1076 |
|
|
@node treelang compiler interfaces, Hints and tips, treelang files, treelang internals
|
1077 |
|
|
@section treelang compiler interfaces
|
1078 |
|
|
|
1079 |
|
|
@cindex driver
|
1080 |
|
|
@cindex toplev.c
|
1081 |
|
|
|
1082 |
|
|
@menu
|
1083 |
|
|
* treelang driver::
|
1084 |
|
|
* treelang main compiler::
|
1085 |
|
|
@end menu
|
1086 |
|
|
|
1087 |
|
|
@node treelang driver, treelang main compiler, treelang compiler interfaces, treelang compiler interfaces
|
1088 |
|
|
@subsection treelang driver
|
1089 |
|
|
|
1090 |
|
|
The GCC compiler consists of a driver, which then executes the various
|
1091 |
|
|
compiler phases based on the instructions in the specs files.
|
1092 |
|
|
|
1093 |
|
|
Typically a program's language will be identified from its suffix
|
1094 |
|
|
(e.g., @file{.tree}) for treelang programs.
|
1095 |
|
|
|
1096 |
|
|
The driver (@file{gcc.c}) will then drive (exec) in turn a preprocessor,
|
1097 |
|
|
the main compiler, the assembler and the link editor. Options to GCC allow you
|
1098 |
|
|
to override all of this. In the case of treelang programs there is no
|
1099 |
|
|
preprocessor, and mostly these days the C preprocessor is run within the
|
1100 |
|
|
main C compiler rather than as a separate process, apparently for reasons of speed.
|
1101 |
|
|
|
1102 |
|
|
You will be using the standard assembler and linkage editor so these are
|
1103 |
|
|
ignored from now on.
|
1104 |
|
|
|
1105 |
|
|
You have to write your own preprocessor if you want one. This is usually
|
1106 |
|
|
totally language specific. The main point to be aware of is to ensure
|
1107 |
|
|
that you find some way to pass file name and line number information
|
1108 |
|
|
through to the main compiler so that it can tell the back end this
|
1109 |
|
|
information and so the debugger can find the right source line for each
|
1110 |
|
|
piece of code. That is all there is to say about the preprocessor except
|
1111 |
|
|
that the preprocessor will probably not be the slowest part of the
|
1112 |
|
|
compiler and will probably not use the most memory so don't waste too
|
1113 |
|
|
much time tuning it until you know you need to do so.
|
1114 |
|
|
|
1115 |
|
|
@node treelang main compiler, , treelang driver, treelang compiler interfaces
|
1116 |
|
|
@subsection treelang main compiler
|
1117 |
|
|
|
1118 |
|
|
The main compiler for treelang consists of @file{toplev.c} from the main GCC
|
1119 |
|
|
compiler, the parser, lexer and back end interface routines, and the
|
1120 |
|
|
back end routines themselves, of which there are many.
|
1121 |
|
|
|
1122 |
|
|
@file{toplev.c} does a lot of work for you and you should almost certainly
|
1123 |
|
|
use it.
|
1124 |
|
|
|
1125 |
|
|
Writing this code is the hard part of creating a compiler using GCC. The
|
1126 |
|
|
back end interface documentation is incomplete and the interface is
|
1127 |
|
|
complex.
|
1128 |
|
|
|
1129 |
|
|
There are three main aspects to interfacing to the other GCC code.
|
1130 |
|
|
|
1131 |
|
|
@menu
|
1132 |
|
|
* Interfacing to toplev.c::
|
1133 |
|
|
* Interfacing to the garbage collection::
|
1134 |
|
|
* Interfacing to the code generation code. ::
|
1135 |
|
|
@end menu
|
1136 |
|
|
|
1137 |
|
|
@node Interfacing to toplev.c, Interfacing to the garbage collection, treelang main compiler, treelang main compiler
|
1138 |
|
|
@subsubsection Interfacing to toplev.c
|
1139 |
|
|
|
1140 |
|
|
In treelang this is handled mainly in tree1.c
|
1141 |
|
|
and partly in treetree.c. Peruse toplev.c for details of what you need
|
1142 |
|
|
to do.
|
1143 |
|
|
|
1144 |
|
|
@node Interfacing to the garbage collection, Interfacing to the code generation code. , Interfacing to toplev.c, treelang main compiler
|
1145 |
|
|
@subsubsection Interfacing to the garbage collection
|
1146 |
|
|
|
1147 |
|
|
Interfacing to the garbage collection. In treelang this is mainly in
|
1148 |
|
|
tree1.c.
|
1149 |
|
|
|
1150 |
|
|
Memory allocation in the compiler should be done using the ggc_alloc and
|
1151 |
|
|
kindred routines in ggc*.*. At the end of every 'function' in your language, toplev.c calls
|
1152 |
|
|
the garbage collection several times. The garbage collection calls mark
|
1153 |
|
|
routines which go through the memory which is still used, telling the
|
1154 |
|
|
garbage collection not to free it. Then all the memory not used is
|
1155 |
|
|
freed.
|
1156 |
|
|
|
1157 |
|
|
What this means is that you need a way to hook into this marking
|
1158 |
|
|
process. This is done by calling ggc_add_root. This provides the address
|
1159 |
|
|
of a callback routine which will be called duing garbage collection and
|
1160 |
|
|
which can call ggc_mark to save the storage. If storage is only
|
1161 |
|
|
used within the parsing of a function, you do not need to provide a way
|
1162 |
|
|
to mark it.
|
1163 |
|
|
|
1164 |
|
|
Note that you can also call ggc_mark_tree to mark any of the back end
|
1165 |
|
|
internal 'tree' nodes. This routine will follow the branches of the
|
1166 |
|
|
trees and mark all the subordinate structures. This is useful for
|
1167 |
|
|
example when you have created a variable declaration that will be used
|
1168 |
|
|
across multiple functions, or for a function declaration (from a
|
1169 |
|
|
prototype) that may be used later on. See the next item for more on the
|
1170 |
|
|
tree nodes.
|
1171 |
|
|
|
1172 |
|
|
@node Interfacing to the code generation code. , , Interfacing to the garbage collection, treelang main compiler
|
1173 |
|
|
@subsubsection Interfacing to the code generation code.
|
1174 |
|
|
|
1175 |
|
|
In treelang this is done in treetree.c. A typedef called 'tree' which is
|
1176 |
|
|
defined in tree.h and tree.def in the GCC directory and largely
|
1177 |
|
|
implemented in tree.c and stmt.c forms the basic interface to the
|
1178 |
|
|
compiler back end.
|
1179 |
|
|
|
1180 |
|
|
In general you call various tree routines to generate code, either
|
1181 |
|
|
directly or through toplev.c. You build up data structures and
|
1182 |
|
|
expressions in similar ways.
|
1183 |
|
|
|
1184 |
|
|
You can read some documentation on this which can be found via the GCC
|
1185 |
|
|
main web page. In particular, the documentation produced by Joachim
|
1186 |
|
|
Nadler and translated by Tim Josling can be quite useful. the C compiler
|
1187 |
|
|
also has documentation in the main GCC manual (particularly the current
|
1188 |
|
|
CVS version) which is useful on a lot of the details.
|
1189 |
|
|
|
1190 |
|
|
In time it is hoped to enhance this document to provide a more
|
1191 |
|
|
comprehensive overview of this topic. The main gap is in explaining how
|
1192 |
|
|
it all works together.
|
1193 |
|
|
|
1194 |
|
|
@node Hints and tips, , treelang compiler interfaces, treelang internals
|
1195 |
|
|
@section Hints and tips
|
1196 |
|
|
|
1197 |
|
|
@itemize @bullet
|
1198 |
|
|
|
1199 |
|
|
@item
|
1200 |
|
|
TAGS: Use the make ETAGS commands to create TAGS files which can be used in
|
1201 |
|
|
emacs to jump to any symbol quickly.
|
1202 |
|
|
|
1203 |
|
|
@item
|
1204 |
|
|
GREP: grep is also a useful way to find all uses of a symbol.
|
1205 |
|
|
|
1206 |
|
|
@item
|
1207 |
|
|
TREE: The main routines to look at are tree.h and tree.def. You will
|
1208 |
|
|
probably want a hardcopy of these.
|
1209 |
|
|
|
1210 |
|
|
@item
|
1211 |
|
|
SAMPLE: look at the sample interfacing code in treetree.c. You can use
|
1212 |
|
|
gdb to trace through the code and learn about how it all works.
|
1213 |
|
|
|
1214 |
|
|
@item
|
1215 |
|
|
GDB: the GCC back end works well with gdb. It traps abort() and allows
|
1216 |
|
|
you to trace back what went wrong.
|
1217 |
|
|
|
1218 |
|
|
@item
|
1219 |
|
|
Error Checking: The compiler back end does some error and consistency
|
1220 |
|
|
checking. Often the result of an error is just no code being
|
1221 |
|
|
generated. You will then need to trace through and find out what is
|
1222 |
|
|
going wrong. The rtl dump files can help here also.
|
1223 |
|
|
|
1224 |
|
|
@item
|
1225 |
|
|
rtl dump files: The main compiler documents these files which are dumps
|
1226 |
|
|
of the rtl (intermediate code) which is manipulated doing the code
|
1227 |
|
|
generation process. This can provide useful clues about what is going
|
1228 |
|
|
wrong. The rtl 'language' is documented in the main GCC manual.
|
1229 |
|
|
|
1230 |
|
|
@end itemize
|
1231 |
|
|
|
1232 |
|
|
@end ifset
|
1233 |
|
|
|
1234 |
|
|
@node Open Questions, Bugs, treelang internals, Top
|
1235 |
|
|
@chapter Open Questions
|
1236 |
|
|
|
1237 |
|
|
If you know GCC well, please consider looking at the file treetree.c and
|
1238 |
|
|
resolving any questions marked "???".
|
1239 |
|
|
|
1240 |
|
|
@node Bugs, Service, Open Questions, Top
|
1241 |
|
|
@chapter Reporting Bugs
|
1242 |
|
|
@cindex bugs
|
1243 |
|
|
@cindex reporting bugs
|
1244 |
|
|
|
1245 |
|
|
You can report bugs to @email{@value{email-bugs}}. Please make
|
1246 |
|
|
sure bugs are real before reporting them. Follow the guidelines in the
|
1247 |
|
|
main GCC manual for submitting bug reports.
|
1248 |
|
|
|
1249 |
|
|
@menu
|
1250 |
|
|
* Sending Patches::
|
1251 |
|
|
@end menu
|
1252 |
|
|
|
1253 |
|
|
@node Sending Patches, , Bugs, Bugs
|
1254 |
|
|
@section Sending Patches for GNU Treelang
|
1255 |
|
|
|
1256 |
|
|
If you would like to write bug fixes or improvements for the GNU
|
1257 |
|
|
Treelang compiler, that is very helpful. Send suggested fixes to
|
1258 |
|
|
@email{@value{email-patches}}.
|
1259 |
|
|
|
1260 |
|
|
@node Service, Projects, Bugs, Top
|
1261 |
|
|
@chapter How To Get Help with GNU Treelang
|
1262 |
|
|
|
1263 |
|
|
If you need help installing, using or changing GNU Treelang, there are two
|
1264 |
|
|
ways to find it:
|
1265 |
|
|
|
1266 |
|
|
@itemize @bullet
|
1267 |
|
|
|
1268 |
|
|
@item
|
1269 |
|
|
Look in the service directory for someone who might help you for a fee.
|
1270 |
|
|
The service directory is found in the file named @file{SERVICE} in the
|
1271 |
|
|
GCC distribution.
|
1272 |
|
|
|
1273 |
|
|
@item
|
1274 |
|
|
Send a message to @email{@value{email-general}}.
|
1275 |
|
|
|
1276 |
|
|
@end itemize
|
1277 |
|
|
|
1278 |
|
|
@end ifset
|
1279 |
|
|
@ifset INTERNALS
|
1280 |
|
|
|
1281 |
|
|
@node Projects, Index, Service, Top
|
1282 |
|
|
@chapter Projects
|
1283 |
|
|
@cindex projects
|
1284 |
|
|
|
1285 |
|
|
If you want to contribute to @code{treelang} by doing research,
|
1286 |
|
|
design, specification, documentation, coding, or testing,
|
1287 |
|
|
the following information should give you some ideas.
|
1288 |
|
|
|
1289 |
|
|
Send a message to @email{@value{email-general}} if you plan to add a
|
1290 |
|
|
feature.
|
1291 |
|
|
|
1292 |
|
|
The main requirement for treelang is to add features and to add
|
1293 |
|
|
documentation. Features are things that the GCC back end can do but
|
1294 |
|
|
which are not reflected in treelang. Examples include structures,
|
1295 |
|
|
unions, pointers, arrays.
|
1296 |
|
|
|
1297 |
|
|
@end ifset
|
1298 |
|
|
|
1299 |
|
|
@node Index, , Projects, Top
|
1300 |
|
|
@unnumbered Index
|
1301 |
|
|
|
1302 |
|
|
@printindex cp
|
1303 |
|
|
@summarycontents
|
1304 |
|
|
@contents
|
1305 |
|
|
@bye
|