OpenCores
URL https://opencores.org/ocsvn/scarts/scarts/trunk

Subversion Repositories scarts

[/] [scarts/] [trunk/] [toolchain/] [scarts-gcc/] [gcc-4.1.1/] [gcc/] [treelang/] [treelang.texi] - Blame information for rev 16

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 12 jlechner
\input texinfo  @c -*-texinfo-*-
2
 
3
@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
4
@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
5
 
6
 
7
@c %**start of header
8
@setfilename treelang.info
9
 
10
@include gcc-common.texi
11
 
12
@set copyrights-treelang 1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005
13
 
14
@set email-general gcc@@gcc.gnu.org
15
@set email-bugs gcc-bugs@@gcc.gnu.org or bug-gcc@@gnu.org
16
@set email-patches gcc-patches@@gcc.gnu.org
17
@set path-treelang gcc/gcc/treelang
18
 
19
@set which-treelang GCC-@value{version-GCC}
20
@set which-GCC GCC
21
 
22
@set email-josling tej@@melbpc.org.au
23
@set www-josling http://www.geocities.com/timjosling
24
 
25
@c This tells @include'd files that they're part of the overall TREELANG doc
26
@c set.  (They might be part of a higher-level doc set too.)
27
@set DOC-TREELANG
28
 
29
@c @setfilename usetreelang.info
30
@c @setfilename maintaintreelang.info
31
@c To produce the full manual, use the "treelang.info" setfilename, and
32
@c make sure the following do NOT begin with '@c' (and the @clear lines DO)
33
@set INTERNALS
34
@set USING
35
@c To produce a user-only manual, use the "usetreelang.info" setfilename, and
36
@c make sure the following does NOT begin with '@c':
37
@c @clear INTERNALS
38
@c To produce a maintainer-only manual, use the "maintaintreelang.info" setfilename,
39
@c and make sure the following does NOT begin with '@c':
40
@c @clear USING
41
 
42
@ifset INTERNALS
43
@ifset USING
44
@settitle Using and Maintaining GNU Treelang
45
@end ifset
46
@end ifset
47
@c seems reasonable to assume at least one of INTERNALS or USING is set...
48
@ifclear INTERNALS
49
@settitle Using GNU Treelang
50
@end ifclear
51
@ifclear USING
52
@settitle Maintaining GNU Treelang
53
@end ifclear
54
@c then again, have some fun
55
@ifclear INTERNALS
56
@ifclear USING
57
@settitle Doing Very Little at all with GNU Treelang
58
@end ifclear
59
@end ifclear
60
 
61
@syncodeindex fn cp
62
@syncodeindex vr cp
63
@c %**end of header
64
 
65
@c Cause even numbered pages to be printed on the left hand side of
66
@c the page and odd numbered pages to be printed on the right hand
67
@c side of the page.  Using this, you can print on both sides of a
68
@c sheet of paper and have the text on the same part of the sheet.
69
 
70
@c The text on right hand pages is pushed towards the right hand
71
@c margin and the text on left hand pages is pushed toward the left
72
@c hand margin.
73
@c (To provide the reverse effect, set bindingoffset to -0.75in.)
74
 
75
@c @tex
76
@c \global\bindingoffset=0.75in
77
@c \global\normaloffset =0.75in
78
@c @end tex
79
 
80
@copying
81
Copyright @copyright{} @value{copyrights-treelang} Free Software Foundation, Inc.
82
 
83
Permission is granted to copy, distribute and/or modify this document
84
under the terms of the GNU Free Documentation License, Version 1.2 or
85
any later version published by the Free Software Foundation; with the
86
Invariant Sections being ``GNU General Public License'', the Front-Cover
87
texts being (a) (see below), and with the Back-Cover Texts being (b)
88
(see below).  A copy of the license is included in the section entitled
89
``GNU Free Documentation License''.
90
 
91
(a) The FSF's Front-Cover Text is:
92
 
93
     A GNU Manual
94
 
95
(b) The FSF's Back-Cover Text is:
96
 
97
     You have freedom to copy and modify this GNU Manual, like GNU
98
     software.  Copies published by the Free Software Foundation raise
99
     funds for GNU development.
100
@end copying
101
 
102
@ifnottex
103
@dircategory Programming
104
@direntry
105
* treelang: (treelang).                  The GNU Treelang compiler.
106
@end direntry
107
@ifset INTERNALS
108
@ifset USING
109
This file documents the use and the internals of the GNU Treelang
110
(@code{treelang}) compiler.  At the moment this manual is not
111
incorporated into the main GCC manual as it is incomplete.  It
112
corresponds to the @value{which-treelang} version of @code{treelang}.
113
@end ifset
114
@end ifset
115
@ifclear USING
116
This file documents the internals of the GNU Treelang (@code{treelang}) compiler.
117
It corresponds to the @value{which-treelang} version of @code{treelang}.
118
@end ifclear
119
@ifclear INTERNALS
120
This file documents the use of the GNU Treelang (@code{treelang}) compiler.
121
It corresponds to the @value{which-treelang} version of @code{treelang}.
122
@end ifclear
123
 
124
Published by the Free Software Foundation
125
51 Franklin Street, Fifth Floor
126
Boston, MA 02110-1301 USA
127
 
128
@insertcopying
129
@end ifnottex
130
 
131
@setchapternewpage odd
132
@c @finalout
133
@titlepage
134
@ifset INTERNALS
135
@ifset USING
136
@center @titlefont{Using and Maintaining GNU Treelang}
137
 
138
@end ifset
139
@end ifset
140
@ifclear INTERNALS
141
@title Using GNU Treelang
142
@end ifclear
143
@ifclear USING
144
@title Maintaining GNU Treelang
145
@end ifclear
146
@sp 2
147
@center Tim Josling
148
@page
149
@vskip 0pt plus 1filll
150
For the @value{which-treelang} Version*
151
@sp 1
152
Published by the Free Software Foundation @*
153
51 Franklin Street, Fifth Floor@*
154
Boston, MA 02110-1301, USA@*
155
@c Last printed ??ber, 19??.@*
156
@c Printed copies are available for $? each.@*
157
@c ISBN ???
158
@sp 1
159
@insertcopying
160
@end titlepage
161
@page
162
 
163
@ifnottex
164
 
165
@node Top, Copying,, (dir)
166
@top Introduction
167
@cindex Introduction
168
 
169
@ifset INTERNALS
170
@ifset USING
171
This manual documents how to run, install and maintain @code{treelang}.
172
It also documents the features and incompatibilities in the @value{which-treelang}
173
version of @code{treelang}.
174
@end ifset
175
@end ifset
176
 
177
@ifclear INTERNALS
178
This manual documents how to run and install @code{treelang}.
179
It also documents the features and incompatibilities in the @value{which-treelang}
180
version of @code{treelang}.
181
@end ifclear
182
@ifclear USING
183
This manual documents how to maintain @code{treelang}.
184
It also documents the features and incompatibilities in the @value{which-treelang}
185
version of @code{treelang}.
186
@end ifclear
187
 
188
@end ifnottex
189
 
190
@menu
191
* Copying::
192
* Contributors::
193
* GNU Free Documentation License::
194
* Funding::
195
* Getting Started::
196
* What is GNU Treelang?::
197
* Lexical Syntax::
198
* Parsing Syntax::
199
* Compiler Overview::
200
* TREELANG and GCC::
201
* Compiler::
202
* Other Languages::
203
* treelang internals::
204
* Open Questions::
205
* Bugs::
206
* Service::
207
* Projects::
208
* Index::
209
 
210
@detailmenu
211
 --- The Detailed Node Listing ---
212
 
213
Other Languages
214
 
215
* Interoperating with C and C++::
216
 
217
treelang internals
218
 
219
* treelang files::
220
* treelang compiler interfaces::
221
* Hints and tips::
222
 
223
treelang compiler interfaces
224
 
225
* treelang driver::
226
* treelang main compiler::
227
 
228
treelang main compiler
229
 
230
* Interfacing to toplev.c::
231
* Interfacing to the garbage collection::
232
* Interfacing to the code generation code. ::
233
 
234
Reporting Bugs
235
 
236
* Sending Patches::
237
 
238
@end detailmenu
239
@end menu
240
 
241
@include gpl.texi
242
 
243
@include fdl.texi
244
 
245
@node Contributors
246
 
247
@unnumbered Contributors to GNU Treelang
248
@cindex contributors
249
@cindex credits
250
 
251
Treelang was based on 'toy' by Richard Kenner, and also uses code from
252
the GCC core code tree.  Tim Josling first created the language and
253
documentation, based on the GCC Fortran compiler's documentation
254
framework.  Treelang was updated to use the TreeSSA infrastructure by
255
James A. Morrison.
256
 
257
@itemize @bullet
258
@item
259
The packaging and compiler portions of GNU Treelang are based largely
260
on the GCC compiler.
261
@xref{Contributors,,Contributors to GCC,GCC,Using and Maintaining GCC},
262
for more information.
263
 
264
@item
265
There is no specific run-time library for treelang, other than the
266
standard C runtime.
267
 
268
@item
269
It would have been difficult to build treelang without access to Joachim
270
Nadler's guide to writing a front end to GCC (written in German).  A
271
translation of this document into English is available via the
272
CobolForGCC project or via the documentation links from the GCC home
273
page @uref{http://gcc.gnu.org}.
274
@end itemize
275
 
276
@include funding.texi
277
 
278
@node Getting Started
279
@chapter Getting Started
280
@cindex getting started
281
@cindex new users
282
@cindex newbies
283
@cindex beginners
284
 
285
Treelang is a sample language, useful only to help people understand how
286
to implement a new language front end to GCC.  It is not a useful
287
language in itself other than as an example or basis for building a new
288
language.  Therefore only language developers are likely to have an
289
interest in it.
290
 
291
This manual assumes familiarity with GCC, which you can obtain by using
292
it and by reading the manuals @samp{Using the GNU Compiler Collection (GCC)}
293
and @samp{GNU Compiler Collection (GCC) Internals}.
294
 
295
To install treelang, follow the GCC installation instructions,
296
taking care to ensure you specify treelang in the configure step by adding
297
treelang to the list of languages specified by @option{--enable-languages},
298
e.g.@: @samp{--enable-languages=all,treelang}.
299
 
300
If you're generally curious about the future of
301
@code{treelang}, see @ref{Projects}.
302
If you're curious about its past,
303
see @ref{Contributors}.
304
 
305
To see a few of the questions maintainers of @code{treelang} have,
306
and that you might be able to answer,
307
see @ref{Open Questions}.
308
 
309
@ifset USING
310
@node What is GNU Treelang?, Lexical Syntax, Getting Started, Top
311
@chapter What is GNU Treelang?
312
@cindex concepts, basic
313
@cindex basic concepts
314
 
315
GNU Treelang, or @code{treelang}, is designed initially as a free
316
replacement for, or alternative to, the 'toy' language, but which is
317
amenable to inclusion within the GCC source tree.
318
 
319
@code{treelang} is largely a cut down version of C, designed to showcase
320
the features of the GCC code generation back end.  Only those features
321
that are directly supported by the GCC code generation back end are
322
implemented.  Features are implemented in a manner which is easiest and
323
clearest to implement.  Not all or even most code generation back end
324
features are implemented.  The intention is to add features incrementally
325
until most features of the GCC back end are implemented in treelang.
326
 
327
The main features missing are structures, arrays and pointers.
328
 
329
A sample program follows:
330
 
331
@smallexample
332
// @r{function prototypes}
333
// @r{function 'add' taking two ints and returning an int}
334
external_definition int add(int arg1, int arg2);
335
external_definition int subtract(int arg3, int arg4);
336
external_definition int first_nonzero(int arg5, int arg6);
337
external_definition int double_plus_one(int arg7);
338
 
339
// @r{function definition}
340
add
341
@{
342
  // @r{return the sum of arg1 and arg2}
343
  return arg1 + arg2;
344
@}
345
 
346
 
347
subtract
348
@{
349
  return arg3 - arg4;
350
@}
351
 
352
double_plus_one
353
@{
354
  // @r{aaa is a variable, of type integer and allocated at the start of}
355
  // @r{the function}
356
  automatic int aaa;
357
  // @r{set aaa to the value returned from add, when passed arg7 and arg7 as}
358
  // @r{the two parameters}
359
  aaa=add(arg7, arg7);
360
  aaa=add(aaa, aaa);
361
  aaa=subtract(subtract(aaa, arg7), arg7) + 1;
362
  return aaa;
363
@}
364
 
365
first_nonzero
366
@{
367
  // @r{C-like if statement}
368
  if (arg5)
369
    @{
370
      return arg5;
371
    @}
372
  else
373
    @{
374
    @}
375
  return arg6;
376
@}
377
@end smallexample
378
 
379
@node Lexical Syntax, Parsing Syntax, What is GNU Treelang?, Top
380
@chapter Lexical Syntax
381
@cindex Lexical Syntax
382
 
383
Treelang programs consist of whitespace, comments, keywords and names.
384
@itemize @bullet
385
 
386
@item
387
Whitespace consists of the space character, a tab, and the end of line
388
character.  Line terminations are as defined by the
389
standard C library.  Whitespace is ignored except within comments,
390
and where it separates parts of the program.  In the example below, A and
391
B are two separate names separated by whitespace.
392
 
393
@smallexample
394
A B
395
@end smallexample
396
 
397
@item
398
Comments consist of @samp{//} followed by any characters up to the end
399
of the line.  C style comments (/* */) are not supported.  For example,
400
the assignment below is followed by a not very helpful comment.
401
 
402
@smallexample
403
x = 1; // @r{Set X to 1}
404
@end smallexample
405
 
406
@item
407
Keywords consist of any of the following reserved words or symbols:
408
 
409
@itemize @bullet
410
@item @{
411
used to start the statements in a function
412
@item @}
413
used to end the statements in a function
414
@item (
415
start list of function arguments, or to change the precedence of operators in
416
an expression
417
@item )
418
end list or prioritized operators in expression
419
@item ,
420
used to separate parameters in a function prototype or in a function call
421
@item ;
422
used to end a statement
423
@item +
424
addition, or unary plus for signed literals
425
@item -
426
subtraction, or unary minus for signed literals
427
@item =
428
assignment
429
@item ==
430
equality test
431
@item if
432
begin IF statement
433
@item else
434
begin 'else' portion of IF statement
435
@item static
436
indicate variable is permanent, or function has file scope only
437
@item automatic
438
indicate that variable is allocated for the life of the current scope
439
@item external_reference
440
indicate that variable or function is defined in another file
441
@item external_definition
442
indicate that variable or function is to be accessible from other files
443
@item int
444
variable is an integer (same as C int)
445
@item char
446
variable is a character (same as C char)
447
@item unsigned
448
variable is unsigned. If this is not present, the variable is signed
449
@item return
450
start function return statement
451
@item void
452
used as function type to indicate function returns nothing
453
@end itemize
454
 
455
 
456
@item
457
Names consist of any letter or "_" followed by any number of letters,
458
numbers, or "_".  "$" is not allowed in a name.  All names must be globally
459
unique, i.e. may not be used twice in any context, and must
460
not be a keyword.  Names and keywords are case sensitive.  For example:
461
 
462
@smallexample
463
a A _a a_ IF_X
464
@end smallexample
465
 
466
are all different names.
467
 
468
@end itemize
469
 
470
@node Parsing Syntax, Compiler Overview, Lexical Syntax, Top
471
@chapter Parsing Syntax
472
@cindex Parsing Syntax
473
 
474
Declarations are built up from the lexical elements described above.  A
475
file may contain one of more declarations.
476
 
477
@itemize @bullet
478
 
479
@item
480
declaration: variable declaration OR function prototype OR function declaration
481
 
482
@item
483
Function Prototype: storage type NAME ( optional_parameter_list )
484
 
485
@smallexample
486
static int add (int a, int b)
487
@end smallexample
488
 
489
@item
490
variable_declaration: storage type NAME initial;
491
 
492
Example:
493
 
494
@smallexample
495
int temp1 = 1;
496
@end smallexample
497
 
498
A variable declaration can be outside a function, or at the start of a
499
function.
500
 
501
@item
502
storage: automatic OR static OR external_reference OR external_definition
503
 
504
This defines the scope, duration and visibility of a function or variable
505
 
506
@enumerate 1
507
 
508
@item
509
automatic: This means a variable is allocated at start of the current scope and
510
released when the current scope is exited.  This can only be used for variables
511
within functions.  It cannot be used for functions.
512
 
513
@item
514
static: This means a variable is allocated at start of program and
515
remains allocated until the program as a whole ends.  For a function, it
516
means that the function is only visible within the current file.
517
 
518
@item
519
external_definition: For a variable, which must be defined outside a
520
function, it means that the variable is visible from other files.  For a
521
function, it means that the function is visible from another file.
522
 
523
@item
524
external_reference: For a variable, which must be defined outside a
525
function, it means that the variable is defined in another file.  For a
526
function, it means that the function is defined in another file.
527
 
528
@end enumerate
529
 
530
@item
531
type: int OR unsigned int OR char OR unsigned char OR void
532
 
533
This defines the data type of a variable or the return type of a function.
534
 
535
@enumerate a
536
 
537
@item
538
int: The variable is a signed integer.  The function returns a signed integer.
539
 
540
@item
541
unsigned int: The variable is an unsigned integer.  The function returns an unsigned integer.
542
 
543
@item
544
char: The variable is a signed character.  The function returns a signed character.
545
 
546
@item
547
unsigned char: The variable is an unsigned character.  The function returns an unsigned character.
548
 
549
@end enumerate
550
 
551
@item
552
parameter_list OR parameter [, parameter]...
553
 
554
@item
555
parameter: variable_declaration ,
556
 
557
The variable declarations must not have initializations.
558
 
559
@item
560
initial: = value
561
 
562
@item
563
value: integer_constant
564
 
565
Values without a unary plus or minus are considered to be unsigned.
566
@smallexample
567
e.g.@: 1 +2 -3
568
@end smallexample
569
 
570
@item
571
function_declaration: name @{ variable_declarations statements @}
572
 
573
A function consists of the function name then the declarations (if any)
574
and statements (if any) within one pair of braces.
575
 
576
The details of the function arguments come from the function
577
prototype.  The function prototype must precede the function declaration
578
in the file.
579
 
580
@item
581
statement: if_statement OR expression_statement OR return_statement
582
 
583
@item
584
if_statement: if ( expression ) @{ variable_declarations statements @}
585
else @{ variable_declarations statements @}
586
 
587
The first lot of statements is executed if the expression is
588
nonzero.  Otherwise the second lot of statements is executed.  Either
589
list of statements may be empty, but both sets of braces and the else must be present.
590
 
591
@smallexample
592
if (a==b)
593
@{
594
// @r{nothing}
595
@}
596
else
597
@{
598
a=b;
599
@}
600
@end smallexample
601
 
602
@item
603
expression_statement: expression;
604
 
605
The expression is executed, including any side effects.
606
 
607
@item
608
return_statement: return expression_opt;
609
 
610
Returns from the function. If the function is void, the expression must
611
be absent, and if the function is not void the expression must be
612
present.
613
 
614
@item
615
expression: variable OR integer_constant OR expression + expression
616
OR expression - expression OR expression == expression OR ( expression )
617
OR variable = expression OR function_call
618
 
619
An expression can be a constant or a variable reference or a
620
function_call.  Expressions can be combined as a sum of two expressions
621
or the difference of two expressions, or an equality test of two
622
expressions.  An assignment is also an expression.  Expressions and operator
623
precedence work as in C.
624
 
625
@item
626
function_call: function_name ( optional_comma_separated_expressions )
627
 
628
This invokes the function, passing to it the values of the expressions
629
as actual parameters.
630
 
631
@end itemize
632
 
633
@cindex compilers
634
@node Compiler Overview, TREELANG and GCC, Parsing Syntax, Top
635
@chapter Compiler Overview
636
treelang is run as part of the GCC compiler.
637
 
638
@itemize @bullet
639
@cindex source code
640
@cindex file, source
641
@cindex code, source
642
@cindex source file
643
@item
644
It reads a user's program, stored in a file and containing instructions
645
written in the appropriate language (Treelang, C, and so on).  This file
646
contains @dfn{source code}.
647
 
648
@cindex translation of user programs
649
@cindex machine code
650
@cindex code, machine
651
@cindex mistakes
652
@item
653
It translates the user's program into instructions a computer can carry
654
out more quickly than it takes to translate the instructions in the
655
first place.  These instructions are called @dfn{machine code}---code
656
designed to be efficiently translated and processed by a machine such as
657
a computer.  Humans usually aren't as good writing machine code as they
658
are at writing Treelang or C, because it is easy to make tiny mistakes
659
writing machine code.  When writing Treelang or C, it is easy to make
660
big mistakes. But you can only make one mistake, because the compiler
661
stops after it finds any problem.
662
 
663
@cindex debugger
664
@cindex bugs, finding
665
@cindex @code{gdb}, command
666
@cindex commands, @code{gdb}
667
@item
668
It provides information in the generated machine code
669
that can make it easier to find bugs in the program
670
(using a debugging tool, called a @dfn{debugger},
671
such as @code{gdb}).
672
 
673
@cindex libraries
674
@cindex linking
675
@cindex @code{ld} command
676
@cindex commands, @code{ld}
677
@item
678
It locates and gathers machine code already generated to perform actions
679
requested by statements in the user's program.  This machine code is
680
organized into @dfn{libraries} and is located and gathered during the
681
@dfn{link} phase of the compilation process.  (Linking often is thought
682
of as a separate step, because it can be directly invoked via the
683
@code{ld} command.  However, the @code{gcc} command, as with most
684
compiler commands, automatically performs the linking step by calling on
685
@code{ld} directly, unless asked to not do so by the user.)
686
 
687
@cindex language, incorrect use of
688
@cindex incorrect use of language
689
@item
690
It attempts to diagnose cases where the user's program contains
691
incorrect usages of the language.  The @dfn{diagnostics} produced by the
692
compiler indicate the problem and the location in the user's source file
693
where the problem was first noticed.  The user can use this information
694
to locate and fix the problem.
695
 
696
The compiler stops after the first error.  There are no plans to fix
697
this, ever, as it would vastly complicate the implementation of treelang
698
to little or no benefit.
699
 
700
@cindex diagnostics, incorrect
701
@cindex incorrect diagnostics
702
@cindex error messages, incorrect
703
@cindex incorrect error messages
704
(Sometimes an incorrect usage of the language leads to a situation where
705
the compiler can not make any sense of what it reads---while a human
706
might be able to---and thus ends up complaining about an incorrect
707
``problem'' it encounters that, in fact, reflects a misunderstanding of
708
the programmer's intention.)
709
 
710
@cindex warnings
711
@cindex questionable instructions
712
@item
713
There are a few warnings in treelang.  For example an unused static function
714
generate a warnings when -Wunused-function is specified, similarly an unused
715
static variable generates a warning when -Wunused-variable are specified.
716
The only treelang specific warning is a warning when an expression is in a
717
return statement for functions that return void.
718
@end itemize
719
 
720
@cindex components of treelang
721
@cindex @code{treelang}, components of
722
@code{treelang} consists of several components:
723
 
724
@cindex @code{gcc}, command
725
@cindex commands, @code{gcc}
726
@itemize @bullet
727
@item
728
A modified version of the @code{gcc} command, which also might be
729
installed as the system's @code{cc} command.
730
(In many cases, @code{cc} refers to the
731
system's ``native'' C compiler, which
732
might be a non-GNU compiler, or an older version
733
of @code{GCC} considered more stable or that is
734
used to build the operating system kernel.)
735
 
736
@cindex @code{treelang}, command
737
@cindex commands, @code{treelang}
738
@item
739
The @code{treelang} command itself.
740
 
741
@item
742
The @code{libc} run-time library.  This library contains the machine
743
code needed to support capabilities of the Treelang language that are
744
not directly provided by the machine code generated by the
745
@code{treelang} compilation phase.  This is the same library that the
746
main C compiler uses (libc).
747
 
748
@cindex @code{tree1}, program
749
@cindex programs, @code{tree1}
750
@cindex assembler
751
@cindex @code{as} command
752
@cindex commands, @code{as}
753
@cindex assembly code
754
@cindex code, assembly
755
@item
756
The compiler itself, is internally named @code{tree1}.
757
 
758
Note that @code{tree1} does not generate machine code directly---it
759
generates @dfn{assembly code} that is a more readable form
760
of machine code, leaving the conversion to actual machine code
761
to an @dfn{assembler}, usually named @code{as}.
762
@end itemize
763
 
764
@code{GCC} is often thought of as ``the C compiler'' only,
765
but it does more than that.
766
Based on command-line options and the names given for files
767
on the command line, @code{gcc} determines which actions to perform, including
768
preprocessing, compiling (in a variety of possible languages), assembling,
769
and linking.
770
 
771
@cindex driver, gcc command as
772
@cindex @code{gcc}, command as driver
773
@cindex executable file
774
@cindex files, executable
775
@cindex cc1 program
776
@cindex programs, cc1
777
@cindex preprocessor
778
@cindex cpp program
779
@cindex programs, cpp
780
For example, the command @samp{gcc foo.c} @dfn{drives} the file
781
@file{foo.c} through the preprocessor @code{cpp}, then
782
the C compiler (internally named
783
@code{cc1}), then the assembler (usually @code{as}), then the linker
784
(@code{ld}), producing an executable program named @file{a.out} (on
785
UNIX systems).
786
 
787
@cindex treelang program
788
@cindex programs, treelang
789
As another example, the command @samp{gcc foo.tree} would do much the
790
same as @samp{gcc foo.c}, but instead of using the C compiler named
791
@code{cc1}, @code{gcc} would use the treelang compiler (named
792
@code{tree1}). However there is no preprocessor for treelang.
793
 
794
@cindex @code{tree1}, program
795
@cindex programs, @code{tree1}
796
In a GNU Treelang installation, @code{gcc} recognizes Treelang source
797
files by name just like it does C and C++ source files.  It knows to use
798
the Treelang compiler named @code{tree1}, instead of @code{cc1} or
799
@code{cc1plus}, to compile Treelang files.  If a file's name ends in
800
@code{.tree} then GCC knows that the program is written in treelang.  You
801
can also manually override the language.
802
 
803
@cindex @code{gcc}, not recognizing Treelang source
804
@cindex unrecognized file format
805
@cindex file format not recognized
806
Non-Treelang-related operation of @code{gcc} is generally
807
unaffected by installing the GNU Treelang version of @code{gcc}.
808
However, without the installed version of @code{gcc} being the
809
GNU Treelang version, @code{gcc} will not be able to compile
810
and link Treelang programs.
811
 
812
@cindex printing version information
813
@cindex version information, printing
814
The command @samp{gcc -v x.tree} where @samp{x.tree} is a file which
815
must exist but whose contents are ignored, is a quick way to display
816
version information for the various programs used to compile a typical
817
Treelang source file.
818
 
819
The @code{tree1} program represents most of what is unique to GNU
820
Treelang; @code{tree1} is a combination of two rather large chunks of
821
code.
822
 
823
@cindex GCC Back End (GBE)
824
@cindex GBE
825
@cindex @code{GCC}, back end
826
@cindex back end, GCC
827
@cindex code generator
828
One chunk is the so-called @dfn{GNU Back End}, or GBE,
829
which knows how to generate fast code for a wide variety of processors.
830
The same GBE is used by the C, C++, and Treelang compiler programs @code{cc1},
831
@code{cc1plus}, and @code{tree1}, plus others.
832
Often the GBE is referred to as the ``GCC back end'' or
833
even just ``GCC''---in this manual, the term GBE is used
834
whenever the distinction is important.
835
 
836
@cindex GNU Treelang Front End (TFE)
837
@cindex tree1
838
@cindex @code{treelang}, front end
839
@cindex front end, @code{treelang}
840
The other chunk of @code{tree1} is the majority of what is unique about
841
GNU Treelang---the code that knows how to interpret Treelang programs to
842
determine what they are intending to do, and then communicate that
843
knowledge to the GBE for actual compilation of those programs.  This
844
chunk is called the @dfn{Treelang Front End} (TFE).  The @code{cc1} and
845
@code{cc1plus} programs have their own front ends, for the C and C++
846
languages, respectively.  These fronts ends are responsible for
847
diagnosing incorrect usage of their respective languages by the programs
848
the process, and are responsible for most of the warnings about
849
questionable constructs as well.  (The GBE in principle handles
850
producing some warnings, like those concerning possible references to
851
undefined variables, but these warnings should not occur in treelang
852
programs as the front end is meant to pick them up first).
853
 
854
Because so much is shared among the compilers for various languages,
855
much of the behavior and many of the user-selectable options for these
856
compilers are similar.
857
For example, diagnostics (error messages and
858
warnings) are similar in appearance; command-line
859
options like @samp{-Wall} have generally similar effects; and the quality
860
of generated code (in terms of speed and size) is roughly similar
861
(since that work is done by the shared GBE).
862
 
863
@node TREELANG and GCC, Compiler, Compiler Overview, Top
864
@chapter Compile Treelang, C, or Other Programs
865
@cindex compiling programs
866
@cindex programs, compiling
867
 
868
@cindex @code{gcc}, command
869
@cindex commands, @code{gcc}
870
A GNU Treelang installation includes a modified version of the @code{gcc}
871
command.
872
 
873
In a non-Treelang installation, @code{gcc} recognizes C, C++,
874
and Objective-C source files.
875
 
876
In a GNU Treelang installation, @code{gcc} also recognizes Treelang source
877
files and accepts Treelang-specific command-line options, plus some
878
command-line options that are designed to cater to Treelang users
879
but apply to other languages as well.
880
 
881
@xref{G++ and GCC,,Programming Languages Supported by GCC,GCC,Using
882
the GNU Compiler Collection (GCC)},
883
for information on the way different languages are handled
884
by the GCC compiler (@code{gcc}).
885
 
886
You can use this, combined with the output of the @samp{gcc -v x.tree}
887
command to get the options applicable to treelang.  Treelang programs
888
must end with the suffix @samp{.tree}.
889
 
890
@cindex preprocessor
891
 
892
Treelang programs are not by default run through the C
893
preprocessor by @code{gcc}. There is no reason why they cannot be run through the
894
preprocessor manually, but you would need to prevent the preprocessor
895
from generating #line directives, using the @samp{-P} option, otherwise
896
tree1 will not accept the input.
897
 
898
@node Compiler, Other Languages, TREELANG and GCC, Top
899
@chapter The GNU Treelang Compiler
900
 
901
The GNU Treelang compiler, @code{treelang}, supports programs written
902
in the GNU Treelang language.
903
 
904
@node Other Languages, treelang internals, Compiler, Top
905
@chapter Other Languages
906
 
907
@menu
908
* Interoperating with C and C++::
909
@end menu
910
 
911
@node Interoperating with C and C++,  , Other Languages, Other Languages
912
@section Tools and advice for interoperating with C and C++
913
 
914
The output of treelang programs looks like C program code to the linker
915
and everybody else, so you should be able to freely mix treelang and C
916
(and C++) code, with one proviso.
917
 
918
C promotes small integer types to 'int' when used as function parameters and
919
return values in non-prototyped functions.  Since treelang has no
920
non-prototyped functions, the treelang compiler does not do this.
921
 
922
@ifset INTERNALS
923
@node treelang internals, Open Questions, Other Languages, Top
924
@chapter treelang internals
925
 
926
@menu
927
* treelang files::
928
* treelang compiler interfaces::
929
* Hints and tips::
930
@end menu
931
 
932
@node treelang files, treelang compiler interfaces, treelang internals, treelang internals
933
@section treelang files
934
 
935
To create a compiler that integrates into GCC, you need create many
936
files.  Some of the files are integrated into the main GCC makefile, to
937
build the various parts of the compiler and to run the test
938
suite.  Others are incorporated into various GCC programs such as
939
@file{gcc.c}.  Finally you must provide the actual programs comprising your
940
compiler.
941
 
942
@cindex files
943
 
944
The files are:
945
 
946
@enumerate 1
947
 
948
@item
949
COPYING.  This is the copyright file, assuming you are going to use the
950
GNU General Public License.  You probably need to use the GPL because if
951
you use the GCC back end your program and the back end are one program,
952
and the back end is GPLed.
953
 
954
This need not be present if the language is incorporated into the main
955
GCC tree, as the main GCC directory has this file.
956
 
957
@item
958
COPYING.LIB.  This is the copyright file for those parts of your program
959
that are not to be covered by the GPL, but are instead to be covered by
960
the LGPL (Library or Lesser GPL).  This license may be appropriate for
961
the library routines associated with your compiler. These are the
962
routines that are linked with the @emph{output} of the compiler.  Using
963
the LGPL for these programs allows programs written using your compiler
964
to be closed source. For example LIBC is under the LGPL.
965
 
966
This need not be present if the language is incorporated into the main
967
GCC tree, as the main GCC directory has this file.
968
 
969
@item
970
ChangeLog.  Record all the changes to your compiler.  Use the same format
971
as used in treelang as it is supported by an emacs editing mode and is
972
part of the FSF coding standard.  Normally each directory has its own
973
changelog.  The FSF standard allows but does not require a meaningful
974
comment on why the changes were made, above and beyond @emph{why} they
975
were made.  In the author's opinion it is useful to provide this
976
information.
977
 
978
@item
979
treelang.texi.  The manual, written in texinfo. Your manual would have a
980
different file name.  You need not write it in texinfo if you don't want
981
do, but a lot of GNU software does use texinfo.
982
 
983
@cindex Make-lang.in
984
@item
985
Make-lang.in.  This file is part of the make file which in incorporated
986
with the GCC make file skeleton (Makefile.in in the GCC directory) to
987
make Makefile, as part of the configuration process.
988
 
989
Makefile in turn is the main instruction to actually build
990
everything.  The build instructions are held in the main GCC manual and
991
web site so they are not repeated here.
992
 
993
There are some comments at the top which will help you understand what
994
you need to do.
995
 
996
There are make commands to build things, remove generated files with
997
various degrees of thoroughness, count the lines of code (so you know
998
how much progress you are making), build info and html files from the
999
texinfo source, run the tests etc.
1000
 
1001
@item
1002
README.  Just a brief informative text file saying what is in this
1003
directory.
1004
 
1005
@cindex config-lang.in
1006
@item
1007
config-lang.in.  This file is read by the configuration progress and must
1008
be present. You specify the name of your language, the name(s) of the
1009
compiler(s) including preprocessors you are going to build, whether any,
1010
usually generated, files should be excluded from diffs (ie when making
1011
diff files to send in patches).  Whether the equate 'stagestuff' is used
1012
is unknown (???).
1013
 
1014
@cindex lang.opt
1015
@item
1016
lang.opt.  This file is included into @file{gcc.c}, the main GCC driver, and
1017
tells it what options your language supports.  This is also used to
1018
display help.
1019
 
1020
@cindex lang-specs.h
1021
@item
1022
lang-specs.h.  This file is also included in @file{gcc.c}. It tells
1023
@file{gcc.c} when to call your programs and what options to send them.  The
1024
mini-language 'specs' is documented in the source of @file{gcc.c}.  Do not
1025
attempt to write a specs file from scratch - use an existing one as the base
1026
and enhance it.
1027
 
1028
@item
1029
Your texi files.  Texinfo can be used to build documentation in HTML,
1030
info, dvi and postscript formats. It is a tagged language, is documented
1031
in its own manual, and has its own emacs mode.
1032
 
1033
@item
1034
Your programs.  The relationships between all the programs are explained
1035
in the next section.  You need to write or use the following programs:
1036
 
1037
@itemize @bullet
1038
 
1039
@item
1040
lexer.  This breaks the input into words and passes these to the
1041
parser.  This is @file{lex.l} in treelang, which is passed through flex, a lex
1042
variant, to produce C code @file{lex.c}.  Note there is a school of thought
1043
that says real men hand code their own lexers.  However, you may prefer to
1044
write far less code and use flex, as was done with treelang.
1045
 
1046
@item
1047
parser.  This breaks the program into recognizable constructs such as
1048
expressions, statements etc.  This is @file{parse.y} in treelang, which is
1049
passed through bison, which is a yacc variant, to produce C code
1050
@file{parse.c}.
1051
 
1052
@item
1053
back end interface.  This interfaces to the code generation back end.  In
1054
treelang, this is @file{tree1.c} which mainly interfaces to @file{toplev.c} and
1055
@file{treetree.c} which mainly interfaces to everything else. Many languages
1056
mix up the back end interface with the parser, as in the C compiler for
1057
example.  It is a matter of taste which way to do it, but with treelang
1058
it is separated out to make the back end interface cleaner and easier to
1059
understand.
1060
 
1061
@item
1062
header files.  For function prototypes and common data items.  One point
1063
to note here is that bison can generate a header files with all the
1064
numbers is has assigned to the keywords and symbols, and you can include
1065
the same header in your lexer.  This technique is demonstrated in
1066
treelang.
1067
 
1068
@item
1069
compiler main file.  GCC comes with a file @file{toplev.c} which is a
1070
perfectly serviceable main program for your compiler.  GNU Treelang uses
1071
@file{toplev.c} but other languages have been known to replace it with their
1072
own main program.  Again this is a matter of taste and how much code you
1073
want to write.
1074
 
1075
@end itemize
1076
 
1077
@end enumerate
1078
 
1079
@node treelang compiler interfaces, Hints and tips, treelang files, treelang internals
1080
@section treelang compiler interfaces
1081
 
1082
@cindex driver
1083
@cindex toplev.c
1084
 
1085
@menu
1086
* treelang driver::
1087
* treelang main compiler::
1088
@end menu
1089
 
1090
@node treelang driver, treelang main compiler, treelang compiler interfaces, treelang compiler interfaces
1091
@subsection treelang driver
1092
 
1093
The GCC compiler consists of a driver, which then executes the various
1094
compiler phases based on the instructions in the specs files.
1095
 
1096
Typically a program's language will be identified from its suffix
1097
(e.g., @file{.tree}) for treelang programs.
1098
 
1099
The driver (@file{gcc.c}) will then drive (exec) in turn a preprocessor,
1100
the main compiler, the assembler and the link editor. Options to GCC allow you
1101
to override all of this. In the case of treelang programs there is no
1102
preprocessor, and mostly these days the C preprocessor is run within the
1103
main C compiler rather than as a separate process, apparently for reasons of speed.
1104
 
1105
You will be using the standard assembler and linkage editor so these are
1106
ignored from now on.
1107
 
1108
You have to write your own preprocessor if you want one.  This is usually
1109
totally language specific.  The main point to be aware of is to ensure
1110
that you find some way to pass file name and line number information
1111
through to the main compiler so that it can tell the back end this
1112
information and so the debugger can find the right source line for each
1113
piece of code.  That is all there is to say about the preprocessor except
1114
that the preprocessor will probably not be the slowest part of the
1115
compiler and will probably not use the most memory so don't waste too
1116
much time tuning it until you know you need to do so.
1117
 
1118
@node treelang main compiler,  , treelang driver, treelang compiler interfaces
1119
@subsection treelang main compiler
1120
 
1121
The main compiler for treelang consists of @file{toplev.c} from the main GCC
1122
compiler, the parser, lexer and back end interface routines, and the
1123
back end routines themselves, of which there are many.
1124
 
1125
@file{toplev.c} does a lot of work for you and you should almost certainly
1126
use it.
1127
 
1128
Writing this code is the hard part of creating a compiler using GCC.  The
1129
back end interface documentation is incomplete and the interface is
1130
complex.
1131
 
1132
There are three main aspects to interfacing to the other GCC code.
1133
 
1134
@menu
1135
* Interfacing to toplev.c::
1136
* Interfacing to the garbage collection::
1137
* Interfacing to the code generation code. ::
1138
@end menu
1139
 
1140
@node Interfacing to toplev.c, Interfacing to the garbage collection, treelang main compiler, treelang main compiler
1141
@subsubsection Interfacing to toplev.c
1142
 
1143
In treelang this is handled mainly in tree1.c
1144
and partly in treetree.c. Peruse toplev.c for details of what you need
1145
to do.
1146
 
1147
@node Interfacing to the garbage collection, Interfacing to the code generation code. , Interfacing to toplev.c, treelang main compiler
1148
@subsubsection Interfacing to the garbage collection
1149
 
1150
Interfacing to the garbage collection. In treelang this is mainly in
1151
tree1.c.
1152
 
1153
Memory allocation in the compiler should be done using the ggc_alloc and
1154
kindred routines in ggc*.*. At the end of every 'function' in your language, toplev.c calls
1155
the garbage collection several times. The garbage collection calls mark
1156
routines which go through the memory which is still used, telling the
1157
garbage collection not to free it. Then all the memory not used is
1158
freed.
1159
 
1160
What this means is that you need a way to hook into this marking
1161
process. This is done by calling ggc_add_root. This provides the address
1162
of a callback routine which will be called duing garbage collection and
1163
which can call ggc_mark to save the storage. If storage is only
1164
used within the parsing of a function, you do not need to provide a way
1165
to mark it.
1166
 
1167
Note that you can also call ggc_mark_tree to mark any of the back end
1168
internal 'tree' nodes. This routine will follow the branches of the
1169
trees and mark all the subordinate structures. This is useful for
1170
example when you have created a variable declaration that will be used
1171
across multiple functions, or for a function declaration (from a
1172
prototype) that may be used later on. See the next item for more on the
1173
tree nodes.
1174
 
1175
@node Interfacing to the code generation code. ,  , Interfacing to the garbage collection, treelang main compiler
1176
@subsubsection Interfacing to the code generation code.
1177
 
1178
In treelang this is done in treetree.c. A typedef called 'tree' which is
1179
defined in tree.h and tree.def in the GCC directory and largely
1180
implemented in tree.c and stmt.c forms the basic interface to the
1181
compiler back end.
1182
 
1183
In general you call various tree routines to generate code, either
1184
directly or through toplev.c. You build up data structures and
1185
expressions in similar ways.
1186
 
1187
You can read some documentation on this which can be found via the GCC
1188
main web page. In particular, the documentation produced by Joachim
1189
Nadler and translated by Tim Josling can be quite useful. the C compiler
1190
also has documentation in the main GCC manual (particularly the current
1191
CVS version) which is useful on a lot of the details.
1192
 
1193
In time it is hoped to enhance this document to provide a more
1194
comprehensive overview of this topic. The main gap is in explaining how
1195
it all works together.
1196
 
1197
@node Hints and tips,  , treelang compiler interfaces, treelang internals
1198
@section Hints and tips
1199
 
1200
@itemize @bullet
1201
 
1202
@item
1203
TAGS: Use the make ETAGS commands to create TAGS files which can be used in
1204
emacs to jump to any symbol quickly.
1205
 
1206
@item
1207
GREP: grep is also a useful way to find all uses of a symbol.
1208
 
1209
@item
1210
TREE: The main routines to look at are tree.h and tree.def. You will
1211
probably want a hardcopy of these.
1212
 
1213
@item
1214
SAMPLE: look at the sample interfacing code in treetree.c. You can use
1215
gdb to trace through the code and learn about how it all works.
1216
 
1217
@item
1218
GDB: the GCC back end works well with gdb. It traps abort() and allows
1219
you to trace back what went wrong.
1220
 
1221
@item
1222
Error Checking: The compiler back end does some error and consistency
1223
checking. Often the result of an error is just no code being
1224
generated. You will then need to trace through and find out what is
1225
going wrong. The rtl dump files can help here also.
1226
 
1227
@item
1228
rtl dump files: The main compiler documents these files which are dumps
1229
of the rtl (intermediate code) which is manipulated doing the code
1230
generation process. This can provide useful clues about what is going
1231
wrong. The rtl 'language' is documented in the main GCC manual.
1232
 
1233
@end itemize
1234
 
1235
@end ifset
1236
 
1237
@node Open Questions, Bugs, treelang internals, Top
1238
@chapter Open Questions
1239
 
1240
If you know GCC well, please consider looking at the file treetree.c and
1241
resolving any questions marked "???".
1242
 
1243
@node Bugs, Service, Open Questions, Top
1244
@chapter Reporting Bugs
1245
@cindex bugs
1246
@cindex reporting bugs
1247
 
1248
You can report bugs to @email{@value{email-bugs}}. Please make
1249
sure bugs are real before reporting them. Follow the guidelines in the
1250
main GCC manual for submitting bug reports.
1251
 
1252
@menu
1253
* Sending Patches::
1254
@end menu
1255
 
1256
@node Sending Patches,  , Bugs, Bugs
1257
@section Sending Patches for GNU Treelang
1258
 
1259
If you would like to write bug fixes or improvements for the GNU
1260
Treelang compiler, that is very helpful.  Send suggested fixes to
1261
@email{@value{email-patches}}.
1262
 
1263
@node Service, Projects, Bugs, Top
1264
@chapter How To Get Help with GNU Treelang
1265
 
1266
If you need help installing, using or changing GNU Treelang, there are two
1267
ways to find it:
1268
 
1269
@itemize @bullet
1270
 
1271
@item
1272
Look in the service directory for someone who might help you for a fee.
1273
The service directory is found in the file named @file{SERVICE} in the
1274
GCC distribution.
1275
 
1276
@item
1277
Send a message to @email{@value{email-general}}.
1278
 
1279
@end itemize
1280
 
1281
@end ifset
1282
@ifset INTERNALS
1283
 
1284
@node Projects, Index, Service, Top
1285
@chapter Projects
1286
@cindex projects
1287
 
1288
If you want to contribute to @code{treelang} by doing research,
1289
design, specification, documentation, coding, or testing,
1290
the following information should give you some ideas.
1291
 
1292
Send a message to @email{@value{email-general}} if you plan to add a
1293
feature.
1294
 
1295
The main requirement for treelang is to add features and to add
1296
documentation. Features are things that the GCC back end can do but
1297
which are not reflected in treelang. Examples include structures,
1298
unions, pointers, arrays.
1299
 
1300
@end ifset
1301
 
1302
@node Index,  , Projects, Top
1303
@unnumbered Index
1304
 
1305
@printindex cp
1306
@summarycontents
1307
@contents
1308
@bye

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.