OpenCores
URL https://opencores.org/ocsvn/openrisc_me/openrisc_me/trunk

Subversion Repositories openrisc_me

[/] [openrisc/] [trunk/] [gnu-src/] [gcc-4.2.2/] [gcc/] [treelang/] [treelang.texi] - Blame information for rev 313

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 38 julius
\input texinfo  @c -*-texinfo-*-
2
 
3
@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
4
@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
5
 
6
 
7
@c %**start of header
8
@setfilename treelang.info
9
 
10
@include gcc-common.texi
11
 
12
@set copyrights-treelang 1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005
13
 
14
@set email-general gcc@@gcc.gnu.org
15
@set email-bugs gcc-bugs@@gcc.gnu.org or bug-gcc@@gnu.org
16
@set email-patches gcc-patches@@gcc.gnu.org
17
@set path-treelang gcc/gcc/treelang
18
 
19
@set which-treelang GCC-@value{version-GCC}
20
@set which-GCC GCC
21
 
22
@set email-josling tej@@melbpc.org.au
23
@set www-josling http://www.geocities.com/timjosling
24
 
25
@c This tells @include'd files that they're part of the overall TREELANG doc
26
@c set.  (They might be part of a higher-level doc set too.)
27
@set DOC-TREELANG
28
 
29
@c @setfilename usetreelang.info
30
@c @setfilename maintaintreelang.info
31
@c To produce the full manual, use the "treelang.info" setfilename, and
32
@c make sure the following do NOT begin with '@c' (and the @clear lines DO)
33
@set INTERNALS
34
@set USING
35
@c To produce a user-only manual, use the "usetreelang.info" setfilename, and
36
@c make sure the following does NOT begin with '@c':
37
@c @clear INTERNALS
38
@c To produce a maintainer-only manual, use the "maintaintreelang.info" setfilename,
39
@c and make sure the following does NOT begin with '@c':
40
@c @clear USING
41
 
42
@ifset INTERNALS
43
@ifset USING
44
@settitle Using and Maintaining GNU Treelang
45
@end ifset
46
@end ifset
47
@c seems reasonable to assume at least one of INTERNALS or USING is set...
48
@ifclear INTERNALS
49
@settitle Using GNU Treelang
50
@end ifclear
51
@ifclear USING
52
@settitle Maintaining GNU Treelang
53
@end ifclear
54
@c then again, have some fun
55
@ifclear INTERNALS
56
@ifclear USING
57
@settitle Doing Very Little at all with GNU Treelang
58
@end ifclear
59
@end ifclear
60
 
61
@syncodeindex fn cp
62
@syncodeindex vr cp
63
@c %**end of header
64
 
65
@c Cause even numbered pages to be printed on the left hand side of
66
@c the page and odd numbered pages to be printed on the right hand
67
@c side of the page.  Using this, you can print on both sides of a
68
@c sheet of paper and have the text on the same part of the sheet.
69
 
70
@c The text on right hand pages is pushed towards the right hand
71
@c margin and the text on left hand pages is pushed toward the left
72
@c hand margin.
73
@c (To provide the reverse effect, set bindingoffset to -0.75in.)
74
 
75
@c @tex
76
@c \global\bindingoffset=0.75in
77
@c \global\normaloffset =0.75in
78
@c @end tex
79
 
80
@copying
81
Copyright @copyright{} @value{copyrights-treelang} Free Software Foundation, Inc.
82
 
83
Permission is granted to copy, distribute and/or modify this document
84
under the terms of the GNU Free Documentation License, Version 1.2 or
85
any later version published by the Free Software Foundation; with the
86
Invariant Sections being ``GNU General Public License'', the Front-Cover
87
texts being (a) (see below), and with the Back-Cover Texts being (b)
88
(see below).  A copy of the license is included in the section entitled
89
``GNU Free Documentation License''.
90
 
91
(a) The FSF's Front-Cover Text is:
92
 
93
     A GNU Manual
94
 
95
(b) The FSF's Back-Cover Text is:
96
 
97
     You have freedom to copy and modify this GNU Manual, like GNU
98
     software.  Copies published by the Free Software Foundation raise
99
     funds for GNU development.
100
@end copying
101
 
102
@ifnottex
103
@dircategory Software development
104
@direntry
105
* treelang: (treelang).                  The GNU Treelang compiler.
106
@end direntry
107
@ifset INTERNALS
108
@ifset USING
109
This file documents the use and the internals of the GNU Treelang
110
(@code{treelang}) compiler.  At the moment this manual is not
111
incorporated into the main GCC manual as it is incomplete.  It
112
corresponds to the @value{which-treelang} version of @code{treelang}.
113
@end ifset
114
@end ifset
115
@ifclear USING
116
This file documents the internals of the GNU Treelang (@code{treelang}) compiler.
117
It corresponds to the @value{which-treelang} version of @code{treelang}.
118
@end ifclear
119
@ifclear INTERNALS
120
This file documents the use of the GNU Treelang (@code{treelang}) compiler.
121
It corresponds to the @value{which-treelang} version of @code{treelang}.
122
@end ifclear
123
 
124
Published by the Free Software Foundation
125
51 Franklin Street, Fifth Floor
126
Boston, MA 02110-1301 USA
127
 
128
@insertcopying
129
@end ifnottex
130
 
131
@setchapternewpage odd
132
@c @finalout
133
@titlepage
134
@ifset INTERNALS
135
@ifset USING
136
@title Using and Maintaining GNU Treelang
137
@end ifset
138
@end ifset
139
@ifclear INTERNALS
140
@title Using GNU Treelang
141
@end ifclear
142
@ifclear USING
143
@title Maintaining GNU Treelang
144
@end ifclear
145
@versionsubtitle
146
@author Tim Josling
147
@page
148
@vskip 0pt plus 1filll
149
Published by the Free Software Foundation @*
150
51 Franklin Street, Fifth Floor@*
151
Boston, MA 02110-1301, USA@*
152
@c Last printed ??ber, 19??.@*
153
@c Printed copies are available for $? each.@*
154
@c ISBN ???
155
@sp 1
156
@insertcopying
157
@end titlepage
158
@page
159
 
160
@ifnottex
161
 
162
@node Top, Copying,, (dir)
163
@top Introduction
164
@cindex Introduction
165
 
166
@ifset INTERNALS
167
@ifset USING
168
This manual documents how to run, install and maintain @code{treelang}.
169
It also documents the features and incompatibilities in the @value{which-treelang}
170
version of @code{treelang}.
171
@end ifset
172
@end ifset
173
 
174
@ifclear INTERNALS
175
This manual documents how to run and install @code{treelang}.
176
It also documents the features and incompatibilities in the @value{which-treelang}
177
version of @code{treelang}.
178
@end ifclear
179
@ifclear USING
180
This manual documents how to maintain @code{treelang}.
181
It also documents the features and incompatibilities in the @value{which-treelang}
182
version of @code{treelang}.
183
@end ifclear
184
 
185
@end ifnottex
186
 
187
@menu
188
* Copying::
189
* Contributors::
190
* GNU Free Documentation License::
191
* Funding::
192
* Getting Started::
193
* What is GNU Treelang?::
194
* Lexical Syntax::
195
* Parsing Syntax::
196
* Compiler Overview::
197
* TREELANG and GCC::
198
* Compiler::
199
* Other Languages::
200
* treelang internals::
201
* Open Questions::
202
* Bugs::
203
* Service::
204
* Projects::
205
* Index::
206
 
207
@detailmenu
208
 --- The Detailed Node Listing ---
209
 
210
Other Languages
211
 
212
* Interoperating with C and C++::
213
 
214
treelang internals
215
 
216
* treelang files::
217
* treelang compiler interfaces::
218
* Hints and tips::
219
 
220
treelang compiler interfaces
221
 
222
* treelang driver::
223
* treelang main compiler::
224
 
225
treelang main compiler
226
 
227
* Interfacing to toplev.c::
228
* Interfacing to the garbage collection::
229
* Interfacing to the code generation code. ::
230
 
231
Reporting Bugs
232
 
233
* Sending Patches::
234
 
235
@end detailmenu
236
@end menu
237
 
238
@include gpl.texi
239
 
240
@include fdl.texi
241
 
242
@node Contributors
243
 
244
@unnumbered Contributors to GNU Treelang
245
@cindex contributors
246
@cindex credits
247
 
248
Treelang was based on 'toy' by Richard Kenner, and also uses code from
249
the GCC core code tree.  Tim Josling first created the language and
250
documentation, based on the GCC Fortran compiler's documentation
251
framework.  Treelang was updated to use the TreeSSA infrastructure by
252
James A. Morrison.
253
 
254
@itemize @bullet
255
@item
256
The packaging and compiler portions of GNU Treelang are based largely
257
on the GCC compiler.
258
@xref{Contributors,,Contributors to GCC,GCC,Using and Maintaining GCC},
259
for more information.
260
 
261
@item
262
There is no specific run-time library for treelang, other than the
263
standard C runtime.
264
 
265
@item
266
It would have been difficult to build treelang without access to Joachim
267
Nadler's guide to writing a front end to GCC (written in German).  A
268
translation of this document into English is available via the
269
CobolForGCC project or via the documentation links from the GCC home
270
page @uref{http://gcc.gnu.org}.
271
@end itemize
272
 
273
@include funding.texi
274
 
275
@node Getting Started
276
@chapter Getting Started
277
@cindex getting started
278
@cindex new users
279
@cindex newbies
280
@cindex beginners
281
 
282
Treelang is a sample language, useful only to help people understand how
283
to implement a new language front end to GCC.  It is not a useful
284
language in itself other than as an example or basis for building a new
285
language.  Therefore only language developers are likely to have an
286
interest in it.
287
 
288
This manual assumes familiarity with GCC, which you can obtain by using
289
it and by reading the manuals @samp{Using the GNU Compiler Collection (GCC)}
290
and @samp{GNU Compiler Collection (GCC) Internals}.
291
 
292
To install treelang, follow the GCC installation instructions,
293
taking care to ensure you specify treelang in the configure step by adding
294
treelang to the list of languages specified by @option{--enable-languages},
295
e.g.@: @samp{--enable-languages=all,treelang}.
296
 
297
If you're generally curious about the future of
298
@code{treelang}, see @ref{Projects}.
299
If you're curious about its past,
300
see @ref{Contributors}.
301
 
302
To see a few of the questions maintainers of @code{treelang} have,
303
and that you might be able to answer,
304
see @ref{Open Questions}.
305
 
306
@ifset USING
307
@node What is GNU Treelang?, Lexical Syntax, Getting Started, Top
308
@chapter What is GNU Treelang?
309
@cindex concepts, basic
310
@cindex basic concepts
311
 
312
GNU Treelang, or @code{treelang}, is designed initially as a free
313
replacement for, or alternative to, the 'toy' language, but which is
314
amenable to inclusion within the GCC source tree.
315
 
316
@code{treelang} is largely a cut down version of C, designed to showcase
317
the features of the GCC code generation back end.  Only those features
318
that are directly supported by the GCC code generation back end are
319
implemented.  Features are implemented in a manner which is easiest and
320
clearest to implement.  Not all or even most code generation back end
321
features are implemented.  The intention is to add features incrementally
322
until most features of the GCC back end are implemented in treelang.
323
 
324
The main features missing are structures, arrays and pointers.
325
 
326
A sample program follows:
327
 
328
@smallexample
329
// @r{function prototypes}
330
// @r{function 'add' taking two ints and returning an int}
331
external_definition int add(int arg1, int arg2);
332
external_definition int subtract(int arg3, int arg4);
333
external_definition int first_nonzero(int arg5, int arg6);
334
external_definition int double_plus_one(int arg7);
335
 
336
// @r{function definition}
337
add
338
@{
339
  // @r{return the sum of arg1 and arg2}
340
  return arg1 + arg2;
341
@}
342
 
343
 
344
subtract
345
@{
346
  return arg3 - arg4;
347
@}
348
 
349
double_plus_one
350
@{
351
  // @r{aaa is a variable, of type integer and allocated at the start of}
352
  // @r{the function}
353
  automatic int aaa;
354
  // @r{set aaa to the value returned from add, when passed arg7 and arg7 as}
355
  // @r{the two parameters}
356
  aaa=add(arg7, arg7);
357
  aaa=add(aaa, aaa);
358
  aaa=subtract(subtract(aaa, arg7), arg7) + 1;
359
  return aaa;
360
@}
361
 
362
first_nonzero
363
@{
364
  // @r{C-like if statement}
365
  if (arg5)
366
    @{
367
      return arg5;
368
    @}
369
  else
370
    @{
371
    @}
372
  return arg6;
373
@}
374
@end smallexample
375
 
376
@node Lexical Syntax, Parsing Syntax, What is GNU Treelang?, Top
377
@chapter Lexical Syntax
378
@cindex Lexical Syntax
379
 
380
Treelang programs consist of whitespace, comments, keywords and names.
381
@itemize @bullet
382
 
383
@item
384
Whitespace consists of the space character, a tab, and the end of line
385
character.  Line terminations are as defined by the
386
standard C library.  Whitespace is ignored except within comments,
387
and where it separates parts of the program.  In the example below, A and
388
B are two separate names separated by whitespace.
389
 
390
@smallexample
391
A B
392
@end smallexample
393
 
394
@item
395
Comments consist of @samp{//} followed by any characters up to the end
396
of the line.  C style comments (/* */) are not supported.  For example,
397
the assignment below is followed by a not very helpful comment.
398
 
399
@smallexample
400
x = 1; // @r{Set X to 1}
401
@end smallexample
402
 
403
@item
404
Keywords consist of any of the following reserved words or symbols:
405
 
406
@itemize @bullet
407
@item @{
408
used to start the statements in a function
409
@item @}
410
used to end the statements in a function
411
@item (
412
start list of function arguments, or to change the precedence of operators in
413
an expression
414
@item )
415
end list or prioritized operators in expression
416
@item ,
417
used to separate parameters in a function prototype or in a function call
418
@item ;
419
used to end a statement
420
@item +
421
addition, or unary plus for signed literals
422
@item -
423
subtraction, or unary minus for signed literals
424
@item =
425
assignment
426
@item ==
427
equality test
428
@item if
429
begin IF statement
430
@item else
431
begin 'else' portion of IF statement
432
@item static
433
indicate variable is permanent, or function has file scope only
434
@item automatic
435
indicate that variable is allocated for the life of the current scope
436
@item external_reference
437
indicate that variable or function is defined in another file
438
@item external_definition
439
indicate that variable or function is to be accessible from other files
440
@item int
441
variable is an integer (same as C int)
442
@item char
443
variable is a character (same as C char)
444
@item unsigned
445
variable is unsigned. If this is not present, the variable is signed
446
@item return
447
start function return statement
448
@item void
449
used as function type to indicate function returns nothing
450
@end itemize
451
 
452
 
453
@item
454
Names consist of any letter or "_" followed by any number of letters,
455
numbers, or "_".  "$" is not allowed in a name.  All names must be globally
456
unique, i.e. may not be used twice in any context, and must
457
not be a keyword.  Names and keywords are case sensitive.  For example:
458
 
459
@smallexample
460
a A _a a_ IF_X
461
@end smallexample
462
 
463
are all different names.
464
 
465
@end itemize
466
 
467
@node Parsing Syntax, Compiler Overview, Lexical Syntax, Top
468
@chapter Parsing Syntax
469
@cindex Parsing Syntax
470
 
471
Declarations are built up from the lexical elements described above.  A
472
file may contain one of more declarations.
473
 
474
@itemize @bullet
475
 
476
@item
477
declaration: variable declaration OR function prototype OR function declaration
478
 
479
@item
480
Function Prototype: storage type NAME ( optional_parameter_list )
481
 
482
@smallexample
483
static int add (int a, int b)
484
@end smallexample
485
 
486
@item
487
variable_declaration: storage type NAME initial;
488
 
489
Example:
490
 
491
@smallexample
492
int temp1 = 1;
493
@end smallexample
494
 
495
A variable declaration can be outside a function, or at the start of a
496
function.
497
 
498
@item
499
storage: automatic OR static OR external_reference OR external_definition
500
 
501
This defines the scope, duration and visibility of a function or variable
502
 
503
@enumerate 1
504
 
505
@item
506
automatic: This means a variable is allocated at start of the current scope and
507
released when the current scope is exited.  This can only be used for variables
508
within functions.  It cannot be used for functions.
509
 
510
@item
511
static: This means a variable is allocated at start of program and
512
remains allocated until the program as a whole ends.  For a function, it
513
means that the function is only visible within the current file.
514
 
515
@item
516
external_definition: For a variable, which must be defined outside a
517
function, it means that the variable is visible from other files.  For a
518
function, it means that the function is visible from another file.
519
 
520
@item
521
external_reference: For a variable, which must be defined outside a
522
function, it means that the variable is defined in another file.  For a
523
function, it means that the function is defined in another file.
524
 
525
@end enumerate
526
 
527
@item
528
type: int OR unsigned int OR char OR unsigned char OR void
529
 
530
This defines the data type of a variable or the return type of a function.
531
 
532
@enumerate a
533
 
534
@item
535
int: The variable is a signed integer.  The function returns a signed integer.
536
 
537
@item
538
unsigned int: The variable is an unsigned integer.  The function returns an unsigned integer.
539
 
540
@item
541
char: The variable is a signed character.  The function returns a signed character.
542
 
543
@item
544
unsigned char: The variable is an unsigned character.  The function returns an unsigned character.
545
 
546
@end enumerate
547
 
548
@item
549
parameter_list OR parameter [, parameter]...
550
 
551
@item
552
parameter: variable_declaration ,
553
 
554
The variable declarations must not have initializations.
555
 
556
@item
557
initial: = value
558
 
559
@item
560
value: integer_constant
561
 
562
Values without a unary plus or minus are considered to be unsigned.
563
@smallexample
564
e.g.@: 1 +2 -3
565
@end smallexample
566
 
567
@item
568
function_declaration: name @{ variable_declarations statements @}
569
 
570
A function consists of the function name then the declarations (if any)
571
and statements (if any) within one pair of braces.
572
 
573
The details of the function arguments come from the function
574
prototype.  The function prototype must precede the function declaration
575
in the file.
576
 
577
@item
578
statement: if_statement OR expression_statement OR return_statement
579
 
580
@item
581
if_statement: if ( expression ) @{ variable_declarations statements @}
582
else @{ variable_declarations statements @}
583
 
584
The first lot of statements is executed if the expression is
585
nonzero.  Otherwise the second lot of statements is executed.  Either
586
list of statements may be empty, but both sets of braces and the else must be present.
587
 
588
@smallexample
589
if (a==b)
590
@{
591
// @r{nothing}
592
@}
593
else
594
@{
595
a=b;
596
@}
597
@end smallexample
598
 
599
@item
600
expression_statement: expression;
601
 
602
The expression is executed, including any side effects.
603
 
604
@item
605
return_statement: return expression_opt;
606
 
607
Returns from the function. If the function is void, the expression must
608
be absent, and if the function is not void the expression must be
609
present.
610
 
611
@item
612
expression: variable OR integer_constant OR expression + expression
613
OR expression - expression OR expression == expression OR ( expression )
614
OR variable = expression OR function_call
615
 
616
An expression can be a constant or a variable reference or a
617
function_call.  Expressions can be combined as a sum of two expressions
618
or the difference of two expressions, or an equality test of two
619
expressions.  An assignment is also an expression.  Expressions and operator
620
precedence work as in C.
621
 
622
@item
623
function_call: function_name ( optional_comma_separated_expressions )
624
 
625
This invokes the function, passing to it the values of the expressions
626
as actual parameters.
627
 
628
@end itemize
629
 
630
@cindex compilers
631
@node Compiler Overview, TREELANG and GCC, Parsing Syntax, Top
632
@chapter Compiler Overview
633
treelang is run as part of the GCC compiler.
634
 
635
@itemize @bullet
636
@cindex source code
637
@cindex file, source
638
@cindex code, source
639
@cindex source file
640
@item
641
It reads a user's program, stored in a file and containing instructions
642
written in the appropriate language (Treelang, C, and so on).  This file
643
contains @dfn{source code}.
644
 
645
@cindex translation of user programs
646
@cindex machine code
647
@cindex code, machine
648
@cindex mistakes
649
@item
650
It translates the user's program into instructions a computer can carry
651
out more quickly than it takes to translate the instructions in the
652
first place.  These instructions are called @dfn{machine code}---code
653
designed to be efficiently translated and processed by a machine such as
654
a computer.  Humans usually aren't as good writing machine code as they
655
are at writing Treelang or C, because it is easy to make tiny mistakes
656
writing machine code.  When writing Treelang or C, it is easy to make
657
big mistakes. But you can only make one mistake, because the compiler
658
stops after it finds any problem.
659
 
660
@cindex debugger
661
@cindex bugs, finding
662
@cindex @code{gdb}, command
663
@cindex commands, @code{gdb}
664
@item
665
It provides information in the generated machine code
666
that can make it easier to find bugs in the program
667
(using a debugging tool, called a @dfn{debugger},
668
such as @code{gdb}).
669
 
670
@cindex libraries
671
@cindex linking
672
@cindex @code{ld} command
673
@cindex commands, @code{ld}
674
@item
675
It locates and gathers machine code already generated to perform actions
676
requested by statements in the user's program.  This machine code is
677
organized into @dfn{libraries} and is located and gathered during the
678
@dfn{link} phase of the compilation process.  (Linking often is thought
679
of as a separate step, because it can be directly invoked via the
680
@code{ld} command.  However, the @code{gcc} command, as with most
681
compiler commands, automatically performs the linking step by calling on
682
@code{ld} directly, unless asked to not do so by the user.)
683
 
684
@cindex language, incorrect use of
685
@cindex incorrect use of language
686
@item
687
It attempts to diagnose cases where the user's program contains
688
incorrect usages of the language.  The @dfn{diagnostics} produced by the
689
compiler indicate the problem and the location in the user's source file
690
where the problem was first noticed.  The user can use this information
691
to locate and fix the problem.
692
 
693
The compiler stops after the first error.  There are no plans to fix
694
this, ever, as it would vastly complicate the implementation of treelang
695
to little or no benefit.
696
 
697
@cindex diagnostics, incorrect
698
@cindex incorrect diagnostics
699
@cindex error messages, incorrect
700
@cindex incorrect error messages
701
(Sometimes an incorrect usage of the language leads to a situation where
702
the compiler can not make any sense of what it reads---while a human
703
might be able to---and thus ends up complaining about an incorrect
704
``problem'' it encounters that, in fact, reflects a misunderstanding of
705
the programmer's intention.)
706
 
707
@cindex warnings
708
@cindex questionable instructions
709
@item
710
There are a few warnings in treelang.  For example an unused static function
711
generate a warnings when -Wunused-function is specified, similarly an unused
712
static variable generates a warning when -Wunused-variable are specified.
713
The only treelang specific warning is a warning when an expression is in a
714
return statement for functions that return void.
715
@end itemize
716
 
717
@cindex components of treelang
718
@cindex @code{treelang}, components of
719
@code{treelang} consists of several components:
720
 
721
@cindex @code{gcc}, command
722
@cindex commands, @code{gcc}
723
@itemize @bullet
724
@item
725
A modified version of the @code{gcc} command, which also might be
726
installed as the system's @code{cc} command.
727
(In many cases, @code{cc} refers to the
728
system's ``native'' C compiler, which
729
might be a non-GNU compiler, or an older version
730
of @code{GCC} considered more stable or that is
731
used to build the operating system kernel.)
732
 
733
@cindex @code{treelang}, command
734
@cindex commands, @code{treelang}
735
@item
736
The @code{treelang} command itself.
737
 
738
@item
739
The @code{libc} run-time library.  This library contains the machine
740
code needed to support capabilities of the Treelang language that are
741
not directly provided by the machine code generated by the
742
@code{treelang} compilation phase.  This is the same library that the
743
main C compiler uses (libc).
744
 
745
@cindex @code{tree1}, program
746
@cindex programs, @code{tree1}
747
@cindex assembler
748
@cindex @code{as} command
749
@cindex commands, @code{as}
750
@cindex assembly code
751
@cindex code, assembly
752
@item
753
The compiler itself, is internally named @code{tree1}.
754
 
755
Note that @code{tree1} does not generate machine code directly---it
756
generates @dfn{assembly code} that is a more readable form
757
of machine code, leaving the conversion to actual machine code
758
to an @dfn{assembler}, usually named @code{as}.
759
@end itemize
760
 
761
@code{GCC} is often thought of as ``the C compiler'' only,
762
but it does more than that.
763
Based on command-line options and the names given for files
764
on the command line, @code{gcc} determines which actions to perform, including
765
preprocessing, compiling (in a variety of possible languages), assembling,
766
and linking.
767
 
768
@cindex driver, gcc command as
769
@cindex @code{gcc}, command as driver
770
@cindex executable file
771
@cindex files, executable
772
@cindex cc1 program
773
@cindex programs, cc1
774
@cindex preprocessor
775
@cindex cpp program
776
@cindex programs, cpp
777
For example, the command @samp{gcc foo.c} @dfn{drives} the file
778
@file{foo.c} through the preprocessor @code{cpp}, then
779
the C compiler (internally named
780
@code{cc1}), then the assembler (usually @code{as}), then the linker
781
(@code{ld}), producing an executable program named @file{a.out} (on
782
UNIX systems).
783
 
784
@cindex treelang program
785
@cindex programs, treelang
786
As another example, the command @samp{gcc foo.tree} would do much the
787
same as @samp{gcc foo.c}, but instead of using the C compiler named
788
@code{cc1}, @code{gcc} would use the treelang compiler (named
789
@code{tree1}). However there is no preprocessor for treelang.
790
 
791
@cindex @code{tree1}, program
792
@cindex programs, @code{tree1}
793
In a GNU Treelang installation, @code{gcc} recognizes Treelang source
794
files by name just like it does C and C++ source files.  It knows to use
795
the Treelang compiler named @code{tree1}, instead of @code{cc1} or
796
@code{cc1plus}, to compile Treelang files.  If a file's name ends in
797
@code{.tree} then GCC knows that the program is written in treelang.  You
798
can also manually override the language.
799
 
800
@cindex @code{gcc}, not recognizing Treelang source
801
@cindex unrecognized file format
802
@cindex file format not recognized
803
Non-Treelang-related operation of @code{gcc} is generally
804
unaffected by installing the GNU Treelang version of @code{gcc}.
805
However, without the installed version of @code{gcc} being the
806
GNU Treelang version, @code{gcc} will not be able to compile
807
and link Treelang programs.
808
 
809
@cindex printing version information
810
@cindex version information, printing
811
The command @samp{gcc -v x.tree} where @samp{x.tree} is a file which
812
must exist but whose contents are ignored, is a quick way to display
813
version information for the various programs used to compile a typical
814
Treelang source file.
815
 
816
The @code{tree1} program represents most of what is unique to GNU
817
Treelang; @code{tree1} is a combination of two rather large chunks of
818
code.
819
 
820
@cindex GCC Back End (GBE)
821
@cindex GBE
822
@cindex @code{GCC}, back end
823
@cindex back end, GCC
824
@cindex code generator
825
One chunk is the so-called @dfn{GNU Back End}, or GBE,
826
which knows how to generate fast code for a wide variety of processors.
827
The same GBE is used by the C, C++, and Treelang compiler programs @code{cc1},
828
@code{cc1plus}, and @code{tree1}, plus others.
829
Often the GBE is referred to as the ``GCC back end'' or
830
even just ``GCC''---in this manual, the term GBE is used
831
whenever the distinction is important.
832
 
833
@cindex GNU Treelang Front End (TFE)
834
@cindex tree1
835
@cindex @code{treelang}, front end
836
@cindex front end, @code{treelang}
837
The other chunk of @code{tree1} is the majority of what is unique about
838
GNU Treelang---the code that knows how to interpret Treelang programs to
839
determine what they are intending to do, and then communicate that
840
knowledge to the GBE for actual compilation of those programs.  This
841
chunk is called the @dfn{Treelang Front End} (TFE).  The @code{cc1} and
842
@code{cc1plus} programs have their own front ends, for the C and C++
843
languages, respectively.  These fronts ends are responsible for
844
diagnosing incorrect usage of their respective languages by the programs
845
the process, and are responsible for most of the warnings about
846
questionable constructs as well.  (The GBE in principle handles
847
producing some warnings, like those concerning possible references to
848
undefined variables, but these warnings should not occur in treelang
849
programs as the front end is meant to pick them up first).
850
 
851
Because so much is shared among the compilers for various languages,
852
much of the behavior and many of the user-selectable options for these
853
compilers are similar.
854
For example, diagnostics (error messages and
855
warnings) are similar in appearance; command-line
856
options like @samp{-Wall} have generally similar effects; and the quality
857
of generated code (in terms of speed and size) is roughly similar
858
(since that work is done by the shared GBE).
859
 
860
@node TREELANG and GCC, Compiler, Compiler Overview, Top
861
@chapter Compile Treelang, C, or Other Programs
862
@cindex compiling programs
863
@cindex programs, compiling
864
 
865
@cindex @code{gcc}, command
866
@cindex commands, @code{gcc}
867
A GNU Treelang installation includes a modified version of the @code{gcc}
868
command.
869
 
870
In a non-Treelang installation, @code{gcc} recognizes C, C++,
871
and Objective-C source files.
872
 
873
In a GNU Treelang installation, @code{gcc} also recognizes Treelang source
874
files and accepts Treelang-specific command-line options, plus some
875
command-line options that are designed to cater to Treelang users
876
but apply to other languages as well.
877
 
878
@xref{G++ and GCC,,Programming Languages Supported by GCC,GCC,Using
879
the GNU Compiler Collection (GCC)},
880
for information on the way different languages are handled
881
by the GCC compiler (@code{gcc}).
882
 
883
You can use this, combined with the output of the @samp{gcc -v x.tree}
884
command to get the options applicable to treelang.  Treelang programs
885
must end with the suffix @samp{.tree}.
886
 
887
@cindex preprocessor
888
 
889
Treelang programs are not by default run through the C
890
preprocessor by @code{gcc}. There is no reason why they cannot be run through the
891
preprocessor manually, but you would need to prevent the preprocessor
892
from generating #line directives, using the @samp{-P} option, otherwise
893
tree1 will not accept the input.
894
 
895
@node Compiler, Other Languages, TREELANG and GCC, Top
896
@chapter The GNU Treelang Compiler
897
 
898
The GNU Treelang compiler, @code{treelang}, supports programs written
899
in the GNU Treelang language.
900
 
901
@node Other Languages, treelang internals, Compiler, Top
902
@chapter Other Languages
903
 
904
@menu
905
* Interoperating with C and C++::
906
@end menu
907
 
908
@node Interoperating with C and C++,  , Other Languages, Other Languages
909
@section Tools and advice for interoperating with C and C++
910
 
911
The output of treelang programs looks like C program code to the linker
912
and everybody else, so you should be able to freely mix treelang and C
913
(and C++) code, with one proviso.
914
 
915
C promotes small integer types to 'int' when used as function parameters and
916
return values in non-prototyped functions.  Since treelang has no
917
non-prototyped functions, the treelang compiler does not do this.
918
 
919
@ifset INTERNALS
920
@node treelang internals, Open Questions, Other Languages, Top
921
@chapter treelang internals
922
 
923
@menu
924
* treelang files::
925
* treelang compiler interfaces::
926
* Hints and tips::
927
@end menu
928
 
929
@node treelang files, treelang compiler interfaces, treelang internals, treelang internals
930
@section treelang files
931
 
932
To create a compiler that integrates into GCC, you need create many
933
files.  Some of the files are integrated into the main GCC makefile, to
934
build the various parts of the compiler and to run the test
935
suite.  Others are incorporated into various GCC programs such as
936
@file{gcc.c}.  Finally you must provide the actual programs comprising your
937
compiler.
938
 
939
@cindex files
940
 
941
The files are:
942
 
943
@enumerate 1
944
 
945
@item
946
COPYING.  This is the copyright file, assuming you are going to use the
947
GNU General Public License.  You probably need to use the GPL because if
948
you use the GCC back end your program and the back end are one program,
949
and the back end is GPLed.
950
 
951
This need not be present if the language is incorporated into the main
952
GCC tree, as the main GCC directory has this file.
953
 
954
@item
955
COPYING.LIB.  This is the copyright file for those parts of your program
956
that are not to be covered by the GPL, but are instead to be covered by
957
the LGPL (Library or Lesser GPL).  This license may be appropriate for
958
the library routines associated with your compiler. These are the
959
routines that are linked with the @emph{output} of the compiler.  Using
960
the LGPL for these programs allows programs written using your compiler
961
to be closed source. For example LIBC is under the LGPL.
962
 
963
This need not be present if the language is incorporated into the main
964
GCC tree, as the main GCC directory has this file.
965
 
966
@item
967
ChangeLog.  Record all the changes to your compiler.  Use the same format
968
as used in treelang as it is supported by an emacs editing mode and is
969
part of the FSF coding standard.  Normally each directory has its own
970
changelog.  The FSF standard allows but does not require a meaningful
971
comment on why the changes were made, above and beyond @emph{why} they
972
were made.  In the author's opinion it is useful to provide this
973
information.
974
 
975
@item
976
treelang.texi.  The manual, written in texinfo. Your manual would have a
977
different file name.  You need not write it in texinfo if you don't want
978
do, but a lot of GNU software does use texinfo.
979
 
980
@cindex Make-lang.in
981
@item
982
Make-lang.in.  This file is part of the make file which in incorporated
983
with the GCC make file skeleton (Makefile.in in the GCC directory) to
984
make Makefile, as part of the configuration process.
985
 
986
Makefile in turn is the main instruction to actually build
987
everything.  The build instructions are held in the main GCC manual and
988
web site so they are not repeated here.
989
 
990
There are some comments at the top which will help you understand what
991
you need to do.
992
 
993
There are make commands to build things, remove generated files with
994
various degrees of thoroughness, count the lines of code (so you know
995
how much progress you are making), build info and html files from the
996
texinfo source, run the tests etc.
997
 
998
@item
999
README.  Just a brief informative text file saying what is in this
1000
directory.
1001
 
1002
@cindex config-lang.in
1003
@item
1004
config-lang.in.  This file is read by the configuration progress and must
1005
be present. You specify the name of your language, the name(s) of the
1006
compiler(s) including preprocessors you are going to build, whether any,
1007
usually generated, files should be excluded from diffs (ie when making
1008
diff files to send in patches).  Whether the equate 'stagestuff' is used
1009
is unknown (???).
1010
 
1011
@cindex lang.opt
1012
@item
1013
lang.opt.  This file is included into @file{gcc.c}, the main GCC driver, and
1014
tells it what options your language supports.  This is also used to
1015
display help.
1016
 
1017
@cindex lang-specs.h
1018
@item
1019
lang-specs.h.  This file is also included in @file{gcc.c}. It tells
1020
@file{gcc.c} when to call your programs and what options to send them.  The
1021
mini-language 'specs' is documented in the source of @file{gcc.c}.  Do not
1022
attempt to write a specs file from scratch - use an existing one as the base
1023
and enhance it.
1024
 
1025
@item
1026
Your texi files.  Texinfo can be used to build documentation in HTML,
1027
info, dvi and postscript formats. It is a tagged language, is documented
1028
in its own manual, and has its own emacs mode.
1029
 
1030
@item
1031
Your programs.  The relationships between all the programs are explained
1032
in the next section.  You need to write or use the following programs:
1033
 
1034
@itemize @bullet
1035
 
1036
@item
1037
lexer.  This breaks the input into words and passes these to the
1038
parser.  This is @file{lex.l} in treelang, which is passed through flex, a lex
1039
variant, to produce C code @file{lex.c}.  Note there is a school of thought
1040
that says real men hand code their own lexers.  However, you may prefer to
1041
write far less code and use flex, as was done with treelang.
1042
 
1043
@item
1044
parser.  This breaks the program into recognizable constructs such as
1045
expressions, statements etc.  This is @file{parse.y} in treelang, which is
1046
passed through bison, which is a yacc variant, to produce C code
1047
@file{parse.c}.
1048
 
1049
@item
1050
back end interface.  This interfaces to the code generation back end.  In
1051
treelang, this is @file{tree1.c} which mainly interfaces to @file{toplev.c} and
1052
@file{treetree.c} which mainly interfaces to everything else. Many languages
1053
mix up the back end interface with the parser, as in the C compiler for
1054
example.  It is a matter of taste which way to do it, but with treelang
1055
it is separated out to make the back end interface cleaner and easier to
1056
understand.
1057
 
1058
@item
1059
header files.  For function prototypes and common data items.  One point
1060
to note here is that bison can generate a header files with all the
1061
numbers is has assigned to the keywords and symbols, and you can include
1062
the same header in your lexer.  This technique is demonstrated in
1063
treelang.
1064
 
1065
@item
1066
compiler main file.  GCC comes with a file @file{toplev.c} which is a
1067
perfectly serviceable main program for your compiler.  GNU Treelang uses
1068
@file{toplev.c} but other languages have been known to replace it with their
1069
own main program.  Again this is a matter of taste and how much code you
1070
want to write.
1071
 
1072
@end itemize
1073
 
1074
@end enumerate
1075
 
1076
@node treelang compiler interfaces, Hints and tips, treelang files, treelang internals
1077
@section treelang compiler interfaces
1078
 
1079
@cindex driver
1080
@cindex toplev.c
1081
 
1082
@menu
1083
* treelang driver::
1084
* treelang main compiler::
1085
@end menu
1086
 
1087
@node treelang driver, treelang main compiler, treelang compiler interfaces, treelang compiler interfaces
1088
@subsection treelang driver
1089
 
1090
The GCC compiler consists of a driver, which then executes the various
1091
compiler phases based on the instructions in the specs files.
1092
 
1093
Typically a program's language will be identified from its suffix
1094
(e.g., @file{.tree}) for treelang programs.
1095
 
1096
The driver (@file{gcc.c}) will then drive (exec) in turn a preprocessor,
1097
the main compiler, the assembler and the link editor. Options to GCC allow you
1098
to override all of this. In the case of treelang programs there is no
1099
preprocessor, and mostly these days the C preprocessor is run within the
1100
main C compiler rather than as a separate process, apparently for reasons of speed.
1101
 
1102
You will be using the standard assembler and linkage editor so these are
1103
ignored from now on.
1104
 
1105
You have to write your own preprocessor if you want one.  This is usually
1106
totally language specific.  The main point to be aware of is to ensure
1107
that you find some way to pass file name and line number information
1108
through to the main compiler so that it can tell the back end this
1109
information and so the debugger can find the right source line for each
1110
piece of code.  That is all there is to say about the preprocessor except
1111
that the preprocessor will probably not be the slowest part of the
1112
compiler and will probably not use the most memory so don't waste too
1113
much time tuning it until you know you need to do so.
1114
 
1115
@node treelang main compiler,  , treelang driver, treelang compiler interfaces
1116
@subsection treelang main compiler
1117
 
1118
The main compiler for treelang consists of @file{toplev.c} from the main GCC
1119
compiler, the parser, lexer and back end interface routines, and the
1120
back end routines themselves, of which there are many.
1121
 
1122
@file{toplev.c} does a lot of work for you and you should almost certainly
1123
use it.
1124
 
1125
Writing this code is the hard part of creating a compiler using GCC.  The
1126
back end interface documentation is incomplete and the interface is
1127
complex.
1128
 
1129
There are three main aspects to interfacing to the other GCC code.
1130
 
1131
@menu
1132
* Interfacing to toplev.c::
1133
* Interfacing to the garbage collection::
1134
* Interfacing to the code generation code. ::
1135
@end menu
1136
 
1137
@node Interfacing to toplev.c, Interfacing to the garbage collection, treelang main compiler, treelang main compiler
1138
@subsubsection Interfacing to toplev.c
1139
 
1140
In treelang this is handled mainly in tree1.c
1141
and partly in treetree.c. Peruse toplev.c for details of what you need
1142
to do.
1143
 
1144
@node Interfacing to the garbage collection, Interfacing to the code generation code. , Interfacing to toplev.c, treelang main compiler
1145
@subsubsection Interfacing to the garbage collection
1146
 
1147
Interfacing to the garbage collection. In treelang this is mainly in
1148
tree1.c.
1149
 
1150
Memory allocation in the compiler should be done using the ggc_alloc and
1151
kindred routines in ggc*.*. At the end of every 'function' in your language, toplev.c calls
1152
the garbage collection several times. The garbage collection calls mark
1153
routines which go through the memory which is still used, telling the
1154
garbage collection not to free it. Then all the memory not used is
1155
freed.
1156
 
1157
What this means is that you need a way to hook into this marking
1158
process. This is done by calling ggc_add_root. This provides the address
1159
of a callback routine which will be called duing garbage collection and
1160
which can call ggc_mark to save the storage. If storage is only
1161
used within the parsing of a function, you do not need to provide a way
1162
to mark it.
1163
 
1164
Note that you can also call ggc_mark_tree to mark any of the back end
1165
internal 'tree' nodes. This routine will follow the branches of the
1166
trees and mark all the subordinate structures. This is useful for
1167
example when you have created a variable declaration that will be used
1168
across multiple functions, or for a function declaration (from a
1169
prototype) that may be used later on. See the next item for more on the
1170
tree nodes.
1171
 
1172
@node Interfacing to the code generation code. ,  , Interfacing to the garbage collection, treelang main compiler
1173
@subsubsection Interfacing to the code generation code.
1174
 
1175
In treelang this is done in treetree.c. A typedef called 'tree' which is
1176
defined in tree.h and tree.def in the GCC directory and largely
1177
implemented in tree.c and stmt.c forms the basic interface to the
1178
compiler back end.
1179
 
1180
In general you call various tree routines to generate code, either
1181
directly or through toplev.c. You build up data structures and
1182
expressions in similar ways.
1183
 
1184
You can read some documentation on this which can be found via the GCC
1185
main web page. In particular, the documentation produced by Joachim
1186
Nadler and translated by Tim Josling can be quite useful. the C compiler
1187
also has documentation in the main GCC manual (particularly the current
1188
CVS version) which is useful on a lot of the details.
1189
 
1190
In time it is hoped to enhance this document to provide a more
1191
comprehensive overview of this topic. The main gap is in explaining how
1192
it all works together.
1193
 
1194
@node Hints and tips,  , treelang compiler interfaces, treelang internals
1195
@section Hints and tips
1196
 
1197
@itemize @bullet
1198
 
1199
@item
1200
TAGS: Use the make ETAGS commands to create TAGS files which can be used in
1201
emacs to jump to any symbol quickly.
1202
 
1203
@item
1204
GREP: grep is also a useful way to find all uses of a symbol.
1205
 
1206
@item
1207
TREE: The main routines to look at are tree.h and tree.def. You will
1208
probably want a hardcopy of these.
1209
 
1210
@item
1211
SAMPLE: look at the sample interfacing code in treetree.c. You can use
1212
gdb to trace through the code and learn about how it all works.
1213
 
1214
@item
1215
GDB: the GCC back end works well with gdb. It traps abort() and allows
1216
you to trace back what went wrong.
1217
 
1218
@item
1219
Error Checking: The compiler back end does some error and consistency
1220
checking. Often the result of an error is just no code being
1221
generated. You will then need to trace through and find out what is
1222
going wrong. The rtl dump files can help here also.
1223
 
1224
@item
1225
rtl dump files: The main compiler documents these files which are dumps
1226
of the rtl (intermediate code) which is manipulated doing the code
1227
generation process. This can provide useful clues about what is going
1228
wrong. The rtl 'language' is documented in the main GCC manual.
1229
 
1230
@end itemize
1231
 
1232
@end ifset
1233
 
1234
@node Open Questions, Bugs, treelang internals, Top
1235
@chapter Open Questions
1236
 
1237
If you know GCC well, please consider looking at the file treetree.c and
1238
resolving any questions marked "???".
1239
 
1240
@node Bugs, Service, Open Questions, Top
1241
@chapter Reporting Bugs
1242
@cindex bugs
1243
@cindex reporting bugs
1244
 
1245
You can report bugs to @email{@value{email-bugs}}. Please make
1246
sure bugs are real before reporting them. Follow the guidelines in the
1247
main GCC manual for submitting bug reports.
1248
 
1249
@menu
1250
* Sending Patches::
1251
@end menu
1252
 
1253
@node Sending Patches,  , Bugs, Bugs
1254
@section Sending Patches for GNU Treelang
1255
 
1256
If you would like to write bug fixes or improvements for the GNU
1257
Treelang compiler, that is very helpful.  Send suggested fixes to
1258
@email{@value{email-patches}}.
1259
 
1260
@node Service, Projects, Bugs, Top
1261
@chapter How To Get Help with GNU Treelang
1262
 
1263
If you need help installing, using or changing GNU Treelang, there are two
1264
ways to find it:
1265
 
1266
@itemize @bullet
1267
 
1268
@item
1269
Look in the service directory for someone who might help you for a fee.
1270
The service directory is found in the file named @file{SERVICE} in the
1271
GCC distribution.
1272
 
1273
@item
1274
Send a message to @email{@value{email-general}}.
1275
 
1276
@end itemize
1277
 
1278
@end ifset
1279
@ifset INTERNALS
1280
 
1281
@node Projects, Index, Service, Top
1282
@chapter Projects
1283
@cindex projects
1284
 
1285
If you want to contribute to @code{treelang} by doing research,
1286
design, specification, documentation, coding, or testing,
1287
the following information should give you some ideas.
1288
 
1289
Send a message to @email{@value{email-general}} if you plan to add a
1290
feature.
1291
 
1292
The main requirement for treelang is to add features and to add
1293
documentation. Features are things that the GCC back end can do but
1294
which are not reflected in treelang. Examples include structures,
1295
unions, pointers, arrays.
1296
 
1297
@end ifset
1298
 
1299
@node Index,  , Projects, Top
1300
@unnumbered Index
1301
 
1302
@printindex cp
1303
@summarycontents
1304
@contents
1305
@bye

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.