URL
https://opencores.org/ocsvn/eco32/eco32/trunk
Subversion Repositories eco32
[/] [eco32/] [trunk/] [doc/] [manual/] [toolchain.tex] - Rev 93
Compare with Previous | Blame | View Log
\chapter{Tool Chain} \label{tool_chain} The \eco comes with tool programs that allow the development of software for it. The software package currently includes an assembler, C compiler, instruction-level simulator, and various support tools. \section{Assembler ({\tt asld})} The {\tt asld} tool assembles and links a set of files written in a custom assembler format to produce an executable binary. The binary uses either a custom segmented binary format, or a raw dump of the code and data segments. It is currently impossible to separate the assembler and linker stages. \subsection{Command Line Interface} {\bf Synopsis:}\\ {\tt asld [options] file [files ...]} The {\tt asld} tool reads all files and interprets them according to a custom assembler format described below. The files are then assembled in the order specified in the command line to produce an executable. Various options control this process: \begin{itemize} \item {\tt \bf -h}: Generates a {\it headerless} binary that contains only a raw dump of the code and data sections in direct sequence, without any header. \item {\tt \bf -o \it objfile}: Specifies the name of the generated binary. \item {\tt \bf -m \it mapfile}: Specifies the name of a {\it map file} that is created in addition to the output binary. This map file contains a listing of the final global symbol table. \item {\tt \bf -rc \it Address}: Specifies the (virtual) start address of the code section. This affects the target location of symbols in that section. It does not affect the position of the code section within the generated binary file. If this option is not specified, the start address of the code section defaults to 0. \item {\tt \bf -rd \it Address}: Specifies the (virtual) start address of the data section. This affects the target location of symbols in that section. It does not affect the position of the data section within the generated binary file. If this option is not specified, the start address of the data section defaults to the end of the code section, rounded up to 4k page boundaries. \item {\tt \bf -rb \it Address}: Specifies the (virtual) start address of the BSS section. This affects the target location of symbols in that section. It does not affect the position of the BSS section within the generated binary file. If this option is not specified, the start address of the BSS section defaults to the end of the data section, without any rounding. \end{itemize} \subsection{Assembling Model} The assembler maintains the following state variables: \begin{itemize} \item Three sections, called {\it code}, {\it data}, and {\it BSS}. Each section consists of a byte array starting at index 0. The number of bytes in this array is the {\it size} of the section. The only way to modify a section is to append bytes at the end. Note that while the BSS is treated like the other sections, its contents are not written to the output file. \item A symbol table. Each entry of this table maps an identifier to a (section, index) pair and thus points to a specific location in a specific section. As a special rule, the section of a symbol can be the special {\it absolute} section, meaning that the symbol is not relative to any defined section and is thus not relocated. The symbol table is split into a {\it local} and a {\it global} part for file-local and cross-file symbols (see below). \item A {\it current section}, which is one of the three sections defined above. The special {\it absolute} section cannot be the current section. \item Various control parameters. \end{itemize} At the beginning of the assembly process, all three sections are empty, the global symbol table is empty, the current section is the code section, and the control parameters are set to their default values. The assembler then begins to consume the input files one by one. For each file, the following steps are performed: \begin{itemize} \item Clear the local symbol table. \item Set the current section to the code section (<-- not sure about this, but would make sense) \item Reset some of the control parameters to their default values. \item Process the file as described in the next section. \end{itemize} After all files have been consumed, symbols are relocated and back-patched: First, the start location of each section is determined either automatically or by the {\tt -r}$^*$ command-line switches. The {\it relocated position} of a symbol is obtained by adding the start address of the symbol's section to the location of the symbol within its section. Symbols in the special {\it absolute} section use their section-local position as the relocated position, which is equivalent to saying that the start address of the {\it absolute} section is 0. The assembler then scans through all references to symbols in the assembled code and inserts the relocated address. Finally, the output binary is generated by writing the header (containing the section sizes; only if {\tt \bf -h} has not been specified) and the contents of the code and data sections. \subsection{Input Format} An assembler input file is a sequence of {\it labels}, {\it instructions}, and {\it processing directives}. Each of them modifies the assembler state defined in the previous section: \begin{itemize} \item A {\it label} creates an entry in the local symbol table. It is specified as an identifier, followed by a colon character. This identifier names the entry that is created in the local symbol table. The target location of the symbol is the current section and the current location in that section. \item An instruction is a simple identifier that is one of the instruction mnemonics of the \ecox, followed by the operands of that instruction. For convenience, the non-immediate mnemonic may be used with an immediate operand to specify the immediate instruction, such as ADD for ADDI. Register operands are specified by a dollar sign, followed by the number of the register. Immediate operands are specified as a simple number. Jump targets are specified by a label identifier. Operands must be comma-separated. The specified instruction is assembled at the current location in the current section (usually, but not necessarily, the code section). The control parameters may be set up to allow {\it synthesized instructions}. These look like single instructions in the input file, but are actually assembled as short instruction sequences. Synthesized instructions exist purely for convenience when writing assembler code manually. \item A processing directive starts with a dot, followed by the name of the directive. The following directives exist: \begin{itemize} \item {\tt \bf .syn}: Enables synthesized instructions. \item {\tt \bf .nosyn}: Disables synthesized instructions. \item {\tt \bf .code}: Makes the code section the current section. \item {\tt \bf .data}: Makes the data section the current section. \item {\tt \bf .bss}: Makes the BSS section the current section. \item {\tt \bf .export}: Creates a global symbol table entry from a local one. This directive expects a list of symbol names, all of which are exported. \item {\tt \bf .import}: Creates a local symbol table entry by importing a global symbol. The corresponding global symbol must be defined in past or future assembler input within the same assembler run, otherwise an error occurs. This directive expects a list of symbol names, all of which are imported. \item {\tt \bf .align}: Inserts padding bytes for half-word or word alignment. Formally, this directive expects a number argument which must be a power of 2, and inserts zeroed bytes into the current section until the current position in the current section is a multiple of that number. The result is undefined if the specified number is not a power of 2. This directive is typically used directly before half-word or word sized variables are emitted, because access to these variables must be aligned to the access size. As an example, ``{\bf align 2}'' inserts a zeroed byte if the current position is an odd position, and thus aligns the current position to generate a half-word variable. Similarly, ``{\bf align 4}'' aligns the current position for word-sized variables. \item {\tt \bf .space}: This directive expects a number argument and emits that number of zeroed bytes to the current section. \item {\tt \bf .locate}: This directive expects a number argument and emits zeroed bytes to the current section until the current position in the current section is equal to that number. The specified number {\it must not} be less than the current position in the current section, otherwise the {\tt asld} will crash. \item {\tt \bf .byte}: Emits a single byte to the current section whose value is the argument to this directive. \item {\tt \bf .half}: Emits two bytes to the current section whose value is the argument to this directive in big-endian representation. The {\tt .half} directive can emit half-words at unaligned memory locations, however, the \eco will not be able to read then a half-word units. \item {\tt \bf .word}: Emits four bytes to the current section whose value is the argument to this directive in big-endian representation. The {\tt .word} directive can emit words at unaligned memory locations, however, the \eco will not be able to read then a word units. \item {\tt \bf .set}: This directive expects an identifier and numeric value as its arguments, and creates a symbol in the special {\it absolute} section with that identifier and value. \end{itemize} \end{itemize} \subsection{Output Format} The output format of {\tt asld} is a single file that consists of a {\it header} and a {\it body}. If the {\tt \bf -h} option is specified, the header is omitted. The header contains the following fields in big-endian byte order: \begin{itemize} \item Magic Number (4 bytes): Must be 3AE82DD4$_h$. \item Code Section Size (4 bytes) \item Data Section Size (4 bytes) \item BSS Section Size (4 bytes) \end{itemize} The body contains the contents of the code and data sections in direct sequence, without any padding in between. It is the responsibility of the loader to ensure that these sections are loaded to the (virtual) section start addresses determined at assembly time. The BSS conceptually contains only zeroed bytes, and thus isn't stored in the binary file. It is the responsibility of the loader to ensure that the contents of the BSS are actually zeroed. \section{C compiler ({\tt lcc})} The {\tt lcc} tool is a C compiler, based on the LCC source code, for ANSI C (C89). Currently, it must be used in conjunction with the {\tt asld} tool to compile a whole project at once, because there is no object format for individual compiled C sources. Assembler and C sources can be mixed in a compiler run, and will be assembled in exactly the order specified at the command line. Unless overridden, the generated object file is a simple segmented format. \subsection{Command Line Interface} LCC supports various switches on the command line that can be viewed by running it without arguments. The general synopsis is: {\tt lcc [option | file] ...} Each file is either a C or assembler input file. The input files are assembled in the specified order. The {\tt \bf -W} argument is a generic extension mechanism for command-line arguments. Only the most important uses of this mechanism will be explained here: \begin{itemize} \item {\tt \bf -Wo-kernel}: Sets the start address of the code section to C0010000$_h$ as if {\tt \bf -Wl-rc -Wl0xC0010000} had been specified, and prevents linking to the standard library. Since there is no useful standard library yet, this switch must be specified. Alternatively, compilation can be done using {\tt \bf -s} and assembly/linking be done in a separate step, which has the same effect. \item {\tt \bf -Wl-m -Wl}{\it mapfile}: Generates a {\it map} file that lists the entities assembled to the output file. \item {\tt \bf -Wl-h}: Generates a {\it headerless} output file. The output file does not contain the simple segmented output format. Instead, it only contains the contents of the code and data section in direct sequence. \item {\tt \bf -Wl-rc -Wl0x}{\it Address}: Specifies the start address of the code section. This affects the (jump) addresses within the code that is ultimately written to the output file. \item {\tt \bf -Wl-rd -Wl0x}{\it Address}: Specifies the start address of the data section. This affects the (load/store) addresses within the code that is ultimately written to the output file. \item {\tt \bf -Wl-rb -Wl0x}{\it Address}: Specifies the start address of the BSS section. This affects the (load/store) addresses within the code that is ultimately written to the output file. \end{itemize} \subsection{Data Types} The C compiler uses the following bit sizes for the C data types: \begin{tabular}{|c|c|} \hline long & 32\\ int & 32\\ short & 16\\ char & 8\\ pointer & 32\\ \hline \end{tabular} \subsection{Register Allocation} The C compiler assigns a fixed purpose to each register index: \begin{tabular}{|c|l|} \hline Index & Meaning\\ \hline 0 & tied to value 0 by the hardware\\ \hline 1 & reserved as an auxiliary register for use by the assembler\\ & (not used by the compiler)\\ \hline 2,3 & function return value\\ \hline 4..7 & function arguments\\ \hline 8..15, 24, 25 & caller-save local value, to be used for temporary results\\ \hline 16..23 & callee-save local value, to be used for local variables\\ \hline 26..28 & reserved for OS kernel\\ & (not used by the compiler)\\ \hline 29 & stack pointer\\ \hline 30 & reserved for interrupt return address \\ & (not used by the compiler)\\ \hline 31 & function return address\\ \hline \end{tabular} \section{Simulator} ... \section{\tt bin2exo} The {\tt bin2exo} tool converts a binary file to a {\tt .exo} file to be loaded into the flash ROM. The {\tt .exo} file contains exactly the byte sequence stored in the binary file, converted to Motorola S-Records, without any headers, stripping, or byte swapping. The start address at which the data is placed in ROM can be specified via the command line. \subsection{Command-Line Options} Synopsis: \begin{itemize} \item[] {\tt bin2exo <load address, hex> <input file> <output file>} \end{itemize} The {\it load address} specifies the first address in the flash ROM, specified as a hexadecimal number, that is occupied by the contents of the {\it input file}. This file is converted to Motorola S-Records, which are stored in the {\it output file}, which presumably is a {\tt .exo} file. \subsection{Generating a Boot ROM} The {\tt bin2exo} tool can be used to convert a binary file to a boot ROM for the \ecox. To do so, the binary file must be converted to a {\tt .exo} file with start address 0 and loaded into the flash ROM using the {\tt GXSLOAD} tool. This maps the contents of the binary file to physical address 20000000$_h$ (virtual address E0000000$_h$) upwards, and thus causes the \eco to interpret the contents of the file as raw instructions after reset. Note that while neither the binary file nor the {\tt bin2exo} tool or the {\tt .exo} file have a notion of a byte order, using the file as a boot image causes the \eco to access its contents in a big-endian fashion, just as expected. This is the result of various intermediate steps, such as the {\tt GXSLOAD} program, the parallel interface to the XSA board, the CPLD configuration, the byte order of the flash ROM itself, and the ROM interface that connects the flash ROM to the SoC bus. Not all of these steps are well-documented, and no assumptions should be made about the intermediate byte order if this chain is broken. \section{\tt bit2exo} The {\tt bit2exo} tool converts a Xilinx {\tt .bit} file to a {\tt .exo} file that can be loaded into the Flash ROM to configure the FPGA on startup. It is important to use {\tt bit2exo}, and not {\tt bin2exo} for this job, because the {\tt .bit} file contains headers that must be stripped from the actual bit stream. The start address at which the bit stream is placed in ROM can be specified in hexadecimal via the command line, and should be the start address of one of the four ROM quadrants: \begin{tabular}{|c|c|} \hline ROM Quadrant & Start Address\\ \hline 0 & 000000$_h$ \\ \hline 1 & 080000$_h$ \\ \hline 2 & 100000$_h$ \\ \hline 3 & 180000$_h$ \\ \hline \end{tabular} Due to the architecture of the \ecox, quadrant 0 usually contains the boot loader code and therefore cannot be used for the FPGA configuration. By placing the configuration in quadrant 3, quadrants 0 through 2 can be usd as a continguous program ROM. {\bf Note:} Do not forget to place the jumpers on the FPGA board to tell the FGPA from which quadrant to load its configuration. \subsection{Command-Line Options} Synopsis: \begin{itemize} \item[] {\tt bin2exo <load address, hex> <input file> <output file>} \end{itemize} The {\it load address} specifies the first address in the flash ROM, specified as a hexadecimal number, that is occupied by the contents of the bit stream found in the {\it input file} after stripping all headers. The bit stream is converted to Motorola S-Records, which are stored in the {\it output file}, which presumably is a {\tt .exo} file.