URL https://opencores.org/ocsvn/cpu_lecture/cpu_lecture/trunk
Subversion Repositories cpu_lecture

[/] [cpu_lecture/] [trunk/] [html/] [09_Toolchain_Setup.html] - Rev 10

Go to most recent revision | Compare with Previous | Blame | View Log
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>html/Toolchain_Setup</TITLE>
<META NAME="generator" CONTENT="HTML::TextToHTML v2.46">
<LINK REL="stylesheet" TYPE="text/css" HREF="lecture.css">
</HEAD>
<BODY>
<P><table class="ttop"><th class="tpre"><a href="08_IO.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="10_Listing_of_alu_vhd.html">Next Lesson</a></th></table>
<hr>
 
<H1><A NAME="section_1">9 TOOLCHAIN SETUP</A></H1>
 
<P>In this lesson we will learn how to set up a toolchain on a Linux box.
We will not describe, however, how the tools are downloaded and installed.
The installation of a tools is normally described in the documentation
that comes with the tool.
 
<P>Places from where tools can be downloaded were already presented in the
first lecture.
 
<P>The following figure gives an overview of the entire flow. We show source files
in yellow, temporary files in white and tools in green.
 
<P><br>
 
<P><img src="toolchain_1.png">
 
<P><br>
 
<P>We start with a C source file <STRONG>hello.c</STRONG>. This file is compiled with <STRONG>avr-gcc</STRONG>,
a <STRONG>gcc</STRONG> variant that generates opcodes for the AVR CPU. The compilation
produces 2 output files: <STRONG>hello.lss</STRONG> and <STRONG>hello.hex</STRONG>. 
 
<P><STRONG>hello.lss</STRONG> is a listing file and may optionally be post-processed by the
tool <STRONG>end_conv</STRONG> which converts the little-endian format of <STRONG>hello.lss</STRONG> into
a slightly different format that is more in line with the way gtkwave shows
the hex values of signals.
 
<P>The main purpose of the compilation is to produce <STRONG>hello.hex</STRONG>. <STRONG>hello.hex</STRONG>
contains the opcodes produced from <STRONG>hello.c</STRONG> in Intel-Hex format.
 
<P><STRONG>hello.hex</STRONG> is then fed into <STRONG>make_mem</STRONG>. <STRONG>make_mem</STRONG> is a tool that converts
the Intel-Hex format into VHDL constants. These constants are used to
initialize the block RAM modules of the program memory. The output of
<STRONG>make_mem</STRONG> is <STRONG>memory_content.vhd</STRONG> (which, as you certainly remember,
was included by <STRONG>prog_mem.vhd</STRONG>).
 
<P>At this point, there are two possible ways to proceed. You could do a
functional simulation or a timing simulation.
 
<H2><A NAME="section_1_1">8.1 Functional Simulation</A></H2>
 
<P>Initially you will be concerned mostly with the functional simulation.
On this branch you debug the VHDL code until it looks functionally OK.
In order to perform the functional simulation, you need 3 sorts of VHDL files:
 
<OL>
  <LI>The VHLD source files that were discussed in the previous lessons,
  <LI>the <STRONG>memory_content.vhd</STRONG> just described, and
  <LI>a testbench that mimics the board containing the FPGA to be (<STRONG>test_tb.vhd</STRONG>,
   and a VHDL implementation of device specific components used (in our case
   this is only <STRONG>RAMB4_S4_S4.vhd</STRONG>). Both files are provided in the directory
   called <STRONG>test</STRONG>.
</OL>
<P>All these VHDL files are then processed by <STRONG>ghdl</STRONG>. <STRONG>ghdl</STRONG> produces a single
output file <STRONG>testbench</STRONG> in directory <STRONG>simu</STRONG>. <STRONG>testbench</STRONG> is an executable
file. <STRONG>testbench</STRONG> is then run in order to produces a gzip'ed <STRONG>vcd</STRONG> (value change
dump) file called <STRONG>testbench.vcdgz</STRONG>.
 
<P>The last step is visualize <STRONG>testbench.vcdgz</STRONG> by means of the tool <STRONG>gtkwave</STRONG>.
<STRONG>gtkwave</STRONG> is similar to the <STRONG>ModelSim</STRONG> provided by Xilinx, but it has
two advantages: it does not bother the user with licence installations
(even in the "free" versions provided by Xilinx) and it runs under Linux.
There are actually more advantages of the <STRONG>ghdl</STRONG>/<STRONG>gtkwave</STRONG> combination;
after having used both tools in the past the author definitely prefers
<STRONG>ghdl</STRONG>/<STRONG>gtkwave</STRONG>.
 
<P>An example output of the functional simulation that shows the  operation
our CPU:
 
<P><br>
 
<P><img src="GTKWave.PNG">
 
<P><br>
 
<P>We can compare the CPU signals shown with the assembler code being executed.
The CPU is executing inside the C function <STRONG>uart_puts()</STRONG>:
 
<P><br>
 
<pre class="vhdl">
 
<pre class="filename">
app/hello.lss1
</pre></pre>
<P>
 
<P><br>
 
<H2><A NAME="section_1_2">8.2 Timing Simulation and FPGA Configuration</A></H2>
 
<P>After the CPU functions correctly, the design can be fed into the Xilinx
toolchain. This toolchain is better described in the documentation that
comes with it, so we don't go to too much detail here.
 
<P>We used Webpack 10.1, which can be downloaded from Xilinx.
 
<P>The first step is to set up a project in the ISE project navigator with
the proper target device. Then the VHDL files in the <STRONG>src</STRONG> directory are
added to the project. Next the <STRONG>Synthesize</STRONG> and <STRONG>Implementation</STRONG> steps
of the design flow are run.
 
<P>If this is successful, then we can generate a programming file. There are
a number of ways to configure Xilinx FPGAs, and the type of programming file
needed depends on the particular way of configuring the device. The board
we used for testing the CPU had a serial PROM and therefore we generated
a programming file for the serial PROM on the board. The FPGA would then
load from the PROM on start-up. Other ways of configuring the device are
via JTAG, which is also quite handy during debugging.
 
<P>The entire build process is a little lengthy (and the devil is known to
hide in the details). We therefore go through the entire design flow in
a step-by-step fashion.
 
<H2><A NAME="section_1_3">8.3 Downloading and Building the Tools </A></H2>
 
<UL>
  <LI>Download and install <STRONG>ghdl</STRONG>.
  <LI>Download and install <STRONG>gtkwave</STRONG>.
  <LI>Download and install the Xilinx toolchain.
  <LI>Build the <STRONG>make_mem</STRONG> tool. The source is this:
</UL>
<P><br>
 
<pre class="vhdl">
 
  1	#include "assert.h"
  2	#include "stdio.h"
  3	#include "stdint.h"
  4	#include "string.h"
  5	
  6	uint8_t buffer[0x10000];    // 64 k is max. for Intel hex.
  7	uint8_t slice [0x10000];    // 16 k is max. for Xilinx bram
  8	
  9	//-----------------------------------------------------------------------------
 10	//
 11	// get a byte (from cp pointing into Intel hex file).
 12	//
 13	uint32_t
 14	get_byte(const char *  cp)
 15	{
 16	uint32_t value;
 17	const char cc[3] = { cp[0], cp[1], 0 };
 18	const int cnt = sscanf(cc, "%X", &value);
 19	   assert(cnt == 1);
 20	   return value;
 21	}
 22	//-----------------------------------------------------------------------------
 23	//
 24	// read an Intel hex file into buffer
 25	void
 26	read_file(FILE * in)
 27	{
 28	   memset(buffer, 0xFF, sizeof(buffer));
 29	char line[200];
 30	   for (;;)
 31	       {
 32	         const char * s = fgets(line, sizeof(line) - 2, in);
 33	         if (s == 0)   return;
 34	         assert(*s++ == ':');
 35	         const uint32_t len     = get_byte(s);
 36	         const uint32_t ah      = get_byte(s + 2);
 37	         const uint32_t al      = get_byte(s + 4);
 38	         const uint32_t rectype = get_byte(s + 6);
 39	         const char * d = s + 8;
 40	         const uint32_t addr = ah << 8 | al;
 41	
 42	         uint32_t csum = len + ah + al + rectype;
 43	         assert((addr + len) <= 0x10000);
 44	         for (uint32_t l = 0; l < len; ++l)
 45	             {
 46	               const uint32_t byte = get_byte(d);
 47	               d += 2;
 48	               buffer[addr + l] = byte;
 49	               csum += byte;
 50	             }
 51	
 52	         csum = 0xFF & -csum;
 53	         const uint32_t sum = get_byte(d);
 54	         assert(sum == csum);
 55	       }
 56	}
 57	//-----------------------------------------------------------------------------
 58	//
 59	// copy a slice from buffer into slice.
 60	// buffer is organized as 32-bit x items.
 61	// slice is organized as bits x items.
 62	//
 63	void copy_slice(uint32_t slice_num, uint32_t port_bits, uint32_t mem_bits)
 64	{
 65	   assert(mem_bits == 0x1000 || mem_bits == 0x4000);
 66	
 67	const uint32_t items = mem_bits/port_bits;
 68	const uint32_t mask = (1 << port_bits) - 1;
 69	const uint8_t * src = buffer;
 70	
 71	    memset(slice, 0, sizeof(slice));
 72	
 73	    for (uint32_t i = 0; i < items; ++i)
 74	        {
 75	          // read one 32-bit value;
 76	          const uint32_t v0 = *src++;
 77	          const uint32_t v1 = *src++;
 78	          const uint32_t v2 = *src++;
 79	          const uint32_t v3 = *src++;
 80	          const uint32_t v = (v3 << 24 |
 81	                              v2 << 16 |
 82	                              v1 <<  8 |
 83	                              v0       ) >> (slice_num*port_bits) & mask;
 84	
 85	          if (port_bits == 16)
 86	             {
 87	               assert(v < 0x10000);
 88	               slice[2*i]     = v;
 89	               slice[2*i + 1] = v >> 8;
 90	             }
 91	          else if (port_bits == 8)
 92	             {
 93	               assert(v < 0x100);
 94	               slice[i] = v;
 95	             }
 96	          else if (port_bits == 4)
 97	             {
 98	               assert(v < 0x10);
 99	               slice[i >> 1] |= v << (4*(i & 1));
100	             }
101	          else if (port_bits == 2)
102	             {
103	               assert(v < 0x04);
104	               slice[i >> 2] |= v << (2*(i & 3));
105	             }
106	          else if (port_bits == 1)
107	             {
108	               assert(v < 0x02);
109	               slice[i >> 3] |= v << ((i & 7));
110	             }
111	          else assert(0 && "Bad aspect ratio.");
112	        }
113	}
114	//-----------------------------------------------------------------------------
115	//
116	// write one initialization vector
117	//
118	void
119	write_vector(FILE * out, uint32_t mem, uint32_t vec, const uint8_t * data)
120	{
121	   fprintf(out, "constant p%u_%2.2X : BIT_VECTOR := X\"", mem, vec);
122	   for (int32_t d = 31; d >= 0; --d)
123	       fprintf(out, "%2.2X", data[d]);
124	
125	   fprintf(out, "\";\r\n");
126	}
127	//-----------------------------------------------------------------------------
128	//
129	// write one memory
130	//
131	void
132	write_mem(FILE * out, uint32_t mem, uint32_t bytes)
133	{
134	   fprintf(out, "-- content of p_%u --------------------------------------"
135	                "--------------------------------------------\r\n", mem);
136	
137	const uint8_t * src = slice;
138	   for (uint32_t v = 0; v < bytes/32; ++v)
139	       write_vector(out, mem, v, src + 32*v);
140	
141	   fprintf(out, "\r\n");
142	}
143	//-----------------------------------------------------------------------------
144	//
145	// write the entire memory_contents file.
146	//
147	void
148	write_file(FILE * out, uint32_t bits)
149	{
150	   fprintf(out,
151	"\r\n"
152	"library IEEE;\r\n"
153	"use IEEE.STD_LOGIC_1164.all;\r\n"
154	"\r\n"
155	"package prog_mem_content is\r\n"
156	"\r\n");
157	
158	const uint32_t mems = 16/bits;
159	
160	   for (uint32_t m = 0; m < 2*mems; ++m)
161	       {
162	         copy_slice(m, bits, 0x1000);
163	         write_mem(out, m, 0x200);
164	       }
165	
166	   fprintf(out,
167	"end prog_mem_content;\r\n"
168	"\r\n");
169	}
170	//-----------------------------------------------------------------------------
171	int
172	main(int argc, char * argv[])
173	{
174	uint32_t bits = 4;
175	const char * prog = *argv++;   --argc;
176	
177	   if      (argc && !strcmp(*argv, "-1"))    { bits =  1;   ++argv;   --argc; }
178	   else if (argc && !strcmp(*argv, "-2"))    { bits =  2;   ++argv;   --argc; }
179	   else if (argc && !strcmp(*argv, "-4"))    { bits =  4;   ++argv;   --argc; }
180	   else if (argc && !strcmp(*argv, "-8"))    { bits =  8;   ++argv;   --argc; }
181	   else if (argc && !strcmp(*argv, "-16"))   { bits = 16;   ++argv;   --argc; }
182	
183	const char * hex_file = 0;
184	const char * vhdl_file = 0;
185	
186	   if (argc)   { hex_file  = *argv++;   --argc; }
187	   if (argc)   { vhdl_file = *argv++;   --argc; }
188	   assert(argc == 0);
189	
190	FILE * in = stdin;
191	   if (hex_file)   in = fopen(hex_file, "r");
192	   assert(in);
193	   read_file(in);
194	   fclose(in);
195	
196	FILE * out = stdout;
197	   if (vhdl_file)   out = fopen(vhdl_file, "w");
198	   write_file(out, bits);
199	   assert(out);
200	}
201	//-----------------------------------------------------------------------------
<pre class="filename">
tools/make_mem.cc
</pre></pre>
<P>
 
<P><br>
 
<P>The command to build the tool is:
 
<P><br>
 
<pre class="cmd">
 
# Build makemem.
g++ -o make_mem make_mem.cc
 
</pre>
<P>
 
<P><br>
 
<UL>
  <LI>Build the <STRONG>end_conv</STRONG> tool. The source is this:
</UL>
<P><br>
 
<pre class="vhdl">
 
  1	#include "assert.h"
  2	#include "ctype.h"
  3	#include "stdio.h"
  4	#include "string.h"
  5	
  6	//-----------------------------------------------------------------------------
  7	int
  8	main(int argc, const char * argv)
  9	{
 10	char buffer[2000];
 11	int pc, val, val2;
 12	
 13	   for (;;)
 14	       {
 15	         char * s = fgets(buffer, sizeof(buffer) - 2, stdin);
 16	         if (s == 0)   return 0;
 17	
 18	         // map lines '  xx:' and 'xxxxxxxx; to 2* the hex value.
 19	         //
 20	         if (
 21	             (isxdigit(s[0]) || s[0] == ' ') &&
 22	             (isxdigit(s[1]) || s[1] == ' ') &&
 23	             (isxdigit(s[2]) || s[2] == ' ') &&
 24	              isxdigit(s[3]) && s[4] == ':')   // '  xx:'
 25	            {
 26	              assert(1 == sscanf(s, " %x:", &pc));
 27	              if (pc & 1)       printf("%4X+:", pc/2);
 28	              else              printf("%4X:", pc/2);
 29	              s += 5;
 30	            }
 31	         else if (isxdigit(s[0]) && isxdigit(s[1]) && isxdigit(s[2]) &&
 32	                  isxdigit(s[3]) && isxdigit(s[4]) && isxdigit(s[5]) &&
 33	                  isxdigit(s[6]) && isxdigit(s[7]))             // 'xxxxxxxx'
 34	            {
 35	              assert(1 == sscanf(s, "%x", &pc));
 36	              if (pc & 1)   printf("%8.8X+:", pc/2);
 37	              else          printf("%8.8X:", pc/2);
 38	              s += 8;
 39	            }
 40	         else                             // other: copy verbatim
 41	            {
 42	              printf("%s", s);
 43	              continue;
 44	            }
 45	
 46	          while (isblank(*s))   printf("%c", *s++);
 47	
 48	          // endian swap.
 49	          //
 50	          while (isxdigit(s[0]) &&
 51	                 isxdigit(s[1]) &&
 52	                          s[2] == ' ' &&
 53	                 isxdigit(s[3]) &&
 54	                 isxdigit(s[4]) &&
 55	                          s[5] == ' ')
 56	             {
 57	              assert(2 == sscanf(s, "%x %x ", &val, &val2));
 58	              printf("%2.2X%2.2X  ", val2, val);
 59	              s += 6;
 60	             }
 61	
 62	         char * s1 = strstr(s, ".+");
 63	         char * s2 = strstr(s, ".-");
 64	         if (s1)
 65	            {
 66	              assert(1 == sscanf(s1 + 2, "%d", &val));
 67	              assert((val & 1) == 0);
 68	              sprintf(s1, " 0x%X", (pc + val)/2 + 1);
 69	              printf(s);
 70	              s = s1 + strlen(s1) + 1;
 71	            }
 72	         else if (s2)
 73	            {
 74	              assert(1 == sscanf(s2 + 2, "%d", &val));
 75	              assert((val & 1) == 0);
 76	              sprintf(s2, " 0x%X", (pc - val)/2 + 1);
 77	              printf(s);
 78	              s = s2 + strlen(s2) + 1;
 79	            }
 80	
 81	         printf("%s", s);
 82	       }
 83	}
 84	//-----------------------------------------------------------------------------
<pre class="filename">
tools/end_conv.cc
</pre></pre>
<P>
 
<P><br>
 
<P>The command to build the tool is:
 
<P><br>
 
<pre class="cmd">
 
# Build end_conv.
g++ -o end_conv end_conv.cc
 
</pre>
<P>
 
<P><br>
 
<H2><A NAME="section_1_4">8.4 Preparing the Memory Content</A></H2>
 
<P>We write a program <STRONG>hello.c</STRONG> that prints "Hello World" to the serial line.
 
<P>The source is this:
 
<P><br>
 
<pre class="vhdl">
 
  1	#include "stdint.h"
  2	#include "avr/io.h"
  3	#include "avr/pgmspace.h"
  4	
  5	#undef F_CPU
  6	#define F_CPU 25000000UL
  7	#include "util/delay.h"
  8	
  9	
 10	     //----------------------------------------------------------------------//
 11	    //                                                                      //
 12	   //   print char cc on UART.                                             //
 13	  //    return number of chars printed (i.e. 1).                          //
 14	 //                                                                      //
 15	//----------------------------------------------------------------------//
 16	uint8_t
 17	uart_putc(uint8_t cc)
 18	{
 19	    while ((UCSRA & (1 << UDRE)) == 0)      ;
 20	    UDR = cc;
 21	    return 1;
 22	}
 23	
 24	     //----------------------------------------------------------------------//
 25	    //                                                                      //
 26	   //   print char cc on 7 segment display.                                //
 27	  //    return number of chars printed (i.e. 1).                          //
 28	 //                                                                      //
 29	//----------------------------------------------------------------------//
 30	// The segments of the display are encoded like this:
 31	//
 32	//
 33	//      segment     PORT B
 34	//      name        Bit number
 35	//      ----A----   ----0----
 36	//      |       |   |       |
 37	//      F       B   5       1
 38	//      |       |   |       |
 39	//      ----G----   ----6----
 40	//      |       |   |       |
 41	//      E       C   4       2
 42	//      |       |   |       |
 43	//      ----D----   ----3----
 44	//
 45	//-----------------------------------------------------------------------------
 46	
 47	#define SEG7(G, F, E, D, C, B, A)   (~(G<<6|F<<5|E<<4|D<<3|C<<2|B<<1|A))
 48	
 49	uint8_t
 50	seg7_putc(uint8_t cc)
 51	{
 52	uint16_t t;
 53	
 54	    switch(cc)
 55	    {                   //   G F E D C B A
 56	    case ' ':   PORTB = SEG7(0,0,0,0,0,0,0);        break;
 57	    case 'E':   PORTB = SEG7(1,1,1,1,0,0,1);        break;
 58	    case 'H':   PORTB = SEG7(1,1,1,0,1,1,0);        break;
 59	    case 'L':   PORTB = SEG7(0,1,1,1,0,0,0);        break;
 60	    case 'O':   PORTB = SEG7(0,1,1,1,1,1,1);        break;
 61	    default:    PORTB = SEG7(1,0,0,1,0,0,1);        break;
 62	    }
 63	
 64	    // wait 800 + 200 ms. This can be quite boring in simulations,
 65	    // so we wait only if DIP switch 6 is closed.
 66	    //
 67	    if (!(PINB & 0x20))     for (t = 0; t < 800; ++t)   _delay_ms(1);
 68	    PORTB = SEG7(0,0,0,0,0,0,0);
 69	    if (!(PINB & 0x20))     for (t = 0; t < 200; ++t)   _delay_ms(1);
 70	
 71	    return 1;
 72	}
 73	
 74	     //----------------------------------------------------------------------//
 75	    //                                                                      //
 76	   //   print string s on UART.                                            //
 77	  //    return number of chars printed.                                   //
 78	 //                                                                      //
 79	//----------------------------------------------------------------------//
 80	uint16_t
 81	uart_puts(const char * s)
 82	{
 83	const char * from = s;
 84	uint8_t cc;
 85	    while ((cc = pgm_read_byte(s++)))   uart_putc(cc);
 86	    return s - from - 1;
 87	}
 88	
 89	     //----------------------------------------------------------------------//
 90	    //                                                                      //
 91	   //   print string s on 7 segment display.                               //
 92	  //    return number of chars printed.                                   //
 93	 //                                                                      //
 94	//----------------------------------------------------------------------//
 95	uint16_t
 96	seg7_puts(const char * s)
 97	{
 98	const char * from = s;
 99	uint8_t cc;
100	    while ((cc = pgm_read_byte(s++)))   seg7_putc(cc);
101	    return s - from - 1;
102	}
103	
104	//-----------------------------------------------------------------------------
105	int
106	main(int argc, char * argv[])
107	{
108	    for (;;)
109	    {
110	        if (PINB & 0x40)    // DIP switch 7 open.
111	            {
112	                // print 'Hello world' on UART.
113	                uart_puts(PSTR("Hello, World!\r\n"));
114	            }
115	        else                // DIP switch 7 closed.
116	            {
117	                // print 'HELLO' on 7-segment display
118	                seg7_puts(PSTR("HELLO "));
119	            }
120	    }
121	}
122	//-----------------------------------------------------------------------------
<pre class="filename">
app/hello.c
</pre></pre>
<P>
 
<P><br>
 
<P>The commands to create <STRONG>hello.hex</STRONG> and <STRONG>hello.css</STRONG> are:
 
<P><br>
 
<pre class="cmd">
 
# Compile and link hello.c.
avr-gcc -Wall -Os -fpack-struct -fshort-enums -funsigned-char -funsigned-bitfields -mmcu=atmega8 \
    -DF_CPU=25000000UL -c -o"hello.o" "hello.c"
avr-gcc -Wl,-Map,hello.map -mmcu=atmega8 -o"hello.elf"  ./hello.o   
 
# Create an opcode listing.
avr-objdump -h -S hello.elf  >"hello.lss"
 
# Create intel hex file.
avr-objcopy -R .eeprom -O ihex hello.elf  "hello.hex"
 
</pre>
<P>
 
<P><br>
 
<P>Create  <STRONG>hello.css1</STRONG>, a better readable from of  <STRONG>hello.css</STRONG>:
 
<P><br>
 
<pre class="cmd">
 
# Create hello.css1.
./end_conv < hello.css > hello.css1
 
</pre>
<P>
 
<P><br>
 
<P>Create <STRONG>prog_mem_content.vhd</STRONG>.
 
<P><br>
 
<pre class="cmd">
 
# Create prog_mem_content.vhd.
./make_mem < hello.hex > src/prog_mem_content.vhd
 
</pre>
<P>
 
<P><br>
 
<H2><A NAME="section_1_5">8.5 Performing the Functional Simulation</A></H2>
 
<H3><A NAME="section_1_5_1">8.5.1 Preparing a Testbench</A></H3>
 
<P>We prepare a testbench in which we instantiate the top-level FPGA design
of the CPU. The test bench provides a clock signal and a reset signal
for the CPU:
 
<P><br>
 
<pre class="vhdl">
 
  1	-------------------------------------------------------------------------------
  2	-- 
  3	-- Copyright (C) 2009, 2010 Dr. Juergen Sauermann
  4	-- 
  5	--  This code is free software: you can redistribute it and/or modify
  6	--  it under the terms of the GNU General Public License as published by
  7	--  the Free Software Foundation, either version 3 of the License, or
  8	--  (at your option) any later version.
  9	--
 10	--  This code is distributed in the hope that it will be useful,
 11	--  but WITHOUT ANY WARRANTY; without even the implied warranty of
 12	--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13	--  GNU General Public License for more details.
 14	--
 15	--  You should have received a copy of the GNU General Public License
 16	--  along with this code (see the file named COPYING).
 17	--  If not, see http://www.gnu.org/licenses/.
 18	--
 19	-------------------------------------------------------------------------------
 20	-------------------------------------------------------------------------------
 21	--
 22	-- Module Name:    alu - Behavioral 
 23	-- Create Date:    16:47:24 12/29/2009 
 24	-- Description:    arithmetic logic unit of a CPU
 25	--
 26	-------------------------------------------------------------------------------
 27	--
 28	library IEEE;
 29	use IEEE.STD_LOGIC_1164.ALL;
 30	use IEEE.STD_LOGIC_ARITH.ALL;
 31	use IEEE.STD_LOGIC_UNSIGNED.ALL;
 32	
 33	entity testbench is
 34	end testbench;
 35	 
 36	architecture Behavioral of testbench is
 37	
 38	component avr_fpga
 39	    port (  I_CLK_100   : in  std_logic;
 40	            I_SWITCH    : in  std_logic_vector(9 downto 0);
 41	            I_RX        : in  std_logic;
 42	
 43	            Q_7_SEGMENT : out std_logic_vector(6 downto 0);
 44	            Q_LEDS      : out std_logic_vector(3 downto 0);
 45	            Q_TX        : out std_logic);
 46	end component;
 47	
 48	signal L_CLK_100            : std_logic;
 49	signal L_LEDS               : std_logic_vector(3 downto 0);
 50	signal L_7_SEGMENT          : std_logic_vector(6 downto 0);
 51	signal L_RX                 : std_logic;
 52	signal L_SWITCH             : std_logic_vector(9 downto 0);
 53	signal L_TX                 : std_logic;
 54	
 55	signal	L_CLK_COUNT         : integer := 0;
 56	
 57	begin
 58	
 59	    fpga: avr_fpga
 60	    port map(   I_CLK_100   => L_CLK_100,
 61	                I_SWITCH    => L_SWITCH,
 62	                I_RX        => L_RX,
 63	
 64	                Q_LEDS      => L_LEDS,
 65	                Q_7_SEGMENT => L_7_SEGMENT,
 66	                Q_TX        => L_TX);
 67	
 68	    process -- clock process for CLK_100,
 69	    begin
 70	        clock_loop : loop
 71	            L_CLK_100 <= transport '0';
 72	            wait for 5 ns;
 73	
 74	            L_CLK_100 <= transport '1';
 75	            wait for 5 ns;
 76	        end loop clock_loop;
 77	    end process;
 78	
 79	    process(L_CLK_100)
 80	    begin
 81	        if (rising_edge(L_CLK_100)) then
 82	            case L_CLK_COUNT is
 83	                when 0 => L_SWITCH <= "0011100000";   L_RX <= '0';
 84	                when 2 => L_SWITCH(9 downto 8) <= "11";
 85	                when others =>
 86	            end case;
 87	            L_CLK_COUNT <= L_CLK_COUNT + 1;
 88	        end if;
 89	    end process;
 90	end Behavioral;
 91	
<pre class="filename">
test/test_tb.vhd
</pre></pre>
<P>
 
<P><br>
 
<H3><A NAME="section_1_5_2">8.5.2 Defining Memory Modules</A></H3>
 
<P>We also need a VHDL file that implements the Xilinx primitives that
we use. This is only one: the memory module RAMB4_S4_S4:
 
<P><br>
 
<pre class="vhdl">
 
  1	-------------------------------------------------------------------------------
  2	-- 
  3	-- Copyright (C) 2009, 2010 Dr. Juergen Sauermann
  4	-- 
  5	--  This code is free software: you can redistribute it and/or modify
  6	--  it under the terms of the GNU General Public License as published by
  7	--  the Free Software Foundation, either version 3 of the License, or
  8	--  (at your option) any later version.
  9	--
 10	--  This code is distributed in the hope that it will be useful,
 11	--  but WITHOUT ANY WARRANTY; without even the implied warranty of
 12	--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13	--  GNU General Public License for more details.
 14	--
 15	--  You should have received a copy of the GNU General Public License
 16	--  along with this code (see the file named COPYING).
 17	--  If not, see http://www.gnu.org/licenses/.
 18	--
 19	-------------------------------------------------------------------------------
 20	-------------------------------------------------------------------------------
 21	--
 22	-- Module Name:    prog_mem - Behavioral 
 23	-- Create Date:    14:09:04 10/30/2009 
 24	-- Description:    a block memory module
 25	--
 26	-------------------------------------------------------------------------------
 27	
 28	library IEEE;
 29	use IEEE.STD_LOGIC_1164.ALL;
 30	use IEEE.STD_LOGIC_ARITH.ALL;
 31	use IEEE.STD_LOGIC_UNSIGNED.ALL;
 32	
 33	entity RAMB4_S4_S4 is
 34	    generic(INIT_00 : bit_vector := X"00000000000000000000000000000000"
 35	                                  &  "00000000000000000000000000000000";
 36	            INIT_01 : bit_vector := X"00000000000000000000000000000000"
 37	                                  & X"00000000000000000000000000000000";
 38	            INIT_02 : bit_vector := X"00000000000000000000000000000000"
 39	                                  & X"00000000000000000000000000000000";
 40	            INIT_03 : bit_vector := X"00000000000000000000000000000000"
 41	                                  & X"00000000000000000000000000000000";
 42	            INIT_04 : bit_vector := X"00000000000000000000000000000000"
 43	                                  & X"00000000000000000000000000000000";
 44	            INIT_05 : bit_vector := X"00000000000000000000000000000000"
 45	                                  & X"00000000000000000000000000000000";
 46	            INIT_06 : bit_vector := X"00000000000000000000000000000000"
 47	                                  & X"00000000000000000000000000000000";
 48	            INIT_07 : bit_vector := X"00000000000000000000000000000000"
 49	                                  & X"00000000000000000000000000000000";
 50	            INIT_08 : bit_vector := X"00000000000000000000000000000000"
 51	                                  & X"00000000000000000000000000000000";
 52	            INIT_09 : bit_vector := X"00000000000000000000000000000000"
 53	                                  & X"00000000000000000000000000000000";
 54	            INIT_0A : bit_vector := X"00000000000000000000000000000000"
 55	                                  & X"00000000000000000000000000000000";
 56	            INIT_0B : bit_vector := X"00000000000000000000000000000000"
 57	                                  & X"00000000000000000000000000000000";
 58	            INIT_0C : bit_vector := X"00000000000000000000000000000000"
 59	                                  & X"00000000000000000000000000000000";
 60	            INIT_0D : bit_vector := X"00000000000000000000000000000000"
 61	                                  & X"00000000000000000000000000000000";
 62	            INIT_0E : bit_vector := X"00000000000000000000000000000000"
 63	                                  & X"00000000000000000000000000000000";
 64	            INIT_0F : bit_vector := X"00000000000000000000000000000000"
 65	                                  & X"00000000000000000000000000000000");
 66	
 67	    port(   ADDRA   : in  std_logic_vector(9 downto 0);
 68	            ADDRB   : in  std_logic_vector(9 downto 0);
 69	            CLKA    : in  std_ulogic;
 70	            CLKB    : in  std_ulogic;
 71	            DIA     : in  std_logic_vector(3 downto 0);
 72	            DIB     : in  std_logic_vector(3 downto 0);
 73	            ENA     : in  std_ulogic;
 74	            ENB     : in  std_ulogic;
 75	            RSTA    : in  std_ulogic;
 76	            RSTB    : in  std_ulogic;
 77	            WEA     : in  std_ulogic;
 78	            WEB     : in  std_ulogic;
 79	
 80	            DOA     : out std_logic_vector(3 downto 0);
 81	            DOB     : out std_logic_vector(3 downto 0));
 82	end RAMB4_S4_S4;
 83	
 84	architecture Behavioral of RAMB4_S4_S4 is
 85	
 86	function cv(A : bit) return std_logic is
 87	begin
 88	   if (A = '1') then return '1';
 89	   else              return '0';
 90	   end if;
 91	end;
 92	
 93	function cv1(A : std_logic) return bit is
 94	begin
 95	   if (A = '1') then return '1';
 96	   else              return '0';
 97	   end if;
 98	end;
 99	
100	signal DATA : bit_vector(4095 downto 0) :=
101	    INIT_0F & INIT_0E & INIT_0D & INIT_0C & INIT_0B & INIT_0A & INIT_09 & INIT_08 & 
102	    INIT_07 & INIT_06 & INIT_05 & INIT_04 & INIT_03 & INIT_02 & INIT_01 & INIT_00;
103	
104	begin
105	
106	    process(CLKA, CLKB)
107	    begin
108	        if (rising_edge(CLKA)) then
109	            if (ENA = '1') then
110	                DOA(3) <= cv(DATA(conv_integer(ADDRA & "11")));
111	                DOA(2) <= cv(DATA(conv_integer(ADDRA & "10")));
112	                DOA(1) <= cv(DATA(conv_integer(ADDRA & "01")));
113	                DOA(0) <= cv(DATA(conv_integer(ADDRA & "00")));
114	                if (WEA = '1') then
115	                    DATA(conv_integer(ADDRA & "11")) <= cv1(DIA(3));
116	                    DATA(conv_integer(ADDRA & "10")) <= cv1(DIA(2));
117	                    DATA(conv_integer(ADDRA & "01")) <= cv1(DIA(1));
118	                    DATA(conv_integer(ADDRA & "00")) <= cv1(DIA(0));
119	                end if;
120	           end if;
121	        end if;
122	
123	        if (rising_edge(CLKB)) then
124	            if (ENB = '1') then
125	                DOB(3) <= cv(DATA(conv_integer(ADDRB & "11")));
126	                DOB(2) <= cv(DATA(conv_integer(ADDRB & "10")));
127	                DOB(1) <= cv(DATA(conv_integer(ADDRB & "01")));
128	                DOB(0) <= cv(DATA(conv_integer(ADDRB & "00")));
129	                if (WEB = '1') then
130	                    DATA(conv_integer(ADDRB & "11")) <= cv1(DIB(3));
131	                    DATA(conv_integer(ADDRB & "10")) <= cv1(DIB(2));
132	                    DATA(conv_integer(ADDRB & "01")) <= cv1(DIB(1));
133	                    DATA(conv_integer(ADDRB & "00")) <= cv1(DIB(0));
134	                end if;
135	            end if;
136	        end if;
137	    end process;
138	
139	end Behavioral;
140	
<pre class="filename">
test/RAMB4_S4_S4.vhd
</pre></pre>
<P>
 
<P><br>
 
<H3><A NAME="section_1_5_3">8.5.3 Creating the testbench executable</A></H3>
 
<P>We assume the following file structure:
 
<UL>
  <LI>a <STRONG>test</STRONG> directory that contains the testbench (<STRONG>test_tb.vhd</STRONG>) and the
  memory module (<STRONG>RAMB4_S4_S4.vhd</STRONG>).
  <LI>a <STRONG>src</STRONG> directory that contains all other VHDL files.
  <LI>a <STRONG>simu</STRONG> directory (empty).
  <LI>A <STRONG>Makefile</STRONG> like this:
</UL>
<P><br>
 
<pre class="vhdl">
 
  1	PROJECT=avr_core
  2	
  3	# the vhdl source files (except testbench)
  4	#
  5	FILES		+= src/*.vhd
  6	
  7	# the testbench sources and binary.
  8	#
  9	SIMFILES	=  test/test_tb.vhd
 10	SIMFILES	+= test/RAMB4_S4_S4.vhd
 11	SIMFILES	+= test/RAM32X1S.vhd
 12	SIMTOP		= testbench
 13	
 14	# When to stop the simulation
 15	#
 16	# GHDL_SIM_OPT	= --assert-level=error
 17	GHDL_SIM_OPT	= --stop-time=40us
 18	
 19	SIMDIR		= simu
 20	
 21	FLAGS		= --ieee=synopsys --warn-no-vital-generic -fexplicit --std=93c
 22	
 23	all:
 24		make compile
 25		make run 2>& 1 | grep -v std_logic_arith
 26		make view
 27	
 28	compile:
 29		@mkdir -p simu
 30		@echo -----------------------------------------------------------------
 31		ghdl -i $(FLAGS) --workdir=simu --work=work $(SIMFILES) $(FILES)
 32		@echo
 33		@echo -----------------------------------------------------------------
 34		ghdl -m $(FLAGS) --workdir=simu --work=work $(SIMTOP)
 35		@echo
 36		@mv $(SIMTOP) simu/$(SIMTOP)
 37	
 38	run:
 39		@$(SIMDIR)/$(SIMTOP) $(GHDL_SIM_OPT) --vcdgz=$(SIMDIR)/$(SIMTOP).vcdgz
 40	
 41	view:
 42		gunzip --stdout $(SIMDIR)/$(SIMTOP).vcdgz | gtkwave --vcd gtkwave.save
 43	
 44	clean:
 45		ghdl --clean --workdir=simu
 46	
<pre class="filename">
Makefile
</pre></pre>
<P>
 
<P><br>
 
<DL>
  <DT>Then</DT>
<DD>
</DL>
<P><br>
 
<pre class="cmd">
 
# Run the functional simulation.
make
 
</pre>
<P>
 
<P><br>
 
<P>It will take a moment, but then a <STRONG>gtkwave</STRONG> window like the one shown
earlier in this lesson will appear. It my look a little different due
due to different default settings (like background color). In that
window you can add new signals from the design that you would like
to investigate, remove signals you are not interested in, and so on.
At the first time, no signals will be shown; you can add some by selecting
a component instance at the right, selecting a signal in that component,
and then pushing the <STRONG>append</STRONG> button on the right.
 
<P>The <STRONG>make</STRONG> command has actually made 3 things:
 
<UL>
  <LI>make compile (compile the VHLD files)
  <LI>make run (run the simulation), and
  <LI>make view
</UL>
<P>The first two steps (which took most of the total time) need only be run
after changes to the VHDL files.
 
<H2><A NAME="section_1_6">8.6 Building the Design</A></H2>
 
<P>When the functional simulation looks OK, it is time to implement the design
and check the timing. We describe this only briefly, since the Xilinx
documentation of the Xilinx toolchain is a much better source of
information.
 
<H3><A NAME="section_1_6_1">8.6.1 Creating an UCF file</A></H3>
 
<P>Before implementing the design, we need an <STRONG>UCF</STRONG> file. That file
describes timing requirements, pin properties (like pull-ups for our
DIP switch), and pin-to-signal mappings:
 
<P><br>
 
<pre class="vhdl">
 
  1	NET     I_CLK_100       PERIOD = 10 ns;
  2	NET     L_CLK           PERIOD = 45 ns;
  3	
  4	NET     I_CLK_100       TNM_NET = I_CLK_100;
  5	NET     L_CLK           TNM_NET = L_CLK;
  6	
  7	NET     I_CLK_100       LOC = AA12;
  8	NET     I_RX            LOC = M3;
  9	NET     Q_TX            LOC = M4;
 10	
 11	# 7 segment LED display
 12	#
 13	NET     Q_7_SEGMENT<0>  LOC = V3;
 14	NET     Q_7_SEGMENT<1>  LOC = V4;
 15	NET     Q_7_SEGMENT<2>  LOC = W3;
 16	NET     Q_7_SEGMENT<3>  LOC = T4;
 17	NET     Q_7_SEGMENT<4>  LOC = T3;
 18	NET     Q_7_SEGMENT<5>  LOC = U3;
 19	NET     Q_7_SEGMENT<6>  LOC = U4;
 20	
 21	# single LEDs
 22	#
 23	NET     Q_LEDS<0>       LOC = N1;
 24	NET     Q_LEDS<1>       LOC = N2;
 25	NET     Q_LEDS<2>       LOC = P1;
 26	NET     Q_LEDS<3>       LOC = P2;
 27	
 28	# DIP switch(0 ... 7) and two pushbuttons (8, 9)
 29	#
 30	NET     I_SWITCH<0>     LOC = H2;
 31	NET     I_SWITCH<1>     LOC = H1;
 32	NET     I_SWITCH<2>     LOC = J2;
 33	NET     I_SWITCH<3>     LOC = J1;
 34	NET     I_SWITCH<4>     LOC = K2;
 35	NET     I_SWITCH<5>     LOC = K1;
 36	NET     I_SWITCH<6>     LOC = L2;
 37	NET     I_SWITCH<7>     LOC = L1;
 38	NET     I_SWITCH<8>     LOC = R1;
 39	NET     I_SWITCH<9>     LOC = R2;
 40	
 41	NET     I_SWITCH<*>     PULLUP;
 42	
<pre class="filename">
src/avr_fpga.ucf
</pre></pre>
<P>
 
<P><br>
 
<H3><A NAME="section_1_6_2">8.6.2 Synthesis and Implementation</A></H3>
 
<UL>
  <LI>Start the ISE project manager and open a new project with the desired
  FPGA device.
  <LI>Add the VHDL files and the <STRONG>UCF</STRONG> file in the <STRONG>src</STRONG> directory to the
  project (Project-&gt;Add Source).
  <LI>Synthesize and implement the design (Process-&gt;Implement top Module).
</UL>
<P>This generates a number of reports, netlists, and other files.
There should be no errors. There will be warnings though, including
timing constraints that are not met.
 
<P>It is important to understand the reason for each warning. Warnings often
point to faults in the design.
 
<P>The next thing to check is the timing reports. We were lucky:
 
<P><br>
 
<pre class="cmd">
 
#Timing report fragment:
================================================================================
Timing constraint: NET "L_CLK" PERIOD = 35 ns HIGH 50%;
 
 676756190 paths analyzed, 2342 endpoints analyzed, 0 failing endpoints
 0 timing errors detected. (0 setup errors, 0 hold errors)
 Minimum period is  34.981ns.
--------------------------------------------------------------------------------
 
================================================================================
Timing constraint: NET "I_CLK_100_BUFGP/IBUFG" PERIOD = 10 ns HIGH 50%;
 
 19 paths analyzed, 11 endpoints analyzed, 0 failing endpoints
 0 timing errors detected. (0 setup errors, 0 hold errors)
 Minimum period is   3.751ns.
--------------------------------------------------------------------------------
 
 
All constraints were met.
 
</pre>
<P>
 
<P><br>
 
<P>This tells us that we have enough slack on the crystal CLK_100 signal
(8.048ns would allow for up to 124 MHz). We had specified a period
of 35 ns irequirement for the CPU clock:
 
<P><br>
 
<pre class="vhdl">
 
  2	NET     L_CLK           PERIOD = 45 ns;
<pre class="filename">
src/avr_fpga.ucf
</pre></pre>
<P>
 
<P><br>
 
<P>The CPU runs at 25 MHz, or 40 ns. The 35 ns come from the 40 ms minus
a slack of 5 ns. With some tweaking of optimization options, we could
have reached 33 MHz, but then the slack would have been pretty small.
 
<P>However, we rather stay on th safe side.
 
<H2><A NAME="section_1_7">8.7 Creating a Programming File</A></H2>
 
<P>Next we double-click "Generate Programming file" in the ISE project navigator.
This generates a file <STRONG>avr_fpga.bit</STRONG> in the project directory. This can also
be run from a Makefile or from the command line (the command is <STRONG>bitgen</STRONG>).
 
<H2><A NAME="section_1_8">8.8 Configuring the FPGA</A></H2>
 
<P>At this point, we have the choice between configuring the FPGA directly
via JTAG, or flashing an EEPROM and then loading the FPGA from the EEPROM.
 
<H3><A NAME="section_1_8_1">8.8.1 Configuring the FPGA via JTAG Boundary Scan</A></H3>
 
<P>Configuring the FPGA can be done with the Xilinx tool called <STRONG>impact</STRONG>.
The file needed by <STRONG>impact</STRONG> is <STRONG>avr_fpga.bit</STRONG> from above. The configuration
loaded via JTAG will be lost when the FPGA looses power.
 
<P>Choose "Boundary Scan" in <STRONG>impact</STRONG>, select the FPGA and follow the instructions.
 
<H3><A NAME="section_1_8_2">8.8.2 Flashing PROMs</A></H3>
 
<P>In theory this can also be done from ISE. In practice it could (and actually
did) happen that the programming cable (I use an old parallel 3 cable)
is not detected by impact.
 
<P>Before flashing the PROM, the <STRONG>avr_fpga.bit</STRONG> from the previous step needs to
translated into a format suitable for the PROM. My PROM is of the serial
variety, so I start <STRONG>impact</STRONG>, choose "PROM File Formatter" and follow the
instructions.
 
<P>After converting <STRONG>avr_fpga.bit</STRONG> into, for example, <STRONG>avr_fpga.mcs</STRONG>, the
PROM can be flashed. Like before choose "Boundary Scan" in #impact. This
time, however, you select the PROM and not the FPGA, and follow the
instructions.
 
<P>This concludes the description of the design flow and also of the CPU.
The remaining lessons contain the complete listings of all sources files
discussed in this lectures.
 
<P>Thank you very much for your attention.
 
<P><hr><BR>
<table class="ttop"><th class="tpre"><a href="08_IO.html">Previous Lesson</a></th><th class="ttop"><a href="toc.html">Table of Content</a></th><th class="tnxt"><a href="10_Listing_of_alu_vhd.html">Next Lesson</a></th></table>
</BODY>
</HTML>
Go to most recent revision | Compare with Previous | Blame | View Log
Browse

Tools

Subversion Repositories cpu_lecture

[/] [cpu_lecture/] [trunk/] [html/] [09_Toolchain_Setup.html] - Rev 10