This is a Atmel AVR ATtiny261/461/861 compatible core.
It should be (more or less) fully code compliant, but it is not clock-cycle compliant.
The reason it was developed was to have a simple core to develop C-code to.
The implementation is rather strait forward without any pipelining.
One reason was also to see how hard it was to implement a standard mcu in vhdl and make it run on gcc-generated code.
The implementation is a bit quick-n-dirty, I spent about 15h coding the core and about 15h to writing test bench and simulate the core.
I must also say that I'm an AVR fan, if designing hardware where you don't need a complete FPGA, just buy the AVR mcu directly from Atmel! http:\\www.atmel.com\avr
Please note, you should not use the AVRStudio for development to this core, the files generated by AVRStudio is compatible with the core, BUT, if you don't buy the mcu from Atmel, you shouldn't use their software!
The test bench uses code (C and Asm) compiled only with the help of gnu-tools.
Atmel, AVR, AVRStudio and other names above may be trademarks of Atmel corporation.
- Fully instruction compliant
- Can address up to 64kword of instructions and 64kbyte of sram
- Hex-file from a standard avr-gcc compile will work in this core
- Difference to standard core:
- The core is slow compared to original mcu, an instruction takes 3-6 cycles to execute
- With simple asm-code that tests all instructions
- With a XTEA encryption/decryption algorithm to test math
- Non tested:
- Not all combinations of instructions, registers and constants are tested
- hex2vhdl converter (may give wrong addresses)
- Not implemented:
- Writing of registers with ST, STS and STD (writing of sreg and sp might work)
- Reading of registers with LD , LDS and LDD
- SLEEP, WDR, SPM
- If you miss something in the core please just send me an email firstname.lastname@example.org
- My future plans is to optimize the core mostly for size but also for speed
- Add generic to control number of address lines for PM and DM.
- Add generic to skip SPH register
- IRQ may be implemented in the future as a generic option if anybody requests it
- Some simple io-units e.g. io-port, spi, uart, pwm and input-capture units
- Add a wishbone bride for those who want to use wishbone components (if possible due to core clocking restrictions)
- Some documentation of the core implementation
- Add an example project that is ready to compile to an FPGA
- Fixed in 2008-10-08 release: Add detection of unimplemented usage of LD/LDS/LDD/ST/STS/STD to indirect read or write register map (when X, Y or Z points to 0-0x1F). (Does gcc really use this anywhere?)
- Wishes from me:
- Please send me an email if you intend to test or use this core, send some information about your project or product. It would be nice to know if it will be used in any product.
- Please don't forget to send me bug-reports, any bug fixes you do in the core must be released on the web anyway due to the lgpl license.
- Comment about the testbench:
- Fixed in 2008-10-08 release: tb_pm_hex is by default configured for max 2048 words (4096 bytes) of pm space, change generic g_pm_size to be able to use larger programs.
- Fixed in 2008-10-08 release: the rjmp/rcall directly addresses the hole memory of up to 8 kBytes with "negative" addresses, this could give a error in the tb_pm_hex but does not effect real world applications. To fix simulation just use the lower 12 bits of the PM_A in tb_pm_hex/pm. "a_int := CONV_INTEGER(PM_A(11 downto 0));". If you use less then 4kBytes of pm code, then this problem will not be visible.
- Other comments:
- If you use this core in a project then please send me an email with a link to the projects home page so I could add a link to your project on this page.
- If you modify the core you must make the changes available according to LGPL, just send me the link and I will add the link to this page or just send me a zip file that I could add to this page.
- Mcu is ready and tested
- Tb (with tb-source for core) is ready and tested
- Extensive testing of core in real world not done
- Synted (and par) for Xilinx Spartan3-400 with Ise 8.1, final results after place-and-route:
- Optimized for speed: about 83 MHz, 150 ff, 1700 lut
- Optimized for area: about 70 MHz, 130 ff, 1600 lut
- Reported: Synted (and par) with Quartus, about 80% of a for Altera EP1C3.
- Reported: One person have tested to implement a complete function with advanced mathematics in a Virtex E, no issues found so far with the core, in his application this core seems to run about 3.7-4 times slower (in cycle count) then a standard AVR.
- The new release (2008-10-08) might be slightly larger (0.x%) but works better in simulation. There is no change to the cores instruction handling between first and this release.