Newsletter August 2009
A new H.264 multimedia project has started at OpenCores. Back in June we defined and started a new multimedia project to be undertaken within the OpenCores community. The long term goal is to be able to support various multimedia standards and products by expanding our collection of media IP cores. The first goal for the multimedia project is a "complete" H.264 Baseline (progressive) encoder that supports both intra- and inter-prediction (I- and P-slices).
The complete H.264 Encoder SoC, will be based on only open-source IPs. That is really cool!
In the last newsletter there was a brief presentation of the project and a request for OpenCores members with knowledge and experience within the multimedia area to register their interest.
The response from the OpenCores community was overwhelming. We are now working with at least 30 engineers whom have extensive competence in this field. Many of the people engaged in this project have worked with similar things before and this collaboration puts together one of the most impressive teams of engineers at work on such a program. The knowledge and experience of the group spans everything from architecture, design, and software to verification.
Hopefully this project will be a good example of the capabilities of the members of the OpenCores community, and that we surpass any and all expectations. Not only are we using open source technologies that benefit from the many persons and companies assisting and contributing to development. testing and verification, we will be making the most of those in the community who have some time and effort to spare with a guided and organised effort towards an impressive but achievable goal.
We will keep the community updated on the progress of this exciting new project here at OpenCores.
Anyone who has had the task of assembling more than a few Verilog modules into a SoC-like design understands the tedium of individually declaring and connecting all necessary ports and wires. Groups of signals traversing several modules, such as control and data bus signals to an external memory, are common in large complex digital projects. The rigmarole of declaring and connecting such interfaces through many and varied submodules is tedious, even more so when small modifications to the bus require altering the ports at every level. In cases such as this, features like VHDL record types, or C data structures "structs" save much time and energy.
With exactly these issues in mind, a preprocessor implementing C-style structs in Verilog has been developed. As previously mentioned, there exists a comparable VHDL feature in the record types. And, of course, System Verilog contains full struct support by specification, but unfortunately no Verilog specification ever outlined such a feature. Given Verilog's widespread use as the HDL of choice in large complex digital designs, it is unfortunate this feature was never introduced. The workarounds are not pretty, such a VHDL wrappers or other automated scripts, but most are impractical during each stages of design.
Veristruct was originally developed during the porting of Gaisler Research's LEON2 processor to Verilog. The extensive use of the VHDL record type in the LEON2 made the porting a tricky task to complete without an equivalent feature in Verilog. And so the Veristruct preprocessor was created. The resulting preprocessor implementing these C-style structs is now required by the LEON2 verilog port: the Rachael SPARC processor. Although not exactly the same, the processor's IU and peripheral interface is the same. This proves Veristruct's worth as a tool for simplifying the design of individual modules, or inter-module signal connections.
Despite the author's aversion to using preprocessors in general, the Veristruct preprocessor is simple, makes sense, and certainly makes managing large sets of signals in a design much easier.
See the Veristruct project page here at OpenCores for more information.
This article is written by Julius Baxter at ORSoC (www.orsoc.se)
This topic gives you an update of what has been "cooking" at the OpenCores community during the last month.
This month activities:
- Modified the "project-list"-page so that it "shows" information of the each project. Moved the actual "download-link" and "svn-browse-function" to the project page.
- Addressed "bottle necks" on the website to improve the user experience and decrease the overall load on the servers.
- Added a new Forum targeted for the new OC-H.264 project.
- Improved the backup-system and scripts to shift all the traffic to the backup-server.
Our message to the community:
Update your project with a "description"-text
There are still projects that do not have a "description"-text. In order to make your project more visible on Google and other search-engines, we have made the "Description"-block on the "Overview"-page a "meta description"-tag. The first 160 characters will be visible at Goggle’s search-list, so it’s important that you describe your project in a short and understandable way in the "Description"-block.
Help us find incomplete or obsolete projects
We have now started task to "cleanup" the project-list on OpenCores, meaning that we want to "mark" incomplete orobsolete projects with a "not ready"-sign. This will make it easier for all users to find suitable projects and hopefully also motivate the project maintainer to complete the project.
View a list of some of the projects that have been updated during the last month. Here you will also see interesting new projects that have reached the first stage of development.
Cellular Automata PRNG
A cellular automata (CA) is a discrete model that consists of a grid (1D, 2D, 3D ) with objects called cells. Each cell can be in one of a given set of states (on and off, different colours etc). Each cell has a set of cells in close proximity (neighbours). Given the current internal state of a cell, the states of the neighbour cells and a given set of update rules the next state of a cell can be determined.
Additional info: Design done, FPGA proven, Specification done
The Xgate Co-processor Module, Xgate, is a 16 bit programmable RISC processor that is managed by a host CPU to reduce the host load in handling interrupts. Because the Xgate is user programmable there is a great deal of user control in how to preprocess data from peripheral modules. The module may be configured as a simple DMA controller to organize data such that the host only deals with whole messages and not individual words or bytes. The Xgate may also deal with higher levels of messaging protocols than the peripheral hardware recognizes. Encryption algorithms are also supported by the instruction set.
Development status: Planning
The Memory Stealer solves a common problem found with modern highly integrated SoC and Controllers. The Problem is that there is no external bus connection for peripherals. So there is no isa-like bus, no lpc and no pci. So how to add some nice additional Bells & whistles ? We put the Bells and whistles into an fpga using the vast amount of open source cores and interconnect them vith the wishbone bus. Then we connect the fpga to the SoC unsing the memory stealer. The memory stealer mimics Memory of the needed type (SRAM, SDRAM, DDR-SDRAM, DDR2-SDRAM) adds some address decoding and memory-window technology.
Development status: Planning
DNA Sequence Alignment Accelerator
This is an easily configurable systolic array of processors to compute the optimal alignment between two DNA sequences. It supports affine gap penalties, and is configurable between local (smith-waterman) and global (needleman-wunsch) alignment algorithms by setting an internal register.
Development status: Alpha
Assembler with VHDL User-defined Commands (AVUC)
Here is proposed a method to implement short structured programs inside an FPGA. The novelty of the proposed method resides in that the commands that constitute the executable program are defined directly by the user in VDHL code. Applying this method, the resolution of a problem can be partitioned in two: on the one hand, the complex hardware functions can be implemented by the VHDL definitions, while, on the other hand, the higher level take of decisions, loops, iterations and conditional branching or testing can be assumed by the executable program.
Development status: Alpha
Just Another Ray Tracer
This is a hardware sphereflake ray tracer.
Development status: Alpha
Parallel CRC Generator
CRC Generator is a command-line application that generates Verilog or VHDL code for a parallel CRC of any data width between 1 and 1024 and polynomial width between 1 and 1024. The CRC can be custom or protocol specific, for example PCI Express, USB5, USB16, 802.3, SATA. The code is written in C and is cross-platform compatible
Development status: Alpha
This project contains a collection of cores that interface with various gamepads. Each gamepad type has a dedicated controller core which handles the communication with one or more pads of the same type. The status of the buttons together with information about the pad axes is provided by a simple interface. Systems integrating such a core can monitor this interface via general purpose IO lines.
Development status: Beta
The openMSP430 is a synthesizable 16bit microcontroller core written in Verilog. It is compatible with Texas Instruments' MSP430 microcontroller family and can execute the code generated by an MSP430 toolchain in a cycle accurate way. The core comes with some peripherals (GPIO, TimerA, generic templates) and a Serial Debug Interface for in-system software development.
Development status: Stable
The VHDL Test Bench
The VHDL test bench is a collection of VHDL procedures and functions which allow the user to create their own scripting instructions for test stimulus. The stimulus script or test case contains the instructions in a regular ASCII text file. The function of the instructions is coded in VHDL as part of the test bench. The test bench VHDL package contains procedures to create instructions, read, parse and execute the test script (stimulus file, test case, script).
Development status: Stable
Computer Operating Properly
The Computer Operating Properly Module, COP, is a watchdog timer module that triggers a system reset if it is not regularly serviced by writing two specific words to its control registers. The intention of the module is to bring an embedded system back to a “good” state after the software program has lost control of the system.
Development status: Beta
Interview with Adam Dunkels, the founder for the minimal IP stack uIP and the Contiki operating system.
Adam Dunkels' interest in the minimal started with his degree thesis at Luleå
University of Technology at the millennium. The project involved equipping
players in Luleå hockey with various sensors and a camera. Data was to be
transmitted wirelessly via Bluetooth - the idea was to give the spectators
something extra. Anyone equipped with a iPaq handheld computer or a laptop
could see how the players felt and receive images from cameras that were
attached to the helmets.
- We ran a test and the system worked, but it probably was used only once. It was physically exhausting for the players and Lulea lost the game" he says and laughs. Adam Dunkels role in the project was to develop an IP stack. After his theses work, he continued to develop the stack and eventually released it as open source under the name lwIP.
- Many were interested in using lwIP but wanted to become even smaller. The general perception was that an IP stack is too big for an eight-bit processor.
Continued reduction of the IP stack
The interest generated by the project acted as an eye opener for Adam Dunkels, which in 2000 had begun at the Swedish Institute of Computer Science (SICS) in Kista, Stockholm. He further developed lwIP and the result was uIP. uIP performs only the minimum necessary for the IP stack to meet the standard. It does not need more than 4 to 5 kB of code and about 1.5kB of working memory.
- You can clip it down even more if you remove some features.
The IP stack was developed for small devices that could be linked up to a fixed network. It found its way into some commercial products, although many detractors at the time claimed that IP was too difficult and large for these devices.
- The funny thing is that we still hear the same arguments today for IP on wireless devices; that the IP is overkill. I still maintain that it is feasible.
Methods to reduce the size of the IP stack can include compressing the head in an intelligent way. Adam Dunkels published the first articles on the theme for 2004, which was picked up by people in the industry.
- 2005 started the Internet Engineering Task Force standardization group called 6lowpan and in 2006 the first standard was published.
IP even in the smallest systems
Today there are products that use 6lowpan (IPv6 over Low power WPAN), including systems for home automation and industrial monitoring. - IP is also being transmitted over Zigbee. Today there are three major wireless standards for automation applications. There are Zigbee, Wireless Hart and ISA 100A.
The latter two are based on the Zigbee 6lowpan and discussions are underway to incorporate IPv6 via 6lowpan in the standard.
- I think it will be adopted, even if users won't know about it. Sure, it gets a little larger but sometimes this is necessary.
In 6lowpan there has been features stripped away such as VPN, Virtual Private Network. But it is possible to squeeze the size of your code even more.
His work at SICS has moved on to operating systems and programming abstractions, two other areas where he has also taken his minimalist approach.
The programming abstraction layer, known as Protothreads is a method to reduce memory consumption. In normal operating systems you make reservations to a memory area for each thread whether it needs it or not.
- 30 percent of working memory can be unlocked but still unused.
Adam Dunkels solution can be a bit sloppy; described as adding all the threads on the same stack, which then a linear flow.
- The idea is incredibly simple, but new. You sit and think that this must be done and someone has made some similar things but not like this.
The advantage of Protothreads is that it is easier to write, monitor and debug the programs than with threaded programs. The latter normally leads to spaghetti like streams where it easily slips into errors.
- We have looked at a number of programs that have complex state machines. In principle it is possible to remove all state and state transitions.
Use in industry
With Prototreads programs are up to 30 percent smaller and one can also find errors not detected earlier. Protothreads is currently used in a range of commercial products.
- "ABB uses it in a vibration sensor" said Adam Dunkels and takes out his cell phone to show a picture of what looks like a small cylindrical USB stick.
Another example is a manufacturer of digital TV boxes. In this case it involves the adoption of Protothreads, due to its ease of use and portability, allow quick porting to different target systems.
- Normally, it is difficult to port because there are tiny differences between the target system. Programs written in Protothreads are exactly alike because they are based on C code, not assembler.
That it works because the Protothreads converts the thread library directly to a state machine at compile time.
- I sat and poked with the programming of the IP stack when I found out.
So the transition from the minimal IP stack to programming abstractions is not as great as it at first may seem.
- I try to keep things apart, even if they are linked.
Protothreads is also part of the minimal operating system Contiki, which is his latest project. But that Protothreads is seeing more use is only due to it having been around longer.
- It's called the Hollywood principle. Do not call us, we'll call you.
According to Adam Dunkels, it is easier to get programmers to embrace an IP stack or a thread library than to switch operating systems.
Contiki, which recently turned six years old, contains IP stack uIP but also radio stack Rime. The operating system is adapted to include Texas Instruments MSP430 and x86 processors, but there are companies that use the operating system for Atmel's AVR CPU and the Hitatchis HC12.
- One of the coolest things about Contiki is that it measures the energy consumption.
This feature can be used to study what consumes power in such a small wireless sensor. That it works is because the processor controls all functions.
Listening drains battery.
It turns out that the radio is the major energy consumer, not the CPU. Probably, most are not aware that it costs as much or more to listen on the radio than to send. In the radio module used at SICS for 802.15.4, the radio standard by including Zigbee is based, is the year to 20 mA at listening and 18 mA during transmission.
This means that a battery-powered Zigbee node in practice can only be used in a star model with a base station that collects data. If you instead configure a mesh net, where the nodes transmit data before it hits the collection point, the battery will run out quickly.
- Saab did a trial where they used Zigbee nodes to protect a Jas aircraft but the batteries lasted only a few days.
The problem is that the nodes have to go back and listen quite often to know if any of the other nodes send a packet to be forwarded. How often partly depends on what response time you want the system to have and how accurate the clocks are.
- I am convinced that it is possible to improve this.
By continuously measuring the power consumption of each node, the remaining capacity on each can be known. The information can be used to dynamically change the protocol. But even the operating system may help to lower power consumption.
- You can see the operating system as a program that helps me to make other programs, namely to implement the Protocol.
By building functions to send from one to all, from one to only one or from many, a programmer does not have to worry about the details.
- It raises the level of abstraction. In the end it falls into a communication protocol such as may be 6lowpan or Zigbee.
A number of these communication methods are implemented in Contiki.
Another effect of letting the operating system take care of parts of the protocol is that it can access the information contained in the head and process it during the decoding. There may be information on signal strength, time stamps, sender and other things that can be used to optimize energy consumption in the whole system.
- Many believe that it is slower to do it this way but there is no measurable difference.
A piquant detail with the sensor modules that can be purchased from, for example, Texas Instruments and Atmel, is that it introduced extra latency. It is because the radio chipsets the host processor communicate via a SPI bus which is not capable of more than 180 kbit / s. The radio, however peaks at 250 kbit / s.
- Should we also perform multi-jump, sending data packets through multiple nodes to a collection point decreases the effective data rate to 10 kbit/s.
Preloading to increase the data rate
There are ways to increase the data rate. If you reduce the package size it can be up to 20 kbit / s. Running it over several radio channels and avoiding interference can reach 60 kbit / s. But because you can not listen and send simultaneously with these radio modules, the theoretical maximum speed is just under 112 kbps if one also takes into account the need to have a little preamble and checksum.
One solution is to pre-load the radio with the next package and then send it as soon as it received a new package.
- Again there is an abstraction level that we can add the operating system.
Research results are all available as open source through the SICS website and it has resulted in a doctorate.
- There is an extra bonus of working in such a place as SICS.
In February, he was Chester Carlson Award of Engineering Sciences. The prize is named after Xerox's founder, Swedish descendant Chester Carlson
- It was incredibly fun for the recognition recognition, even if it was stuff I did around 2001 and 2002.
In early April, it was Microsoft's turn to praise him when he was awarded the Roger Needham Award.
The awards have meant that he ended up in the media spotlight which certainly can be good if he is serious about their thoughts about starting a business based on research results.
- It would be fun to do and it interests me more and more. But it is difficult to sell platform software as a CD in a cardboard box. It's just that Microsoft succeeded.
And the software developed by Adam Dunkels exists as open source.