Cores are generated from Confluence; a modern logic design language. Confluence is a simple, yet highly expressive language that compiles into Verilog, VHDL, and C. See Confluent.org for more info. Several cores are provided in Verilog, Vhdl, and C. If you don't see the configuration you need, chances are we can easily generate it for you. The Reconfigurable Computing Array (RCA) is a platform for dynamic reconfigurable computing. RCA consists of a fine-grained array of reconfigurable "square" logic tiles. Similar to an FPGA CLB, a tile can be programmed to perform a wide variety of functions.
Overview
Unlike FPGAs, RCA has no routing fabric. Rather, all tiles
communicate directly with their nearest neighbor, i.e., north, south, west, east.
Because a tile's inputs are registered, the lack of routing fabric
prevents end-to-end combinatorial logic design that is possible with general
purpose FPGAs.
However, the advantage of "hard-wiring" tiles is 2 fold: greater logic density
and improved speed. FPGAs consume 80-90% of their area on routing; only 10%
yields useful logic in the form of CLBs. Without the routing fabric, it is possible RCA
can increase logic density by a factor of 10.
Secondly, because signals are registered across tile boundaries, timing is
deterministic and constant. Further more, since tiles are fine-grained, clocks rates
into the GHz should be possible.
The goal of this project is to develop an understanding of optimal tile architecture
trade-offs and RCA compiler technology.
Tile Structure
A tile is square, having four 1-bit inputs and four 1-bit outputs named
north, south, west, and east. An array is a collection of tiles organized
like a checkerboard, each side connecting to an adjacent tile. For instance, the east output of
a tile of the left plugs into the west input of a tile on the right.
In terms of tile architecture, there are several possibilities.
The initial architecture is based on 3-to-1 look-up tables (LUTs).
There are four LUTs per tile -- one for each direction -- each LUT with three
8-to-1 multiplexers for input data selection.
The following illustrates the tile architecture (only the north datapath shown):
Top Level Interface and Array Configuration
At the top level, RCA has 4 input data buses and 4 output data buses;
and input and output bus for each side of the array (N, S, W, E).
Bit 0 of "north_i", "north_o", "south_i," and "south_o" corresponds to the western most tile.
Likewise, bit 0 of "west_i", "west_o", "east_i", and "east_o" corresponds to
the northern most tile. All tile interconnection registers are synchronized
on the "clock_main_c" clock.
In addition to the data busses, the configuration bus handles the programming and
reconfiguration of the array. Configuration is synchronized on the "clock_config_c" clock.
Each data path within each tile is addressable.
Configuration addressing is as follows (msb on the left):
- ConfigAddr = {RowSelect, ColSelect, DirSelect}
- RowSelect of 0 corresponds to the northern most row.
- ColSelect of 0 corresponds to the wester most column.
- DirSelect: 00=north, 01=south, 10=west, 11=east.
The configuration data is 18-bits. It defines the LUT function,
the input MUX selection, and the output MUX selection, for a specify tile
datapath. The follow defines the configuration data format:
- ConfigData[17] : Output Select (0=direct, 1=registered)
- ConfigData[16:14] : Input Select 2
- ConfigData[13:11] : Input Select 1
- ConfigData[10:8] : Input Select 0
- 000=north_in
- 001=south_in
- 010=west_in
- 011=east_in
- 100=north_state
- 101=south_state
- 110=west_state
- 111=east_state
- ConfigData[7:0] : LUT data {f(7), f(6), f(5), f(4), f(3), f(2), f(1), f(0)}
Routing and Function
With the lack of routing fabric, data routing is performed in the configuration of each tile.
Because every tile input is registered, designs on RCA are micro-pipelined.
To simplify pipeline data aliment, each tile output can come directly from the LUT
or delayed 1 cycle though an output register.
With each tile having 4 independent datapaths (N, S, W, E), function and routing can be
grouped onto the same tile. For instance, a function can be performed from West and South to East,
while at the same time data is routed from North to South. Note the South input and South output are
separate datapaths.
Embedded Extensions
As with platform FPGAs, RCA can benefit from specialized embedded components,
such as block ram, hardware multipliers, and processors. Implementing embedded
components is possible by replacing internal tiles groups with hard IP.