The Elliptic Curve Group core is for computing the addition of two elements in the elliptic curve group, and the addition of $c$ identical elements in the elliptic curve group.
The elliptic curve is super-singular $E:y^2=x^3-x+1$ in affine coordinates defined over a Galois field $GF(3^m)$, $m=97$, whose irreducible polynomial is $x^97+x^12+2$.
The elliptic curve group is the set of solutions $(x,y)$ over $GF(3^m)$ to the equation of $E$, together with an additional point at infinity, denoted $O$. An element in the elliptic curve group is also called “a point”. The elliptic curve group is abelian. The group law is described in the document/specification.
The Elliptic Curve Group core consists of two modules, one computing the addition of two elliptic curve group elements ($P_1+P_2$) and the other computing the addition of many identical elliptic curve group elements ($c⋅P_1$). The first module is called $point_add$. The second module is called $point_scalar_mult$.
The core is written in Verilog 2001, and it is carefully optimized for FPGA. For example, input signals are synchronous and sampled at the rising edge of the clock. Output signals are driven by flip-flops, and not directly connected to input signals by combinational logic. There is no latch, and only one clock domain in entire core.
The $point_add$ module runs at 192 MHz on the Xilinx Virtex-4 XC4VLX200-11FF1513 FPGA board. It computes one addition within 2.7 microseconds if with a 100MHz clock. The $point_add$ module uses 12,099 (6%) LUTs, 6,694 (7%) slices, 6,141 (3%) flip-flops of the XC4VLX200-11FF1513 FPGA board.
The $point_scalar_mult$ module runs at 148 MHz on the Xilinx Virtex-4 XC4VLX200-11FF1513 FPGA board. It computes one addition within 0.552 milliseconds if with a 100MHz clock. The $point_scalar_mult$ module uses 13,780 (7%) LUTs, 7,272 (8%) slices, 7,451 (4%) flip-flops of the XC4VLX200-11FF1513 FPGA board.
The core is open source, under the license of LGPL version 3.
- Elliptic Curve Group for hyper-elliptic curve $y^2=x^3-x+1$
- The irreducible polynomial is $x^97+x^12+2$
- Fully synchronous design
- Fully synthesize-able
- ONLY ONE clock domain in entire core
- NO latch
- All output signals are buffered
- Vendor independent code
- The core is ready and available in Verilog from OpenCores svn
- using projective coordinates may improve speed
- adopting base-3 scalar multiplication value may improve speed, requiring base-2 to base-3 transforming function, and point tripling function
If this project has helped you, please consider donating an FPGA to Homer Hsing (Xilinx FPGA is preferred). To donate him will help him develop more valuable project, and is to help you.