OpenCores
URL https://opencores.org/ocsvn/neorv32/neorv32/trunk

Subversion Repositories neorv32

Compare Revisions

  • This comparison shows the changes necessary to convert path
    /neorv32/trunk/sw/example/floating_point_test
    from Rev 55 to Rev 56
    Reverse comparison

Rev 55 → Rev 56

/README.md
1,5 → 1,7
# NEORV32 `Zfinx` Floating-Point Extension
 
The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extensions can be enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.
 
The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead of the dedicated floating-point `f` register file (which is
defined by the RISC-V `F` single-precision floating-point extension). Hence, the standard data transfer instructions from the `F` extension are **not** available in `Zfinx`:
 
6,7 → 8,6
* floating-point load/store operations (`FLW`, `FSW`) and their compressed versions
* integer register file `x` <-> floating point register file `f` move operations (`FMV.W.X`, `FMV.X.W`)
 
 
:information_source: More information regarding the RISC-V `Zfinx` single-precision floating-point extension can be found in the officail GitHub repo:
[`github.com/riscv/riscv-zfinx`](https://github.com/riscv/riscv-zfinx).
 
14,6 → 15,12
Make sure you **do not** use the `f` ISA attribute when compiling applications that use floating-point arithmetic (`-march=rv32i*f*` is **NOT ALLOWED!**).
 
 
### :warning: FPU Limitations
 
* The FPU **does not support subnormal numbers** yet. Subnormal FPU inputs and subnormal FPU results are always *flushed to zero*. The *classify* instruction `FCLASS` will never set the "subnormal" mask bits.
* Rounding mode `ob100` "round to nearest, ties to max magnitude" is not supported yet (this and all invalid rounding mode configurations behave as "round towards zero" (truncation)).
 
 
## Intrinsic Library
 
The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This library uses "custom" inline assmbly instructions
36,3 → 43,10
 
The provided test program `main.c` verifies all currently implemented `Zfinx` instructions by checking the functionality against the pure software-based emulation model
(GCC soft-float library).
 
 
## Resources
 
* Great page with online calculators for floating-point artihmetic: [http://www.ecs.umass.edu/ece/koren/arith/simulator/](http://www.ecs.umass.edu/ece/koren/arith/simulator/)
* A handy tool for visualizing floating-point numbers in their binary representation: [https://www.h-schmidt.net/FloatConverter/IEEE754.html](https://www.h-schmidt.net/FloatConverter/IEEE754.html)
* This helped me to understand what results the different FPU operation generate when having "special" inputs like NaN: [https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview](https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview)
/main.c
82,6 → 82,8
#define RUN_CLASSIFY_TESTS (1)
//** Run unsupported instructions tests when != 0 */
#define RUN_UNAVAIL_TESTS (1)
//** Run average instruction execution time test when != 0 */
#define RUN_TIMING_TESTS (0)
/**@}*/
 
 
143,7 → 145,8
#if (SILENT_MODE != 0)
neorv32_uart_printf("SILENT_MODE enabled (only showing actual errors)\n");
#endif
neorv32_uart_printf("Test cases per instruction: %u\n\n", (uint32_t)NUM_TEST_CASES);
neorv32_uart_printf("Test cases per instruction: %u\n", (uint32_t)NUM_TEST_CASES);
neorv32_uart_printf("NOTE: The NEORV32 FPU does not support subnormal numbers yet. Subnormal numbers are flushed to zero.\n\n");
 
 
// clear exception status word
413,7 → 416,7
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
riscv_intrinsic_fdivs(opa.float_value, opb.float_value);
if (neorv32_cpu_csr_read(CSR_MCAUSE) == 0) {
if (neorv32_cpu_csr_read(CSR_MCAUSE) != TRAP_CODE_I_ILLEGAL) {
neorv32_uart_printf("%c[1m[FAILED]%c[0m\n", 27, 27);
err_cnt_total++;
}
426,7 → 429,7
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
riscv_intrinsic_fsqrts(opa.float_value);
if (neorv32_cpu_csr_read(CSR_MCAUSE) == 0) {
if (neorv32_cpu_csr_read(CSR_MCAUSE) != TRAP_CODE_I_ILLEGAL) {
neorv32_uart_printf("%c[1m[FAILED]%c[0m\n", 27, 27);
err_cnt_total++;
}
439,7 → 442,7
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
riscv_intrinsic_fmadds(opa.float_value, opb.float_value, -opa.float_value);
if (neorv32_cpu_csr_read(CSR_MCAUSE) == 0) {
if (neorv32_cpu_csr_read(CSR_MCAUSE) != TRAP_CODE_I_ILLEGAL) {
neorv32_uart_printf("%c[1m[FAILED]%c[0m\n", 27, 27);
err_cnt_total++;
}
452,7 → 455,7
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
riscv_intrinsic_fmsubs(opa.float_value, opb.float_value, -opa.float_value);
if (neorv32_cpu_csr_read(CSR_MCAUSE) == 0) {
if (neorv32_cpu_csr_read(CSR_MCAUSE) != TRAP_CODE_I_ILLEGAL) {
neorv32_uart_printf("%c[1m[FAILED]%c[0m\n", 27, 27);
err_cnt_total++;
}
465,7 → 468,7
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
riscv_intrinsic_fnmadds(opa.float_value, opb.float_value, -opa.float_value);
if (neorv32_cpu_csr_read(CSR_MCAUSE) == 0) {
if (neorv32_cpu_csr_read(CSR_MCAUSE) != TRAP_CODE_I_ILLEGAL) {
neorv32_uart_printf("%c[1m[FAILED]%c[0m\n", 27, 27);
err_cnt_total++;
}
478,7 → 481,7
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
riscv_intrinsic_fnmadds(opa.float_value, opb.float_value, -opa.float_value);
if (neorv32_cpu_csr_read(CSR_MCAUSE) == 0) {
if (neorv32_cpu_csr_read(CSR_MCAUSE) != TRAP_CODE_I_ILLEGAL) {
neorv32_uart_printf("%c[1m[FAILED]%c[0m\n", 27, 27);
err_cnt_total++;
}
488,7 → 491,325
#endif
 
 
// final report
// ----------------------------------------------------------------------------
// Instruction execution timing test
// ----------------------------------------------------------------------------
 
#if (RUN_TIMING_TESTS != 0)
 
uint32_t time_start, time_sw, time_hw;
const uint32_t num_runs = 4096;
 
neorv32_uart_printf("\nAverage execution time tests (%u runs)\n", num_runs);
 
 
// signed integer to float
neorv32_uart_printf("FCVT.S.W: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fcvt_sw((int32_t)opa.binary_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fcvt_sw((int32_t)opa.binary_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// float to signed integer
neorv32_uart_printf("FCVT.W.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.binary_value = (uint32_t)riscv_intrinsic_fcvt_ws(opa.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.binary_value = (uint32_t)riscv_emulate_fcvt_ws(opa.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// addition
neorv32_uart_printf("FADD.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fadds(opa.float_value, opb.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fadds(opa.float_value, opb.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// subtraction
neorv32_uart_printf("FSUB.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fsubs(opa.float_value, opb.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fsubs(opa.float_value, opb.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// multiplication
neorv32_uart_printf("FMUL.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fmuls(opa.float_value, opb.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fmuls(opa.float_value, opb.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// Max
neorv32_uart_printf("FMAX.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fmaxs(opa.float_value, opb.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fmaxs(opa.float_value, opb.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// Comparison
neorv32_uart_printf("FLE.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fles(opa.float_value, opb.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fles(opa.float_value, opb.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
 
 
// Sign-injection
neorv32_uart_printf("FSGNJX.S: ");
time_sw = 0;
time_hw = 0;
err_cnt = 0;
for (i=0; i<num_runs; i++) {
opa.binary_value = get_test_vector();
opb.binary_value = get_test_vector();
 
// hardware execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_hw.float_value = riscv_intrinsic_fsgnjxs(opa.float_value, opb.float_value);
}
time_hw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
time_hw -= 4; // remove the 2 dummy instructions
 
// software (emulation) execution time
time_start = neorv32_cpu_csr_read(CSR_CYCLE);
{
res_sw.float_value = riscv_emulate_fsgnjxs(opa.float_value, opb.float_value);
}
time_sw += neorv32_cpu_csr_read(CSR_CYCLE) - time_start;
 
if (res_sw.binary_value != res_hw.binary_value) {
err_cnt++;
}
}
 
if (err_cnt == 0) {
neorv32_uart_printf("cycles[SW] = %u vs. cycles[HW] = %u\n", time_sw/num_runs, time_hw/num_runs);
}
else {
neorv32_uart_printf("%c[1m[TEST FAILED!]%c[0m\n", 27, 27);
err_cnt_total++;
}
#endif
 
 
// ----------------------------------------------------------------------------
// Final report
// ----------------------------------------------------------------------------
 
if (err_cnt_total != 0) {
neorv32_uart_printf("\n%c[1m[ZFINX EXTENSION VERIFICATION FAILED!]%c[0m\n", 27, 27);
neorv32_uart_printf("%u errors in %u test cases\n", err_cnt_total, test_cnt*(uint32_t)NUM_TEST_CASES);
529,10 → 850,6
tmp.binary_value = xorshift32();
}
 
// subnormal numbers are not supported yet!
// flush them to zero
//tmp.float_value = subnormal_flush(tmp.float_value);
 
return tmp.binary_value;
}
 
/neorv32_zfinx_extension_intrinsics.h
90,7 → 90,7
*
* @warning Subnormal numbers are not supported yet! Flush them to zero.
*
* @param[in] tmp Source operand 1.
* @param[in] tmp Source operand.
* @return Result.
**************************************************************************/
float subnormal_flush(float tmp) {
167,13 → 167,11
/**********************************************************************//**
* Single-precision floating-point addition
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fadds(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fadds(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
189,6 → 187,9
// fadd.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0000000, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
197,13 → 198,11
/**********************************************************************//**
* Single-precision floating-point subtraction
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fsubs(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fsubs(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
219,6 → 218,9
// fsub.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0000100, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
227,13 → 229,11
/**********************************************************************//**
* Single-precision floating-point multiplication
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fmuls(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fmuls(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
249,6 → 249,9
// fmul.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0001000, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
257,13 → 260,11
/**********************************************************************//**
* Single-precision floating-point minimum
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fmins(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fmins(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
279,6 → 280,9
// fmin.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0010100, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
287,13 → 291,11
/**********************************************************************//**
* Single-precision floating-point maximum
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fmaxs(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fmaxs(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
309,6 → 311,9
// fmax.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0010100, a1, a0, 0b001, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
317,12 → 322,10
/**********************************************************************//**
* Single-precision floating-point convert float to unsigned integer
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @return Result.
**************************************************************************/
uint32_t __attribute__ ((noinline)) riscv_intrinsic_fcvt_wus(float rs1) {
inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_wus(float rs1) {
 
float_conv_t opa;
opa.float_value = rs1;
336,6 → 339,9
// fcvt.wu.s a0, a0
CUSTOM_INSTR_R2_TYPE(0b1100000, x1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
return result;
}
 
343,12 → 349,10
/**********************************************************************//**
* Single-precision floating-point convert float to signed integer
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @return Result.
**************************************************************************/
int32_t __attribute__ ((noinline)) riscv_intrinsic_fcvt_ws(float rs1) {
inline int32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_ws(float rs1) {
 
float_conv_t opa;
opa.float_value = rs1;
362,6 → 366,9
// fcvt.w.s a0, a0
CUSTOM_INSTR_R2_TYPE(0b1100000, x0, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
return (int32_t)result;
}
 
369,12 → 376,10
/**********************************************************************//**
* Single-precision floating-point convert unsigned integer to float
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fcvt_swu(uint32_t rs1) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_swu(uint32_t rs1) {
 
float_conv_t res;
 
387,6 → 392,9
// fcvt.s.wu a0, a0
CUSTOM_INSTR_R2_TYPE(0b1101000, x1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
395,12 → 403,10
/**********************************************************************//**
* Single-precision floating-point convert signed integer to float
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fcvt_sw(int32_t rs1) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_sw(int32_t rs1) {
 
float_conv_t res;
 
413,6 → 419,9
// fcvt.s.w a0, a0
CUSTOM_INSTR_R2_TYPE(0b1101000, x0, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
421,13 → 430,11
/**********************************************************************//**
* Single-precision floating-point equal comparison
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
uint32_t __attribute__ ((noinline)) riscv_intrinsic_feqs(float rs1, float rs2) {
inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_feqs(float rs1, float rs2) {
 
float_conv_t opa, opb;
opa.float_value = rs1;
443,6 → 450,9
// feq.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b1010000, a1, a0, 0b010, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
return result;
}
 
450,13 → 460,11
/**********************************************************************//**
* Single-precision floating-point less-than comparison
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
uint32_t __attribute__ ((noinline)) riscv_intrinsic_flts(float rs1, float rs2) {
inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_flts(float rs1, float rs2) {
 
float_conv_t opa, opb;
opa.float_value = rs1;
472,6 → 480,9
// flt.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b1010000, a1, a0, 0b001, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
return result;
}
 
479,13 → 490,11
/**********************************************************************//**
* Single-precision floating-point less-than-or-equal comparison
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
uint32_t __attribute__ ((noinline)) riscv_intrinsic_fles(float rs1, float rs2) {
inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fles(float rs1, float rs2) {
 
float_conv_t opa, opb;
opa.float_value = rs1;
501,6 → 510,9
// fle.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b1010000, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
return result;
}
 
508,13 → 520,11
/**********************************************************************//**
* Single-precision floating-point sign-injection
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fsgnjs(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjs(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
530,6 → 540,9
// fsgnj.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0010000, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
538,13 → 551,11
/**********************************************************************//**
* Single-precision floating-point sign-injection NOT
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fsgnjns(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjns(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
560,6 → 571,9
// fsgnjn.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0010000, a1, a0, 0b001, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
568,13 → 582,11
/**********************************************************************//**
* Single-precision floating-point sign-injection XOR
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fsgnjxs(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjxs(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
590,6 → 602,9
// fsgnjx.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0010000, a1, a0, 0b010, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
598,12 → 613,10
/**********************************************************************//**
* Single-precision floating-point number classification
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @param[in] rs1 Source operand 1 (a0).
* @return Result.
**************************************************************************/
uint32_t __attribute__ ((noinline)) riscv_intrinsic_fclasss(float rs1) {
inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fclasss(float rs1) {
 
float_conv_t opa;
opa.float_value = rs1;
617,6 → 630,9
// fclass.s a0, a0
CUSTOM_INSTR_R2_TYPE(0b1110000, x0, a0, 0b001, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
return result;
}
 
628,8 → 644,6
/**********************************************************************//**
* Single-precision floating-point division
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @warning This instruction is not supported and should raise an illegal instruction exception when executed.
*
* @param[in] rs1 Source operand 1 (a0).
636,7 → 650,7
* @param[in] rs2 Source operand 2 (a1).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fdivs(float rs1, float rs2) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fdivs(float rs1, float rs2) {
 
float_conv_t opa, opb, res;
opa.float_value = rs1;
652,6 → 666,9
// fdiv.s a0, a0, x1
CUSTOM_INSTR_R2_TYPE(0b0001100, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
660,14 → 677,12
/**********************************************************************//**
* Single-precision floating-point square root
*
* @note "noinline" attributed to make sure arguments/return values are in a0 and a1.
*
* @warning This instruction is not supported and should raise an illegal instruction exception when executed.
*
* @param[in] rs1 Source operand 1 (a0).
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fsqrts(float rs1) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fsqrts(float rs1) {
 
float_conv_t opa, res;
opa.float_value = rs1;
681,6 → 696,9
// fsqrt.s a0, a0, a1
CUSTOM_INSTR_R2_TYPE(0b0101100, a1, a0, 0b000, a0, 0b1010011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
689,8 → 707,6
/**********************************************************************//**
* Single-precision floating-point fused multiply-add
*
* @note "noinline" attributed to make sure arguments/return values are in a0, a1 and a2.
*
* @warning This instruction is not supported and should raise an illegal instruction exception when executed.
*
* @param[in] rs1 Source operand 1 (a0)
698,7 → 714,7
* @param[in] rs3 Source operand 3 (a2)
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fmadds(float rs1, float rs2, float rs3) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fmadds(float rs1, float rs2, float rs3) {
 
float_conv_t opa, opb, opc, res;
opa.float_value = rs1;
717,6 → 733,9
// fmadd.s a0, a0, a1, a2
CUSTOM_INSTR_R3_TYPE(a2, a1, a0, 0b000, a0, 0b1000011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
725,8 → 744,6
/**********************************************************************//**
* Single-precision floating-point fused multiply-sub
*
* @note "noinline" attributed to make sure arguments/return values are in a0, a1 and a2.
*
* @warning This instruction is not supported and should raise an illegal instruction exception when executed.
*
* @param[in] rs1 Source operand 1 (a0)
734,7 → 751,7
* @param[in] rs3 Source operand 3 (a2)
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fmsubs(float rs1, float rs2, float rs3) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fmsubs(float rs1, float rs2, float rs3) {
 
float_conv_t opa, opb, opc, res;
opa.float_value = rs1;
753,6 → 770,9
// fmsub.s a0, a0, a1, a2
CUSTOM_INSTR_R3_TYPE(a2, a1, a0, 0b000, a0, 0b1000111);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
761,8 → 781,6
/**********************************************************************//**
* Single-precision floating-point fused negated multiply-sub
*
* @note "noinline" attributed to make sure arguments/return values are in a0, a1 and a2.
*
* @warning This instruction is not supported and should raise an illegal instruction exception when executed.
*
* @param[in] rs1 Source operand 1 (a0)
770,7 → 788,7
* @param[in] rs3 Source operand 3 (a2)
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fnmsubs(float rs1, float rs2, float rs3) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmsubs(float rs1, float rs2, float rs3) {
 
float_conv_t opa, opb, opc, res;
opa.float_value = rs1;
789,6 → 807,9
// fnmsub.s a0, a0, a1, a2
CUSTOM_INSTR_R3_TYPE(a2, a1, a0, 0b000, a0, 0b1001011);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
797,8 → 818,6
/**********************************************************************//**
* Single-precision floating-point fused negated multiply-add
*
* @note "noinline" attributed to make sure arguments/return values are in a0, a1 and a2.
*
* @warning This instruction is not supported and should raise an illegal instruction exception when executed.
*
* @param[in] rs1 Source operand 1 (a0)
806,7 → 825,7
* @param[in] rs3 Source operand 3 (a2)
* @return Result.
**************************************************************************/
float __attribute__ ((noinline)) riscv_intrinsic_fnmadds(float rs1, float rs2, float rs3) {
inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmadds(float rs1, float rs2, float rs3) {
 
float_conv_t opa, opb, opc, res;
opa.float_value = rs1;
825,6 → 844,9
// fnmadd.s a0, a0, a1, a2
CUSTOM_INSTR_R3_TYPE(a2, a1, a0, 0b000, a0, 0b1001111);
 
// dummy instruction to prevent GCC "constprop" optimization
asm volatile ("add %[res], %[input], x0" : [res] "=r" (result) : [input] "r" (result) );
 
res.binary_value = result;
return res.float_value;
}
841,7 → 863,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fadds(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fadds(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
858,7 → 880,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fsubs(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fsubs(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
875,7 → 897,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fmuls(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fmuls(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
892,7 → 914,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fmins(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fmins(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
933,7 → 955,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fmaxs(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fmaxs(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
974,7 → 996,7
* @param[in] rs1 Source operand 1.
* @return Result.
**************************************************************************/
uint32_t riscv_emulate_fcvt_wus(float rs1) {
uint32_t __attribute__ ((noinline)) riscv_emulate_fcvt_wus(float rs1) {
 
float opa = subnormal_flush(rs1);
 
988,7 → 1010,7
* @param[in] rs1 Source operand 1.
* @return Result.
**************************************************************************/
int32_t riscv_emulate_fcvt_ws(float rs1) {
int32_t __attribute__ ((noinline)) riscv_emulate_fcvt_ws(float rs1) {
 
float opa = subnormal_flush(rs1);
 
1002,7 → 1024,7
* @param[in] rs1 Source operand 1.
* @return Result.
**************************************************************************/
float riscv_emulate_fcvt_swu(uint32_t rs1) {
float __attribute__ ((noinline)) riscv_emulate_fcvt_swu(uint32_t rs1) {
 
return (float)rs1;
}
1014,7 → 1036,7
* @param[in] rs1 Source operand 1.
* @return Result.
**************************************************************************/
float riscv_emulate_fcvt_sw(int32_t rs1) {
float __attribute__ ((noinline)) riscv_emulate_fcvt_sw(int32_t rs1) {
 
return (float)rs1;
}
1027,7 → 1049,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
uint32_t riscv_emulate_feqs(float rs1, float rs2) {
uint32_t __attribute__ ((noinline)) riscv_emulate_feqs(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1055,7 → 1077,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
uint32_t riscv_emulate_flts(float rs1, float rs2) {
uint32_t __attribute__ ((noinline)) riscv_emulate_flts(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1080,7 → 1102,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
uint32_t riscv_emulate_fles(float rs1, float rs2) {
uint32_t __attribute__ ((noinline)) riscv_emulate_fles(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1105,7 → 1127,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fsgnjs(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fsgnjs(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1142,7 → 1164,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fsgnjns(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fsgnjns(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1179,7 → 1201,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fsgnjxs(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fsgnjxs(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1215,7 → 1237,7
* @param[in] rs1 Source operand 1.
* @return Result.
**************************************************************************/
uint32_t riscv_emulate_fclasss(float rs1) {
uint32_t __attribute__ ((noinline)) riscv_emulate_fclasss(float rs1) {
 
float opa = subnormal_flush(rs1);
 
1287,7 → 1309,7
* @param[in] rs2 Source operand 2.
* @return Result.
**************************************************************************/
float riscv_emulate_fdivs(float rs1, float rs2) {
float __attribute__ ((noinline)) riscv_emulate_fdivs(float rs1, float rs2) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1303,7 → 1325,7
* @param[in] rs1 Source operand 1.
* @return Result.
**************************************************************************/
float riscv_emulate_fsqrts(float rs1) {
float __attribute__ ((noinline)) riscv_emulate_fsqrts(float rs1) {
 
float opa = subnormal_flush(rs1);
 
1324,7 → 1346,7
* @param[in] rs3 Source operand 3
* @return Result.
**************************************************************************/
float riscv_emulate_fmadds(float rs1, float rs2, float rs3) {
float __attribute__ ((noinline)) riscv_emulate_fmadds(float rs1, float rs2, float rs3) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1343,7 → 1365,7
* @param[in] rs3 Source operand 3
* @return Result.
**************************************************************************/
float riscv_emulate_fmsubs(float rs1, float rs2, float rs3) {
float __attribute__ ((noinline)) riscv_emulate_fmsubs(float rs1, float rs2, float rs3) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1362,7 → 1384,7
* @param[in] rs3 Source operand 3
* @return Result.
**************************************************************************/
float riscv_emulate_fnmsubs(float rs1, float rs2, float rs3) {
float __attribute__ ((noinline)) riscv_emulate_fnmsubs(float rs1, float rs2, float rs3) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);
1381,7 → 1403,7
* @param[in] rs3 Source operand 3
* @return Result.
**************************************************************************/
float riscv_emulate_fnmadds(float rs1, float rs2, float rs3) {
float __attribute__ ((noinline)) riscv_emulate_fnmadds(float rs1, float rs2, float rs3) {
 
float opa = subnormal_flush(rs1);
float opb = subnormal_flush(rs2);

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.