Conversion Rules for
hls_float
hls_float
You can convert between different sizes of
hls_float
data types through assignment or by using the
convert_to()
function. For example,
using namespace ihc; hls_float<8, 32> myFloat = ...; hls_float<3, 18> myFloat2 = myFloat; // use rounding rules defined by hls_float type // use rounding rules defined in convert_to() function call hls_float <3, 18> myFloat3 = myFloat.convert_to<3, 18, ihc::fp_config::FP_Round::RZERO>();
To convert between native types (for example,
float
,
double
) and
hls_float
data types, assign to or from the types. Type conversion in an assignment occurs according to the rules mentioned in
Table 1.
For two
hls_float
variables in a binary operation, the
hls_float
variable with the larger exponent bitwidth is considered to be the
larger
variable. If two variables have the same exponent bit width, the variable with the larger mantissa bitwidth is considered to be the
larger
variable. The operands are then unified to the
larger
type before the binary operation occurs.
Native floatingpoint data types and
hls_float
data types are converted to
hls_float
data types according to the rules in
Table 1.
The
Intel® oneAPI
also provides some operations that leave the precision of input types untouched and provide control over the output precision. For more details, refer to
Operations with Explicit Precision Controls.
DPC++/C++
Compiler Data Type
 From
hls_float To
Data Type  From
Data Type To
hls_float 

hls_float with higher representable range
 Keep exponent equivalent.
The mantissa is rounded according to the rounding mode of the target
hls_float (with the higher representable range).
 +Inf if the source of the conversion is out of the representable range. Otherwise, keep exponent equivalent.
The mantissa is rounded according to the rounding mode of the target
hls_float (with the smaller representable range).

float  Convert original
hls_float to
hls_float<8, 23> with the previous
hls_float rule, and then bit cast to
float .
 Bitcast
float to
hls_float<8, 23> , and then convert to target
hls_float precision using the
hls_float to
hls_float rules described previously.

double  Convert original
hls_float to
hls_float<11, 52> with earlier
hls_float rule, and then bit cast to
double .
 Bitcast
double to
hls_float<11, 52> , and then convert to the target
hls_float precision using the
hls_float to
hls_float rules described earlier.

long double (emulation only)
(Linux only)
 Convert the original
hls_float to
hls_float<15, 63> with the earlier
hls_float rule, and then insert a 1bit 1 to the MSB of fraction bits to get an approximate equivalent of 80bit representation of a
long double .
 Drop the explicit one fraction bit to convert
long double to 79bit
hls_float<15, 63> .

C++ native integer types
 Truncate towards zero. Converting from
hls_float that is larger than the range of integer type is an undefined behavior.
 Round to the nearest, tie breaks to even. If the integer value is too large, the
hls_float value saturates to plus infinity.
