# Data Types and Operations

## Data Type Selection Considerations

- Select the most appropriate data type for your application. For example, do not define your variable asfloatif the data typeshortis sufficient.
- Ensure that both sides of an arithmetic expression belong to the same data type. Consider an example where one side of an arithmetic expression is a floating-point value and the other side is an integer. The mismatched data types cause theIntel® oneAPI DPC++/C++ Compilerto create implicit conversion operators, which can become expensive if they are present in large numbers.
- Take advantage of padding if it exists in your data structures. For example, if you only needfloat3data type, which has the same size asfloat4, you may change the data type tofloat4to make use of the extra dimension to carry an unrelated value.

## Arithmetic Operation Considerations

- Introduce floating-point arithmetic operations only when necessary.
- TheIntel® oneAPI DPC++/C++ Compilerdefaults floating-point constants to double data type. Add anfdesignation to the constant to make it a single precision floating-point operation.
- For example, the arithmetic operationsin(1.0)represents a double precision floating-point sine function. The arithmetic operationsin(1.0f)represents a single precision floating-point sine function.
- If you do not require full precision result for a complex function, compute simpler arithmetic operations to approximate the result. Consider the following example scenarios:
- Instead of computing the functionpow(x,n)wherenis a small value, approximate the result by performing repeated squaring operations because they require much less hardware resources and area.
- Ensure you are aware of the original and approximated area uses because in some cases, computing a result via approximation might result in excess area use. For example, thesqrtfunction is not resource-intensive. Other than a rough approximation, replacing thesqrtfunction with arithmetic operations that the host must compute at runtime might result in larger area use.
- If your kernel performs a complex arithmetic operation with a constant that theIntel® oneAPI DPC++/C++ Compilercomputes at compilation time (for example,log(PI/2.0)), perform the arithmetic operation on the host instead and pass the result as an argument to the kernel at runtime.

## Optimizing Floating-Point Operations

**Tree Balancing**

result = (((A * B) + C) + (D * E)) + (F * G);

dpcpp -fintelfpga -Xshardware -Xsfp-relaxed <source_file>.cpp

**Rounding Operations**

dpcpp -fintelfpga -Xshardware -Xsfpc <source_file>.cpp

- Remove floating-point rounding operations and conversions whenever possible. If possible, the-Xsfpcargument directs theIntel® oneAPI DPC++/C++ Compilerto round a floating-point operation only once—at the end of the tree of the floating-point operations.
- Carry additional mantissa bits to maintain precision. TheIntel® oneAPI DPC++/C++ Compilercarries additional precision bits through the floating-point calculations and removes these precision bits at the end of the tree of floating-point operations.

## Avoid Expensive Functions

- Integer division and modulo (remainder) operators
- Most floating-point operators except addition, multiplication, absolute value, and comparison. For more information about optimizing floating-point operations, refer to the Optimizing Floating-Point Operations section.
- Atomic functions

- Binary logic operations such asAND,NAND,OR,NOR,XOR, andXNOR
- Logical operations with one constant argument
- Shift by constant
- Integer multiplication and division by a constant that is a power of two

// this function is used in kernel code void myKernel (accessor<int, access::mode::read, access:: target::global_buffer> a, accessor<int, access::mode::read, access:: target::global_buffer> b, cl::sycl::id<1> wiID, const float c, const float d) { //inefficient since each work-item must calculate c divided by d b[wiID ] = a[wiID ] * (c / d); }

void myKernel (accessor<int, access::mode::read, access:: target::global_buffer> a, accessor<int, access::mode::read, access:: target::global_buffer> b, cl::sycl::id<1> wiID, const float c_divided_by_d) { /*host calculates c divided by d once and passes it into kernel to avoid redundant expensive calculations*/ b[wiID ] = a[wiID ] * c_divided_by_d; }