User Type and > 8 arguments

User Type and > 8 arguments

I'm having trouble calling a arbb method with >8 parameters. Any idea what I'm doing wrong?

7 arguments ok.

8 dense& arguements ok.

Simplified code snippet below:

using namespace sim;

class mat3
{
public:

mat3& operator=(const mat3& input)
{
m_a00 = input.m_a00;
m_a01 = input.m_a01;
m_a02 = input.m_a02;

m_a10 = input.m_a10;
m_a11 = input.m_a11;
m_a12 = input.m_a12;

m_a20 = input.m_a20;
m_a21 = input.m_a21;
m_a22 = input.m_a22;

return (*this);
}

private:

f64 m_a00, m_a01, m_a02;
f64 m_a10, m_a11, m_a12;
f64 m_a20, m_a21, m_a22;

};

void arbb_args(arbb::dense& x1, arbb::dense& x2, arbb::dense& x3, arbb::dense& x4,
arbb::dense& x5, arbb::dense& x6, arbb::dense& x7, arbb::dense& x8)
{
x2 = x1;
x3 = x1;
x4 = x1;
x5 = x1;
x6 = x1;
x7 = x1;
x8 = x1;
}

void arbb_seven(arbb::dense& x1, arbb::dense& x2, arbb::dense& x3, arbb::dense& x4,
arbb::dense& x5, arbb::dense& x6, arbb::dense& x7)
{
x2 = x1;
x3 = x1;
x4 = x1;
x5 = x1;
x6 = x1;
x7 = x1;
//x8 = x1;
}

int main(int argc, char** argv)
{

std::vector aa1(90,0); dense ab1; bind(ab1,10,&aa1[0],&aa1[10],&aa1[20],&aa1[30],&aa1[40],&aa1[50],&aa1[60],&aa1[70],&aa1[80]);
std::vector aa2(90,0); dense ab2; bind(ab2,10,&aa2[0],&aa2[10],&aa2[20],&aa2[30],&aa2[40],&aa2[50],&aa2[60],&aa2[70],&aa2[80]);
std::vector aa3(90,0); dense ab3; bind(ab3,10,&aa3[0],&aa3[10],&aa3[20],&aa3[30],&aa3[40],&aa3[50],&aa3[60],&aa3[70],&aa3[80]);
std::vector aa4(90,0); dense ab4; bind(ab4,10,&aa4[0],&aa4[10],&aa4[20],&aa4[30],&aa4[40],&aa4[50],&aa4[60],&aa4[70],&aa4[80]);
std::vector aa5(90,0); dense ab5; bind(ab5,10,&aa5[0],&aa5[10],&aa5[20],&aa5[30],&aa5[40],&aa5[50],&aa5[60],&aa5[70],&aa5[80]);
std::vector aa6(90,0); dense ab6; bind(ab6,10,&aa6[0],&aa6[10],&aa6[20],&aa6[30],&aa6[40],&aa6[50],&aa6[60],&aa6[70],&aa6[80]);
std::vector aa7(90,0); dense ab7; bind(ab7,10,&aa7[0],&aa7[10],&aa7[20],&aa7[30],&aa7[40],&aa7[50],&aa7[60],&aa7[70],&aa7[80]);
std::vector aa8(90,0); dense ab8; bind(ab8,10,&aa8[0],&aa8[10],&aa8[20],&aa8[30],&aa8[40],&aa8[50],&aa8[60],&aa8[70],&aa8[80]);

call(arbb_seven)(ab1,ab2,ab3,ab4,ab5,ab6,ab7);
std::cout << " seven arg ok " << std::endl;
call(arbb_args)(ab1,ab2,ab3,ab4,ab5,ab6,ab7,ab8);
std::cout << " eight, not for me...(out_of_bounds error) " << std::endl;

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello,

Currently the number of supported arguments is limited to 64. The case using "seven" argument is working fine because of 7 x 9 = 63 is still below this limit, where 8 x 9 = 72 is exceeding this limit. Using an user-defined type (mat3 in your case) seems to contribute just a single entity per container entry. Such an entry is finally consisting of a series of scalars (3 x 3 = 9 scalars in your case). Intel ArBB this is actually building a structure of arrays (SoA) instead of an array of structures (AoS). Intel ArBB is allowing us to see such a case as an array of structures (AoS) and to work with the data accordingly, but data is layed out as a structure of arrays in order to exploit SIMD extensions. In your case we get reminded on this fact by exceeding the number of function arguments.

We may consider raising this limit mainly because of user-defined types (thank you for your input!). On the other hand there is a way to not just overcome this limit, but to also try-out using dense (in this case). One may see slices/pages of this 3-dimensional dense container as a stack of 3x3 matrices.

The following rules may also help to write well-performing code:

  • Keep an operation's dimensionality according to the dimensionality of the container, e.g. do not decompose a dense into rows/columns using loops if a higher-dimensional operator is available.
  • Use the most-special operation you can find in the Intel ArBB API, e.g. using clamp() instead of min()/max(), or select().

It might be even possible to implement some of your intended functionality by using very regular element-wise operations applied to the whole 3d dense container. This way we don't even need to see "slices" in some cases. Another example illustrating these hints might be a simple matrix-vector multiplication:

void matvec_product(const dense& matrix, const dense& vector, dense& result)
{
result = add_reduce(matrix * repeat_row(vector, matrix.num_rows()));
}

Thank you very much for providing the detailed information!

Leave a Comment

Please sign in to add a comment. Not a member? Join today