Workaround for vector dependence

Workaround for vector dependence

Hello all,

I've got a critical image processing routine that I would like very much to accelerate. However, the compiler tells me that it can't be optimized because of "existence of vector dependence". This is true, since the new value for a pixel is the sum of its original value plus a fraction of the previous pixel (previous in the horizontal direction). The code is as follow:

for(col = firstCol+1 ; col <=lastCol ; ++col){

buffer[col] = fraction*buffer[col-1] + buffer[col];

}

Also note that the starting column is not the first, but first+1.

Since this is pretty standard, I believe that someone probably already came up with an idea to accelerate such a piece of code. Does someone have any suggestions/ideas/comments?

Thanks in advance

Alex

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

That code is pretty much standard and it should vectorize, put #pragma ivdep just before the loop and see if it helps. On the other hand, the vectorization may be prevented by datatypes used in the code -- some datatypes cannot be vectorized because they lack correct assembler instructions to support the vectorization.

Regards,
Igor Levicki

Alex,

>> new value for a pixel is the sum of its original value plus a fraction of the previous pixel

Is this the value of the previous pixel prior to modification or post modification?

If prior to modification then you do have a vector dependence therefore why not use 2 buffers and a pointer?

float BufferA[BufferSize];
float BufferB[BufferSize];
float* buffer = &BufferA;
void HalfAverage()
{
float* OldBuffer = buffer;
if(buffer == &BufferA)
buffer = &BufferB;
else
buffer = &BufferA;
for(col = firstCol+1 ; col <=lastCol ; ++col){
buffer[col] = fraction*OldBuffer [col-1] + OldBuffer [col];
}
}
Jim Dempsey

www.quickthreadprogramming.com

Thanks for the answer. I have tried with #pragma ivdep, but the compiler still complains about a vector dependence. Also, all the data are float. I'll post a sample code below.

Alex

Thanks for the answer, Jim.

>> why not use 2 buffers and a pointer?

As a matter of fact, I am already using 2 buffers, my example was not fully representative of my code.The code is in fact (with img an instance of a Image class)

for(int col=img.firstCol+1 ; col<=img.lastCol ; ++col){

buffer1[col] = buffer1[col-1] * fr + img(row, col);

}

I have attached a self-contained c++ file that illustrates the problem. The simple loops that

do not vectorize are located at line 143 and line 152. The command line for compiling is as follow:

icl /QxN /Qansi-alias /Qvec-report2 binary_Z.cpp

Any ideas/suggestions/comments more than welcome! Thanks in advance.

Alex

Attachments: 

AttachmentSize
Download binary_Z.cpp4.25 KB

Leave a Comment

Please sign in to add a comment. Not a member? Join today