vectorization issue

jimdempseyatthecove
Total Points:
36,397
Status Points:
36,397
Black Belt
October 20, 2008 7:14 AM PDT
Rate
 
|Best Answer
#2
Quoting - h_krishnan

In my application, I have the following pattern occurring at many places:

loop_over_i(i)
{
   do_something(i); // common for all loops
   do_remaining_things(i);
}

The do_something part is common for most loops and so I am trying to extract it out with a class. However, the loop no longer vectorizes. A sample code is given below:

#include <iostream>
using namespace std;

class Loop
{
public:
    Loop(int max) : max_(max), count_(0), sqr_(0)
    { }
    void more()
    { 
      ++count_ ;
      sqr_ = count_ * count_;
    }
    bool done() const
    {
      return count_ >= max_;
    }
    operator int() const
    {
      return count_;
    }
    int sqr() const
    {
      return sqr_;
    }
private:
    int count_, max_, sqr_;
};

int main()
{
    int square[50];
    // this loop vectorizes
    for (int i = 0; i < 50; ++i)
      {
        square[i] = i*i;
      }
    // this loop doesn't vectorize
    for (Loop loop(50); !loop.done(); loop.more())
    {
      square[loop] = loop.sqr();
    }
}
How can I implement what I need in a way that the loop vectorizes?

When indexing square[i], elements i and i+1, i+2, ... are in adjacent memory locations, thus permitting vectorization. The compiler vectorization code will recognize loop syntax such as i<50 and ++i and can analyse that for candidate of vectorization.

When you declare the class Loop, and declare the member functions, the compiler does not have the necessary information in order to vectorize the loop. The compiler _potentially_ could vectorize the code if you inline the member functions. But having the !loop.done() may present a problem.

Also note that your more() function is computing the result for the element after the last element. Consider converting the more() to return a bool and remove the done function.

  for(Loop loop(50); loop.more();)

where

   inline bool more()
   {
     if(count_ >= max_) return false;
     sqr_ = count_ * count_;
     ++count_;
     return true;
   }

This syntax will not necessarily vectorize. It would depend on how aggressive the the compiler was at vectorization.

Jim Dempsey


--------

Blog: The Parallel Void


www.quickthreadprogramming.com


Intel Software Network Forums Statistics

8474 users have contributed to 31606 threads and 100656 posts to date.
In the past 24 hours, we have 30 new thread(s) 109 new posts(s), and 163 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member Kevin Johnson