Loading...
You are not logged-in Login/Register





  • Posts   Search Threads
  • h_krishnanOctober 19, 2008 8:28 PM PDT   
    vectorization issue

    In my application, I have the following pattern occurring at many places:

    loop_over_i(i)
    {
       do_something(i); // common for all loops
       do_remaining_things(i);
    }

    The do_something part is common for most loops and so I am trying to extract it out with a class. However, the loop no longer vectorizes. A sample code is given below:

    #include <iostream>
    using namespace std;
    
    class Loop
    {
    public:
        Loop(int max) : max_(max), count_(0), sqr_(0)
        { }
        void more()
        { 
          ++count_ ;
          sqr_ = count_ * count_;
        }
        bool done() const
        {
          return count_ >= max_;
        }
        operator int() const
        {
          return count_;
        }
        int sqr() const
        {
          return sqr_;
        }
    private:
        int count_, max_, sqr_;
    };
    
    int main()
    {
        int square[50];
        // this loop vectorizes
        for (int i = 0; i < 50; ++i)
          {
            square[i] = i*i;
          }
        // this loop doesn't vectorize
        for (Loop loop(50); !loop.done(); loop.more())
        {
          square[loop] = loop.sqr();
        }
    }
    How can I implement what I need in a way that the loop vectorizes?


    kalvenOctober 19, 2008 11:12 PM PDT
    Rate
     
    Re: vectorization issue

    Quoting - h_krishnan

    In my application, I have the following pattern occurring at many places:

    loop_over_i(i)
    {
       do_something(i); // common for all loops
       do_remaining_things(i);
    }

    The do_something part is common for most loops and so I am trying to extract it out with a class. However, the loop no longer vectorizes. A sample code is given below:

    #include <iostream>
    using namespace std;
    
    class Loop
    {
    public:
        Loop(int max) : max_(max), count_(0), sqr_(0)
        { }
        void more()
        { 
          ++count_ ;
          sqr_ = count_ * count_;
        }
        bool done() const
        {
          return count_ >= max_;
        }
        operator int() const
        {
          return count_;
        }
        int sqr() const
        {
          return sqr_;
        }
    private:
        int count_, max_, sqr_;
    };
    
    int main()
    {
        int square[50];
        // this loop vectorizes
        for (int i = 0; i < 50; ++i)
          {
            square[i] = i*i;
          }
        // this loop doesn't vectorize
        for (Loop loop(50); !loop.done(); loop.more())
        {
          square[loop] = loop.sqr();
        }
    }
    How can I implement what I need in a way that the loop vectorizes?

     



    jimdempseyatthecoveOctober 20, 2008 6:14 AM PDT
    Rate
     
    Re: vectorization issue

    Quoting - h_krishnan

    In my application, I have the following pattern occurring at many places:

    loop_over_i(i)
    {
       do_something(i); // common for all loops
       do_remaining_things(i);
    }

    The do_something part is common for most loops and so I am trying to extract it out with a class. However, the loop no longer vectorizes. A sample code is given below:

    #include <iostream>
    using namespace std;
    
    class Loop
    {
    public:
        Loop(int max) : max_(max), count_(0), sqr_(0)
        { }
        void more()
        { 
          ++count_ ;
          sqr_ = count_ * count_;
        }
        bool done() const
        {
          return count_ >= max_;
        }
        operator int() const
        {
          return count_;
        }
        int sqr() const
        {
          return sqr_;
        }
    private:
        int count_, max_, sqr_;
    };
    
    int main()
    {
        int square[50];
        // this loop vectorizes
        for (int i = 0; i < 50; ++i)
          {
            square[i] = i*i;
          }
        // this loop doesn't vectorize
        for (Loop loop(50); !loop.done(); loop.more())
        {
          square[loop] = loop.sqr();
        }
    }
    How can I implement what I need in a way that the loop vectorizes?

    When indexing square[i], elements i and i+1, i+2, ... are in adjacent memory locations, thus permitting vectorization. The compiler vectorization code will recognize loop syntax such as i<50 and ++i and can analyse that for candidate of vectorization.

    When you declare the class Loop, and declare the member functions, the compiler does not have the necessary information in order to vectorize the loop. The compiler _potentially_ could vectorize the code if you inline the member functions. But having the !loop.done() may present a problem.

    Also note that your more() function is computing the result for the element after the last element. Consider converting the more() to return a bool and remove the done function.

      for(Loop loop(50); loop.more();)

    where

       inline bool more()
       {
         if(count_ >= max_) return false;
         sqr_ = count_ * count_;
         ++count_;
         return true;
       }

    This syntax will not necessarily vectorize. It would depend on how aggressive the the compiler was at vectorization.

    Jim Dempsey



    Blog: The Parallel Void
    www.quickthreadprogramming.com

    h_krishnanOctober 23, 2008 8:49 PM PDT
    Rate
     
    Re: vectorization issue

    Good point about the issue with my more() function. I didn't realize I was doing something so stupid.
    I am using inline functions throughout but still the vectorization failed with a "known dependency" message.
    I guess I need to find another way to implement this.
    Thanks for your feedback.

     



    jimdempseyatthecoveOctober 25, 2008 9:25 AM PDT
    Rate
     
    Re: vectorization issue

    The fundamental problem you are experiencing is by making the code opaque by encapsulation with member functions and iterators you also tend to make it difficult, if not impossible for the compiler to vectorize or even to determine vectoizability of the code (as it chews through the data). This is the penalty you pay for progress. By using "old school" programming techniques such as array of like members of each object (as opposed to C++ arrays of objects of elements) the data layout favors vectorization. The requirements of the application would indicate the better of the two techniques. If you need the vectorization for performance then consider unpackaging the objects.

    Jim

     



    Blog: The Parallel Void
    www.quickthreadprogramming.com

Forum jump:  

Intel Software Network Forums Statistics

17,025 users have contributed to 48,319 threads and 172,758 posts to date.

In the past 24 hours, we have 11 new thread(s) 54 new posts(s), and 47 new user(s).

In the past 3 days, the most popular thread for everyone has been Optimalization of sine function\'s taylor expansion The most posts were made to Most likely, the issue is that The post with the most views is Optimalization of sine function\'s taylor expansion

Please welcome our newest member redfruit83


For more complete information about compiler optimizations, see our Optimization Notice.