Tuner double rescheduled

Tuner double rescheduled

As an exercises I am trying to implement a Bitonic sorter with CnC.

My (naive) implementation use one item, tag, and step collection for the sort part, and one item, tag, and step collection for the merge part.

For both merge and sort the tag are composed of lowIndex, delta and direction. (lowIndex and lowIndex + delta are the indexes of the interval of the vector to sort/merge; direction can be ascending and descending)

I wanted to use a tuner, since I have dependencies for the merge step on the two halves that I have to merge, but when I use it the number of rescheduled steps doubles instead of decreasing.

 

struct merge_tuner : public CnC::step_tuner<>{
    template< class dependency_consumer >
    void depends( const bitonic_tag & tag, bitonic_context & c, dependency_consumer & dC ) const;
};

template< class dependency_consumer >
void merge_tuner :: depends( const bitonic_tag & tag, bitonic_context & c, dependency_consumer & dC ) const{
    if(tag.delta > 1){
        int lo = tag.lowIndex;
        int m = tag.delta / 2;

        bitonic_tag first (tag.lowIndex, m, ASCENDING);
        bitonic_tag second (tag.lowIndex + m , m, DESCENDING);
    
        dC.depends(c.m_merge_item,first);
        dC.depends(c.m_merge_item,second);
    }

}
CnC::step_collection< bitonic_merge_step, merge_tuner>       merge_steps;

 

Any idea of what am I doing wrong?

AttachmentSize
Download bitonic_0.h1.16 KB
Download bitonic_indexes_0.cpp8.34 KB
2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi,

the printed statistics are a little misleading. The re-queue number includes the step instances which were delayed by a tuner::depends. Without the tuner, some of the merge steps will find all their input available when they actually get scheduled. Hence the number of re-queues appears to be lower without the tuner.

I guess we should improve the statistics output in the upcoming release. Thanks for pointing this out.

Just FYI: I checked your program and it behaves correctly. Activating your tracing statements and using wc to count the number of suspended invocations actually shows that there not a single suspend when using the tuner:

~/cnc/apps/bitonic> ./bitonic_indexes 5 | & grep "Suspend" | wc -l
0

While if I disable the tuner I get 2048-2054 suspends:

~/cnc/apps/bitonic> ./bitonic_indexes 5 | & grep "Suspend" | wc
2054

The actual number of instances is 4095, so roughly half of them have all their input available right away when they get scheduled.

 

Leave a Comment

Please sign in to add a comment. Not a member? Join today