Detecting Theft by Hyperobject Abuse

Intel® Cilk™ Plus employs work stealing, where threads steal work from other threads.  Though a good Intel Cilk Plus program should not depend on whether work is stolen or not, you might be curious about when it occurs in a program.  This blog shows how to satisfy that curiousity with a holder hyperobject, a generally useful abstraction that I'll abuse somewhat to detect stealing.  

Hyperobjects are Cilk's way of doing parallel reductions.  The best reference on them is the award winning paper "Reducers and Other Cilk++ Hyperobjects".   I'll summarize their proper use as background for their abuse. 

Intel® Cilk™ Plus allows control flow to fork into multiple strands of execution.  A hyperobject is a special kind of object for which there are multiple views.  Any two concurrently executing strands get separate views that they can safely update without locking.  Here is a trivial example:

#include <cilk/cilk.h>

#include <cilk/reducer_opadd.h>


cilk::reducer_opadd<int> X;


void f() {

    X += 1;

}


int g() {

    cilk_spawn f();

    X += 2;

    cilk_sync;

    return X.get_value();

}


Variable X is declared as a hyperobject for doing addition reduction over type int.    Here is what happens when function g() is called and there is an idle thread that successfully steals.

    1. The cilk_spawn causes control flow to fork into two strands of execution.  One strand calls f(), which executes X+=1.

    1. The other strand is a continuation of the caller's execution.  The idle thread may steal the continuation and executes "X+=2".  If stealing does not occur, the strand executes after f() returns.

    1. Execution waits at the cilk_sync until both strands complete. 

    1. The value of X is returned. 



As a practical matter, stealing is unlikely in this example because f() executes so quickly that the original thread will get to execution of the continuation before a thief can grab it.  But for exposition's sake, assume that += is slow. 

If X were an ordinary int, having two strands concurrently update X would be unsafe, because one of the updates might stomp on the other.  Declaring it as a hyperobject avoids the problem.  When the thief operates on X, it gets a gets a fresh view, initialized to 0, the identity element for addition.   The cilk_sync causes the views to be merged into a single view.  The declaration of X implies merging by addition, so the net effect of calling g() is X+=3.  

If the continuation is not stolen, a fresh view is not created.  The X+=1 happens first, followed by X+=2, both on the same view.  Thus the next effect of g() is still X+=3.

The reduction operation should be associative, or as in the case of floating-point addition, practically associative for given circumstances.  But it need not be commutative.   When two views merge, the reduction operation is always applied such that the left operand is the view for the spawned routine and the right operand is the view for the stolen continuation.  

Now about detecting steals.  The idea is to detect when a view is fresh.  I'll use a global boolean hyperobject "Seen" for this purpose.   I'll use "Seen==false" to indicate that a view is fresh.  This convention is simplifies the code by exploiting default initialization of new views, so I do not have to explicitly specify their value.  

The domain of the "reduction" is boolean values.  The reduction operation is "x op y → x".  It's associative but not commutative.  A hyperobject with this reduction operation is called a holder, because it holds the left value.   The right value is irrelevant because it corresponds to the view created by a thief, and inspected by that thief.    (Exercise for the mathematically inclined: do left and right identity values exist for this operation?  What are they?)

Here is a complete example of using a holder to detect steals.  You can compile and run with Intel® Cilk Plus compiler.    

#include <cilk/cilk.h>

#include <cilk/reducer.h>


template<typename U>

class Holder {

    struct Monoid : cilk::monoid_base<U> {

        static void reduce(U *left, U *right) {}

    };


    cilk::reducer<Monoid> impl;

public:

     inline U& get_view() {

        return impl.view();

    }

};


Holder<bool> Seen;


#include <cstdio>


int main() {

    Seen.get_view() = true;

    cilk_for( int i=0; i<100000000; ++i ) {

        bool& x = Seen.get_view();

        if( !x ) {

            std::printf("Iteration %d was stolenn",i);

            x = true;  // Must not forget this part.

        }

    }

    return 0;

}


 Here is an explanation of the program's parts:

    • Template Holder<U> defines a holder for views of type U, using the template cilk::monoid_base defined in <cilk/reducer.h>.  

    • Template class cilk::reducer<Monoid> requires that signature Monoid::reduce compute "*left = *left op *right".  The implementation is trivial for the holder reduction operation  "x op y → x".

    • The views conceptually live in Holder::impl.  A strand invokes impl.view() to get a reference to its view.

    • Seen is declared as a Holder<bool>.  The views of the bool live in Seen.impl.  The initial view is default initialized to false.

    • Function main marks the initial view as seen.

    • Function main executes a cilk_for loop, which parcels out chunks of iterations as work.

    • Each iteration inspects its view of Seen.  If the view is false, then it is a freshly created view, which indicates that the chunk was stolen.  The view is  marked as seen after the theft is reported.



The code abuses hyperobjects in the sense that its visible behavior depends on whether steals happen or not.  Not all uses of holders are abusive.  Consider a scratchpad variable that is used for temporary storage, but its final value does not matter.  Changing the variable to a holder enables a Cilk program to safely operate on it, without any locks, because each thread will get its own view as necessary.  Furthermore,  each view is operated on in the left-to-right order of the original program.  For some applications, that's a valuable property that enables maintaining complex state in the scratchpad, which is not necessarily a practical thing to do with thread-local storage.

Footnote: Intel and Cilk are trademarks of Intel Corporation in the U.S. and/or other countries.

有关编译器优化的更完整信息,请参阅优化通知