<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blogs &#187; graph</title>
	<atom:link href="http://software.intel.com/en-us/blogs/tag/graph/feed/" rel="self" type="application/rss+xml" />
	<link>http://software.intel.com/en-us/blogs</link>
	<description></description>
	<lastBuildDate>Fri, 25 May 2012 22:49:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Understanding the Internals of tbb::graph : Balancing Push and Pull</title>
		<link>http://software.intel.com/en-us/blogs/2011/05/26/understanding-the-internals-of-tbbgraph-balancing-push-and-pull/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/05/26/understanding-the-internals-of-tbbgraph-balancing-push-and-pull/#comments</comments>
		<pubDate>Thu, 26 May 2011 16:57:21 +0000</pubDate>
		<dc:creator>Michael Voss (Intel)</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA["Graph Community Preview Feature"]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[TBB]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/05/26/understanding-the-internals-of-tbbgraph-balancing-push-and-pull/</guid>
		<description><![CDATA[In this post, I’m going to describe the hybrid push-pull protocol used by Intel® Threading Building Blocks graph Community Preview Feature.  You can find other graph-related posts here, including an introduction and some examples.  You can download the open-source version of Intel® TBB at http://www.threadingbuildingblocks.org and are encouraged to provide feedback about the graph via the [...]]]></description>
			<content:encoded><![CDATA[<p>In this post, I’m going to describe the hybrid push-pull protocol used by Intel® Threading Building Blocks graph Community Preview Feature.  You can find other graph-related posts <a href="http://software.intel.com/en-us/blogs/tag/graph" target="_blank">here</a>, including an introduction and some examples.  You can download the open-source version of Intel® TBB at <a href="http://www.threadingbuildingblocks.org" target="_blank">http://www.threadingbuildingblocks.org</a> and are encouraged to provide feedback about the graph via <a href="http://software.intel.com/en-us/forums/intel-threading-building-blocks/" target="_blank">the forum</a>.</p>
<p>The hybrid push-pull protocol used by tbb::graph biases communication to prevent polling and to reduce unnecessary retries.  Understanding the details of this protocol is not necessary to use tbb::graph, but it makes understanding its performance easier. </p>
<p>Nodes in a graph are persistent and exist until a user explicitly destroys them. But unlike some actor systems, a thread is not assigned to each tbb::graph node.  Tasks are created on-demand to execute node bodies and pass messages between nodes when there is activity in the graph.   Consequently, a tbb::graph node does not spin in a loop waiting for messages to arrive.  Instead when a message arrives, a task is created to apply the receiving node’s body to the incoming message.</p>
<p>If nodes always accept incoming messages, this is straightforward to implement.  Each time a message arrives at a node, a task can be created to apply the body to that message and to forward the result to the node’s successors.  However some nodes, such as function_nodes or limiter_nodes, can reject an incoming message.  For example, a function_node will reject an incoming message if it has reached its maximum allowable concurrency.  </p>
<p>The challenge is to create a protocol for dealing with message rejection that is efficient and ensures that messages aren’t accidentally dropped.   To be efficient in a non-preemptive tasking system like Intel® Threading Building Blocks, it’s important to not create many small tasks or tasks that waste resources by spinning.  Creating a new task to retry at each rejection may generate many small useless tasks, and repeatedly issuing <em>sends</em> in a loop until one is accepted is likewise inefficient.</p>
<p>Instead, tbb::graph uses a hybrid push-pull protocol as a more efficient alternative.</p>
<p style="text-align: center;"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/05/StateDiagram.bmp"><img class="size-full wp-image-34287 aligncenter" title="StateDiagram" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/05/StateDiagram.bmp" alt="" width="410" height="161" /></a></p>
<p style="text-align: left;">Figure 1: A state diagram for the hybrid push-pull protocol used by tbb::graph.</p>
<p style="text-align: left;">In the state diagram shown in Figure 1, edges dynamically switch between a push and pull protocol at rejections.  An Intel® TBB graph G = ( V, S, L ), where V is the set of nodes, S is the set of edges that are currently using a push protocol, and L is the set of edges that are currently using a pull protocol.  For each edge (Vi, Vj), Vi is the predecessor / sender and Vj is the successor / receiver.  When in the push set S, messages over an edge are initiated by the sender, which tries to put to the receiver.  When in the pull set, messages are initiated by the receiver, which tries to get from the sender.  If a message attempt across an edge fails, the edge is moved to the other set.  For example, if a put across the edge (Vi, Vj) is rejected, the edge is removed from the push set S and placed in the pull set L.</p>
<p style="text-align: left;">This protocol results in a reduction in the messages across an edge, while maintaining quick response times.  If a sender, Vi, generates data at a faster rate than its successor Vj, the edge will transition into pull mode, eliminating the many rejections that Vi would see if it were to continue to send.   Likewise, if a receiver Vj processes data faster than its sender Vi, the edge will stay in push mode, allowing Vi to send data as soon as it is generated.</p>
<p style="text-align: left;">There are two interesting scenarios in a tbb::graph program when considering this protocol: (1) when there is no buffering between the sender and receiver and (2) when there is buffering between the sender and receiver.   Using tbb::graph, there are two ways that one can deal with a node that rejects messages.  If no buffering is placed before the node, rejected messages will be dropped.  If buffering is placed before the receiver, then messages will be buffered until the receiver can consume them.  The protocol in Figure 1 works in both cases to reduce unnecessary rejections.</p>
<p style="text-align: left;">Let’s first consider the unbuffered case, as shown in Figure 2 below.</p>
<p style="text-align: left;"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/05/Unbuffered.bmp"><img class="aligncenter size-full wp-image-34289" title="Unbuffered" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/05/Unbuffered.bmp" alt="" /></a></p>
<p style="text-align: left;">Figure 2: A simple sub-graph with two function_nodes.</p>
<p style="text-align: left;">In Figure 2, we have a function_node, f1, sending its output to a function_node, f2.   Node f1 has unlimited concurrency, which allows its body to be applied concurrently to multiple inputs. Node f2, however, is restricted to a concurrency of 1.    If f2 is busy applying its body to another message when a new message arrives from f1, it rejects the incoming message.  Without a buffer between the two nodes, the message is dropped.  Admittedly, it is rare that a user will want messages to be dropped.  In fact, in an upcoming update of the graph API, buffering will be added by default to the input of a function_node if it has limited concurrency.   After that update, users will have to explicitly choose the dropping behavior.</p>
<p style="text-align: left;">In any case, let’s say that f2 in Figure 2 rejects a message sent by f1.  Because of the protocol in Figure 1, the edge f1-&gt;f2 is changed so that f1 will no longer put to f2 and instead f2 must pull from f1.  This does not mean that f1 will not apply its body to messages as they arrive at f1, but just that it will no longer send the results of its body to f2.   If f1 had other successors that did not reject the message, they would continue to receive subsequent results.</p>
<p style="text-align: left;">When f2 finishes executing its body, it becomes free to process new messages and tries to pull from f1.  Because a function_node does not have input or output buffering, it can never produce output on demand.  Thus, it always rejects attempts to pull from it.  This rejection returns the edge to push mode, and subsequently f1 will again push to f2. </p>
<p style="text-align: left;">So what has this accomplished?  If f1 and f2 are imbalanced, these switches, from push to pull and then back to push, reduce the number of failed puts sent from f1 to f2.  While f2 is busy, f1 stops sending messages that will ultimately be rejected.  Only when f2 becomes free is the edge re-established for pushing.</p>
<p style="text-align: left;">The more common case, and what will become the default in an upcoming update, is when there is buffering between the two nodes as shown in Figure 3.</p>
<p style="text-align: left;"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/05/Buffered.bmp"><img class="aligncenter size-full wp-image-34291" title="Buffered" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/05/Buffered.bmp" alt="" /></a></p>
<p style="text-align: left;">Figure 3: A simple sub-graph with two function_nodes and queue between them.</p>
<p>In this case, f1 puts its result to q1, which always accepts.  In turn, q1 attempts to forward items to f2 in first-in first-out order.  If f2 rejects an item, it remains at the head of the queue in q1, until it can be successfully passed to f2.  In this example, the edge f1-&gt;q1 will always stay in push mode, since q1 does not reject.  However, the edge q1-&gt;f2 may change since f2 can reject if it is busy.</p>
<p>Again, let’s assume that at times items arrive in q1 at a faster rate than they can be consumed by f2.  If f2 rejects a send from q1, the edge q1-&gt;f2 is transitioned to the pull state.  Because q1 buffers rejected items, this message is not lost.  When f2 finishes executing its body, it tries to pull from q1.  Since there is an item buffered in q1, the pull succeeds and the item is given to f2.  Since the pull was successful, the edge q1-&gt;f2 stays in pull mode.  When f2 finishes execution on the item it has just pulled, it will pull again from q1.  It will continue to do so, until q1 is empty and therefore rejects f2’s pull request.</p>
<p>As with the previous scenario, the switching of modes from push to pull reduces the number of messages across the edge.  Only when f2 is available to do work does it make a request for an item from q1, thereby removing the need for q1 to periodically poll or send items speculatively.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/05/26/understanding-the-internals-of-tbbgraph-balancing-push-and-pull/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using the Intel® Threading Building Blocks Graph Community Preview Feature: An Implementation of Dining Philosophers.</title>
		<link>http://software.intel.com/en-us/blogs/2011/01/10/using-the-intel-threading-building-blocks-graph-community-preview-feature-an-implementation-of-dining-philosophers/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/01/10/using-the-intel-threading-building-blocks-graph-community-preview-feature-an-implementation-of-dining-philosophers/#comments</comments>
		<pubDate>Mon, 10 Jan 2011 18:59:34 +0000</pubDate>
		<dc:creator>Michael Voss (Intel)</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Community preview feature]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[TBB]]></category>
		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/01/10/using-the-intel-threading-building-blocks-graph-community-preview-feature-an-implementation-of-dining-philosophers/</guid>
		<description><![CDATA[Intel® Threading Building Blocks (Intel® TBB) Version 3 Update 5 introduced the class graph as a Community Preview (CP) feature. There is an introductory post that provides an overview of the class and the nodes that can be used with it. You can download the open-source version of this release at www.threadingbuildingblocks.org and are encouraged [...]]]></description>
			<content:encoded><![CDATA[<p>Intel® Threading Building Blocks (Intel® TBB) Version 3 Update 5 introduced the class <code>graph</code> as a <a href="http://software.intel.com/en-us/articles/intel-tbb-community-preview-features/">Community Preview (CP) feature</a>. There is <a href="http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/">an introductory post</a> that provides an overview of the class and the nodes that can be used with it. You can download the open-source version of this release at <a href="http://www.threadingbuildingblocks.org">www.threadingbuildingblocks.org</a> and are encouraged to provide feedback about the graph via the <a href="http://software.intel.com/en-us/forums/intel-threading-building-blocks/">forum</a>. In a previous post, I provided <a href="http://software.intel.com/en-us/blogs/2011/01/03/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-message-graph/">an example that created a simple message graph</a>.  In this post, I describe a more complicated example that highlights some interesting features of the API.</p>
<p>This example will demonstrate:</p>
<ul>
<li>How to use the graph's run function.</li>
<li>How to mix explicit puts with explicit edges</li>
<li>The non-greedy nature of the join_node</li>
</ul>
<p>In this post, I'll provide an implementation for the Dining Philosophers problem shown below.  In this problem, several philosophers are sitting together at a table.  Each philosopher needs to both think and eat, but can only do one of these at a time.  They each think, eat, think, eat, etc...  In the figure below, the philosophers are using chopsticks to eat noodles.  They must grab both the chopstick to their left and the chopstick to their right before eating. To complicate things, the chopsticks are shared with their neighbors.  So a philosopher's left chopstick is their left neighbor's right chopstick.  And their right chopstick is their right neighbor's left chopstick.  </p>
<p>  <a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/DiningPhilosophersTable.bmp"><img class="aligncenter size-full wp-image-22640" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/DiningPhilosophersTable.bmp" alt="" width="228" height="281" /></a></p>
<p>Dining Philosophers is a challenging problem because it will deadlock without proper cooperation between the philosophers.  For example, if all of the philosophers start by grabbing their left chopstick, then there will be no right chopstick available for any of them.  None of them will be able to eat (and subsequently think) unless their right neighbor gives up the chopstick they have already claimed.  There are a number of <a href="http://en.wikipedia.org/wiki/Dining_philosophers_problem">existing solutions to the Dining Philosophers problem</a>.</p>
<p>I'll use the <code>tbb::graph</code> and its associated node classes to implement a solution to Dining Philosophers.  In my solution, each philosopher will be an object that contains a <code>join_node</code> that will capture the chopsticks and a <code>function_node</code> that will perform the eating and thinking. The chopsticks will be null objects and their places on the table will be implemented as <code>queue_nodes</code>. If a <code>queue_node</code> has an item, it means that the chopstick is available at that place, otherwise it is not available. At most each <code>queue_node</code> will contain one item.  The graph for 4 philosophers will therefore be structured like the figure below.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/DiningPhilosophersGraph.bmp"><img class="aligncenter size-full wp-image-22643" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/DiningPhilosophersGraph.bmp" alt="" width="476" height="318" /></a></p>
<p>As with all Community Preview Features, the graph must be explicitly enabled. This is done by defining its macro, <code>TBB_PREVIEW_GRAPH</code>, before including the header file as shown below.</p>
<p><code>#define TBB_PREVIEW_GRAPH 1</code><br />
<code>#include "tbb/graph.h"</code></p>
<p>The main function is shown below. </p>
<p><code>const char *names[] = </code><br />
<code>{ "Archimedes", "Aristotle", "Democritus", "Epicurus", "Euclid", </code><br />
<code>"Heraclitus", "Plato", "Pythagoras", "Socrates", "Thales" };</code></p>
<p><code>int main(int argc, char *argv[]) {</code><br />
<code>  int num_threads = 0;</code><br />
<code>  int num_philosophers = 10;</code><br />
<code>  if ( argc &gt; 1 ) num_threads = atoi(argv[1]);</code><br />
<code>  if ( argc &gt; 2 ) num_philosophers = atoi(argv[2]);</code></p>
<p><code>  if ( num_threads &lt; 1 || num_philosophers &lt; 1 || num_philosophers &gt; 10 ) exit(1);</code></p>
<p><code>  tbb::task_scheduler_init init(num_threads);</code><br />
<code>  tbb::graph g;</code><br />
<code>  printf("\n%d philosophers with %d threads\n\n", </code><br />
<code>         num_philosophers, num_threads);</code></p>
<p><code>  std::vector&lt; tbb::queue_node * &gt; places;</code><br />
<code>  for ( int i = 0; i &lt; num_philosophers; ++i ) {</code><br />
<code>    tbb::queue_node&lt;chopstick&gt; *qn_ptr = new tbb::queue_node(g);</code><br />
<code>    qn_ptr-&gt;try_put(chopstick());</code><br />
<code>    places.push_back( qn_ptr );</code><br />
<code>  }</code></p>
<p><code>  std::vector&lt; philosopher &gt; philosophers;</code><br />
<code>  for ( int i = 0; i &lt; num_philosophers; ++i ) {</code><br />
<code>    philosophers.push_back( philosopher( names[i], g,</code><br />
<code>                                         places[i], </code><br />
<code>                                         places[(i+1)%num_philosophers] ) );</code><br />
<code>    g.run( philosophers[i] );</code><br />
<code>  }</code><br />
<code>  g.wait_for_all();</code></p>
<p><code>  for ( int i = 0; i &lt; num_philosophers; ++i ) philosophers[i].check();</code></p>
<p><code>  return 0;</code><br />
<code>}</code></p>
<p>After the initial command line processing is done in the main function above, a <code>graph</code> object is instantiated. A vector <code>places</code> of <code>queue_node&lt;chopstick&gt;</code> pointers is then populated with queues that will represent the places at the table.</p>
<p>After each <code>queue_node</code> is created, a single <code>chopstick</code> object is put to it, indicating that a chopstick is initially available at that location.</p>
<p>After the queues are created, the main function then populates a vector of <code>philosopher</code> objects. After each philosopher is added to the vector, it is passed to the <code>graph</code> object's <code>run</code> function. As I will show shortly, <code>class philosopher</code> not only contains a <code>function_node</code> and <code>join_node</code> but it is also a function object, defining a <code>void operator()()</code>. The <code>graph</code>'s <code>run</code> function executes this function object in a task that is a child of the graph's root task. No calls to <code>g.wait_for_all()</code> will return until all tasks that are children of this root task complete.  The philosophers use their <code>operator()</code> functions to think once and then insert themselves in to the graph. The main function ends by checking each philosopher object to verify that it has called think and eat the proper number of times.</p>
<p>There is also version of <code>run</code> that takes a second argument: <code>template&lt;typename Receiver, typename Body&gt; void run( Receiver &amp;r, Body body )</code>. Like the version used in this example, it creates a task that runs <code>body</code> but also sends the value returned by <code>body</code> to the receiver r.</p>
<p>My declaration of <code>class philosopher</code> is shown below:</p>
<p><code>const int think_time = 1; </code><br />
<code>const int eat_time = 1; </code><br />
<code>const int num_times = 10; </code></p>
<p><code>class chopstick {}; </code></p>
<p><code>class philosopher { </code><br />
<code>public: </code></p>
<p><code>  typedef tbb::queue_node&lt; chopstick &gt; chopstick_buffer; </code><br />
<code>  typedef tbb::join_node&lt; chopstick, chopstick &gt; join_type; </code></p>
<p><code>  philosopher( const char *name, tbb::graph &amp;the_graph,</code><br />
<code>               chopstick_buffer *left, chopstick_buffer *right ) : </code><br />
<code>   my_name(name), my_graph(&amp;the_graph),</code><br />
<code>   my_left_chopstick(left), my_right_chopstick(right),</code><br />
<code>   my_join(new join_type(the_graph)), my_function_node(NULL),</code><br />
<code>   my_count(new int(num_times)) {</code><code>} </code></p>
<p><code>  void operator()(); </code><br />
<code>  void check(); </code></p>
<p><code>private: </code></p>
<p><code>  const char *my_name; </code><br />
<code>  tbb::graph *my_graph; </code><br />
<code>  chopstick_buffer *my_left_chopstick; </code><br />
<code>  chopstick_buffer *my_right_chopstick; </code><br />
<code>  join_type *my_join; </code><br />
<code>  tbb::function_node&lt; join_type::output_type, tbb::continue_msg &gt; *my_function_node; </code><br />
<code>  int *my_count; </code></p>
<p><code>  friend class node_body; </code></p>
<p><code>  void eat_and_think( ); </code><br />
<code>  void eat( ); </code><br />
<code>  void think( ); </code><br />
<code>  void make_my_node(); </code></p>
<p><code>}; </code></p>
<p>Each philosopher has a <code>const char *my_name</code> that holds its name. It also has pointers to the graph, the two chopstick queues that it is seated near, its <code>join_node</code>, its <code>function_node</code> and the counter that it will use to track how many times its been called.</p>
<p>Let's first look at the definition of <code>void operator()()</code>, which is invoked by the tasks enqueued by calls to run in main. This function calls <code>think</code> and then <code>make_my_node</code>.  So each philosopher will first think and then afterwards insert itself into the graph.</p>
<p><code>void philosopher::operator()() { </code><br />
<code>  think(); </code><br />
<code>  make_my_node(); </code><br />
<code>} </code></p>
<p>Both function <code>think</code> and function <code>eat</code> (which will be used later) are straightforward functions that just sleep:</p>
<p><code>void philosopher::think() { </code><br />
<code>  printf("%s thinking\n", my_name ); </code><br />
<code>  SLEEP(think_time); </code><br />
<code>  printf("%s done thinking\n", my_name ); </code><br />
<code>} </code></p>
<p><code>void philosopher::eat() { </code><br />
<code>  printf("%s eating\n", my_name ); </code><br />
<code>  SLEEP(eat_time); </code><br />
<code>  printf("%s done eating\n", my_name ); </code><br />
<code>} </code></p>
<p>The function <code>make_my_node</code> is responsible for creating the <code>function_node</code> and connecting both the <code>join_node</code> and <code>function_node</code> to the rest of the graph. The <code>join_node</code>'s input ports are stored in a <code>std::tuple</code>, which is returned by the call to <code>inputs()</code>. I use the template function <code>std::get</code> to access the needed element. The implementation of <code>make_my_node</code> is shown below:</p>
<p><code>void philosopher::make_my_node() { </code><br />
<code>  my_left_chopstick-&gt;register_successor( std::get&lt;0&gt;(my_join-&gt;inputs()) ); </code><br />
<code>  my_right_chopstick-&gt;register_successor( std::get&lt;1&gt;(my_join-&gt;inputs()) ); </code><br />
<code>  my_function_node = </code><br />
<code>    new tbb::function_node&lt; join_type::output_type, tbb::continue_msg &gt;( *my_graph, </code><br />
<code>      tbb::graph::serial, </code><code>node_body( *this ) ); </code><br />
<code>  tbb::make_edge( *my_join, *my_function_node ); </code><br />
<code>} </code></p>
<p>The <code>class node_body</code> is a straightforward function object that invokes the corresponding philosopher's <code>eat_and_think</code> function.</p>
<p><code>class node_body { </code><br />
<code>  philosopher &amp;my_philosopher; </code><br />
<code>public: </code><br />
<code>  node_body( philosopher &amp;p ) : my_philosopher(p) { } </code><br />
<code>  void operator()( philosopher::join_type::output_type ) { </code><br />
<code>    my_philosopher.eat_and_think(); </code><br />
<code>  } </code><br />
<code>}; </code></p>
<p>The implementation of <code>eat_and_think()</code>, calls the philosopher's function <code>eat</code> and then decrements its count. If the philosopher stills needs to eat and think more, then it puts its chopsticks back down on the table and thinks. Otherwise, it removes its <code>join_node</code> from the graph before putting its chopsticks back down on the table.</p>
<p><code>void philosopher::eat_and_think( ) { </code><br />
<code>  eat(); </code><br />
<code>  --(*my_count); </code></p>
<p><code>  if (*my_count &gt; 0) { </code><br />
<code>    my_left_chopstick-&gt;try_put( chopstick() ); </code><br />
<code>    my_right_chopstick-&gt;try_put( chopstick() ); </code><br />
<code>    think(); </code><br />
<code>  } else { </code><br />
<code>    my_left_chopstick-&gt;remove_successor( std::get&lt;0&gt;(my_join-&gt;inputs()) );</code><br />
<code>    my_right_chopstick-&gt;remove_successor( std::get&lt;1&gt;(my_join-&gt;inputs()) );</code><br />
<code>    my_left_chopstick-&gt;try_put( chopstick() ); </code><br />
<code>    my_right_chopstick-&gt;try_put( chopstick() ); </code><br />
<code>  } </code><br />
<code>} </code></p>
<p>The code above demonstrates that nodes can be connected by explicit edges, as is the case for the <code>queue_node</code>s and the <code>join_node</code>. And user code can also do explicit <code>try_put</code>s to nodes. In this example, there is no explicit edge from the philosopher back to its chopstick queues. However, <code>eat_and_think</code> explicitly calls <code>try_put</code> to put chopstick objects in to the queues.</p>
<p>Finally at the end of main, each philosopher's function <code>check</code> is called to verify that it has been executed the correct number of times (and it also does some cleanup).</p>
<p><code>void philosopher::check() { </code><br />
<code>  if ( *my_count != 0 ) { </code><br />
<code>    printf("ERROR: philosopher %s still had to run %d more times\n", my_name, *my_count); </code><br />
<code>    exit(1); </code><br />
<code>  } else { </code><br />
<code>    printf("%s done.\n", my_name); </code><br />
<code>  } </code><br />
<code>  delete my_function_node; </code><br />
<code>  delete my_join; </code><br />
<code>  delete my_count; </code><br />
<code>} </code></p>
<p>When I execute this example using four philosophers and a single thread, "philosophers 1 4", it runs in about 80 seconds. This is 4 x ( 10 thinks + 10 eats ) = 80. When I run it using all 8 threads available on my desktop, it completes in about 21 seconds.</p>
<p>The reason this example works at all is because of the non-greedy nature of the <code>join_node</code>. A <code>join_node</code> creates a <code>std::tuple</code> from the items it receives at its input ports. However, it does not greedily consume items as they appear. Instead, once it has received notification that an item is available at each port it then attempts to reserve each of these items. If it is successful, only then does it create the tuple and consume the items. If it cannot reserve an item at any one port, it releases all reservations it has previously made.</p>
<p>In the Dining Philosopher's problem, the <code>join_node</code> prevents deadlock by never holding a chopstick unless it can acquire both. It may reserve one of the chopsticks, but if it cannot reserve the other, it puts the first one back on the table and tries again.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/01/10/using-the-intel-threading-building-blocks-graph-community-preview-feature-an-implementation-of-dining-philosophers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using the Intel® Threading Building Blocks Graph Community Preview Feature: Creating a Simple Message Graph.</title>
		<link>http://software.intel.com/en-us/blogs/2011/01/03/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-message-graph/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/01/03/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-message-graph/#comments</comments>
		<pubDate>Mon, 03 Jan 2011 14:40:19 +0000</pubDate>
		<dc:creator>Michael Voss (Intel)</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Community preview feature]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[TBB]]></category>
		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/01/03/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-message-graph/</guid>
		<description><![CDATA[Intel® Threading Building Blocks (Intel® TBB) Version 3 Update 5 introduced the class graph as a Community Preview (CP) feature. There is an introductory post that provides an overview of the class and the nodes that can be used with it. You can download the open-source version of this release at www.threadingbuildingblocks.org and are encouraged [...]]]></description>
			<content:encoded><![CDATA[<p>Intel® Threading Building Blocks (Intel® TBB) Version 3 Update 5 introduced the class <code>graph</code> as a <a href="http://software.intel.com/en-us/articles/intel-tbb-community-preview-features/">Community Preview (CP) feature</a>. There is <a href="http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/">an introductory post</a> that provides an overview of the class and the nodes that can be used with it. You can download the open-source version of this release at <a href="http://www.threadingbuildingblocks.org">www.threadingbuildingblocks.org</a> and are encouraged to provide feedback about the graph via the <a href="http://software.intel.com/en-us/forums/intel-threading-building-blocks/">forum</a>. In a previous post, I provided <a href="http://software.intel.com/en-us/blogs/2010/12/27/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-dependency-graph-2/">an example that created a dependency graph</a>. In this post, I describe how to make simple message graph using class <code>graph</code>.</p>
<p>This example will calculate the sum of x*x + x*x*x for all x = 1 to 10. This is a simple syntactic example only. Since each node in a graph may execute as an independent task, the granularity of each node should follow the general guidelines for tasks as described in Section 3.2.3 of the Intel® Threading Building Blocks Tutorial. But for demonstration purposes, I will use an artificial, tiny example here and inflate the time spent in each node by sleeping for 1 second after each operation.  I use the Linux function <code>sleep</code> for this. If you want to enter this example yourself, you'll have to use the appropriate sleep function for your system.</p>
<p>The basic layout of the graph that I’ll create is shown in the figure below. Each value enters through the <code>input</code> node. This node will broadcast the value to both <code>squarer</code> and <code>cuber</code>, which will calculate x*x and x*x*x respectively (and sleep for 1 second). The output of each of these nodes will be placed in an unbounded buffer. A tuple containing both values will be created by the <code>join</code> node and forwarded to <code>summer</code>, which will add both values to the running total (and sleep for 1 second). The <code>squarer</code> and <code>cuber</code> will allow unlimited concurreny, that is they will be allowed to process multiple values simultaneously. The final <code>summer</code>, which updates a shared total, will be only allowed process a single in-coming tuple at a time, eliminating the need for a lock around the shared value.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/messageFig3.jpg"><img class="aligncenter size-full wp-image-22393" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/messageFig3.jpg" alt="" width="557" height="208" /></a></p>
<p>As with all Community Preview Features, the graph must be explicitly enabled. This is done by defining its macro, <code>TBB_PREVIEW_GRAPH</code>, before including the header file as shown below.</p>
<p><code>#define TBB_PREVIEW_GRAPH 1</code><br />
<code>#include "tbb/graph.h"</code></p>
<p>This example performs three basic types of operations: square, cube and sum. The classes below define these operations and will be use as the body objects for my <code>function_node</code>s.</p>
<p><code>struct square { </code><br />
<code>  int operator()(int v) {</code><br />
<code>    printf(“squaring %d\n”, v);</code><br />
<code>    sleep(1); </code><br />
<code>    return v*v; </code><br />
<code>  }</code><br />
<code>};</code></p>
<p><code>struct cube {</code><br />
<code>  int operator()(int v) {</code><br />
<code>    printf(“cubing %d\n”, v);</code><br />
<code>    sleep(1); </code><br />
<code>    return v*v*v; </code><br />
<code>  }</code><br />
<code>};</code></p>
<p><code>class sum {</code><br />
<code>  int &amp;my_sum;</code><br />
<code>public:</code><br />
<code>  sum( int &amp;s ) : my_sum(s) {}</code><br />
<code>  int operator()( std::tuple&lt;int,int&gt; v ) {</code><br />
<code>    printf(“adding %d and %d to %d\n”, std::get&lt;0&gt;(v), std::get&lt;1&gt;(v), my_sum);</code><br />
<code>    my_sum += std::get&lt;0&gt;(v) + std::get&lt;1&gt;(v);</code><br />
<code>    return my_sum;</code><br />
<code>  }</code><br />
<code>};</code></p>
<p>In function <code>main</code>, the graph is setup and then the values 1 – 10 are put into the <code>input</code> node. All the nodes in this example pass around values of type <code>int</code>. The nodes used below are all class templates and therefore can be used with any type that supports copy construction, including pointers and objects. It should be noted that values are copied as they pass between nodes, so passing around large objects should be avoided.</p>
<p><code>using namespace tbb; </code></p>
<p><code>int main() {</code><br />
<code>  int result = 0;</code></p>
<p><code>  graph g;</code><br />
<code>  broadcast_node&lt;int&gt; input;</code><br />
<code>  function_node&lt;int,int&gt; squarer( g, graph::unlimited, square() );</code><br />
<code>  buffer_node&lt;int&gt; square_buffer(g);</code><br />
<code>  function_node&lt;int,int&gt; cuber( g, graph::unlimited, cube() );</code><br />
<code>  buffer_node&lt;int&gt; cube_buffer(g);</code><br />
<code>  join_node&lt;int,int&gt; j( g );</code><br />
<code>  function_node&lt;std::tuple&lt;int,int&gt;,int&gt; summer( g, graph::serial, sum(result) );</code></p>
<p><code>  make_edge( input, squarer );</code><br />
<code>  make_edge( input, cuber );</code><br />
<code>  make_edge( squarer, square_buffer );</code><br />
<code>  make_edge( square_buffer, std::get&lt;0&gt;( j.inputs() ) );</code><br />
<code>  make_edge( cuber, cube_buffer );</code><br />
<code>  make_edge( cube_buffer, std::get&lt;1&gt;( j.inputs() ) );</code><br />
<code>  make_edge( j, summer );</code></p>
<p><code>  for (int i = 1; i &lt;= 10; ++i)</code><br />
<code>    input.try_put(i);</code><br />
<code>  g.wait_for_all();</code><br />
<code>  printf("Final result is %d\n", result);</code><br />
<code>  return 0;</code><br />
<code>}</code></p>
<p>Towards the top of the code above, the graph nodes are created: 1 <code>broadcast_node</code>, 3 <code>function_node</code> objects , 2 <code>buffer_node</code> objects, and 1 <code>join_node</code>. After the nodes are created, they are linked together using <code>make_edge</code> calls. Both the nodes and edges correspond directly to the figure presented earlier.</p>
<p>Once the graph has been setup, the values 1-10 are put into the <code>input</code> node. The <code>wait_for_all</code> call will block until there is no more activity in the graph. The result of the computation is then printed to <code>stdout</code> and should look something like:</p>
<p><code>cubing 10 </code><br />
<code>squaring 1</code><br />
<code>cubing 1</code><br />
<code>squaring 2</code><br />
<code>cubing 2</code><br />
<code>cubing 3</code><br />
<code>squaring 4</code><br />
<code>squaring 3</code><br />
<code>squaring 5</code><br />
<code>cubing 5</code><br />
<code>cubing 4</code><br />
<code>squaring 6</code><br />
<code>cubing 6</code><br />
<code>squaring 7</code><br />
<code>cubing 7</code><br />
<code>squaring 8</code><br />
<code>squaring 10</code><br />
<code>adding 1 and 1000 to 0</code><br />
<code>cubing 8</code><br />
<code>squaring 9</code><br />
<code>cubing 9</code><br />
<code>adding 4 and 1 to 1001</code><br />
<code>adding 9 and 8 to 1006</code><br />
<code>adding 25 and 27 to 1023</code><br />
<code>adding 36 and 125 to 1075</code><br />
<code>adding 49 and 64 to 1236</code><br />
<code>adding 64 and 216 to 1349</code><br />
<code>adding 16 and 343 to 1629</code><br />
<code>adding 100 and 512 to 1988</code><br />
<code>adding 81 and 729 to 2600</code><br />
<code>Final result is 3410</code></p>
<p>One might note that most of the square and cube operations happened before any of the sum operations in the above run. This is an artifact of using a for loop to inject the values 1-10 into the graph. In a future example, I'll demonstrate how a <code>source_node</code> can be used to create a better execution ordering.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/01/03/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-message-graph/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using the Intel® Threading Building Blocks Graph Community Preview Feature: Creating a Simple Dependency Graph.</title>
		<link>http://software.intel.com/en-us/blogs/2010/12/27/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-dependency-graph/</link>
		<comments>http://software.intel.com/en-us/blogs/2010/12/27/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-dependency-graph/#comments</comments>
		<pubDate>Mon, 27 Dec 2010 16:18:17 +0000</pubDate>
		<dc:creator>Michael Voss (Intel)</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Community preview feature]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[TBB]]></category>
		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2010/12/27/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-dependency-graph/</guid>
		<description><![CDATA[ Intel® Threading Building Blocks (Intel® TBB) Version 3 Update 5 introduced the class graph as a Community Preview (CP) feature. There is an introductory post that provides an overview of the class and the nodes that can be used with it. You can download the open-source version of this release at www.threadingbuildingblocks.org and are encouraged [...]]]></description>
			<content:encoded><![CDATA[<p> Intel® Threading Building Blocks (Intel® TBB) Version 3 Update 5 introduced the class <code>graph</code> as a <a href="http://software.intel.com/en-us/articles/intel-tbb-community-preview-features/">Community Preview (CP) feature</a>. There is <a href="http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/">an introductory post </a>that provides an overview of the class and the nodes that can be used with it. You can download the open-source version of this release at <a href="http://www.threadingbuildingblocks.org/">www.threadingbuildingblocks.org</a> and are encouraged to provide feedback about the graph via the <a href="http://software.intel.com/en-us/forums/intel-threading-building-blocks/">forum</a>. In this post, I describe how to make simple dependency graph using class <code>graph</code>.</p>
<p> In this example, I setup five computations A-E, and enforce the partial ordering shown in the figure below. For each edge in the graph, the node at the tail of the edge must complete its execution before the node at the head may begin.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/depend_graph1.jpg"><img class="aligncenter size-full wp-image-22409" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/depend_graph1.jpg" alt="" width="166" height="335" /></a></p>
<p>As with all Community Preview Features, the graph must be explicitly enabled. This is done by defining its macro, <code>TBB_PREVIEW_GRAPH</code>, before including the header file as shown below. </p>
<p><code>#define TBB_PREVIEW_GRAPH 1</code><br />
<code>#include "tbb/graph.h"</code> </p>
<p>To keep this example simple, I make the bodies of the nodes just print out their names and then sleep for 1 second. I use the Linux function <code>sleep</code> for this. If you want to enter this example yourself, you'll have to use the appropriate sleep function for your system. All of these nodes are able to use the class <code>body</code> below. </p>
<p><code>#include &lt;cstdio&gt;</code></p>
<p><code>struct body { </code><br />
<code>  std::string my_name; </code><br />
<code>  body( const char *name ) : my_name(name) {} </code><br />
<code>  void operator()( continue_msg ) const { </code><br />
<code>    printf("%s\n", my_name.c_str()); </code><br />
<code>    sleep(1);</code><br />
<code>  } </code><br />
<code>}; </code> </p>
<p>In function <code>main</code>, the graph is setup once and then run three times. All of the nodes in this example pass around <code>continue_msg</code> objects. This type is defined in graph.h and is used to communicate that a node has completed its execution. </p>
<p><code>using namespace tbb; </code></p>
<p><code>int main() {</code><br />
<code>  graph g;</code><br />
<code>  broadcast_node&lt; continue_msg &gt; start;</code><br />
<code>  executable_node&lt; continue_msg &gt; a( g, body("A"));</code><br />
<code>  executable_node&lt; continue_msg &gt; b( g, body("B"));</code><br />
<code>  executable_node&lt; continue_msg &gt; c( g, body("C"));</code><br />
<code>  executable_node&lt; continue_msg &gt; d( g, body("D"));</code><br />
<code>  executable_node&lt; continue_msg &gt; e( g, body("E"));</code><br />
<code>  make_edge( start, a );</code><br />
<code>  make_edge( start, b );</code><br />
<code>  make_edge( a, c );</code><br />
<code>  make_edge( b, c );</code><br />
<code>  make_edge( c, d );</code><br />
<code>  make_edge( a, e );</code><br />
<code>  for (int i = 0; i &lt; 3; ++i ) {</code><br />
<code>    start.try_put( continue_msg() );</code><br />
<code>    g.wait_for_all();</code><br />
<code>  }</code><br />
<code>  return 0;</code><br />
<code>}</code> </p>
<p>The first line in function <code>main</code> instantiates a <code>graph</code> object <code>g</code>. On the next line, a <code>broadcast_node</code> named <code>start</code> is created. Anything passed to this node will be broadcast to all of its successors. The <code>start</code> node is used in the for loop at the bottom of main to launch the execution of the rest of the graph. </p>
<p>In the code above, five <code>executable_node</code> objects are created, named a – e. Each node is constructed with a reference to graph <code>g</code> and the function object to invoke when they run. The successor / predecessor relationships are setup by the <code>make_edge</code> calls that follow the declaration of the nodes.</p>
<p>After the nodes and edges are setup, the <code>try_put</code> in each iteration of the for loop results in a broadcast of a <code>continue_msg</code> to both <code>a</code> and <code>b</code> . Both <code>a</code> and <code>b</code> are waiting for a single <code>continue_msg</code>, since they both have only a single predecessor, <code>start</code>. </p>
<p>When they receive the message from <code>start</code>, they execute their body objects. When complete, they each forward a <code>continue_msg</code> to their successors, and so on. The graph uses Intel® Threading Building Blocks tasks to execute the node bodies as well as forward messages between the nodes, allowing computation to execute concurrently when possible. </p>
<p>If you run this example on a system with at least two threads, the output for each iteration of the loop will look something like: </p>
<p>A<br />
B<br />
(pause)<br />
E<br />
C<br />
(pause)<br />
D </p>
<p>The (pause) above is not shown in the output, but represents a noticeable (1 second) delay. </p>
<p>The critical path in this example is {start, B, C, D} which has a length of 3 seconds. When run on a systems with more than 1 hardware thread, the execution of the critical path can be overlapped with the executions of nodes A and E. The total time to perform the three executions of the graph will therefore take roughly 9 seconds, or 3 seconds per iteration. </p>
<p>For additional information about the graph and its nodes, you can read Appendix D in the Intel® Threading Building Blocks Reference Manual. I will also be posting additional examples that highlight other key features of the graph over the coming weeks.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2010/12/27/using-the-intel-threading-building-blocks-graph-community-preview-feature-creating-a-simple-dependency-graph/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Intel® Threading Building Blocks Version 3.0 Update 5 Introduces Graph as a Community Preview Feature.</title>
		<link>http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/</link>
		<comments>http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/#comments</comments>
		<pubDate>Thu, 23 Dec 2010 17:13:09 +0000</pubDate>
		<dc:creator>Michael Voss (Intel)</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Community preview feature]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[TBB]]></category>
		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/</guid>
		<description><![CDATA[There are some applications that, even on systems with shared memory, are best organized as computations that explicitly pass messages. These messages may contain data or simply act as signals that a computation has completed. The new class graph and its associated nodes can be used to express such applications. These classes are available as [...]]]></description>
			<content:encoded><![CDATA[<p>There are some applications that, even on systems with shared memory, are best organized as computations that explicitly pass messages. These messages may contain data or simply act as signals that a computation has completed. The new class <code>graph</code> and its associated nodes can be used to express such applications. These classes are available as a <a href="http://software.intel.com/en-us/articles/intel-tbb-community-preview-features/">Community Preview (CP) feature</a> in Intel® Threading Building Blocks Version 3.0 Update 5. You can download the open-source version of this release at <a href="http://www.threadingbuildingblocks.org/">www.threadingbuildingblocks.org</a> and are encouraged to provide feedback about its features and design via the <a href="http://software.intel.com/en-us/forums/intel-threading-building-blocks/">forum</a>.</p>
<p><em>There are three types of components used to implement a graph: a graph object, nodes and edges.</em></p>
<p>A user first creates a <code>graph </code>object. This object acts as the owner of all tasks created on behalf of the graph and its nodes. In many ways its like a <code>task_group</code>, and users can call <code>wait_for_all </code>on the graph object if they need to wait for all of the tasks related to the graph to complete.</p>
<p>Next users create the nodes and edges. Nodes process or buffer messages as they pass through them. There are some node types that invoke user-provided function objects, while others direct, buffer or combine messages in a pre-specified way. The user creates nodes and then links them together by calling the functions <code>make_edge </code>or <code>make_edges</code>.</p>
<p>In future posts over the coming weeks, I will provide several example applications that highlight key features of the graph and its node types. A detailed description of class <code>graph </code>and the different node types are provided in Appendix D of the Intel® Threading Building Blocks Reference Manual.</p>
<p><em>As with all Community Preview features, the graph must be explicitly enabled by setting its macro before including its header:</em></p>
<p><code>#define TBB_PREVIEW_GRAPH 1</code><br />
<code>#include "tbb/graph.h"</code></p>
<p>Below are brief descriptions of the node types currently provided in graph.h. In future posts, I will use these nodes to construct several example applications.</p>
<p><em>The first three node types invoke user-provided function objects and are used to wrap user computations:</em></p>
<p><code>template &lt; typename OType &gt; class source_node;</code><br />
A <code>source_node </code>has no predecessors in the graph. It generates messages of a generic type <code>OType </code>by calling a user-provided body object. The generated message is broadcast to all of its successors.</p>
<p><code>template &lt; typename IType, typename OType &gt; class function_node;</code><br />
A <code>function_node </code>receives messages of a generic type <code>IType</code>, applies a body object to each message, and broadcasts its output of type <code>OType </code>to its successors in the graph.</p>
<p><code>template &lt; typename OType &gt; class executable_node;</code><br />
An <code>executable_node </code>receives messgaes of a type <code>continue_msg</code>. When it has received a <code>continue_msg </code>from each of its predecessors, it invokes its body object to generate a message of type <code>OType </code>that is passes to its successors in the graph.</p>
<p><em>The remaining 10 node types all perform pre-defined operations that buffer, direct or combine messages:</em></p>
<p><code>class contine_node;</code><br />
A <code>continue_node </code>receives <code>continue_msg </code>messages. When it receives a message from each of its predecessors, it forwards a single <code>continue_msg </code>to all of its successors.</p>
<p><code>template &lt; typename T &gt; class broadcast_node;</code><br />
A node that broadcasts each incoming message to all of its successors.</p>
<p><code>template &lt; typename T &gt; class write_once_node;</code><br />
A <code>write_once_node </code>buffers a single item that cannot be over written. The value may be cleared explicitly, after which a new value may be set.</p>
<p><code>template &lt; typename T &gt; class overwrite_node;</code><br />
An <code>overwrite_node </code>buffers a single item that can be over written.</p>
<p><code>template &lt; typename T &gt; class buffer_node;</code><br />
A <code>buffer_node</code> is an unbounded buffer of messages of type <code>T</code>. Messages are forwarded in arbitrary order.</p>
<p><code>template &lt; typename T &gt; class queue_node;</code><br />
A <code>queue_node</code> is an unbounded buffer of messages of type <code>T</code>. Messages are forwarded in first-in first-out order.</p>
<p><code>template &lt; typename T &gt; class priority_queue_node;</code><br />
A <code>priority_queue_node </code>is an unbounded buffer of messages of type <code>T</code>. Messages are forwarded in priority order.</p>
<p><code>template &lt; typename T &gt; class sequencer_node;</code><br />
A <code>sequencer_node </code>is an unbounded buffer of messages of type <code>T</code>. Messages are forwarded in sequence order.</p>
<p><code>template &lt; typename T &gt; class limiter_node;</code><br />
A <code>limiter_node </code>limits the number of messages that may pass through it. It contains an embedded <code>continue_node </code>that may be used to decrement the count to allow additional items through.</p>
<p><code>template &lt; typename T1, typename T2, … &gt; class join_node;</code><br />
A <code>join_node </code>creates a <code>std::tuple&lt;T1,T2,…&gt;</code> from a set of messages received at its inputs.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2010/12/23/intel-threading-building-blocks-version-30-update-5-introduces-graph-as-a-community-preview-feature-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

