Digital Logic Simulation with the Intel® TBB Flow Graph, Part 1: Using the or_node

In this multi-part blog, I’m going to show you how to put together a simple logic simulation program using the Intel® Threading Building Blocks flow graph feature. Please note that this example does NOT demonstrate a practical approach to digital logic simulation. The purpose of the example is to demonstrate the use of several types of flow graph nodes and how they can be composed to make more interesting components. I’ll start by designing basic logic gates that are composed of flow graph nodes.

Consider an AND gate. In its simplest form, it takes two inputs, and produces a single output. The first thing that comes to mind to represent this is the flow graph function_node: it could take a pair as input, and a body that computes the logical AND operation on the items in the pair, and puts out the result as its output. That might work, but let’s think a little more about how such a gate might receive its two input signals: a function_node takes a single argument, so I’d have to group the two inputs together. However, both inputs will be coming from different senders, and may not be available at the same time. Should I preface the function_node with a join_node? Possibly, but there’s still a limitation with a join_node: it gathers together the inputs and when it has received the full complement, it then sends them along as a tuple. But this still isn’t exactly the behavior I want. What I really want is when either of the inputs becomes available, the function_node should be told about it, because it will need to change its output value when any of its input values change.

Thus, the first decision about gates is this: Gates are responsive: when any input changes, the gate will check if its output needs to change. To simplify this a little, and make our flow graph have to do a little less work, I’ll make this second decision: Gates are lazy; a gate will send data to its output port only when that data differs from the previous value sent to that output port. This will certainly reduce the number of tasks doing redundant work in the graph.

So, on the input side, something reports changes on any input port, and on the output side, something produces output, or not, depending on if the output value has changed. Neither of these behaviors corresponds exactly to a function_node. However, the new feature multifunction_node (formerly the Community Preview feature (CPF) multioutput_function_node) can certainly meet the output needs: it can optionally produce an output. For the input, if the title of this blog hasn’t given it away already, my choice is the or_node. The or_node will pass along any input it receives on any input port at any time, giving exactly the responsiveness I need. The or_node is currently a CPF in Intel® TBB. Figure 1 illustrates this basic logic gate design. Note that the or_node takes a variable number of inputs – no need to limit it to two – and the multifunction_node takes a body that produces either no output or one output. In general, the multifunction_node can produce zero or more outputs of varying types, but for the gate implementation, zero or one output will suffice. Let’s take a look at the actual code for this, the gate template class.

First, I set up a type signal_t to represent the signal data being transferred. Since I’m allowing the gates to fire only when the output state changes, it helps to have an additional undefined state for initialization.

typedef enum { low=0, high, undefined } signal_t;

Next, I define a few potential input configurations to gates. I could go all out and add eight_input gates, but I couldn’t dredge up a use for them from the dark and rarely-visited corner of my brain where I keep the knowledge leftover from a digital logic course so many years ago.

typedef tuple < signal_t > one_input;
typedef tuple < signal_t, signal_t > two_input;
typedef tuple < signal_t, signal_t, signal_t > three_input;
typedef tuple < signal_t, signal_t, signal_t, signal_t > four_input;

Now I’m ready to set up the gate template.

template < typename GateInput >
class gate {
protected:
typedef or_node < GateInput > input_port_t;
typedef multifunction_node < typename input_port_t::output_type, tuple < signal_t > > gate_fn_t;
typedef typename gate_fn_t::output_ports_type ports_type;
public:
static const int N = std::tuple_size < GateInput > ::value;

template < typename Body >
gate(graph& g, Body b) : my_graph(g), in_ports(g), gate_fn(g, 1, b) {
make_edge(in_ports, gate_fn);
}
virtual ~gate() {}
virtual gate& operator=(const gate& src) { return *this; }
sender < signal_t > & get_out() { return output_port < 0 > (gate_fn); }
receiver < signal_t > & get_in(size_t port) {
return gate_helper < N > ::get_inport(in_ports, (int)port);
}
protected:
graph& my_graph;
private:
input_port_t in_ports;
gate_fn_t gate_fn;
};

The class is templated by the input configuration, GateInput, so for example, I would pass in two_input if I wanted to make a gate with two inputs. Then I define two types. First, input_port_t, which is the type of the or_node that that I’ll pass the input configuration to, as specified by GateInput. Second is gate_fn_t, which is the multifunction_node that takes the output from the or_node, performs the function of the gate, and outputs a single signal_t (or nothing). These types are used to declare the actual graph nodes in_ports and gate_fn, in the private section of the class above.

The gate constructor initializes the two graph nodes, making them belong to a graph g that is passed in as a reference parameter. Additionally, the constructor takes a function object b that performs the actual logical operation on the inputs to the gate, and determines what the new output will be. So in the case of an AND gate, I would pass in a function object that computes a logical AND operation. The constructor also completes this small component by connecting the two graph nodes with the make_edge function.

In order to connect this gate to other components, I’ve provided methods to access the input ports and the output port. get_in takes a port number and returns a reference to an input port capable of receiving data, i.e. a receiver& in the flow graph jargon. It uses the gate_helper::get_inport function shown below to extract the input port to the or_node. get_out returns a reference to the output port of the multifunction_node which is capable of sending data, i.e. a sender&.

template < int N >
struct gate_helper {
template < typename TupleType >
static inline receiver < signal_t > & get_inport(or_node < TupleType > & in_ports, int port) {
if (N-1 == port) return input_port < N-1 > (in_ports);
else return gate_helper < N-1 > ::get_inport(in_ports, port);
}
};
template < >
struct gate_helper < 1 > {
template < typename TupleType >
static inline receiver < signal_t > & get_inport(or_node < TupleType > & in_ports, int port) {
return input_port < 0 > (in_ports);
}
};

Now that I have a building block for creating a wide variety of logic gates, I’ll use it for designing an AND gate. When creating the derived class and_gate, the main purpose is to define the functor that gets passed to the gate_fn object inside the gate base class. and_body computes a logical AND operation over all the inputs to the gate, including undefined inputs, so the function is not completely trivial.

template < typename GateInput >
class and_gate : public gate < GateInput > {
using gate < GateInput > ::N;
using gate < GateInput > ::my_graph;
typedef typename gate < GateInput > ::ports_type ports_type;
typedef typename gate < GateInput > ::input_port_t input_port_t;
class and_body {
signal_t ports[N];
signal_t state;
bool touched;
public:
and_body() : state(undefined), touched(false)
for (int i=0; i < N; ++i) ports[i] = undefined;
}
void operator()(const typename input_port_t::output_type& v, ports_type& p) {
ports[v.indx] = or_output_helper < N > ::get_or_output(v);
signal_t new_state=high;
size_t i=0;
while (i < N) {
if (ports[i] == low)
new_state = low; break;
else if (ports[i] == undefined && new_state != low)
new_state = undefined;
++i;
}
if (!touched || state != new_state) {
state = new_state;
std::get < 0 > (p).try_put(state);
touched = true;
}
}
};
public:
and_gate(graph& g) : gate < GateInput > (g, and_body()) {}
and_gate(const and_gate < GateInput > & src) : gate < GateInput > (src.my_graph, and_body()) {}
~and_gate() {}
};

The and_body keeps track of the states of the gate’s input ports and output port. These are all initially undefined. The operator() for and_body receives the or_node output in parameter v, which indicates that data was received on one of the input ports. The input port that received data is specified in v.indx. Accessing the data from that port is a little more challenging, as the entire input tuple is passed in v.result. I wrote a helper function or_output_helper::get_or_output to select the v.indx-th port of the tuple v.result. This value is used to update the locally stored state of the appropriate port, and then the new output state is calculated. The new state is checked to see if it differs from the old state, and if so, the new state is sent out on the appropriate output port of the multifunction_node (which in this case, since there is only one output, is always port zero). Note also that the very first time a gate receives data, i.e. when touched is false, the new state is sent out even if it is not different from the initial state. This is useful when the gate is a part of a larger circuit. It allows any initial settings on input ports to propagate through the graph and register at any possible output devices that might exist.

The helper function that extracts the or_node output is as follows:

template < int N >
struct or_output_helper {
template < typename OrOutputType >
static inline signal_t get_or_output(const OrOutputType& out) {
if (N-1 == out.indx) return std::get < N-1 > (out.result);
else return or_output_helper < N-1 > ::get_or_output(out);
}
};
template < >
struct or_output_helper < 1 > {
template < typename OrOutputType >
static inline signal_t get_or_output(const OrOutputType& out) {
return std::get < 0 > (out.result);
}
};

Given an AND gate, it’s easy to see how to make OR gates and any other sort of basic logic gate from the base class gate.

As the or_node is currently a Community Preview feature, it’s a good time to have a look at it and give us your feedback.

In Part 2 of this blog, I’ll show you how to put together a variety of basic logic gates to make a four-bit adder.

For more complete information about compiler optimizations, see our Optimization Notice.