Developer Guide

Contents

Clustering the Datapath

Dynamically scheduling all operations adds area overhead due to the required handshaking control logic. To reduce this overhead, the compiler groups fixed latency operations into clusters. A cluster of fixed latency operations, such as arithmetic operations, needs fewer handshaking interfaces, thereby reducing the area overhead.
Clustered Logic
Clustered Logic
If A, B, and C from Figure 1 do not contain variable latency operations, the compiler can cluster them together, as illustrated in Figure 1. Clustering the logic reduces area by removing the need for signals to stall data flow in addition to other handshaking logic within the cluster.

Cluster Types

The
Intel® oneAPI
DPC++/C++
Compiler
can create the following types of clusters:
  • Stall-Enable Cluster (SEC)
    : This cluster type passes the handshaking logic to every pipeline stage in the cluster in parallel. This means that if the cluster is stalled by logic from further down in the datapath, all logic in the SEC stalls at the same time.
    Stall-Enable Cluster
  • Stall Free Cluster (SFC)
    : This cluster type adds a first in, first out (FIFO) buffer to the end of the cluster that can accommodate at least the entire latency of the pipeline in the cluster. This FIFO is often called a
    capacity FIFO
    because the FIFO can accommodate the capacity of the cluster. The
    capacity
    of a cluster is the minimum number of valid data pieces that a cluster can operate on simultaneously. Capacity is always less than or equal to the latency of the datapath.
    Because of this FIFO, the pipeline stages in the cluster do not require any handshaking logic. The stages can run freely and drain into the capacity FIFO, even if the cluster is stalled from logic further down in the datapath.
Stall-Free Cluster
Stall-Free Cluster

Cluster Characteristics

Different cluster architectures result in different characteristics for each cluster type:
  • Bubble Handling
    : SECs remove only leading bubbles in the pipeline under limited circumstances. A leading bubble is a bubble that arrives before the first piece of valid data arrives in the cluster. SECs do not remove any arriving afterwards.
    SFCs can use the capacity FIFO to remove all bubbles from the pipeline.
  • Capacity
    : For SECs, capacity can vary where the best-case capacity is equal to the number of register stages and the worst-case capacity is 1 because all registers in an SEC share the same stall signal. A capacity of 1 means that the cluster may be able to hold only a single piece of data, regardless of the number of pipeline stages in the cluster.
    This worst-case scenario corresponds to a single valid data at the end of the cluster pipeline and bubbles in the rest of the pipeline.
    SFCs have a capacity equal to the depth of the capacity FIFO.
  • Handshaking
    : The capacity FIFO inside SFCs allow them to take advantage of hyper-optimized handshaking between clusters. For more information, refer to Hyper Optimized Handshaking
    SECs do not support this capability.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.