Developer Guide


Task Parallelism

While the compiler achieves concurrency by scheduling independent individual operations to execute simultaneously, it does not achieve concurrency at coarser granularities (for example, across loops).
For larger code structures to execute in parallel with each other, you must write them as separate kernels that launch simultaneously. These kernels then run asynchronously with respect to each other and you can achieve synchronization and communication using pipes, as illustrated in the following figure:
Multiple Kernels Running Asynchronously
This is similar to how a program running on a CPU can leverage threads running on separate cores to achieve simultaneous asynchronous behavior.

