Choose between static scheduling and dynamic scheduling for the assignment of work to particular threads in a data-decomposition problem on a dual-processor system. The static scheduling model calls for work to be divided among threads at the beginning of processing. Each thread works on its allotted data until it is finished. The first thread to finish waits idle while the other thread(s) finish. Dynamic scheduling consists of scheduling work one piece at a time; as each thread finishes its assigned work, it is assigned another portion, until all pieces of work have been allocated and processed.
In the case of a video-processing application, for example, the picture might be subdivided into slices of macroblocks, each of which consists of blocks of pixels, and separate threads might each be assigned to a set of slices:
As addressed in a separate item, Processing Loads on Systems with Hyper-Threading Technology, static scheduling is more efficient on systems that support Hyper-Threading Technology, due to the importance of balanced workload in taking advantage of Hyper-Threading Technology.
Use static scheduling for data-decomposition problems on dual-processor systems. While this general rule should be empirically verified for individual applications, it is faster to decode the picture in the video-processing example given above when the co-located parts of the pictures are still in the cache. Although dynamic scheduling has better load balance, co-located parts of the pictures may not be decoded by the same processor when using dynamic scheduling. Thus, static scheduling provides better performance in this case than dynamic scheduling, as shown in the following chart:
This performance differential is due to the fact that processing incurs more bus transactions with static scheduling than with dynamic scheduling, as shown in the following table:
This high number of front-side bus data activities under a dual-processor system, relative to a processor with Hyper-Threading Technology is because, whereas the logical processors in the processor with Hyper-Threading Technology share the second-level cache, the physical processors in the dual-processor system do not. As a result, the overall speed achieved using dynamic scheduling is slower on dual-processor systems than that achieved using static scheduling.