Producer-consumer problem and max subarray problem

First of all, I know it might seems to be a little late for that, but I think that this article might help a few people. And for those who don't have conclusive results, I think that only a few hours are enough to implement this solution (If you already have a sequential algorithm of course). During my curriculum at the INSA, I studied a little parallelism and especially a very famous problem in parallelization and synchonization: the producer-consumer problem.

The idea of the problem is quiet simple ; here is an extract of the wikipedia article on the problem:

"The problem describes two processes, the producer and the consumer, who share a common, fixed-size buffer used as a queue. The producer's job is to generate a piece of data, put it into the buffer and start again. At the same time the consumer is consuming the data (i.e., removing it from the buffer) one piece at a time. The problem is to make sure that the producer won't try to add data into the buffer if it's full and that the consumer won't try to remove data from an empty buffer."

Below  is a simple illustration of the problem. We have a producer that creates "tasks" and store them in a buffer and we have un consumer that simply execute the tasks in the queue. As soon as the consumer finishes a task, he picks a new one in the queue.

Illustration of the producer-consumer problem

This example can be generalized with several consumers (that we can consider as workers). The problem then consist in synchronizing this whole process to avoid conflicts of any kind.

This kind of multi-process synchronization problem seems to suit perfectly the Maximum SubArray problem. Indeed, if we search for the maximum sum by iterating on the starting line of the array (i0) and then search for the best array for this line, the problem can easily be divided in sub-tasks (created by the producer) that the consumers (the threads) will have to solve separatly. The producer will simply create a task for each line (or groups of several lines), and then consumers (the threads) will simply solve the problem for each line, and start a new task in the queue as soon as it finishes. We then only need to share the value of the maximum sum and the associated indexes to check that the local solution is better than the global one.

Here is a "generic" solution to solve this multi-process synchronization problem:

For the producer:
    create a task T
    lock the queue Q
        add T at the end of Q
        send a signal through Q to the consumers
    unlock Q

For the consumers:
    lock Q
    while there are consumers
        wait for signal from Q
        for each element E of Q
            do task E
        end for
        erase all the elements from Q
    end while
    unlock Q

Now that you have the concept and a generic solution, it's up to you to make a proper implementation of this problem.

Hint: use pthreads and semaphores.
http://en.wikipedia.org/wiki/Producer-consumer_problem#Using_semaphores
分类:
如需更全面地了解编译器优化,请参阅优化注意事项