Lossless JPEG & multithreading

Lossless JPEG & multithreading

I'm trying to evaluate IPP to use it for lossless JPEG encoding. It seems that I would have to incorporate multithreading on my own.
Using CJPEGEncoder, I call SetSource, SetDestination, SetParams, and finally WriteHeader and WriteData.
Am I correct in assuming that I will need to incorporate the multithreading on my own? I would prefer if multiple cores could work on one image. Can anyone tell me what is needed?

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello,

weinvestigated the opportunity to add multithreading in JPEG lossless mode once we have implemented lossless JPEG support in IPP codec. The result of that investigation was that we end up with decisionto not incorporate the threading model we usein JPEG baseline mode for JPEG lossless mode. The threading model used in JPEG baseline mode is based on parallel processing of MCU rows by dividing processing into independent stages. For baseline mode there are three basic stages (on example of decoding): huffman decoding, inverse quantization and DCT, upsampling and color conversion. Consideron example of two threads case: we allocate buffers for two MCU rows, thenlet's first thread do huffman decoding offirst MCU row in image and when it finished start decoding of the next MCU row. At this moment of time we can start the second thread to do dequantization and IDCT plus upsampling and color conversion of the first MCU row. When the second thread finished processing of the first MCU row it check if it is possible to start processing of the next MCU row. Once the second thread start processing of the next MCU row the first thread can start huffman decoding of the subsequent MCU row and so on. This approach providegood workload balance between two threads and balance is degradate with increasing number of threads.
The problem of losslesss mode is that the main computational time in JPEG lossless mode is taken by huffman entropy coding functions (it is about 90% of total time for lossless coding/decoding). So it just gives no any chance to provide acceptable workload balance even for two threads.

By the way, we currently working on implementation different threading model which should be based on processing JPEG restart intervals in parallel. The JPEG restart intervals can be processed in parallel and should provide good threading scalability for multiple cores. The drawback of this approach is that not every JPEG file encoded with restart intervals.

Regards,
Vladimir

Quoting - Vladimir Dudnik (Intel)
Hello,

weinvestigated the opportunity to add multithreading in JPEG lossless mode once we have implemented lossless JPEG support in IPP codec. The result of that investigation was that we end up with decisionto not incorporate the threading model we usein JPEG baseline mode for JPEG lossless mode. The threading model used in JPEG baseline mode is based on parallel processing of MCU rows by dividing processing into independent stages. For baseline mode there are three basic stages (on example of decoding): huffman decoding, inverse quantization and DCT, upsampling and color conversion. Consideron example of two threads case: we allocate buffers for two MCU rows, thenlet's first thread do huffman decoding offirst MCU row in image and when it finished start decoding of the next MCU row. At this moment of time we can start the second thread to do dequantization and IDCT plus upsampling and color conversion of the first MCU row. When the second thread finished processing of the first MCU row it check if it is possible to start processing of the next MCU row. Once the second thread start processing of the next MCU row the first thread can start huffman decoding of the subsequent MCU row and so on. This approach providegood workload balance between two threads and balance is degradate with increasing number of threads.
The problem of losslesss mode is that the main computational time in JPEG lossless mode is taken by huffman entropy coding functions (it is about 90% of total time for lossless coding/decoding). So it just gives no any chance to provide acceptable workload balance even for two threads.

By the way, we currently working on implementation different threading model which should be based on processing JPEG restart intervals in parallel. The JPEG restart intervals can be processed in parallel and should provide good threading scalability for multiple cores. The drawback of this approach is that not every JPEG file encoded with restart intervals.

Regards,
Vladimir

Hi Vladimir,
Thanks for your reply. We will probably stick with our current 3rd-party lossless JPEG encoder, but there are other parts of the IPP that we can benefit from. (Scaling, cropping, rotation, color space conversion).

Regards,
Magnus

Am I correct in assumption thatperformance is the main reason you do not move to IPP lossless JPEG?
Is it possible to share some benchmark data you have with your current solution?

Vladimir

Quoting - Vladimir Dudnik (Intel)

Am I correct in assumption thatperformance is the main reason you do not move to IPP lossless JPEG?
Is it possible to share some benchmark data you have with your current solution?

Vladimir

In my case, I cut the image to two pieces and run two threads to compress the images. I can get almost 200% performance gain. of course, you end up a little JPGE file header overhead.

As I report in another email, the JPEG lossless is slow on CPU level lower than CPU type 33.

Leave a Comment

Please sign in to add a comment. Not a member? Join today