I'm currently using a parallel pipeline with I/O bounded filters. Let me remind you my project, if you can bear with me.
A serial filter reads a string, a parallel filter transforms a string, and a serial filter writes back. In order to have better performance, or so I thought, the writer and transformers threads won't communicate directly. Instead, the transformer filter writes to a concurrent queue, and the writer pops from that.
Now I've launched my program setting in the main a number of threads, in my case tbb::task_scheduler_init init(6), however, in Xcode I see that I have only two threads on. The pipeline, moreover, is slow. I expected it to be so, since I'm using a transformer that is actually an optimizer with complexity O(n^3), but I think I'm doing something really wrong here.
So now, how can I profile my program in order to see what is slowing down the pipeline? I'd really like to optimize it at the best I can do.
I think I could use a queue also from reader to transformer threads, if this could help. Or the problem could lie in the writer, but I don't really know how I can see if this is the case...
Any hints would be really appreciated!