I have a tricky problem under Red Hat 126.96.36.199 ELSMP
I have an audio application reading from disk. All is well, except that about 2-4 times out of 10,000 occurences, the time it takes to complete the block-level IO call increases significantly. The submit_bio call takes over 1/2 second of wall clock time to complete.
Using Vtune, what would be the best way to profile/trace the kernel and find out what is causing the extra delay. It is either a long code path within the normal IO stack, or some competing thread that is preempting the code.
My gut feel is that the user-level thread is being forgotten to be resumed properly when the read event completes. A non-related event later causes the process queues to be scanned, and the read-to-run process is found.
This is being run on a quad-core recent Xeon system. I can reproduce the problem within a few minutes.