Determining root cause of Linux system call latency

Determining root cause of Linux system call latency


I have a tricky problem under Red Hat ELSMP

I have an audio application reading from disk. All is well, except that about 2-4 times out of 10,000 occurences, the time it takes to complete the block-level IO call increases significantly. The submit_bio call takes over 1/2 second of wall clock time to complete.

Using Vtune, what would be the best way to profile/trace the kernel and find out what is causing the extra delay. It is either a long code path within the normal IO stack, or some competing thread that is preempting the code.

My gut feel is that the user-level thread is being forgotten to be resumed properly when the read event completes. A non-related event later causes the process queues to be scanned, and the read-to-run process is found.

This is being run on a quad-core recent Xeon system. I can reproduce the problem within a few minutes.


1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.