Isee thatthe GPU I am using,supports 16 concurrent kernels. I am starting 10 kernels by looping through clEnqueueNDRangeKernel for 10 times. Each of the kernel is tied to a different command queue. How do I get to know that the kernels are executing concurrently?
One way which I have thought is to get the time before and after the NDRangeKernel statement. I might have to use events so as to ensure the execution of the kernel has completed. But I still feel that the loop will start the kernels sequentially. Can someone tell me if this is the right way to start concurrent kernels..?