Example of app submitting work directly to GPU (bypassing KMD) ?

Example of app submitting work directly to GPU (bypassing KMD) ?

Is there example code for the use case where the user app wants to submit a workload directly to the (gen9) GPU (ie. bypassing the kernel mode driver)? 

3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello (name withheld),

Can you expand on the goal and define what you mean by 'submit'? 

You may be interested in the c for media project.

You may also be interested in precompiling kernels prior to execution, to something like a SPIR intermediate target... or a target specific executable... See the -x spir toggle here for an example.

Other than that I'm not aware of anyway of bypassing a mechanism to access gen9.




Hi MichaelC,

I am referring to what I think is described by the patent here. By "submit" I mean enqueue a context to some controller in the GPU from the user application/opencl runtime directly, rather than having the kernel mode driver submit on behalf of the application. I believe there is some way to do so as long as there is some initial agreement set up between the user application and the kernel mode driver. I think it is also alluded to by the code located on this page by the "ContextIndex" and "SubmissionByProxy" values in the following struct: 

// PURPOSE: To represent the context ID structure and execlist/submit queues
typedef struct UK_CONTEXT_ID_MAP_REC
            ULONG    ContextIndex          : KM_BIT_RANGE(  19,  0);  // NOTE: This can be index in the app context pool in direct submission case or LRCA itself in proxy submission case
            ULONG    SubmissionByProxy     : KM_BIT_RANGE(  20, 20);  // If KMD or other context submitted this context. This means, ContextID is LRCA[31:20]
            ULONG    Reserved              : KM_BIT_RANGE(  22, 21);  // Required by HW
            ULONG    SWCounter             : KM_BIT_RANGE(  28, 23);  // Used for tracking IOMMU group resubmits (or if submit by proxy is true, lower 6 bits QWIndex).
            ULONG    EngineId              : KM_BIT_RANGE(  31, 29);
        ULONG                        ContextIdDword;

Leave a Comment

Please sign in to add a comment. Not a member? Join today