Error when loading network to GNA inference engine

Error when loading network to GNA inference engine

Hi All, 

   I'm trying to perform inference using a very simple model defined in TF freeze graph (attached) converted to IR format using Model Optimizer (attached). Unfortunately when executing my code I get the following error:

Exception: [GNAPlugin] in function void GNAPluginNS::GNAPlugin::LoadNetwork(InferenceEngine::ICNNNetwork &): The plugin does not support layer: matrix_multiplication_explicit/MatMul:Gemm

AFAIU this means that my model includes a GEMM layer that is not supported by the GNA plugin. Is there any way I can enforce Model Optimizer NOT to use GEMM in the IR representation and instead use something that will be supported by GNA plugin?

BTW, I'm currently using the following command to convert .pb file to IR:

python --input_model matrix_mul_explicit.pb --input "input_x_float","input_y_float" --input_shape (1,8,8),(1,8,1)

Thanks in advance for any help.




Downloadapplication/zip matrix_mul_explicit.zip1.5 KB
4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Dear Karol Duzinkiewicz 

According to the Supported Frameworks Document for Tensorflow, MatMul should be converted to FullyConnected. So yes, it's true that According to Supported Devices Gemm is not supported by GNA, but the greater question is - why is MatMul not getting converted to FullyConnected ? And FullyConnected is supported by GNAPlugin.

I think it's a bug. 

Let me reproduce your error and file a bug on your behalf.


Best Reply

Dear Dear Karol Duzinkiewicz 

I later found that this is in fact not a bug, but you're correct - GNA does not support GEMM at the moment.

BatchMatMul operation in attached model has dynamic inputs. And you have to print your pb model to text to see the BatchMatMul. You will not see it in the MO Generated IR.

node {
  name: "prefix/matrix_multiplication_explicit/MatMul"
  op: "BatchMatMul"
  input: "prefix/input_x_float"
  input: "prefix/input_y_float"
  attr {
    key: "T"
    value {
      type: DT_DOUBLE

BatchMatMul is in the Supported Framework Layers Doc and mapped to GEMM but it's under ONNX not Tensorflow, and that is a documentation bug actually.

FullyConnected - You can find FullyConnected  definition here. It gets a 2D or 4D input blob. And it is properly represented in the aforementioned document and correctly mapped backward to MatMul. 

Gemm - is pure matrix multiplication operation with no restrictions on inputs. Unfortunately GEMM is supported by a few plugins and GNA is not in that list.

Hope it helps and I apologize for the confusion.


OK, thanks - now I get it.

Leave a Comment

Please sign in to add a comment. Not a member? Join today