Caffe - BVLC AlexNet Model vs Intel Optimized AlexNet Model

Caffe - BVLC AlexNet Model vs Intel Optimized AlexNet Model

Hi All,

In Intel Optimized Caffe there are default models of BVLC and optimized models of most of the BVLC mode. What type of optimization has been done and how can I validate it?

As of now, I am interested in difference between following two:

I am using Intel Xeon Phi 7210.

Thanks.

Chetan Arvind Patil
8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

Dear Chetan,

Primarily, if you look at both the train_val.prototxt of  BVLC and AlexNet models, you may not be able to find any differences. But take a look at both solver.prototxt and you will be able to find a different set of hyper parameters from the BVLC include the CPU mode. This infers that the parameters defined the solver will be able to bring the optimized performance in Intel Architecture

This does not mean that the caffe source code is not optimized for Intel. But if you go and check the source code of the Layers (written in C++) you will be able to find certain modifications done by Intel. Intel has mainly added the distributed support for BVLC caffe which by itself did not had in place.(Please check the MLSL, this is the library used by Intel to add distribute nature to BVLC caffe)

In addition to it, Intel Caffe has also added several new layers in C++ which could be used to define custom models for defining your own topologies.

But beyond all these changes, the most significant change which Intel did was Integrating MKL libraries with Caffe, so when you build Caffe with MKL, it will make sure that, the hardware features like multiple cores and threads, AVX supports e.t.c will be utilized while executing the same in a CPU

 

Kindly let me know that if the above details clarified your concerns. You may also learn lot of more things, while you start working with Intel Caffe.

 

Thanks

Anan

 

 

Hi Anand,

Thank you for detailed response.

By default Intel Caffe spawns 64 thread for OpenMP on Xeon Phi. Is there a way to verify which part of Caffe is being threaded. I assume most of these threading is of Caffe framework rather than the models fed to the Caffe architecture?

Thanks.

Chetan Arvind Patil

Dear Chetan,

You can use a VTune Profiler and see the performance of your codes. Threading happens in the MKL engine. So if you build Caffe with MKL, it should be taken care automatically.

 

Thanks

Anand

Hi Anand,

When using OpenMP threads, exactly what part is being threaded? Does it depend on the model/solver?

Thanks.

Chetan Arvind Patil

Dear Chetan,

OepnMP threads are used inside the Intel MKL  to parallelize some of the math routine calls like the General Matrix to Matrix multiplication(GEMM).  This is used for forward & backward propogation and is supposed to reduce the computation time.

Also note that if the work load is not big enough to efficiently use all the cores of Xeonphi, you may not experience that high performance.

 

Thanks

Anand

Dear Chetan

Is the querry clarified? Shall I close this?
 

Thanks

Anand

Hi Anand,

Yes. 

Thanks.

Chetan Arvind Patil

Leave a Comment

Please sign in to add a comment. Not a member? Join today