Not getting the required computational power !

Not getting the required computational power !

When i run this program of CNN on my laptop epoch is around 5 seconds, but when i run the following on cluster ETA to compute eone epoch is around 60- 80 seconds. How do i fix this ?

11 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Prajjawal,

Can you please share the complete code so that we can look for any possible optimizations and  get back to you.

Please fill the below details.

1. Tensorflow version and wheel file name used
2. Python version
3. Steps Followed
4. Attach complete code

5. Software used (Tensorflow/Keras/etc.,)

Thanks,

Rajeswari Ponnuru.

Tensorflow version - 1.3

Python 3.5

Steps:

conda info -e
source activate test_env
pip install ipykernel

python -m ipykernel install --user --name=nn

a. echo jupyter notebook | qsub   

b. [u4336@c001 ~]$ qstat -f <jobID> | grep exec_host

exec_host = c001-n036/0

ssh -L 8896:localhost:8896 colfax ssh -L 8896:localhost:8896 c001-n036

Software used:

Keras with theano as backend / tensorflow

Hi Prajjawal,

As requested, please share the complete code whatever you have executed. Do not share part of code, share the complete code.

Thanks,

Rajeswari Ponnuru.

 

Hi Prajjawal,

Whatever the code you have posted is compatible with python 2.7 . Did you make any changes to make it work with Python 3.5? 

Or did we use python 2.7?

Python 3.5 -- has  '_pickle'  module and python 2.7 has 'cPickle'. 

 

 

Thanks,

Rajeswari Ponnuru.

 

 

It should be
import six.moves.cPickle as pickle
You'll have to make the change

You should use Python 2.7 for this for compatibility.

Hi Prajjawal,

Can I assume that you have used python 2.7 for the source code links you have provided?

Please confirm which version you have used for this experiment.

-- Rajeswari Ponnuru.

 

Yeah python 2.7

Hi PRajjawal,

Please try with below environment flags to increase the performance on cluster.

Add the below code to your existing code and run the model.

from keras import backend as K
import os
import tensorflow as tf

# ******* START CODE ADDITION  ***********
sess = tf.Session()

K.set_session(sess)

tf.app.flags.DEFINE_integer('inter_op', 2, """Inter Op Parallelism Threads.""")
tf.app.flags.DEFINE_integer('intra_op', 64, """Intra Op Parallelism Threads.""")

os.environ["OMP_NUM_THREADS"] = "64"
os.environ["KMP_BLOCKTIME"] = "0"
os.environ["KMP_SETTINGS"] = "1"
os.environ["KMP_AFFINITY"]= "granularity=fine,verbose,compact,1,0"

# ******* END CODE ADDITION  ***********

Thanks,

Rajeswari Ponnuru.

 

 

Leave a Comment

Please sign in to add a comment. Not a member? Join today