How to get befenit from Python (IDP) acceleration?

How to get befenit from Python (IDP) acceleration?


My python script is not being accelerated at all.

I have installed Intel Distribution for Python (IDP) using l_python2_pu3_2017.3.053 obtained at Intel® Distribution for Python* . I ran "./" and I got no error messages.  Then I have loaded Intel environment by typing "source /opt/intel/intelpyhon3/bin/activate". I call "python" and the script runs with Intel Python. My problem is that I cannot see any improvement when using IDP, compared to standard python from Ubuntu. There must be something wrong with my environment.

This is the information about my system:
Hardware:  Intel® Core™ i7 4500U@1.8 GHz processor, 64 bits, 8GB RAM
Software:  Ubuntu 16.04 LTS operating system, Intel® distribution for Python* 2.7.13, standard Python* 2.7.12, Scikit-learn 0.19.0

My code is simple and uses scikit-learn library. Since scikit-learn used scipy and numpy that are supported by Intel, I had supposed its performance would be improved too. But it is not what happened. My code is simple:


import numpy as np
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn import svm
import time
import sys

input_file = "drill39mil.csv"

# Read data
mydata = pd.read_csv(input_file, header = 0, delimiter = ",")

# Break data into train and test dataset
train_mydata, test_mydata = train_test_split(mydata, test_size = 0.2)

# Provided your csv has header row, and the label column is named "classe"
train_data_target = train_mydata["classe"]
test_data_target = test_mydata["classe"]

# Select all but last column as data, which is the classification class
train_data = train_mydata.iloc[:,:-1]
test_data = test_mydata.iloc[:,:-1]

start = time.time()
#######       Classifier
clf = svm.SVC()

#Perform training, train_data_target)

# Make class predictions for all observations
Z = clf.predict(test_data)

# Compare predicted class labels with actual class labels
print ("Predicted model accuracy: "+ str(accuracy))

end = time.time()
print("Time (s):" + str(end - start))
print (sys.version)

The outputs are when running Intel Python are:
Predicted model accuracy: 0.543351131452
Time (s):628.276842833
2.7.13 |Intel Corporation| (default, Apr 27 2017, 15:33:46)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]

And when running standard Python from Ubuntu are:
Predicted model accuracy: 0.550597508263
Time (s):589.650998831
2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609]

As you see, Intel Python is slower that standard Python. Can anyone give me a tip of what is going wrong?

Thanks in advance,

Flávio Mello

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Flavio,

What are the dimensions of the "drill39mil.csv" file?  How many features?

If your dataset is oddly sized or not big enough to cause proper vectorization, the the cost of setting it up may (at the small scale) make it slower than the standard variant of those packages.  Just want to make sure it isn't that problem before we look at other parts of your environment and setup.



Hi David,

Thank you for your attention.

The dataset is composed of 21 features (columns) and 39329 records (lines)


Flávio Mello

Best Reply

Hi Flavio,

After talking with engineering, scikit-learn's svm.SVC() doesn't use NumPy (which we accelerate) and doesn't use our Intel® DAAL library, but instead uses libsvm which we haven't currently accelerated at this time.  So for now, it is the same speed as standard version of Scikit-learn as it doesn't leverage any of our current accelerations.  But in many of the other areas of scikit-learn we do have significant accelerations, and we are looking to add more this year (such as svm.SVC() ).  Thanks for bringing this to our attention, and please let us know if you have any further questions.



Hi David,

Thank you for the information.


Flávio Mello

Leave a Comment

Please sign in to add a comment. Not a member? Join today