I am setting following OpenMP thread parameters using bash before I train CIFAR-10. However, Intel Caffe overwrites these and goes to default 64 threads. Also, instead of compact, threads are scattered.
Can anyone please share where I am going wrong?
Environment variables set before running training of CIFAR-10:
CIFAR-10 Prototxt has engine as MKL:
# reduce learning rate after 120 epochs (60000 iters) by factor 0f 10
# then another factor of 10 after 10 more epochs (5000 iters)
# The train/test net protocol buffer definition
# test_iter specifies how many forward passes the test should carry out.
# In the case of CIFAR10, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
# Carry out testing every 1000 training iterations.
# The base learning rate, momentum and the weight decay of the network.
# The learning rate policy
# Display every 200 iterations
# The maximum number of iterations
# snapshot intermediate results
# solver mode: CPU or GPU
I0927 11:14:16.258430 14454 cpu_info.cpp:468] OpenMP environmental variables are specified: no
I0927 11:14:16.258483 14454 cpu_info.cpp:471] OpenMP thread bind allowed: yes
I0927 11:14:16.258538 14454 cpu_info.cpp:474] Number of OpenMP threads: 64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
On-line CPU(s) list: 0-255
Thread(s) per core: 4
Core(s) per socket: 64
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model name: Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz
CPU MHz: 1098.957
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
NUMA node0 CPU(s): 0-255
NUMA node1 CPU(s):