To see the acceleration of XNOR-nets on CPUs, I have been reading a paper, which claims that most CPUs execute 64 binary operations in one clock cycle. Thus the speedup is calculated accordingly.
To calculate the speed up in the XNOR-net, i need to know how many binary operations per clock cycle can be executed by KNL processors. How can I find this information for a CPU?
Does AVX-512 imply that 512 bitwise operations are possible every clock cycle?
If this is indeed correct, can you suggest some material with the reference of which I can attempt to code bitwise convolution operations which take advantage of the Intel architecture?