# SDE 7.15 for Linux has no 64-bit libs

The recently released SDE 7.15 for Linux seem to have 32-bit libraries instead of 64-bit in intel64/pin_ext_lib and intel64/xed_ext_lib. Is this an oversight or am I missing something?

# SSE ucomiss/comiss strange behavior

Hello. When I run this code :

# Measuring Core Voltage

I am using an Atom N2600 processor. The intel software developer's manual says that a p-state can be requested by writing to MSR 0x199 and the locked p-state can be seen in MSR 0x198. The way to compute Core Voltage is given as MSR_PERF_STATUS[47:32] * (float) 1/(2^13).

The data that I see in MSR_PERF_STATUS (MSR 0x198) is 62d104306001045. Bits [47:32] is always 1043 irrespective of the value that I set in MSR 0x199.

When I use the formula: 0x1043 = 4163. Voltage = 4163/(2^13)=0.5 V, which is a really low voltage for the processor to operate stably at.

# why does _mm_mulhrs_epi16() always do biased rounding to positive infinity?

Does anyone know why the pmulhrsw instruction or

_mm_mulhrs_epi16(x) := RoundDown((x * y + 16384) / 32768)

always rounds towards positive infinity? To me, this is terribly biased for negative numbers, because then a sequence like -0.6, 0.6, -0.6, 0.6, ... won't add up to 0 on average.

Is this behavior intentional or unintentional? If it's intentional, what could be the use? Is there an easy way to make it less biased?

Lucky for me, I can just change the order of my operations to get a less biased result (my function is a signed geometric mean):

# AVX512f on non-MIC this year?

Hi all,

Can we expect AVX512f on non-MIC systems this year, or only on Knights Landing during 2015?

Thanks,

Angus.

# What does it implies to disable syscall in Intel SGX

I'm looking into programming with Intel Software Guard Extensions (SGX) facility recently. The idea of SGX is to create an enclave in which security-sensitive code is loaded and executed. Most importantly memory access (and many other restrictions) to that enclave is enforced by hardware.

# Vector programming. SSE4.2 to AVX2 conversion examples.

In this blog I’ll try to show how to convert SSE4.2 assembly to AVX2 (using the schemes from the blog Programming using AVX2) and how this affects performance.

• Easy case. When it is enough to add “v” prefix and replace “xmm” with “ymm”.

Consider we have the following loop:

# how best to implement AVX2 _mm256_cmplt_epi32?

AVX2 appears to only offer _mm256_cmpeq_epi32 and _mm256_cmpgt_epi32.  What's the most efficient way to implement _mm256_cmplt_epi32 given the available AVX2 functions?

# Interpreting Intel SDE avx/sse transition tracker

Hello, I am running Intel SDE in 'ast' mode (AVX/SSE Transition tracker.) on Mac OSX. I struggle to interpret the results.

First off, in the output, it shows function addresses, not function names. Should it not show the symbols? I built my app with -g.

Next, this is the output I see: are these numbers indicative of excessive transitions? Or are they in a normal range?