Forum Jump

Select Group :
Select Forum :
Sorted By :
Sort Order :
From The :
 
Thread Tools  Search this thread 
matt.j
Total Points:
40
Registered User
August 13, 2009 7:24 PM PDT
Opcode semantics
Hi,


Firstly, I apologize if this is the wrong forum; I could not find any other more relevant.

I'm looking for clarification in regards to a statement made that asserts there is a 1-cycle difference between the instructions:

0x3B (cmp reg, mem)
0x39 (cmp mem, reg)

As the two are functionally equivalent, I assume it would have to have something to do with the decoding circuit logic, but would like clarification if this statement reigns true in the first place.

Additionally, if this is true, where would I be able to find documentation of such details? All of the manuals I've read (even the IA32 optimization manual) does not mention these things.


Thanks,

Matt.
matt.j
Total Points:
40
Registered User
October 31, 2009 3:28 AM PDT
Rate
 
#1
Any ideas, anyone?


bronxzv
Total Points:
405
Status Points:
355
Green Belt
October 31, 2009 3:53 AM PDT
Rate
 
#2 Reply to #1
Quoting - matt.j
Any ideas, anyone?

no idea, but here is the best place I know for this kind of subtleties:

http://www.agner.org/optimize/



c0d1f1ed
Total Points:
860
Status Points:
360
Brown Belt
November 2, 2009 12:38 AM PST
Rate
 
#3
Quoting - matt.j
Hi,


Firstly, I apologize if this is the wrong forum; I could not find any other more relevant.

I'm looking for clarification in regards to a statement made that asserts there is a 1-cycle difference between the instructions:

0x3B (cmp reg, mem)
0x39 (cmp mem, reg)

As the two are functionally equivalent, I assume it would have to have something to do with the decoding circuit logic, but would like clarification if this statement reigns true in the first place.

Additionally, if this is true, where would I be able to find documentation of such details? All of the manuals I've read (even the IA32 optimization manual) does not mention these things.


Thanks,

Matt.


Just a guess, but normally an instruction with a memory location as the 'destination' operand takes an extra cycle for the write operation. cmp doesn't actually write anything to the destination, but it might simplify the logic to handle these instructions uniformly and skip the write at a later point (where latency might also be less of an issue). Compilers shouldn't emit this anyway, so they can make compromises like these.



Intel Software Network Forums Statistics

8472 users have contributed to 31603 threads and 100652 posts to date.
In the past 24 hours, we have 31 new thread(s) 115 new posts(s), and 163 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member Edwin B. Ramayya