Calculation of Cycles Per Instruction (CPI) for Intel processors.

Calculation of Cycles Per Instruction (CPI) for Intel processors.

Dear sir, I am exploring regarding calculation of processor speed in MIPS or MOPS or GFLOPS. I know calculation of clock rate. I need a solution to calculate Cycles Per Instruction (CPI) value for a given intel processor. Please suggest me the method I should follow to calculate CPI.Also I heard about some benchmark programs to determine CPU speed in MIPS or MOPS or GFLOPS, I don't where they will be available. Please give me some links where I can get these test programs.I don't know whether this is right place ask these type of questions. If it is not the place, please give me some links where I can get the related information.

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

You will find a program that shows some processor info at
A good source is if you want to calculate it by yourself for a given procedure.

Dear sir, Thanks for your reply. I know about CPUID software and I downloaded pdf you specified. Please suggest me how can I write a program using instruction sets to calculate processor speed in MIPS. Also I heard that some benchmark programs will helps to estimate this MIPS, what are these benchmark programs and where I can get these programs. Actually I am using Windows OS, using vc++ compiler and c or c++ programming languages. To calculate MIPS is it necessary to learn assembly language programming. Awaiting for your reply.... Thanks in advance....

For what do need these values?
You can calculate the theoretical throughput from the tables given in the link to Agner Fog's documentation, you will get values such as 1, 2 or 3 (independent) operations per cycle, each giving possibly multiple results such as 4 for an sse single operation. You will have to take the latencies into account. All this depends on e.g. processor, actual command, memory access times, and also on what else is running on the computer.
You can create some test routines which e.g. calculate a dot product for a long vector (e.g. 1024 singles) and measure the time it takes to run this procedure 100000 times. This gives a very rough estimate about the actual floating point performance and heavily depends on the compiler and the used compiler switches. Additionally it depends on the memory bandwidth of your processor. You must also see that modern processors are capable of processing many things in parallel by having multiple cores, using hyperthreading and acting superscalar. You will see that when starting your test program multiple times simultaneously the result will vary.
You will also see that the theoretical thoughput is almost never reached.
To achieve the goal of getting a measured result equal to or as near as possible to the theoretical one you will need to read a lot of documentation and do many experiments.

I agree with you and would like to hear other people opinion on this too as I have been in hot water about this before. The rating of a CPU/GPU in terms of flops is the hardware manufacturers perogative to benchmark the chip. If I may use an analogy: "it only shows how fast a car's engine can rev without the clutch engaged". Once you start using the cpu the rating will change according to what components of the cpu are being deployed at the time. The Gflop rating is highly subjective and one needs to be careful when using it to compare onecomputing platformagainst another.

Leave a Comment

Please sign in to add a comment. Not a member? Join today