Originally Posted by aayman_farzand
4th and 5th image should be what you're looking for.
To talk about that review (which the reviewer doesn't fully get it, IMO) and the question presented to you, IPC means Instruction Per Cycle. If you want actual performance, something that reviewer hinted at in mentioning the "higher frequency allowances," then you want IPS, which means Instructions Per Second.
Now IPC varies by task. That means depending on the instruction set and how coded, you will get variances. This measure can bake in multiple aspects of architecture changes, from cache changes, latency impacting the completion of the cycle, and even memory speeds effecting the feeding of cache and per core.
I previously argued with Anton over at anandtech on the topic of whether the memory speed and timings/latency need standardized for IPC, with his position being that software should be used that can separate the IPC when running the chip at stock values with stock memory configurations so that there is not artificial gimping of other elements of the architecture, like slower memory effecting keeping the cores fed or cache speed being dropped due to dropping core speed, thereby artificially changing the behavior of something other than the core but which effects the core's performance. It is a good argument. And this is why he uses SPEC for checking IPC.
IPC also varies by task and instruction set, so saying one number for IPC also becomes problematic.
Now, when you multiply IPC by the frequency (which is how many cycles that complete in a period, typically a second), you now arrive at IPS. This is more commonly seen as overall performance.
How these play together is decently straight forward. If you have an IPC of 1.2 (20% higher than another chip) and a speed of 4GHz, 1.2*4=4.8 for your IPS. If you instead have an IPC of 1, but a speed of 5GHz, 1*5=5 IPS. Welcome to the comparison of AMD Zen and Zen+ versus Intel, or even Zen 2.
So IPC bakes in certain issues, like latency effects, but it is only seen in certain tasks. If the task is latency sensitive, you could see a dive in performance. If the test is setup so it all stays within the cache system and doesn't do memory calls, then with Zen, due to the architecture, you may get a higher number.
This is why, although it is good to know and understand, there is much more to the performance of a chip. This should also show why IPC is harder to discuss, is a short hand for gains in an architecture, but is not the end, nor should it ever be, of overall performance.
But, it is NOT just a marketing thing. It is quantifiable by task, so long as you have a task with a set number of instructions, you are able to control for the frequency to be factored out (backing it from IPS to IPC), thereby allowing for a performance analysis. But trying to pinpoint how much IPC came from which change, now you are talking voodoo. That can be difficult to impossible at times, mainly because we did not design the chips nor test the effects in between each proposed change/revision.
I hope that helps (and I'm sure it causes more questions to arise)...
Edit: To be clear, setting the speed where all chips can hit it and trying to set the memory to equalize frequency and latency of the memory is trying to control factors to compare different architectures of CPU and gives an idea of behavior of the core. But, this also at times can be misleading as it can effect other aspects of the architecture that contribute to IPC, such as latency or cache behavior.