Teach me about IPC

Minibiker · 31 Dec 2019 at 21:23

Hi all

I've been trying to find out more information about a CPUs IPC but not finding much, I'm sure some of you on here could shed some light on this for me, basically I'm just trying to learn how to find out what a CPUs IPC is, How it works and how you would find it out?

I've heard that a CPU with a slower clock speed could still perform as good as a faster clock speed if the IPC is better :confused:

Any help would be appreciated

Thanks and Happy New year

sideways14a · 31 Dec 2019 at 21:26

IPC is instructions per clock, its pretty self explanatory.
If you have a high ipc you can do as much or more than a low ipc processor with a high clock.

But IPC is situation dependant as well, not all applications or games will have the same performance.

Minibiker · 31 Dec 2019 at 21:29

sideways14a said:
IPC is instructions per clock, its pretty self explanatory.
If you have a high ipc you can do as much or more than a low ipc processor with a high clock.

But IPC is situation dependant as well, not all applications or games will have the same performance.

Ahhh I see, so how would you find out your CPU IPC rate

sideways14a · 31 Dec 2019 at 21:37

Why do you want to know?
IPC is not something generally nailed down and given a figure in cpu literature... well there are numbers out there but as IPC is code/situation dependant it doesnt mean all that much.

A good example is the current high end cpu's from intel and amd.
Zen2 has a higher IPC than refreshlake (whatever version we are on this week!) however its got a lower overall clock speed.
Zen2 ipc is circa what??? 8% or so over intels shambles so can get away with less clock speed and still be quick.
Ice lake is what?? supposed to be 10% or so improvement over that of zen2 - however its currently clocking crap due to intels 10nm process being rubbish so wont be a big winner if it doesnt find those clocks.
Zen3 is currently supposed to be 17 or so % faster than zen2 which is already sporting a good IPC rate. Add in maybe 10% better clocks for ryzen 4000 which is pushing it a bit but who knows... and well your actual performance improvement will be well past 20% - then you can look at the number of cores as well which may or may not take a hike.

Minibiker · 31 Dec 2019 at 22:15

sideways14a said:
Why do you want to know?
IPC is not something generally nailed down and given a figure in cpu literature... well there are numbers out there but as IPC is code/situation dependant it doesnt mean all that much.

A good example is the current high end cpu's from intel and amd.
Zen2 has a higher IPC than refreshlake (whatever version we are on this week!) however its got a lower overall clock speed.
Zen2 ipc is circa what??? 8% or so over intels shambles so can get away with less clock speed and still be quick.
Ice lake is what?? supposed to be 10% or so improvement over that of zen2 - however its currently clocking crap due to intels 10nm process being rubbish so wont be a big winner if it doesnt find those clocks.
Zen3 is currently supposed to be 17 or so % faster than zen2 which is already sporting a good IPC rate. Add in maybe 10% better clocks for ryzen 4000 which is pushing it a bit but who knows... and well your actual performance improvement will be well past 20% - then you can look at the number of cores as well which may or may not take a hike.

Im just wanting to gain more knowledge about CPU's really, Instead of just looking at CORES/THREADS, trying to figure out more about what else matters when people ask about CPU's

If you have anything else that may help that would be appreciated

Zeeflyboy · 31 Dec 2019 at 22:35

As mentioned, IPC is somewhat harder to pin down than something as granular as a frequency... it will vary in any given workload (although arguably we are starting to see frequencies behave that way too to some extent!).

The best way to compare IPC for a particular use would be to set the clock of both processors you are trying to compare to a common figure (eg 4.0ghz) and then run the task that you are trying to evaluate... the difference between the two results will be showing you the difference in IPC between the two chips at that particular task.

As a really super simple example to your question about how a CPU with a slower clock can outperform one with a higher clock imagine you have two chips:

Chip A runs at 4hz and can do 2 instructions per cycle, while Chip B runs at 5hz but can only do 1 instruction per cycle. So Chip A is doing 4 cycles per second at 4hz, processing 8 instructions. Chip B is doing 5 cycles per second at 5hz, processing 5 instructions. If you had eg 80 instructions to get through, then Chip A will process that in 10 seconds while Chip B will take 16 seconds, despite the fact that it's running at a higher frequency.

Obviously that's beyond simplified, but is just a really basic example to show the basic concept that is at work.

Tetras · 31 Dec 2019 at 22:45

I'm not an expert in this stuff, but my take is that since CPUs tend to do the same stuff over and over, there are a lot of ways that they can be optimised over time, so that they can do the same thing quicker. There's a lot that can influence it and changes in each generation help different type of workloads to a greater or lesser degree.

There are some instructions and features that together with architectural changes to support them make the CPU much faster than one without them, when the software is written to support it, AVX would be an example. There's also the size and speed of cache and memory access, since the compute part has to wait until it can access or write the data it needs. The same goes for the pipelines that execute instructions because if they're not large or sophisticated enough to process the data, then even if the task in another pipeline is complete, then they all have to wait, so there's less advantage in splitting the work in the first place. Speculative execution is a fairly recent, but controversial feature, the concept is that it completes some work in anticipation that it will be needed, since it would be slower to wait until it is, though keeping as much of the CPU busy as possible is common with other, older features too.

There's a cost to excessive optimisation, because it can make the CPU slower for more general tasks, but if you did decide to focus on only one purpose, then the CPU could be made much more efficient and have a much higher IPC than a general purpose CPU. It can be difficult to predict the best kind of optimisation too, since you don't know exactly what kind of tasks software will demand and what instructions will be repeatedly called for. GPUs are a good example of that, because sometimes their architectural differences and optimisations (or lack of them) can show up quite clearly in benchmarks.

Minibiker · 31 Dec 2019 at 22:47

Zeeflyboy said:
As mentioned, IPC is somewhat harder to pin down than something as granular as a frequency... it will vary in any given workload (although arguably we are starting to see frequencies behave that way too to some extent!).

The best way to compare IPC for a particular use would be to set the clock of both processors you are trying to compare to a common figure (eg 4.0ghz) and then run the task that you are trying to evaluate... the difference between the two results will be showing you the difference in IPC between the two chips at that particular task.

As a really super simple example to your question about how a CPU with a slower clock can outperform one with a higher clock imagine you have two chips:

Chip A runs at 4hz and can do 2 instructions per cycle, while Chip B runs at 5hz but can only do 1 instruction per cycle. So Chip A is doing 4 cycles per second at 4hz, processing 8 instructions. Chip B is doing 5 cycles per second at 5hz, processing 5 instructions. If you had eg 80 instructions to get through, then Chip A will process that in 10 seconds while Chip B will take 16 seconds, despite the fact that it's running at a higher frequency.

Obviously that's beyond simplified, but is just a really basic example to show the basic concept that is at work.

Thank you, The more basic explanation you give the better for me and easier it is for me to understand

Im starting to think IPC was not something I should really need to worry about

sideways14a · 31 Dec 2019 at 22:57

Well to be honest it isnt really something folk should worry about.
Whats important is the overall performance of your cpu, if it scores well in benchmarks that are relevant to you then its going to be fine.

andy_mk3 · 31 Dec 2019 at 23:38

A very simplistic way to look at it, if you have two different cpus, both with 8 cores at 4.0ghz, the one with the higher ipc would be faster.

Minibiker · 31 Dec 2019 at 23:49

sideways14a said:
Well to be honest it isnt really something folk should worry about.
Whats important is the overall performance of your cpu, if it scores well in benchmarks that are relevant to you then its going to be fine.

andy_mk3 said:
A very simplistic way to look at it, if you have two different cpus, both with 8 cores at 4.0ghz, the one with the higher ipc would be faster.

So what's the best things to look out for when buying a CPU? Other than cores/threads what would you also take into consideration? Like @Tetras Has mentioned there is Cache and memory access, Maybe I should look more into those sorts of things

Quartz · 31 Dec 2019 at 23:53

Minibiker said:
I've been trying to find out more information about a CPUs IPC but not finding much,

In theory it's simple; in practice it's a can of worms.

Imagine a CPU with a single core and all instructions taking one clock cycle. So you have one instruction per clock cycle. Now, in prqactice, many instructions take more than one clock cycle, so if the average were 3 then your IPC figure would be 1/3. But wait! You can have multiple cores, so if you had 8 cores your processor could crank out 8/3 of an instruction per cycle. So far so good. But you could note that different parts of the CPU perform different parts of the instruction so you could construct a pipeline so that while one part of the CPU is working on one instruction another part of the CPU is acting on the next instruction, thereby improving your IPC. Problem! What happens when two instructions in the pipeline or on separate cores want to operate on the same register or RAM location? Then you can get a stall in the pipeline, reducing the IPC.

And then there are RAM access speeds and cache misses and hyperthreading and speculative execution and layers and much more to consider.

Oh, and IPC is often Inter-Process Communication.

Minibiker · 1 Jan 2020 at 00:18

Thanks @Quartz very helpful information there, I'm kind of getting the gist of things now....
Let's take it to the next step and talk Cache?

sideways14a · 1 Jan 2020 at 00:19

Minibiker said:
Let's take it to the next step and talk Cache?

Ahh thats the thing you have to sacrifice a lot of for new hardware especially if your buying some of the new but not much faster products from Intel or Nvidia.

Minibiker · 1 Jan 2020 at 00:42

sideways14a said:
Ahh thats the thing you have to sacrifice a lot of for new hardware especially if your buying some of the new but not much faster products from Intel or Nvidia.

Is it right there are many cache sections on a CPU?

sideways14a · 1 Jan 2020 at 00:47

I think the sarcasm in my last post flew right over your head, although i am sure many on here will get it.

Quartz · 1 Jan 2020 at 01:11

Minibiker said:
Let's take it to the next step and talk Cache?

That's another can of worms you're opening there!

At a basic level, cache is simply super-fast RAM, usually on the CPU die itself, but it used to be separate. There are various levels of cache, going from per core to per processor. For example the first-generation Ryzen CPUs have a level 1 cache of 64k for instructions and 32k for data, and a level 2 cache of 512 KB per core and a level 3 cache of 8 MB per CCX. Note that the level 1 cache is divided into instructions and data: this helps with looping. There is a downside to caching and that is that a cache miss costs time, and each missed cache costs more time. If you have too many levels of cache then you can negate the benefits of cache. It's a fine balance.

There are two basic types of cache: write-through and write-back. The former writes to the cached location, the latter writes when it's next convenient. DOS and Windows 3.x users will likely remember SMARTDRV.

Vince · 1 Jan 2020 at 01:49

In really simple terms this example works, you have 20 bits of data to porcess.

Processor 1 operates at 2ghz and can process 5 bits in a single clock cycle so takes 4 cycles to process the 20 bits of data.

Processor 2 operates at 2ghz but can process 10 bits of data in a single clock and therefore finishes the operation in 2 clock cycles.

For this task then processor 2 has higher ipc as can complete the task in less clocks at the same frequency. In reality its a bit more complex and I probably should have done multiples of 8 to make it a bit more realistic. IPC can always be considered in this way, going back to what you said about how can one half the ghz be as fast? In the above example if you half the frequency to 1ghz on processor 2 it effectively then matches that 4 clock cycles to complete the task that processor 1 takes, only it still takes 2 cycles but the cycle is half as fast.

Minibiker · 1 Jan 2020 at 12:29

Thanks guys, it seems to be sinking in now with the cycles etc... I think I need to read up more on the CPU Cache, seems abit confusing having all these different cache numbers like 32k and 512kb etc

Vince · 1 Jan 2020 at 12:44

Minibiker said:
Thanks guys, it seems to be sinking in now with the cycles etc... I think I need to read up more on the CPU Cache, seems abit confusing having all these different cache numbers like 32k and 512kb etc

The numbers themselves a bit like ram in your system don't necessarily improve performance for all tasks as they increase so knowing that l1 cache is 32k might not actually be all the relevant. Really all you need to know is that the CPU caches are there for one reason only and that is to reduce the 'cost' be that time or energy of retrieving data from main memory.