Overclock.net banner

Zen 4 - "Houston, we have a problem!"

4710 67
You're a TL;DR person and want to get to the problem right away? I don't blame you - I would too. But skipping my writing in the beginning kind of sucks because I made a thread with a slightly larger purpose than to just describe the problem. If you must, just look at the picture after the text below and start reading below it. If you have more self control, don't! (until you're supposed to lol)



Zen 2 was great. AMD crawled right up Intel's behind: matching their IPC for the first time. Well, it felt like the first time! lol. (even though it was only 12 years or so).
Unfortunately they didn't bring overclocking to the table, but efficiency? I almost forgave them for selling me an unlocked chip that needs 1.5V to get to 4.4GHz!!
The only word is epic: AMD, trading blows with Intel after so long, while consuming about HALF the power?! X570 though.. was a little lacking. It's not like AMD didn't have time to pack it full of all the relevant modern features and I/O we could all possibly want, and for cheap - they were given notice January 2011, the release of Sandy Bridge 10 months before Bulldozer's release, that they'd need to scrap their still unreleased processor - entirely - and everything that followed if they were to ever compete again with Intel, even in the mid-range. So they worked on Zen/Zen2 and 470/570 while they sold their many many-core low-IPC processors to people who did heavily multi-threaded things on the regular - but just the ones living in buildings with utilities included, as well as rooms with individual climate control. And they sold those cheap cheap cheaaap APUs too - what a disaster!! GPU acceleration of CPU workloads in AMD's APUs was a great idea on paper (where it remained from conception til now). AMD's APUs were just slow, hot CPUs with decent GPUs attached in a jerry-rigged kind of way (compared to Intel's integrated offerings). They were great if you had the time to wait for 4 minutes of booting, 7 of logging in and opening everything, and 5 minutes to navigate 3 links deep into a wrestling streaming site to watch wrestling!

All those APUs did was ruin AMD in the eyes of millions of people who bought them for their image - they went in thinking AMD was competent and a company that offered the lowest prices and the best value products. They were only right on one of their preconceptions - price. Things had changed at AMD and the buyers hadn't noticed yet. For a few years, AMD chips were found in the absolute cheapest laptops - base models available for about $75 less than the cheapest Intel options on sale.
They were cheap - but when you could spend 20% more for a comparatively compact, capable, and relatively cooler-running computer? AMD offered no value, and normally patient people got angry after wasting so many hours of their lives over 2-3 years because of AMD. So angry they had to spread the hate around by leaving the PC platform entirely, switching to Apple! (remember, these were the bargain hunters too).
All of AMD's inconsiderate actions weren't for nothing though, they did survive to make some good stuff recently, and I know they're not dragging their feet with Zen 4 because they LEARNED FROM THE PAST and DON'T WANT TO REPEAT IT. RIGHT AMD?

So now we have Zen 4 and x670. Modern connectivity without 3rd party add-in cards! Not that I don't like ASMedia's solutions - I just prefer my essential controllers in the chipset and on the CPU itself (why not?!) It all looks pretty great. But, there's something I want to point out.
I'm really hoping that Zen 4 isn't just Zen 3 with proper I/O and DDR5/PCIe5.0, but the observation I made below is almost exactly how I tracked Intel's most severe stagnation from the 2nd gen through 7th (and beyond)

SuperPi 1M is what we're looking at:
I'll continue below the image.
(turns out, this preamble is for review and tone!)

Colorfulness Light Line Font Parallel


Again, we're just looking at SuperPi 1M because it's not dependent on RAM speed.

The 5950X boosts to 4.9GHz for one thread.
The 7950X boosts to 5.7GHz for one thread.

If thermals allow, I think both will increase by another 150MHz or so, but we're ignoring that because it's the same for both chips and the effect on the ratio is negligible

Clockspeed: 5700/4900 = 1.16326, therefore, the 7950X is 116.33% faster than the 5950X (in maximum single threaded MHz)

Look in the image above... 116.6% faster than the 5950X is the 7950X calculating 1M digits of Pi in SuperPi.
Again, RAM speed is irrelevant

Now, Pi is not the most complicated thing to compute - I'm sure very few, if any, "new" instructions were used, especially in SuperPi. But we can still infer that internally, a large portion of the Zen 4 CPU design remained unchanged (except for transistor size) from the previous Zen 3.

We can also infer that in most cases of day to day computing, whether you have a 7950X or 5950X at 5GHz, both will perform the same except when memory bandwidth is a limiting factor. Usually it's not, and when it is, it's not something single or lightly threaded.

7950X is better than 5950X in more ways than one, but IPC is the same (at least for what was used for calculating Pi with SuperPi, which is a good amount of the chip (which is also used for general purpose computing)).

From 2600K to 7700K, the majority (probably more than 80% of Intel's CPU performance increases came from plain old clock speed increases. That's why, except for AVX2, if you put 2133 RAM into both (the 7700K's rated speed and the 2600K's maximum speed) and clock them both to 5GHz, the 2600K is more of a brother than the 7600K!

AMD, WE DO NOT WANT THIS FOR OUR NEW CHIPS.

Definitely keep the clock speed increases coming whenever you can, and never deliberately hold back clock speeds (like Intel did - since Sandy Bridge 2500K/2600K, 4.6-5GHz was the range you could expect your chip would be within when fed a safe voltage. We've seen it before - it'll be even easier to spot this time!)

My hands hurt. Discuss!
1 - 20 of 68 Posts

·
Registered
Joined
·
1,109 Posts

·
Registered
Joined
·
1,109 Posts
Yup. Gotta remember that Comet Lake is just a lot more Skylake! And then people forget about Rocket Lake, because the IPC improvements didn't cover for losing two cores on the top SKU.
Rocket Lake is about 15-18% better IPC than Comet Lake, but about 1-6% worse in gaming performance due to increased latency.
Alder Lake P cores (Golden Cove) is about 19% faster than Rocket Lake.

Alder Lake P cores is about 40% better IPC than comet Lake
 

·
Registered
Joined
·
1,745 Posts

·
Iconoclast
Joined
·
33,714 Posts
Tom's Hardware shows a sizable IPC increase. There was a second reviewer (I forgot where) that does a similar test with the 7950X and 5950X clocked at 4.0GHz with similar IPC increases.

AMD Ryzen 9 7950X and Ryzen 5 7600X Review: A Return to Gaming Dominance | Tom's Hardware (tomshardware.com)

View attachment 2573672
Doesn't y-cruncher use AVX-512?

Zen 4's AVX-512 performance is impressive on it's own, and even more so when we consider that it's the only mainstream architecture that still (barring older Alder Lake setup with old/modified firmware) has any AVX-512 instructions enabled.

However, stuff that uses it is extremely rare in the consumer space, so factoring it in to general IPC is a bit optimistic.
 
  • Rep+
Reactions: LazyGamer

·
Registered
Joined
·
1,745 Posts
Doesn't y-cruncher use AVX-512?

Zen 4's AVX-512 performance is impressive on it's own, and even more so when we consider that it's the only mainstream architecture that still (barring older Alder Lake setup with old/modified firmware) has any AVX-512 instructions enabled.

However, stuff that uses it is extremely rare in the consumer space, so factoring it in to general IPC is a bit optimistic.
Read the article, they have other benchmarks. There are also other reviews that support AMD's IPC gain claims. The IPC gain from 5000 to 7000 is in line with the gains from 1000 to 2000 to 3000 to 5000, if not a bit more.

The only problems AMD has is A: Gaming performance vs the 58003DX and B: The cost/performance of the mid and low range 7000 series vs competition.
 

·
Registered
Joined
·
1,064 Posts
Discussion Starter · #8 ·
Rocket Lake is about 15-18% better IPC than Comet Lake...
Yup. Gotta remember that Comet Lake is just a lot more Skylake!...
Sandy Bridge to Ivy Bridge has 5.8% IPC...
Also have to remember the IPC improvements are averages, which, as time goes on, include more and more... less commonly used instructions for common tasks. Large portions of the chip weren't touched at all from Sandy to Ivy, from Ivy to Haswell, Haswell to Broadwell, and on. Don't treat this next thing I say as absolute truth because it might not be, but I strongly believe that improvements from memory speed increases are by and large treated as "IPC" improvements. For example, this 2500K machine I'm on right now, in Passmark's Performance Test, there are two CPU sub-tests which are highly dependent on memory bandwidth - "Physics" and "Prime Numbers". They almost scale linearly with memory frequency and number of cores. It doesn't seem to matter if it was a 4 core 9100 or 2500, if they were both running similarly performing 2133 RAM, they'd have results close enough to be from the same CPU with Windows 8.1 instead of 10.

Also, some "IPC" improvements aren't technically "IPC". New instructions used to do the same work in less time, aren't IPC. You can't work them into a figure reflective of computing anything other than that specific (or type of) task

For example, running SuperPi 1M like I did above, (a test most divorced from memory bandwidth as possible), you find the actual IPC improvements reflective of simple tasks common to computing. If you ran every Intel CPU from 2nd to 12th gen at 5GHz and ran SuperPi 1M, your results would be very reflective of how quick systems feel during normal operation. Not the timing large tasks sometimes dependent on new instruction sets - normal tasks.

Also, you can see very well the L1/L2/L3 cache bandwidth increases over the years using the Memory and Cache Benchmark in AIDA64. These improvements also pushed "IPC" along. The cache was always capable of running much faster than it did in earlier generations, but Intel chose to not run it at its potential, just like they chose to not run the CPUs at their potential. If Intel had clocked the 250AT HGHEOTITVE YUO0K/2600K as aggressively as they did the 12700K/12900K, even without thermal velocity boost and newer turbo boost modes that didn't exist yet, the 2500K would probably turbo to 4.5GHz and the 2600K 4.6GHz. The 2500K would probably do 100MHz higher than the 2600K because of HT, but it'd be sold 100MHz lower (it's Intel's practice and one that makes sense in modern marketing- product differentiation is important. Back when Intel's 2nd gen was current, programs were, for the most part, optimized to run on single core systems, so the real world benefit of going from 2500K to 2600K was smaller and less frequently felt by users - the 100MHz speed difference kept the 2600K the clear winner (though performance was so close, I doubt anyone running a 2600K would've noticed a difference in their day to day computing if someone swapped out their CPU with a 2500K)
 
  • Rep+
Reactions: LazyGamer

·
Iconoclast
Joined
·
33,714 Posts
Read the article, they have other benchmarks. There are also other reviews that support AMD's IPC gain claims.
I know they have other benchmarks and I'm not discounting AMD's (13% mean) IPC claim. I'm pointing out that a ~30% IPC gain in y-cruncher, which is the only thing shown in that chart, is an outlier.

The IPC gain from 5000 to 7000 is in line with the gains from 1000 to 2000 to 3000 to 5000, if not a bit more.
AMD claimed a mean gain of 15% with Zen(+) to Zen 2, 19% with Zen 2 to Zen 3, and 13% with Zen 3 to Zen 4. It's in a similar ballpark, but it's definitely less, overall, especially if the AVX-512 outliers are trimmed.

Also, some "IPC" improvements aren't technically "IPC". New instructions used to do the same work in less time, aren't IPC. You can't work them into a figure reflective of computing anything other than that specific (or type of) task
Anything that results in more instructions being retired over a given number of clock cycles is technically IPC. Doesn't matter if it comes from a wider front-end, larger buffers, new instruction sets, faster memory, SMT, or whatever, it's still IPC. Faster memory improves IPC by reducing the number of cycles spent waiting on memory access. SMT improves IPC by ramming more work into gaps in the execution pipelines, etc and so forth.

Yes, IPC is highly subjective to the task at hand, with or without simply reducing it to performance per clock.
 
  • Rep+
Reactions: LazyGamer

·
Because I was inverted...
Joined
·
1,094 Posts
AMD, WE DO NOT WANT THIS FOR OUR NEW CHIPS.
AMD does not care at all what you, or "we" want. PC gamers / PC owners are not their primary customer. You are forgetting what Ryzen is, and what it isn't.

Since literally day one, ALL zen cores are designed as server cores; Then they over volt them and boost the clocks up and make the best desktop parts they can out of them. From the Zeppelin dies on Zen1 to the brand new Zen4 chiplets; are all the exact same dies that go in the Eypc server CPU's. Now if you keep the fact that the CPU's are designed for servers, and that is AMD's #1, #2, and #3 priority; with Laptops and mobile sitting at #4, and Custom parts like Xbox sitting in at #5, then you start to understand why the desktop parts are the way they are, Why things like X399 was never fixed, why X570 was never fixed, why the desktop bios is always a disaster, Why infinity fabric is an ancient though the package interconnect and runs STUPID slow at only 2000-2133mhz. So why hasn't AMD put any effort into making a better interconnect? Simple, Eypc doesn't need a faster interconnect. Servers run a lot of slow memory (by desktop standards). Fast memory clocks with tight timings are not the priority on servers. Thus AMD does not care that the maximum possible bandwidth of the "Infinity fabric" is a bottleneck. Cache is king; both on server parts and desktop parts, so guess what we see? More cache, 3D stacked cache (Which was 100% developed for the server market, not the desktop market).

Ryzen is simply is something they do with excess parts to make extra money. Desktop Ryzen is AMD's side hussle; Once you remember that, and understand that desktops and gaming is AMD's lowest priority and lower you expectations accordingly you will be much happier.

Zen4 is the way that it is because it is what AMD needed for Eypc. Nothing more and nothing less.
 

·
Laptop Enthusiast
Joined
·
9,979 Posts
deleted
 

·
Registered
Joined
·
1,064 Posts
Discussion Starter · #13 ·
Anything that results in more instructions being retired over a given number of clock cycles is technically IPC. Doesn't matter if it comes from a wider front-end, larger buffers, new instruction sets, faster memory, SMT, or whatever, it's still IPC. Faster memory improves IPC by reducing the number of cycles spent waiting on memory access. SMT improves IPC by ramming more work into gaps in the execution pipelines, etc and so forth.

Yes, IPC is highly subjective to the task at hand, with or without simply reducing it to performance per clock.
We're saying the same thing about it - if improvement to the way the information has been processed hasn't changed, sometimes that task is helped by memory bandwidth and other times it's not memory bandwidth limited so there is no improvement. If the task was completed in less time once the data arrived for processing, then you have an architecture upgrade - a real, repeatable, non specific increase in processing speed
 

·
Registered
Joined
·
1,064 Posts
Discussion Starter · #14 ·
AMD does not care at all what you, or "we" want. PC gamers / PC owners are not their primary customer. You are forgetting what Ryzen is, and what it isn't.

Since literally day one, ALL zen cores are designed as server cores; Then they over volt them and boost the clocks up and make the best desktop parts out of them. The Zeppelin dies on Zen1 to the Zen4 chiplets, are all exactly the same dies that go in the Eypc server CPU's. Now if you keep that the fact that the CPU's are designed for servers, and that is AMD's #1, #2, and #3 priority, then you start to understand why the desktop parts are the way they are, Why things like X399 was never fixed, why X570 was never fixed, why the desktop bios is always a disaster, Why infinity fabric is an ancient though the package interconnect and runs STUPID slow (at only 2000-2133mhz? 6 years later, and we are still only a 2000-2133mhz!), etc. etc.

Ryzen is simply is something they do with excess parts to make extra money. Desktop Ryzen is AMD's side hussle; Once you remember that, and understand that desktops and gaming is the lowest priority, and lower you expectations accordingly you will be much happier.

Zen4 is the way that it is because it is what AMD needed for Eypc. Nothing more and nothing less.
Damn them lol. They do customize the parts for desktop to an extent though, like Intel uses the same core architectures for server/workstation/PC. They should do more - re-invest into development. What helps desktop has the potential to help server, so why not? Their stock is sky-high so it's not like they don't have the money.
 

·
Because I was inverted...
Joined
·
1,094 Posts
Damn them lol. They do customize the parts for desktop to an extent though, like Intel uses the same core architectures for server/workstation/PC. They should do more - re-invest into development. What helps desktop has the potential to help server, so why not? Their stock is sky-high so it's not like they don't have the money.
Not really. The Intel desktop dies are purpose built custom desktop parts. With AMD the only desktop parts that are physically customized are the APU’s.

the zen 1/1+/2/3/4 dies are not customized desktop parts at all, chiplets from the same wafer will end up in desktops, threadrippers, and Eypc skus
 

·
Registered
Joined
·
1,064 Posts
Discussion Starter · #16 · (Edited)
Not really. The Intel desktop dies are purpose built custom desktop parts. With AMD the only desktop parts that are physically customized are the APU’s.

the zen 1/1+/2/3/4 dies are not customized desktop parts at all, chiplets from the same wafer will end up in desktops, threadrippers, and Eypc skus
Yes Intel's are purpose built, but I'd wager 99%+ of the transistor arrangements within cores of the same type are the same whether they're in a server, workstation, or PC. Just because Intel's designs are monolithic and have to be manufactured independently doesn't mean they've been customized much. Intel does AMD's external connections internally, on the same piece of silicon. Intel's larger server parts use mesh instead of ring interconnect, but the differences past that (and the bigger memory controller /more PCIe lanes) are next to nil.

But I get what you mean - AMD made no such effort. And it's sad because when you consider how much work went into everything else, it's really not much effort at all, and the potential upside is huge. Maybe not huge, but it's there. AMD has substantial financial fortitude to fund development that could improve things. I still think AMD should invest in this customization.

They did make the X3D chip. You could argue that the cache is just one thing, and that one thing's just cache. But all other customizations are just things too. No, they didn't take it far with the 5800X3D. How far the same thing will get them this generation now that they've got higher memory bandwidth and possibly an upgraded memory controller remains to be seen. SuperPi 1M results aren't encouraging
 
  • Rep+
Reactions: LazyGamer

·
Premium Member
Joined
·
864 Posts
But I get what you mean - AMD made no such effort. And it's sad because when you consider how much work went into everything else, it's really not much effort at all, and the potential upside is huge. Maybe not huge, but it's there. AMD has substantial financial fortitude to fund development that could improve things. I still think AMD should invest in this customization.
The fact that they're able to produce the same CCD and use it from the bottom to the top of their CPU product lines is why AMD is doing so well. Just being competitive with Intel and moving product is a success; it means that they're not giving away parts (ahem, Bulldozer) and can build their technology. And really, the biggest thing they've done is put graphics on their 7000-series I/O dies. That's not a killer feature for some enthusiasts, or at least hasn't been, but with Intel doing it for every consumer-facing SKU by default, that's one less thing to worry about and one reason to not skip AMD.

And remember - any 'customization' AMD makes has a cost, if they use the same CCD everywhere except for APUs. It's reasonable to assume that AMD has run the numbers for their supply chain - i.e., TSMC - and Intel has run the numbers for their internal supply chain, and both arrived at their current strategies rationally.

They did make the X3D chip. You could argue that the cache is just one thing, and that one thing's just cache. But all other customizations are just things too. No, they didn't take it far with the 5800X3D. How far the same thing will get them this generation now that they've got higher memory bandwidth and possibly an upgraded memory controller remains to be seen. SuperPi 1M results aren't encouraging
I've heard that the X3D CCD is also a direct transplant from their Epyc and now TR Pro line - it's just cache, and seeing that the 7700X is pushing similar minimum frametimes at the 5800X3D (which Intel will probably match with 13th gen), then there may not be much benefit to a hypothetical 7800X3D. Need to keep in mind that DDR5, while having higher 'access' latencies, is far lower latency overall than DDR4, and also while at these speeds, latency decreases quickly with increases in clockspeed.

I'm honestly wondering if there's any more for AMD to do here, aside from cut prices. Wait for allocation for TSMCs next process shrink? Increase native cache on the cores themselves? Faster DDR5?
 

·
Because I was inverted...
Joined
·
1,094 Posts
Yes Intel's are purpose built, but I'd wager 99%+ of the transistor arrangements within cores of the same type are the same whether they're in a server, workstation, or PC. Just because Intel's designs are monolithic and have to be manufactured independently doesn't mean they've been customized much. Intel does AMD's external connections internally, on the same piece of silicon. Intel's larger server parts use mesh instead of ring interconnect, but the differences past that (and the bigger memory controller /more PCIe lanes) are next to nil.

But I get what you mean - AMD made no such effort. And it's sad because when you consider how much work went into everything else, it's really not much effort at all, and the potential upside is huge. Maybe not huge, but it's there. AMD has substantial financial fortitude to fund development that could improve things. I still think AMD should invest in this customization.

They did make the X3D chip. You could argue that the cache is just one thing, and that one thing's just cache. But all other customizations are just things too. No, they didn't take it far with the 5800X3D. How far the same thing will get them this generation now that they've got higher memory bandwidth and possibly an upgraded memory controller remains to be seen. SuperPi 1M results aren't encouraging
more memory bandwidth doesn’t do them much good When IF is the limiting factor.

If AMD made something like Intel’s EMIB I’d be a lot more excited about them.

I am absolutely ready for Sapphire Rapids-X
 

·
Registered
Joined
·
1,064 Posts
Discussion Starter · #20 ·
It's reasonable to assume that AMD has run the numbers for their supply chain - i.e., TSMC - and Intel has run the numbers for their internal supply chain, and both arrived at their current strategies rationally.
Yes, they might've, but what they should do is take some of the capital gained from investors and use it for R&D to improve their products. When people buy stocks (shares of AMD), the shares AMD owns of their own company increase in value. Part of the reason those people invested is because want and believe AMD will do well, and they contributed capital for them to do so. When they buy shares, share price goes up, and the company is worth more. This is reflected in the share price (not entirely and not immediately and to varying degrees depending on investor confidence, but the relationship is there, and the relationship is direct relating to AMD's worth derived from shares it owns of itself.

Anyway, if AMD takes money from investors by selling its shares at the higher price the investors (in part) bid it up to, and uses it to develop their product line into something even better than it is, that should increase their sales, which, eventually will cause the share price to increase well past the maybe 1% reduction AMD might have caused by raising capital by selling those shares it owned of itself. That's what's supposed to happen anyway. That's how it generally works, except there are other market forces now which prop things up too, but that's not our focus right now.

I've heard that the X3D CCD is also a direct transplant from their Epyc and now TR Pro line - it's just cache
Interesting, I had no idea.

I'm honestly wondering if there's any more for AMD to do here, aside from cut prices. Wait for allocation for TSMCs next process shrink? Increase native cache on the cores themselves? Faster DDR5?
There's always something to do if you don't want stagnation. Did AMD design the essence of what makes up the Zen line to be a viable framework for a few generations? Did AMD deliberately hold back in the beginning like Intel did with Sandy Bridge? Is AMD facing problems with its Infinity Fabric not increasing in performance as it might have been expected to with new nodes? Only they know, and hopefully they're doing the right thing. After suffering for so long I hope AMD hasn't been affected in ways that ultimately make it operate suboptimally (regarding development)
 
1 - 20 of 68 Posts
Top