[Official] Zen 5 Owner's Club (9600X / 9700X / 9900X / 9950X)

Blameless · Aug 18, 2024

Sam_Oslo said:
Now this inter-CCX latency is almost 3 times more compared to Zen4. Probably this is the root of all these multi-core performance problems. Because it increases core-core communication time and decreases chip-to-chip bandwidth, but that's not all, it increases the RAM latency too.

There is no inter-CCX latency on the single CCX parts, but odd performance issues remain. Memory latency is also not significantly different between Raphael and Granite Ridge. The inter-CCX latency issue is just one problem of many, and is probably more of a symptom than a cause.

chewonthis said:
all they needed to do to bandwidth unlock 8c/16t was dump single ccd.......

Splitting an 8c/16t part into two CCXes would hit performance far harder far more often than the single CCD bandwidth limitations.

Takla said:
Yeah but thats a big change
Keeping Ryzen 7000 architecture and power limits, moving to 4nm and upgrading the memory controller to allow for 3000 fclk would have been an amazing product.

Would need to be monolithic and would need to be something significantly different from the current Fabric implementation. A hypothetical monolithic desktop part built from the ground up probably wouldn't even connect the memory controller with Fabric, they'd give it it's own stop on the ring bus.

Infinity Fabric exists for the same reason chiplets exist...to make designs more modular and ultimately cheaper. There is a whole lot that is within the realm of technical possibility that will never happen because it would a massive waste of money for AMD (or whoever). Replacing the IOD, or moving to a new substrate/interconnect, or spending their expensive TSMC wafer allotments building an entire die flavor just to make a handful of enthusiasts happy, or win a few client benchmarks, doesn't jive with running a profitable business.

Kaliz · Aug 18, 2024

chewonthis said:
so I had this funny feeling testing 9950x ES......and i was right.

AMD made so many improvements to server that they basically crippled single ccd even more.....so much that it probably should not exist.....

4c/4c theoreticaly what 9700x should have been @kailz what was your best balls out 1b?

View attachment 2669622

16,4s with AVX512 cannonlake. Zen 5 workload is slower on the single ccd parts ;(

keeping them in

its just PBO and 1,43v you know

best it can do is around 5550 effective on this cooling i didnt delid the cpu either. I think MAX 1b with LN2 will be 14seconds. So not hwbot worthy comapred to 12900K AVX for it

chewonthis · Aug 18, 2024

Kaliz said:
16,4s with AVX512 cannonlake. Zen 5 workload is slower on the single ccd parts ;(

it's not that it's bandwidth cap. uncapped we could possibly touch intel. we need a 9750x dual ccd part

Kaliz · Aug 18, 2024

chewonthis said:
it's not that it's bandwidth cap. uncapped we could possibly touch intel. we need a 9750x dual ccd part

at DDR5-12000 but i dont have high hopes for the future in regards to this

chewonthis · Aug 18, 2024

Kaliz said:
at DDR5-12000 but i dont have high hopes for the future in regards to this

DDR 12000 2400 fclk trust 🤣

J7SC · Aug 18, 2024

'In the olden days' (2018), I built a TR 2950 X system; it had not-so-hot inter-CCX latency - HOWEVER, one could switch the TR 2950X Numa modes which gave some big improvements in latency. Something similar for the Granite Ridge parts could be useful, no ?

Kaliz · Aug 18, 2024

maybe one day i can beat my 12900KF YC pi 1b but i need 1,2 seconds or so and better cooling

+unlocked PMIC

kaliz`s 15 sec 333 ms score: y-cruncher - Pi-1b with aCore i9 12900KF (8P)

The Core i9 12900KF (8P) @ 5500MHzscores getScoreFormatted in the y-cruncher - Pi-1b benchmark. kalizranks #null worldwide and #null in the hardware class. Find out more at HWBOT.

hwbot.org

probably can scave it under 15s this was just testing my new kit 7400C36 patriots

chewonthis · Aug 18, 2024

Kaliz said:
maybe one day i can beat my 12900KF YC pi 1b but i need 1,2 seconds or so and better cooling

+unlocked PMIC

kaliz`s 15 sec 333 ms score: y-cruncher - Pi-1b with aCore i9 12900KF (8P)

The Core i9 12900KF (8P) @ 5500MHzscores getScoreFormatted in the y-cruncher - Pi-1b benchmark. kalizranks #null worldwide and #null in the hardware class. Find out more at HWBOT.

hwbot.org

probably can scave it under 15s this was just testing my new kit 7400C36 patriots

I'm a karhu addict I just want to match my Intel 450mb/s output. Is that to much to ask? 🤣

Blameless · Aug 18, 2024

y-cruncher is bandwidth limited, but y-cruncher is an outlier. What real-world tasks are going to perform better on two four-core CCXes than one eight-core?

J7SC said:
'In the olden days' (2018), I built a TR 2950 X system; it had not-so-hot inter-CCX latency - HOWEVER, one could switch the TR 2950X Numa modes which gave some big improvements in latency. Something similar for the Granite Ridge parts could be useful, no ?

You can set most any Ryzen setup to split the CPU into NUMA nodes based on L3 caches. It doesn't do anything to inter-CCX latency, it just convinces most OS schedulers to keep within a node as much as possible.

Older Threadrippers had different memory controllers on different dies and using NUMA could reduce memory latency because anything running on one CCX wouldn't touch a non-local memory controller unless absolutely necessary. This doesn't apply to any AM4 or AM5 CPU because none of them have this topology.

Kaliz · Aug 18, 2024

this is on a slow version of windows need W11 for it, but i was trying some stuff yesterday with just PBO+200 and -45 all core.
Once i set static OC for example 5600 and 1,35v its way slower. And need more cooling for 5675+ static.

chewonthis · Aug 18, 2024

Blameless said:
y-cruncher is bandwidth limited, but y-cruncher is an outlier. What real-world tasks are going to perform better on two four-core CCXes than one eight-core?

You can set most any Ryzen setup to split the CPU into NUMA nodes based on L3 caches. It doesn't do anything to inter-CCX latency, it just convinces most OS schedulers to keep within a node as much as possible.

Older Threadrippers had different memory controllers on different dies and using NUMA could reduce memory latency because anything running on one CCX wouldn't touch a non-local memory controller unless absolutely necessary. This doesn't apply to any AM4 or AM5 CPU because none of them have this topology.

Y cruncher just prefers latency.

I can still get it fast with 2x ccd @ 1/1

Kaliz · Aug 18, 2024

chewonthis said:
I'm a karhu addict I just want to match my Intel 450mb/s output. Is that to much to ask? 🤣

everybody his own

My addiction is setting my ram and cpu so on watercooling its fast enough to compete (sometimes). I see it as pretesting who knows what to come

You already upgraded your cooling @-35C thats a dream for me.

chewonthis · Aug 18, 2024

Kaliz said:
View attachment 2669638

this is on a slow version of windows need W11 for it, but i was trying some stuff yesterday with just PBO+200 and -45 all core.
Once i set static OC for example 5600 and 1,35v its way slower. And need more cooling for 5675+ static.

Yes those were my findings and why I felt need to mention it when you beat my single ccd 1b.

Unfortunately until AMD fix agesa and stops thinking I'm overheating and throttling back my CPU I can't optimize anything sub zero for actual "performance". So I have just used it to gauge how I want to run benchmarks based on what gains I can measure while static. Static is at least good for that. When static nothing jumps around. A tune is faster or it isn't.

That wasn't an upgrade kaliz. That thing has been just dusted off and ghetto mount adapted. I had it way back on deneb 😉

J7SC · Aug 18, 2024

Kaliz said:
View attachment 2669638

this is on a slow version of windows need W11 for it, but i was trying some stuff yesterday with just PBO+200 and -45 all core.
Once i set static OC for example 5600 and 1,35v its way slower. And need more cooling for 5675+ static.

...you can borrow mine -this is the 8000 one, have an extra for 8200

@Blameless - yeah, on my TR 2950X the latency gain via NUMA/UMA was something like (-)12 ns, mid 70s to low 60s

chewonthis · Aug 18, 2024

J7SC said:
...you can borrow mine -this is the 8000 one, have an extra for 8200

View attachment 2669640

@Blameless - yeah, on my TR 2950X the latency gain via NUMA/UMA was something like (-)12 ns, mid 70s to low 60s

V cache CPU...try non vcache vs non vcache.

Also we are talking 8c rankings.

I'm already 10.777 16c untuned just flipping benches back to back

Kaliz · Aug 18, 2024

J7SC said:
...you can borrow mine -this is the 8000 one, have an extra for 8200

View attachment 2669640

@Blameless - yeah, on my TR 2950X the latency gain via NUMA/UMA was something like (-)12 ns, mid 70s to low 60s

mate if you lower this score by a 600ns its a topscore already for that part , but the 9950X ES i saw Dom getting 10,8s!

Domdtxdissar`s 10 sec 804 ms score: y-cruncher - Pi-1b with aRyzen 9 9950X

The Ryzen 9 9950X @ 5400MHzscores getScoreFormatted in the y-cruncher - Pi-1b benchmark. Domdtxdissarranks #null worldwide and #null in the hardware class. Find out more at HWBOT.

hwbot.org

chewonthis · Aug 18, 2024

Kaliz said:
mate if you lower this score by a 600ns its a topscore already for that part , but the 9950X ES i saw Dom getting 10,8s!

Domdtxdissar`s 10 sec 804 ms score: y-cruncher - Pi-1b with aRyzen 9 9950X

The Ryzen 9 9950X @ 5400MHzscores getScoreFormatted in the y-cruncher - Pi-1b benchmark. Domdtxdissarranks #null worldwide and #null in the hardware class. Find out more at HWBOT.

hwbot.org

About to put retail under ice. Maybe I'll try to not run default 1b config and tune it 😁

Azure Rectangle Font Handwriting Parallel

Blameless · Aug 18, 2024

Regarding that ARdPtrInitValMP0/1 setting, it's been around since AM4.

I have it set to "0" on all my saved AMD nvram dumps going back to at least my 3900X. IIRC, I first noticed it in an MSI board that exposed a bunch of settings that MSI gave silly brand specific names and I noted that it improved memory benches slightly, so started looking for it in my other boards (it's hidden on most of them but editable via AMISCE or other tools). It's also enabled on my 7800X3D system, which explains why I haven't been seeing any advantage to 2:3 ratios on any of my AM5 setups.

Kaliz · Aug 18, 2024

chewonthis said:
About to put retail under ice. Maybe I'll try to not run default 1b config and tune it 😁

View attachment 2669644

nice stuff man excellent vddp

Kaliz · Aug 18, 2024

this CPU range has AI right?

Downloading Geekbench AI for Windows

www.geekbench.com

new benchmark (also on hwbot)

[Official] Zen 5 Owner's Club (9600X / 9700X / 9900X / 9950X)

Blameless

Kaliz

chewonthis

Kaliz

chewonthis

J7SC

Kaliz

kaliz`s 15 sec 333 ms score: y-cruncher - Pi-1b with aCore i9 12900KF (8P)

chewonthis

kaliz`s 15 sec 333 ms score: y-cruncher - Pi-1b with aCore i9 12900KF (8P)

Blameless

Kaliz

chewonthis

Kaliz

chewonthis

J7SC

chewonthis

Kaliz

Domdtxdissar`s 10 sec 804 ms score: y-cruncher - Pi-1b with aRyzen 9 9950X

chewonthis

Domdtxdissar`s 10 sec 804 ms score: y-cruncher - Pi-1b with aRyzen 9 9950X

Blameless

Kaliz

Kaliz

Downloading Geekbench AI for Windows

Top Contributors this Month

Recommended Communities