Overclock.net - An Overclocking Community

Overclock.net - An Overclocking Community (https://www.overclock.net/forum/)
-   AMD CPUs (https://www.overclock.net/forum/10-amd-cpus/)
-   -   Strictly technical: Matisse (Not really) (https://www.overclock.net/forum/10-amd-cpus/1728758-strictly-technical-matisse-not-really.html)

The Stilt 07-07-2019 06:01 AM

Strictly technical: Matisse (Not really)
 
07/08/2019 6:33 PM (GMT) - Update on the bios issue on Crosshair VIII Hero motherboard ("the thing").

Earlier today I received a response to my inquiries from ASUS. The response was rather technical and I cannot go into the specifics of what exactly it involved.
However, it confirmed my suspicions of what actually has caused the seen anomalies. A long story short; a mistake has been made and it has affected the results of multiple reviewers, including my own. In my own case, I ended
up discarding my own affected multithreaded results alltogether, before even releasing them. I'm still angry because of a lot of my own and other peoples work went to waste because of it. But like I said, mistakes do happen.
In this case all of the evidence and known facts suggest that this was indeed a mistake, caused by an extremely tight schedule and miscommunication between several different parties. Infact, all of the facts I can personally verify
indicate that despite the rather suspicious way this mistake happened, there never was any malicous intent involved.

ASUS also provided me a new bios versions for both Crosshair VIII Hero and Formula boards, which correct the mistake made in newer than the AMD approved 0066 bios builds.
Based on my own testing done on the 3900X SKU, the CPU now meets its specification in terms of the allowed power consumption (same way, as the approved 0066 build did). The new build has currently not been validated, so
it will take some time until its changes get reflected to builds available to the larger audience.

What kind of effects will the fixed bioses have then?

Based on my own testing (do note that silicon variation exists and that the sample size is one for 3900X):

- ~ 27W lower average power package power consumption (VDDCR_CPU & VDDCR_SoC, i.e. the main power rails)
- 7°C lower temperature (tDie, while using DeepCool Assassin II cooler)
- < 90MHz average frequency loss across all twelve cores in MT workloads

The above figures were recorded during Blender 2.80b runs, but they should translate almost directly to Cinebench R20 NT as well (based on my experience).

The peak power difference between the faulty and the fixed bioses is around 35W (Prime95).

Despite there is no question that a mistake was made, I'd still like to thank ASUS for two specific reasons: they didn't try to deny the existence of the issue (which btw. is the usual reponse within the industry), but also fixed it immediately.
I also do feel bad for the bios engineer, who had to stay over(over)-time to get the bios build done. Thanks for that. I also have to feel bad for ASUS, because this mistake might have smirched the reputation of their brand new Crosshair VIII -series motherboards.
And make no mistake, these are one of the best, if not the best X570 boards available at the market (a personal opinion).

At this point you should ask yourself if ASUS paid me off?
Everyone can be bought, its just the matter of the offered sum or bargain. Everyone claiming otherwise either lives in self-deception or frankly, is a moron.
I myself could definitely be bought. And rather cheaply too, I think. The thing is, just that at least until writing this, nobody has even tried to do so.

Besides of this statement, I also corrected an error AMD pointed out to me.
Despite the 3900X CPU has fused (factory programmed) Fmax ceiling of 4.65GHz, AMD only advertises 4.60GHz maximum boost.
I must admit that I was initially surprised to see the 3900X having 4.65GHz fused maximum boost limit, since AMD indeed only mentions 4.60GHz in their marketing materials.
Nevertheless, I'm yet to reach the advertised 4.6GHz either, so in that regard the only thing which changes is the CPU falling 25MHz short instead of 75MHz short of its advertised frequency.

-------------------------------------------------------------------

First and foremost, a word of warning. When reading ANY of the AMD Ryzen 3000-series "Matisse" launch-day reviews, the first thing you should do is navigate to the page which lists the hardware setups.
AMD supplied four different motherboards to the media, one from ASRock, ASUS, GIGABYTE and MSI. In case of the ASUS Crosshair VIII Hero Wi-Fi motherboard, the media was instructed to use 0066 bios build,
which had been vetted and approved by AMD. However, newer bios builds were available and ASUS has also (allegedly) told the media to use those versions. What exactly has transpired here is still under investigation,
but regardless of the actual reasons behind it, the consequences might be rather significant. In practical terms, all reviews which were done on ASUS Crosshair VIII Formula or Hero motherboards using other than 0066 bios build must
be considered invalid, at least partially. Reviews using other ASUS motherboard models (not provided by AMD) are under suspicion as well.


Few days ago, I noticed certain anomalies, while measuring the power consumption of the different Matisse SKUs. Inspection of the power management parameters revealed no issues, which could have explained those anomalies.
The external power measurements (VRM DCR) revealed that the CPU was consuming significantly more power, than its power management should have allowed it to. I initially suspected that this was AMDs own doing, in an effort trying
to boost the performance of the new CPUs even further, but further investigation indicated otherwise.

AMD had no part in it, and the actions by ASUS are the sole reason behind it. The investigation revealed that ASUS is altering one or more power
management parameters of the CPU, causing it believe it consumes less power than it actually does. As a result, the frequencies will be higher than the actual power budget would normally allow to. Tricks like this are pretty much a common (mal)practice these
days however, there is a good reason why this must be considered worse than the others: this "thing" is completely undetectable without external measurements and rather deep knowledge, but also there is no way to disable it either.
Even a person such as myself, who can control most things on these platforms cannot disable this "thing". As you may notice, at the moment I call this issue the "thing", since I'm giving ASUS the benefit of a doubt.

The release schedule of Ryzen 3000-series CPUs was rather ridiculous to begin with for two reasons. The retail (or PR, production ready) silicon has been available for at least two months, and relatively finished motherboard designs even longer than that.
Yet AMD had decided to enforce EXTREMELY strict control (NLTR, nothing leaves the room) over the silicon samples. I could have had several different X570 motherboard models months ago, but I managed to lay my hands on the first CPUs just three weeks ago (give or take). The actual CPU samples were distributed to the media just six days prior to the launch date.

Due to the extremely tight schedule, I have worked around 16 hours per day, for the last couple of days. There is nothing I hate more in this world than seeing my work being wasted.
This time a substantial part of it was wasted because of something I had no control over. Unless ASUS can clearly prove that this "thing" happened due to a human error and wasn't intentional, I have to reconsider my relations with them.
Mistakes do happen, but regardless of the actual reasons behind it, it definitely shouldn't have happened.

Despite AMD instructed the media to use the approved 0066 bios build with Crosshair VIII Hero, at the moment I have no idea how many of the reviewers ended up following those instructions and how many thought it would be a good idea to use the latest build (which in case of a new platform, most often is). Potentially this "thing" might have caused significant financial losses as well, in terms of additional salaries required to get the products re-tested with proper settings.

So then, what is affected? Technically every scenario on every Ryzen 3000-series SKU, which might be power limited. Purely single threaded workloads are fine, as well as at least most of the pure gaming tests.
However, every multithreaded CPU workload / benchmark must be considered invalid, if ASUS Crosshair VIII Hero with any other than 0066 bios version was used as the platform.

I used Crosshair VIII Formula for my tests, and since this model wasn't supplied to the media by AMD, there was no "official" (i.e vetted and approved) bios build for it either.
In my case I ended up discarding all of my multithreaded results. Since the Ryzen 3000-series multithreaded results were invalid, there was no point in keeping the multithreaded results for the other platforms either.
Since single threaded workloads are never power limited, these results were fine. In case of testing the SMT-yield on different architectures, the power limits were disabled anyway to avoid any potential biasing, so these results are included as well.

I originally intended to provide a lot more, but unfortunately the reality is that there was never enough time to do it all. The various different issues on several platforms and the "thing" (which was confirmed only yesterday) didn't help things either.
Also the issues with AGESA cross-compability also prevented testing the SMT-yield on Pinnacle Ridge. Because of that, I only provide the figures for Matisse, Coffee Lake Refresh and Skylake-X.

Test setups

AMD Ryzen 7 2700X (IPC / SMT = 3.800GHz fixed)
ASUS ROG Crosshair VIII Formula (Bios 0605, µCode 0x0800820D 4/16/2019, SMU 43.22)
2x16GB Corsair LPX 3333C16 running at 2666MHz 12-12-12-28 (IPC)
Deepcool Assassin II cooler
Windows 10 Education 10.0.18362.175

AMD Ryzen 7 3700X (IPC / SMT = 3.800GHz fixed, SKU-SKU ST up to 4.40GHz)
ASUS ROG Crosshair VIII Formula (Bios 0605, µCode 0x08701013 6/11/2019, SMU 46.37)
2x16GB Corsair LPX 3333C16 running at 2666MHz 12-12-12-28 (IPC & SMT), 3200MHz 14-14-14-32, 1:1 FCLK (SKU-SKU ST)
Deepcool Assassin II cooler
Windows 10 Education 10.0.18362.175

AMD Ryzen 7 3900X (SKU-SKU ST up to 4.6GHz)
ASUS ROG Crosshair VIII Formula (Bios 0605, µCode 0x08701013 6/11/2019, SMU 46.37)
2x16GB Corsair LPX 3333C16 running at 2666MHz 12-12-12-28 (IPC & SMT), 3200MHz 14-14-14-32, 1:1 FCLK (SKU-SKU ST)
Deepcool Assassin II cooler
Windows 10 Education 10.0.18362.175

i9-9900K (IPC / SMT = 3.800GHz fixed, SKU-SKU ST = 1C = 5.00GHz, 2C = 5.00GHz as defined by the fuses), ring offset -3.
ASUS ROG Strix Z390-E Gaming (Bios 1005, modified with µCode 0xBE 5/17/2019 includes all available mitigations, ME FW 12.0.40.1433)
All scenarios: 2x16GB Corsair LPX 3333C16, running at 2666MHz 12-12-12-28
Deepcool Assassin II cooler
Windows 10 Education 10.0.18362.175

i9-9920X (IPC / SMT = 3.800GHz fixed, SKU-SKU ST = 1-2C = 4.5GHz (TBM, SSE), 1-2C = 3.9GHz (AVX2), 1-2C = 3.7GHz (AVX512) as defined by the fuses), mesh 2.4GHz.
ASUS ROG Rampage VI Apex (Bios 1705, modified with µCode 0x2000005E 4/2/2019 includes all available mitigations, ME FW 11.11.65.1590)
All scenarios: 4x16GB Corsair LPX 3333C16, running at 2666MHz 12-12-12-28
Deepcool Assassin II cooler
Windows 10 Education 10.0.18362.175

The IPC

For the first time in over a decade, AMD has reached IPC parity with Intel.
On average, based on the results of 32 individual workloads Zen 2 even manages to provide slightly higher average IPC than Coffee Lake-S Refresh.
Thanks to it AVX-512 resources Skylake-X manages to stay a head in this test suite however, not by a large margin.



Individual results: https://imgur.com/a/AonND9l

NOTE: The gallery link has been updated on 7/9/2019 due to a following reason: In case of the tonemapping test, I've misunderstood the actual performance restrictions of the chain.
The original title of the tonemap chart was "ZIMG 2.91" however, the author pointed out to me that ZIMG itself is not the bottleneck in this case. Therefore the title of the chart has been changed from ZIMG 2.91 to FFMpeg 4.14.
The results (in any regard) are unchanged. The original and mislabeled gallery can be seen here: https://imgur.com/a/LeuwqnD for reference purposes only.

"ER" (Extremities removed):

Pinnacle Ridge - Coffee Lake SR = Particle Force (Hi), Vampire Numbers (Lo)
Pinnacle Ridge - Skylake-X = Linpack (Hi), Vampire Numbers (Lo)
Pinnacle Ridge - Matisse = Particle Force (Hi), Vampire Numbers (Lo)

The SMT-yield



Individual results: https://imgur.com/a/bUgp153

SKU vs. SKU results



Individual results: https://imgur.com/a/y4HAZPF

NOTE: The gallery link has been updated on 7/9/2019 due to a following reason: In case of the tonemapping test, I've misunderstood the actual performance restrictions of the chain.
The original title of the tonemap chart was "ZIMG 2.91" however, the author pointed out to me that ZIMG itself is not the bottleneck in this case. Therefore the title of the chart has been changed from ZIMG 2.91 to FFMpeg 4.14.
The results (in any regard) are unchanged. The original and mislabeled gallery can be seen here: https://imgur.com/a/otWpc5H for reference purposes only.

"ER" (Extremities removed):

3700X-9900K = Stockfish (Hi), Vampire Numbers (Lo)
3700X-9920X = Linpack (Hi), Eigen (Lo)
3700X-3900X = Vampire Numbers (Hi), Lame (Lo)

A word regarding the "Auto Overclocking" feature...

The new "auto overclocking" feature, which is advertised with up to 200MHz frequency increase, in reality does close to nothing, at least on higher-end SKUs.
The lower-end SKUs, such as Ryzen 5 3600 definitely get some advantage however, the higher-end SKUs such as the 3700X and 3900X can be completely maxed out simply by increasing or removing the power limit (through PBO).
These SKUs are already clocked so high that further frequency improvements theoretically made possible by the "Auto OC" feature are disallowed by the silicon fitness monitoring feature (FIT), due to the required voltage for higher frequencies being too high. For instance,
on the 3700X test sample the best core of the CPU raises its frequency by 25MHz when the highest 200MHz option is selected. The rest of the seven cores remain at their default frequency, which varies between 4.35GHz and 4.375GHz.
Meanwhile the 3900X, which has stock max boost of 4.6GHz, there are no gains what so ever. In fact, none of the cores within this CPU even reach the advertised 4.6GHz. The two best cores reach 4.575GHz, while the ten other cores reach 4.325 - 4.4GHz peak. The variation between the different cores even on the same piece of a silicon appears to be huge, which would indicate that the process isn't very mature at this point. Even AMD themselves state in their slides that the frequencies are limited by the voltage they can safely feed to the CPU.

The overclocking capabilities

Essentially, if we're talking about the higher-end SKUs, there is basically none.
Based on my experience, the best case of scenario on 6C CCDs (3600, 3600X and 3900X) is around 4.25GHz, at relatively safe voltage levels.
In case of 3900X, given that you can cool the chip with two of those 6C CCDs. SKUs with 8C CCDs (3700X, 3800X and 3950X) the best case is around 4.15GHz. The 3950X is expected to be thermally limited, as a whole.
The biggest limit is the intensity (heat per area), secondly the voltage you can safely feed to the silicon. For example, the 9900K which has a reputation of being an inferno, has theoretical intensity of ~1.15W/mm² when operating at 5.0GHz (200W @ 174mm²).
Meanwhile Matisse can easily reach intensity of > 1.5W/mm² (120W+ @ 74mm²). The second issue is, that beyond ~3.8GHz the V/F curve becomes extremely steep. According to FIT, the safe voltage levels for the silicon are around 1.325V in high-current loads
and up to 1.47V in low-current loads (i.e ST), depending on the silicon characteristics. Because the stock boost operation is already limited by the silicon voltage reliability, the only way to eke out every last bit of all-core performance is using OC-Mode. Like on previous Ryzen generations, entering OC-Mode also means that you will loose the turbo boost (all cores operate at same frequency). On the higher-end SKUs, the single threaded performance penalty will be massive from doing so. For example on 3900X, you'd be trading additional ~100MHz all-core frequency to a loss of up to 450MHz in ST frequency by doing so. Personally, I advice against overclocking the higher-end SKUs at all, and instead increasing the power limits and trying your luck with the "Auto OC" feature (which most likely isn't beneficial).

The V/F testing was done using full resource utilization (FRU), meaning the stability was tested using 256-bit workloads.
Unlike Intel designs, Matisse does not feature an offset for 256-bit workloads. This means that to ensure the stability of the CPU cores in every scenario, they must be tested using this kind of a workload.
On Matisse, the delta in power consumption between the scalar and 256-bit vector instructions is massive, as expected (37%). That being said, there seems to be other design related factors limiting the maximum achievable frequency.
Despite significantly lower power consumption and therefore also lower temperatures, stability even in pure scalar workloads could not be achieved at much higher frequencies, compare to FRU scenario.



Performance per Watt

As expected, Matisse provides significantly higher performance per watt than its competition, thanks to its leading edge 7nm manufacturing process. Some of you might notice that Matisse's power efficiency seems to peak at 3.5GHz, despite the fact that semiconductors do not behave like that. The reason behind this was revealed by Vmin testing, which clearly illustrated that Matisse lacks fused V/F (voltage-frequency) curve below 3.4GHz. This means that below 3.4GHz frequencies the voltage is always at the same level, it is at 3.4GHz. The stock (fused) V/F curve appears to be extremely well optimized as well, leaving only the temperature factor on the table.


The Stilt 07-07-2019 09:04 AM

IPC and SMT yield figures in, SKU-SKU ST results later.

lordzed83 07-07-2019 12:57 PM

Quote:

Originally Posted by The Stilt (Post 28029498)
IPC and SMT yield figures in, SKU-SKU ST results later.

Fantastic wrightup. Cant wait what can 3900x run at on my cooling. 2700x survived 14 months of constant 24/7 load with 1.425vcore :)

chakku 07-07-2019 01:20 PM

Awesome writeup and interesting analysis on the overclocking.

Is there a chance we could see some CCD latency testing and compare how data hopping from a core on one chiplet to the other compares to a single chiplet processor? Ie how a theoretical 2CCX/2CCX + 2CCX/2CCX 8 core processor would fare against the 3700X's 4CCX/4CCX? (I know this is probably outside the scope of your analysis but it's worth a try since Matisse does change the layout entirely)

Luminair 07-07-2019 02:31 PM

Great work as usual, thanks for investigating the motherboard mystery.

One of the reviewers lost their 3900x while overclocking, so I think it's safe to say that you're right and AMD has left no room to spare.

The Stilt 07-07-2019 02:34 PM

Quote:

Originally Posted by chakku (Post 28030026)
Awesome writeup and interesting analysis on the overclocking.

Is there a chance we could see some CCD latency testing and compare how data hopping from a core on one chiplet to the other compares to a single chiplet processor? Ie how a theoretical 2CCD/2CCD + 2CCD/2CCD 8 core processor would fare against the 3700X's 4CCD/4CCD? (I know this is probably outside the scope of your analysis but it's worth a try since Matisse does change the layout entirely)

I can try testing the latencies tomorrow, I should have a suitable tool somewhere.

chakku 07-07-2019 02:48 PM

Quote:

Originally Posted by The Stilt (Post 28030174)
I can try testing the latencies tomorrow, I should have a suitable tool somewhere.

Awesome, this review appears to suggest that there's no latency difference between CCDs regardless of which chiplet they're on, so it would be good to get another reference point.

Heuchler 07-07-2019 05:39 PM

Quote:

Originally Posted by The Stilt (Post 28030174)
I can try testing the latencies tomorrow, I should have a suitable tool somewhere.

Thanks for doing this. Worth so much more than all the youtube tech guys reviews combined.

datspike 07-07-2019 05:51 PM

Beautiful work.
Incredible how much more information on 3xxz cpus I ve got from your work than from YouTube nonsense

Diablix6 07-07-2019 10:05 PM

Great review, thank you.


As already said, this post provides more for us more relevant info than the most of the youtube / standard press.


I wonder if I will need to get custom loop for planned 3900X/3950X, or if I can safely work with stock cooler, since as you and many other reviewers wrote, OC almost provides no benefit. And paying 400+$ for custom loop seems to be to steep price for 'bout 50MHz boost over stock cooler.


All times are GMT -7. The time now is 02:45 PM.

Powered by vBulletin® Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2020 DragonByte Technologies Ltd.
vBulletin Security provided by vBSecurity (Pro) - vBulletin Mods & Addons Copyright © 2020 DragonByte Technologies Ltd.

vBulletin Optimisation provided by vB Optimise (Pro) - vBulletin Mods & Addons Copyright © 2020 DragonByte Technologies Ltd.