3rd gen. Threadripper CPUs have a power management related quirk, which we should discuss about.
Unlike the 3rd gen. Ryzen AM4 CPUs, 3rd gen. TR CPUs do not use telemetry for their power management related decisions.
AM4 CPUs base their decisions (i.e. know their current power consumption) on the current & voltage telemetry (SVI2 TFN), sourced from the motherboard VRM controller.
Meanwhile on the 3rd gen. Threadripper CPUs, this is not the case.
Instead of relying on external telemetry, the power management decisions are based solely on internal calculations (voltage, frequency, utilisation, etc).
I'm not sure what are the reasons behind this configuration, but I'd expect it to have something to do with TRs close relation to Rome EPYC (which uses cLDOs for the main planes) or possibly to
the limitations of the AMD SVI2 standard itself.
In practice, this can affect the users in couple different ways. In my case, even at stock, the power consumption seen by the CPUs power management is slightly inflated, roughly by 6-11% (depending on the workload).
Because the CPU is pegged against the default 280W power limit in most properly multithreaded workloads, this means that some performance is lost.
Granted, the CPU I'm using is an engineering sample so the inflated power reporting at stock can be an ES related anomaly which sometimes do exist.
However, the next issue will affect anyone how has plans to undervolt the CPU. While undervolting 3rd gen. Ryzens generally isn't recommended, at least in my case there is a ~ -50mV undervolting margin to be had, without any performance penalty in any workload.
The issue with this internal power management mode with undervolting is, that unlike with telemetry the CPU cannot "see" it. Despite the undervolting the CPU still calculates the voltage based on its internal targets / models and doesn't acknowledge that the voltage has reduced.
Because of that, the power consumption seen by the CPU doesn't change either. There will usually be a small performance boost from undervolting, but thats due to the lowered temperature and nothing else.
Here is an example:
This is at stock, with the default 280W PPT, 215A TDC and 300A EDC limits. PPT and EDC are constantly pegged to their limits.
Despite the PPT is reading ~ 280W constantly during the workload, the actual measured power consumption for the whole package is 263.178W on average.
The average CPU voltage is 1.15516V.
Now, everything else remains identical, but a negative offset of 50mV has been applied for the CPU.
The average, actual power consumption has reduced to 243.809W, yet the CPUs power management still sees the same power draw as before.
You can also see that the CPU voltage has also reduced to 1.11357V and that the CPU temperature has reduced by 3.25°C.
Reducing the CPU voltage by -50mV improved CB20 NT performance by ~ 40, which is far less than expected improvement. In this case the improvement is solely
caused by the lowered temperature (which the CPU acknowledges).
So how to get the full advantage of undervolting then?
As I've said before, the most recent version of HWInfo has complete support for Zenith II Extreme.
Since it is able to report the output currents and powers for both of the domains accurately, you can make adjustments based on the observed delta.
Prior undervolting, write up the averages of "CPU Core Power (SVI2 TFN)" and "CPU Core Current (SVI2 TFN)" during a sustained and stable workload (e.g. 5 minutes of Blender).
Apply the undervolt and repeat the same procedure for these two values.
When you have both sets of these values, go to the "Precision Boost Override" menu in the bios and select "Manual".
The new PPT limit will be 280W + the difference between the stock and undervolted "CPU Core Power (SVI2 TFN)" power. TDC will be 215A + the difference between the stock and the undervolted "CPU Core Current (SVI2 TFN)".
EDC can be set to reflect the same percentual difference as TDC was increased by, added to the default 300A value. PBO Scalar should be manually set to 1x to avoid it changing from the stock, when manual PBO mode is used.
So in my case, the power consumption of the CPU cores themselves (CR Pout) reduced 197.329W to 178.302W (19.027W) and the current for the CPU cores (CR Iout) reduced from 170.813A to 160.329A (10.484A).
Hence, my new PPT limit became 299W (instead of 280W), TDC limit became 226A (instead of 215A) and EDC limit became 315A (300 * (226/215)).
Doing the whisky-tango calibration described above, Cinebench R20 NT score improved by ~300pts with the -50mV undervolt, while maintaining pretty much the same power consumption as stock.