Originally Posted by 1usmus
Why will not work 1: 1 (UCLK: MEMCLK) on current generations of processors:
1) UCLK is not able to work with adequate voltage at frequencies above 1900-2000 MHz, respectively, we can not overclock the RAM more then 1900-2000mhz, we lose bandwidth and improve data access delays
2) The internal interfaces are designed for specific frequencies and have certain signal / noise tolerances.
3) Separate components of the memory controller (buffers and caches) no overclocking options. They also have design limitations and they are bottlenecks. Confirmation of this high frequency memory, which do not bring use (over 3533)
At the moment, I'm afraid to predict something because the limiting IF frequency in UCLK mode == MEMCLK may limit the overclocking of RAM. Accordingly, we will lose memory bandwidth and delays will worsen. In this case, we will not have enough two memory channels even for a single-chip configuration.
About the new communications interface, the broader one you wrote about. This is true, they say about it :
* CAKE CRC perf bounds Control
* CAKE CRC perf bounds
* ACPI SLIT Distance Control
* ACPI SLIT remote relative distance
* ACPI SLIT virtual distance
* ACPI SLIT same socket distance
* ACPI SLIT remote socket distance
* ACPI SLIT local SLink distance
* ACPI SLIT remote SLink distance
* ACPI SLIT local inter-SLink distance
* ACPI SLIT remote inter-SLink distance
But the width of the interface, as you wrote above at identical frequencies, will not bring an improvement in the delays, because fine tuning timings will remain.
A confirmation of this will also be published on TechpowerUP
What I am talking about in regards to Valhalla/Zen 2, though, is the changes that have been made. "One interesting detail AMD disclosed with their GPU announcement is that the infinity fabric now supports 100 GB/s (BiDir) per link. If we assume the Infinity Fabric 2 still uses 16 differential pairs as with first-generation IF, it would mean the IF 2 now operates at 25 GT/s, identical to NVLink 2.0 data rate. However, since AMD’s IF is twice as wide, it provides twice the bandwidth per link over Nvidia’s NVLink." https://fuse.wikichip.org/news/1815/...zen-2-details/
This suggests that there are non-trivial improvements to the Infinity Fabric for implementation, where the bandwidth supported doubled. Now, pointed out in this quote is that they assume the same number of differential pairs exist as compared to first generation for part of their calculations.
So, moving forward, I'm using publicly available data for discussing what may come in the future generation, not current generation, chips.
1) There is a voltage/power issue with infinity fabric. I cannot find the source at the moment, but someone did an analysis of power draw of the data fabric relative to package power and estimate core power to show that the data fabric is a large power hog. In addition to voltage and power, that also means that the data fabric will contribute to the heat on the chip and eat into the TDP, depending on how calculated, meaning keeping it cool is another issue that needs considered. In arguendo, that means that if they accomplished doubling the bandwidth by doubling the speed through changing the ratio so that IF2 can run at the same speed of the DDR, rather than using the divisor of 2, there are still questions on how the reduction of power was accomplished and the heat levels generated by the data fabric. This means they may have found a way around the effective cap that you mentioned. But the point on limiting memory overclock is still valid (more on that in a moment).
2) This is true, it is designed for certain frequencies and signal to noise ratios. In fact, in light of discussing point one, I mentioned that IF gen 1 was a power hog, which then translates into heat. Heat can effect the signal to noise ratios in specific scenarios, as thermal radiation can degrade and decay the signal integrity of the data.
3) Great point. And this suggests to implement this with Zen 2, AMD would have had to address those areas in the design and testing phase to allow for a 1:1 setting, where current gen CPUs could not accomplish that, whether it be the cache or timings that were implemented for a 1:2 ratio to optimize that setting or the power requirements or the signal integrity. They may even have it so that when the divisor is changed, it automatically switches to a pre-determined set of timings and settings for the UMC to allow for it to work (like slightly loosening timings due to the speed being higher, which the lower timings would be too aggressive at the higher speed).
As we both have mentioned in this discussion, if AMD were to achieve double the frequency and bandwidth with the IF gen 2, it would not be a trivial undertaking. There are many notable changes that would need made, and would be an impressive evolving of the data fabric.
Even if accomplished, your point is still valid that by using UCLK == MCLK could limit memory frequencies, thereby limiting the ram overclock which loses bandwidth and increases latency. Let's say that the rumor is true that the officially supported memory speed increases to 3200MHz, something rumored awhile ago, but that no new information or speculation on this front has been given for 3-5 months. If the IF2 has been tuned to support in 1:1 mode 3200MHz, then it may not allow too much speed beyond this. That could leave low 3000MHz speeds as necessary to use 1:1 mode, where higher frequencies would not be possible.
Now, AMD, by moving the memory controller to the I/O die, should be able to bin the performance of the memory controller and I/O die overall. That can help to achieve the higher official memory speed. But, that also means that, in certain scenarios, like mainstream chips with dual channel memory support which requires less bandwidth than potentially that of Threadripper or Epyc, it may be more beneficial to use 1:2 mode to allow the memory to hit high 3000 to 4000MHz on the binned I/O dies, as the benefits of using 1:1 may not scale, while also reducing the frequency selected, thereby effecting memory bandwidth and latency. It would also increase the power draw, causing more heat, etc.
On the other hand, for Threadripper and Epyc users, with quad channel and octochannel memory configurations, achieving higher speeds is a bit more difficult, and they have more potential memory bandwidth than the mainstream CPUs. I would argue this is where the true benefit of a 1:1 setting would lie.
This has a couple implications:
A) the new divisor likely is not going to be backwards compatible with current generation offerings from AMD,
B) the new divisor may not be the best setup for mainstream CPUs on the new generation, depending on different factors, and
C) when overclocking with the new processors, if the above changes were made to allow lower power draw and higher frequencies, a person may want to explore the trade offs of using 1:1 vs 1:2 relative to the stable function of memory and IF, in the event that memory overclocking is able to achieve higher frequencies due to binning of I/O dies.
So I must admit, I have numerous embedded assumptions about what would have had to have taken place for this to be achievable on Zen 2. It also shows why the function of data fabric is important, and why Intel and Nvidia are bidding for the purchase of Mellanox for their IP.
Sorry if I missed something. Just waking up with my morning coffee, so forgive if my morning cloudiness effected some element of this discussion. If nothing else, this shows how much we still don't know about the upcoming platform.
And I'm looking forward to seeing that article!