Overclock.net - An Overclocking Community - View Single Post - NEW!!! DRAM Calculator for Ryzen™ 1.7.3 (overclocking DRAM on AM4) + MEMbench 0.8 (DRAM bench)
View Single Post
post #4138 of (permalink) Old 03-18-2019, 01:09 AM - Thread Starter
AMD overclocker/developer
1usmus's Avatar
Join Date: Jun 2017
Location: Ukraine / Germany
Posts: 1,624
Rep: 327 (Unique: 151)
Quote: Originally Posted by jedidude75 View Post
Could you clarify what you mean by #6? If I understand you correctly then this means that there is a 2CCD mode which allows you to link one memory channel to one chiplet and the other to the second chiplet, and then another non-2CCD mode which simply acts like normal, with both chiplets having access to both RAM channels, is this what you mean? If so, is their any benefit to this, or why would they implement it?
This is inside information, not the fact that it is reliable.'
There are 2 scenarios:

1) Each chiplet will receive 2 Unified Memory Controllers, and in the case when there are 2 chiplets in the system, only one Unified Memory Controller will be active for each chiplet.
2) There is no UMC in the chiplet, communication with IO (two UMCs) will occur via the "long-range" link based on serialization (CAKE-> IFOP). That is, all the blocks and interfaces of the Zen architecture will remain but will have a slightly different lineup.

Quote: Originally Posted by ajc9988 View Post
The XFR is actually quite impressive. If I read it right, they added FCLK, which we need to find out what that bus is doing. On Intel Skylake, Anand wrote the following:

The register in question is called the FCLK (or ‘f-clock’), which controls some of the cross-frequency compensation mechanisms between the ring interconnect of the CPU, the System Agent, and the PEG (PCI Express Graphics). Basically this means it is to do with data from the processor to the GPUs. So when data is handed from one end to another, this element of the processor manages the data buffers to allow that cross boundary migration in a lossless way. This is a ratio frequency setting which is tied directly to the base frequency of the processor (the BCLK, typically 100 MHz), and can be set at 4x, 8x or 10x for 400 MHz, 800 MHz or 1000 MHz respectively.

They also now allow for the clock set on the Infinity Fabric (UCLK) to select the divisor, which means we are looking at IF being clocked equal to the memory frequency at dual rate instead of single rate (like 3200MHz instead of 1600MHz), potentially. That has a lot of implications on performance if I'm reading that correctly! EXCITED!!!

Edit: Anyone better with limits in calculus, here is some data points from a pro Intel review company, PCPerspective (Ryan Shrout ran it and Shrout Research and regularly attacked AMD, but the latency of going off CCX was shown by them, although their memory timings were crap and I get lower latency than they ever achieved as a combination of core clock, memory speed and timings, etc.).

Another way would be to test Zen or Zen+ with Sisoft Sandra's test for calculating the latency to see the latency at different memory speeds, then, after that, extrapolate out the expected drop in latency for a speed double the single rate, meaning where the limit is that the curve is approaching as latency is not dropping linearly with the speed increase of the memory controller and therefor the Infinity Fabric. This can show how the bandwidth is double for the upcoming infinity fabric changes due to doubling the speed of the fabric, while the latency improvement would be estimated through this calculation. (math is the reason I dropped from engineering/physics in undergrad; the only way to pass calc II is to have taken calc II (even though calc I can handle this math problem)).

With that information, we can estimate a lot about the upcoming performance increase related to reduced latency, as well as looking at whether there were bandwidth limitations on data related to the IF. Unfortunately, we cannot fully get the picture, but a data point is a data point.
Pinnacle Ridge CPUs also support multiple reference clock inputs. Motherboards which support the feature will allow "Synchronous" (default) and "Asynchronous" operation. In synchronous-mode the CPU has a single reference clock input, just like Summit Ridge did. In this configuration increasing the BCLK frequency will increase CPU, MEMCLK and PCI-E frequencies.

In asynchronous-mode the CPU cores will have their own reference clock input. MEMCLK, FCLK and PCI-E input will always remain at 100.0MHz, while the CPU input becomes separately adjustable. This allows even finer grain CPU frequency control, than the already extremely low granularity "Fine Grain PStates" (with 25MHz intervals) do.

Despite some wild speculation, the asynchronous clocking capability makes no difference to the memory & data fabric (“IF”) frequency relations. These “two” frequencies are permanently tied together in every currently existing Zen design and changing the current topology would require a major overhaul to the foundations of the die.

If there is no UMC in the chiplet, then the delay will increase. Of course, this negative effect can be reduced by applying a new memory controller and new links.

Last edited by 1usmus; 03-18-2019 at 01:20 AM.
1usmus is offline