Overclock.net - An Overclocking Community - View Single Post - NEW!!! DRAM Calculator for Ryzen™ 1.5.1 (overclocking DRAM on AM4) + MEMbench 0.7 (DRAM bench)

View Single Post
post #4139 of (permalink) Old 03-18-2019, 06:54 AM
ajc9988
New to Overclock.net
 
ajc9988's Avatar
 
Join Date: Mar 2015
Posts: 457
Rep: 32 (Unique: 26)
Quote: Originally Posted by 1usmus View Post
Pinnacle Ridge CPUs also support multiple reference clock inputs. Motherboards which support the feature will allow "Synchronous" (default) and "Asynchronous" operation. In synchronous-mode the CPU has a single reference clock input, just like Summit Ridge did. In this configuration increasing the BCLK frequency will increase CPU, MEMCLK and PCI-E frequencies.

In asynchronous-mode the CPU cores will have their own reference clock input. MEMCLK, FCLK and PCI-E input will always remain at 100.0MHz, while the CPU input becomes separately adjustable. This allows even finer grain CPU frequency control, than the already extremely low granularity "Fine Grain PStates" (with 25MHz intervals) do.

Despite some wild speculation, the asynchronous clocking capability makes no difference to the memory & data fabric (“IF”) frequency relations. These “two” frequencies are permanently tied together in every currently existing Zen design and changing the current topology would require a major overhaul to the foundations of the die.

If there is no UMC in the chiplet, then the delay will increase. Of course, this negative effect can be reduced by applying a new memory controller and new links.
Did not realize that was added with Pinnacle Ridge, which is cool. I'm still on a first gen 1950X. But, that addressed the first point.

With UCLK, it still is tied between the MCLK and UCLK, the difference here is the selection of the divisor used. So separate from the sync and async capabilities, the UCLK setting deals specifically with just the divisor.

When Summit Ridge was being worked on, the Stilt mentioned that in MB debug mode, they could change the ratio for MCLK and UCLK so that the uncore, which is the infinity fabric, would run 1:1. In the final product, which carried through to Pinnacle Ridge, the divisor or ratio of MCLK to UCLK was set 2:1. Many found it easier to refer to it as being clocked to what the RAM's single data rate would be, instead of using the double data rate frequency defined by the user. https://forums.anandtech.com/threads.../post-38778725

AMD has already announced that IF, which acts like an uncore, will have double the bandwidth with Zen 2. Aside from other changes that may have occurred to IF, along with the UMC, etc., the easiest way to double bandwidth would be to double the clockspeed. If not adding an independent clock gen for that, and redesigning all of what you mentioned, you could leave them tied together and just change the divisor to what was even capable on first gen Zen in debug mode (obviously, there may have been stability issues with the first gen or two using a 1:1 divisor for the relationship between those components, which is why 2:1 was used in the final products, something that seems resolved with Zen 2). And I'm not saying it was trivial solving the issue, to be clear, rather that it is the most obvious way to approach the issue.

So, as you said, adding in IF between the I/O die housing the UMC and the core chiplets means latency will go up due to having to transverse the IF to get to the UMC. If you are leaving the UMC MCLK and UCLK tied, then changing the divisor would double the speed of the IF, but more importantly lower the latency with the increase in speed. It is like keeping the latency timing the same, but instead of using 1600MHz, you use 3200MHz, so that the real life latency in ns is reduced a fair amount. This is what I believe the setting for UCLK is referring to, the divisor that set IF to half the speed of the memory, which changing that divisor changes the ratio, thereby allowing for the increase in speed, which translates to bandwidth and lower latency.

"XFR Enhancement:
1) FCLK Frequency
2) MEMCLK Frequency
3) UCLK DIV1 MODE:
a) Auto
b) UCLK==MEMCLK
c) UCLK==MEMCLK/2"

For this, I am relying specifically on the selection 3 under XFR enhancements, where (3)(b) would be 1:1 for UCLK to MEMCLK, and where (3)(c) would set it to a 1:2 ratio, which is what was used on Zen 1 and Zen+. This would leave them being tied together intact, instead just changing one factor in the relationship of how they are tied together.

As such, the UCLK would be defining the MEM/IF relationship separate from the sync and async related to core clock and other clocks that would be effected. Does that make more sense why I see the UCLK divisor as separate from your explanation of the sync and async relationship of clocks?

Edit: and this is for the community more than you Yuri-

While looking through a couple other forums, I've seen comments denigrating IF due to its speed being slower than the ringbus on Intel or Mesh on Intel chips. Frequency is NOT what matters, although it points to two factors that do matter: (1) bandwidth, and (2) latency.

Bandwidth is what determines how much data can pass over the connection in a given period of time (where we get Gbps in relation to ethernet, etc.). Each fabric used has a different amount that it is able to transfer, so that speed will effect the bandwidth the fabric carries, but the bandwidth is what is important, not the speed. It doesn't matter what speed is run if the amount of data between the two is equal, which leads to latency then being what helps determine which is faster if the bandwidth is equal.

A good example of this principle is looking at HBM2 vs GDDR. GDDR uses a smaller bus interface, but clocks really fast. HBM2 has a very wide data bus, but runs at much slower speeds. This means that you can get the same amount of bandwidth from each tech, but the frequency each runs on is very different. Because of this, the impact of latency is then considered. Not going to dive in too deep there, but wanted to show that simplistically looking at frequency alone does not necessarily give an idea on how something functions.

Desktop (Click to show)
Asus Maximus VIII Extreme, G.Skill 4133MHz - Running 3733MHz (MB limitation; I've had 4000, but...)@[email protected] 14-17-17-36-280 with Corsair CMXAF2 Ram fans, EVGA 980 Ti Classified (OC specs to come), [email protected]@1.42V everyday stable, Thermaltake Water 3.0 Ultimate AIO (push/pull 6xNoctua iPPC 3000), EVGA 1600 Ti PSU (for later SLI setups), Thermaltake X9 Cube, Intel 600P NVME
Laptop (Click to show)
P770ZM / Chi Mei 1080p 17.3" TN matte - OC to 100Hz / Intel Core i7 4790K - 4.5GHz 1.130V Static / Nvidia Geforce 980M 8GB (current OC limit of 330MHZ Core/410MHZ VRam/ [email protected]) / Kingston HyperX 32GB DDR3 [email protected] CR-2T custom SPD / Samsung 850 EVO 500GB M.2 / Intel Wifi Dual Band 7265ac+BT / Win 7 Pro x64 - custom by NTLite / 330W PSU/Prema BIOS

Last edited by ajc9988; 03-18-2019 at 07:21 AM.
ajc9988 is online now