Originally Posted by elmor
Yes, unless you want higher copy bandwidth.
I suppose theoretically, since each has 16-bit width lanes for writes (32 bits for reads) which makes 2x16=32 bits total. It will depend on the application and if it uses both dies to access memory. From what I remember AIDA64 always tests memory using core 0 which wouldn't show any improvement, but that could have changed. I re-do these tests with 3900X when I get my hands on one.
It would seem that the decoupled FCLK is locked at 1800Mhz while the MCLK is increased with the faster kits not as was originally communicated, running at a 2:1 ratio.That can be seen in gaming frame rates as memory speeds imcrease above 3600MT/s
Please remember that those AMD slides in the presentation Deck are a little misleading and do not show teh full picture.
If we do the Maths
L3 Cache is clocked at CPU frequency yet the slides show that it is communicating at 32 bytes /cycle. At a CPU frequency of 4200MHz, that is approximately 440GB/s of theoretical Bandwidth and is reflected in the Aida Cache benchmarks with real world cache throughput between Core and Cache in the 400+GB/s range depending on CPU frequency. The Timings required for the Cache memory are baked into the UEFI and not user adjustable
The infinity Fabric interconnects are also shown at 32 bytes/cycle yet they are capped at 1800 MHz. with 3600MT/s Ram installed that is a maximum theoretical bandwidth of 56.7GB/s.
On the 3900X slide, it shows that there are 2 interconnects between the CCDs and the I/O die. but what is not shown, is that those 2 inteconnects are merged into a single interconnect, also running at 32 bytes/cycle at 1800Mhz that connects the data fabric in the IO Die to the Memory Controller circuitry. That aspect of Zen is in common with Zen 1 and 1+. That single interconnect to the memory limits Ryzen CPUs to a total 56.7GB/s between CPU Cores and the dual Channel RAM that in itself also has a theoretical Max bandwith that matches the Infinity Fabric up to 1800Mhz.Above that frequency, teh Infinity Fabric will start to bottleneck the Installed 4000+MT/s Ram
That 56.7GB/s of total end to end bandwidth also has to be shared with the GPU and it's DMA memory access requirements. A 2080ti running at full speed is demanding upwards of 15GB/s of data, that, while separated directly from the hardware by the kernel virtual memory management, ultimately has to come from the L3 Cache or the Ram over those physical interconnects. That requires need the CPU cores to sacrifice memory bandwidth and leads to longer CPU stalls as the cores have to wait idle for memory requests due to the reduction in available bandwidth for the CPU
The tests shown above show real world throughput that includes the Latency overhead cause by the Rams need to operate with appropriate timings.
Conversely, Intel Ringbus ties L3 cache speed to the Ring topology interconnect and also transfers data at 32 bytes/cycle yet, with a Cache multiplier of x42, it gives the transport, that also has to share the bandwidth between CPU and GPU memory needs, a total theoretical bandwidth of 144GB/s providing a surplus of bandwidth that is shared between CPU and GPU to access memory.