Overclock.net - An Overclocking Community - Reply to Topic
Thread: [TechSpot] Micron starts sampling DDR5 RDIMMs with industry partners Reply to Thread
Title:
Message:

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in


  Additional Options
Miscellaneous Options

  Topic Review (Newest First)
01-22-2020 02:19 AM
8051
Quote: Originally Posted by epic1337 View Post
why multiplex?
DDR5 does in fact have more effective bandwidth per pin.

i'm quoting the white paper one more time...

in this case, they didn't multiplex it, they simply increased the command length to fit in more commands per pin.
The pin-outs for DDR5 suggest they are time multiplexing commands and address data across the exact same pins:
http://www.softnology.biz/pdf/JESD79...d%20Rev0.1.pdf

As does this: "Note: Since some commands are multi-cycle, the pins may not be interchanged between devices on the same bus."
This implies time multiplexing.

I believe an activate command (with row address) has to precede any read or write of any type and it goes over the exact same pins as a read or write command.

For all I know this is how DDR4 works now.

The mirror mode is interesting because it's something used on GDDR5. It swaps the functionality of pins but why would you ever want to do this?
01-21-2020 04:16 PM
epic1337
Quote: Originally Posted by EniGma1987 View Post
Do we *know* they are multiplexing any of that though? Or just assume right now based on the info we see? Maybe they are not multiplexing anything and have two fully separate channels that can send/receive at the same time as each other.
why multiplex?
DDR5 does in fact have more effective bandwidth per pin.

i'm quoting the white paper one more time...

Quote:
Protocol Features for Performance

In addition to higher data rates and improvements to the I/O circuitry, DDR5 introduces other new protocol
features unrelated to data rate that are integral to increasing bandwidth and performance. For example, DDR5
DIMMs feature two 40-bit (32 bits plus ECC) independent channels. When combined with a new default burst
length of 16 (BL16) in the DDR5 component, this allows a single burst to access 64B of data (the typical CPU
cache line size) using only one of the independent channels, or only half of the DIMM. Providing this ability to
interleave accesses from these two independent channels enables tremendous improvements to concurrency,
essentially turning an 8-channel system as we know it today into a 16-channel system.

In the DRAM array, the number of bank groups (BGs) is doubling in DDR5 as compared to DDR4, keeping the
number of banks-per-BG the same, which effectively doubles the number of banks in the device. This enables
controllers to avoid the performance degradations associated with sequential memory accesses within the
same bank (for example, causing t CCD_S to be the sequential access restriction, instead of the much longer
t CCD_L). The addition of same-bank refreshes and improvements to the pre/postambles on the command bus
(by introducing an interamble) help to mitigate the traditional performance bottlenecks commonly observed in
DDR4, improving the overall effective bandwidth of the memory interface.


Data Burst Length Increase


DDR5 SDRAM default burst length increases from BL8 (seen on DDR4) to BL16 and improves command/address and
data bus efficiency. The same read or write CA bus transaction can now provide twice as much data on the data bus while
limiting the exposure to IO/array timing constraints within the same bank. Reducing the commands required to access a
given amount of data also improves the power profile for read and write accesses.
The burst length increase also reduces the number of IOs required to access the same 64B cache line data payload. The
default burst length increase enables a dual sub-channel for the DDR5 DIMM architecture (shown in Figure 2), which
increases overall channel concurrency, flexibility and count. For systems that utilize a 128B cache line data payload,
DDR5 adds a burst length of 32 option specifically for x4-configured devices. This further improves the command/address,
data bus efficiency and overall power profile.
in this case, they didn't multiplex it, they simply increased the command length to fit in more commands per pin.
01-21-2020 03:00 PM
EniGma1987
Quote: Originally Posted by 8051 View Post
I notice they don't talk about the additional latency of time multiplexing all command signals and address lines over the same bus though.

In previous DDR revisions I could simultaneously send address data and control signals, no bus turn-around times for that.
Do we *know* they are multiplexing any of that though? Or just assume right now based on the info we see? Maybe they are not multiplexing anything and have two fully separate channels that can send/receive at the same time as each other.
01-21-2020 02:09 PM
8051
Quote: Originally Posted by epic1337 View Post
total latency might be lower in DDR5.

this came from the DDR5 white paper on Micron, you guys should give it a read at least.
I notice they don't talk about the additional latency of time multiplexing all command signals and address lines over the same bus though.

In previous DDR revisions I could simultaneously send address data and control signals, no bus turn-around times for that.
01-21-2020 12:14 PM
epic1337 total latency might be lower in DDR5.

Quote:
Refresh Commands

In addition to the standard ALL-BANK REFRESH command (REFab) available on DDR5 and earlier DDR SDRAM
products, DDR5 introduces a SAME-BANK REFRESH (REFsb) command. The REFsb command targets the same bank
in all bank groups, as designated by bank bits via command/address bits when the REFsb command is issued.
REFRESH commands on SDRAM devices require that the banks targeted for refresh are idle (precharged, no data
activity) prior to the command being issued, and the banks cannot resume subsequent write and read activity for the
duration of the REFRESH command (timing parameter t RFC). REFRESH commands are issued at an average periodic
interval (timing parameter t REFI). For REFab commands, the system must ensure all banks are idle prior to issuing the
command, on an average of once every 3.9µs in "normal" refresh mode, with a duration of 295ns for a 16Gb DDR5
SDRAM device.

The performance benefit of the REFsb command is that only one bank in each bank group needs to be idle before issuing
the command. The remaining 12 banks (for a 16Gb, x4/x8 device; blue cells in Figure 3) do not have to be idle when the
REFsb command is issued, and the only timing constraint to non-refreshed banks is the same-bank-refresh-to-activate
delay (timing parameter t REFSBRD). REFsb commands can only be issued in the fine granularity refresh (FGR) mode,
meaning each bank must receive a REFRESH command every 1.95µs on average. The REFsb duration is only 130ns for
a 16Gb DDR5 SDRAM device, which also reduces the system access lockout ( t RFCsb) to actively refreshing banks (red
cells in Figure 3). A restriction when using REFsb is that each "same bank" must receive one REFsb command prior to
that "same bank" being issued a second REFsb command, but the REFsb commands can be issued in any bank order.

Depending upon the read/write command ratio, simulations indicate a 6% to 9% increase in system performance
throughput when using REFsb as compared to REFsb, as shown in Figure 4. Furthermore, REFsb reduces the refresh
impact to average idle latency from 11.2ns to 5.0ns, as highlighted in Table 1. Calculations are based on standard
queuing theory and are applicable for a single bank with randomly driven data traffic.
this came from the DDR5 white paper on Micron, you guys should give it a read at least.
01-21-2020 11:55 AM
Asmodian
Quote: Originally Posted by 8051 View Post
Something tells me it'll be quite a while before DDR5 can beat the highest end DDR4 kits in latency and maybe never.
We said the same thing about DDR2, DDR3, and DDR4 when they came out but it wasn't true. It took a few years after release but they did eventually get to about the same real latencies as the previous generation.
01-20-2020 01:32 AM
8051
Quote: Originally Posted by Asmodian View Post
As long as you can get the clock high enough you can keep latencies the same. For some reason no one seems to want to decrease real memory latencies.

Edit: While I would like to see memory with lower real latencies, it is probably better to use the pins for extra lanes rather than control signals with today's high core count CPUs. Get the burst size and clocks up while increasing parallelism, more bandwidth for more threads at similar or only slightly worse real latencies. The second threads get much lower latencies since they do not need to wait for the first thread's burst to finish.
Something tells me it'll be quite a while before DDR5 can beat the highest end DDR4 kits in latency and maybe never.
01-19-2020 12:09 PM
The Pook 256 GB RAM on 4 DIMM boards?

01-19-2020 06:12 AM
Asmodian As long as you can get the clock high enough you can keep latencies the same. For some reason no one seems to want to decrease real memory latencies.

Edit: While I would like to see memory with lower real latencies, it is probably better to use the pins for extra lanes rather than control signals with today's high core count CPUs. Get the burst size and clocks up while increasing parallelism, more bandwidth for more threads at similar or only slightly worse real latencies. The second threads get much lower latencies since they do not need to wait for the first thread's burst to finish.
01-19-2020 05:53 AM
8051
Quote: Originally Posted by EniGma1987 View Post
Looks like according to that Micron page, where we used to have RAS, CAS, WE, etc command/address signals, DDR5 greatly reduces all those previously necessary connections
It looks to me like they're planning on time multiplexing address and command data over the same bus. While time multiplexing address data was already the standard w/DDR from the get-go the control signals all had independent pins. Low latency and a time-multiplexed bus work against each other.
This thread has more than 10 replies. Click here to review the whole thread.

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off