[Guru3d] AMD Working on Vertical (3D) Stacking of DRAM onto processors - Page 3 - Overclock.net - An Overclocking Community

Forum Jump: 

[Guru3d] AMD Working on Vertical (3D) Stacking of DRAM onto processors

Reply
 
Thread Tools
post #21 of 38 (permalink) Old 03-18-2019, 03:06 PM
New to Overclock.net
 
mmonnin's Avatar
 
Join Date: Nov 2012
Posts: 5,634
Rep: 272 (Unique: 129)
Quote: Originally Posted by KyadCK View Post
Zen 2 already removed the memory controller from the CPU, so moot point.

That being said, the slide's picture is literally a copy/paste of the Vega HBM2 slide and says GPU on it. The slide also says "On-Die 3D Stacked Memory in Development", so in the future it seems they are hoping to put HBM or DRAM directly onto the Ryzen IO Die.

Also, yes, the HBM stack does need pinouts through the interposer, they need to get power from somewhere and AMD does not use IVRs. However, dies are not attached to the interposer with TSVs, they are micro ball soldered. TSVs are only for the actual RAM to the controller;

I was just pointed out the memory controller is a difference. Regardless, an HBM/HMC stack is guaranteed to be more expensive which is the point I was trying to make.


mmonnin is offline  
Sponsored Links
Advertisement
 
post #22 of 38 (permalink) Old 03-18-2019, 04:39 PM
Null
 
geoxile's Avatar
 
Join Date: Jul 2010
Posts: 6,284
Rep: 159 (Unique: 129)
If only HSA were still alive RIP

geoxile is offline  
post #23 of 38 (permalink) Old 03-18-2019, 05:27 PM
Not a linux lobbyist
 
rluker5's Avatar
 
Join Date: Feb 2014
Location: Wisconsin
Posts: 1,523
Rep: 42 (Unique: 33)
Quote: Originally Posted by JackCY View Post
It has been wanted for many many years to have a nice large L4 cache/RAM on CPU. But AMD and Intel barely every make even a CPU with nice L4 cache let alone put RAM on the CPU. Why we still don't have a huge L4 cache on CPUs is beyond me. Putting RAM on CPU... more relevant for mobile use than desktop, server.
The L4 cache you are thinking of was a lot more than just a ram chip on the pcb, it had a whole extra concurrent memory controller and used 1/4 of the L3 cache as L4 tags. The cache without the controller and tags isn't the same. It would just be some mismatched ram.
Zen 2 is getting that I/O chiplet. And a good amount of L3. Maybe there is some way they can work doubling ram channels and controllers and keep them straight with that stuff. Lowering Ryzen's ram latency to less than Intel's would be good for gaming.
Attached Thumbnails
Click image for larger version

Name:	BDW-H-Map.png
Views:	10
Size:	4.29 MB
ID:	260030  


L5
(19 items)
Lea2
(12 items)
L7
(11 items)
CPU
5950hq
Motherboard
z97 Classified
GPU
Aorus 1080ti Waterforce
GPU
Aorus 1080ti Waterforce
RAM
16 G Gskill Trident @ 2400,cas10,1.575v
RAM
16 G Team Extreme @ 2400,cas10,1.575v
Hard Drive
2xSamsung 840 EVO 250G
Hard Drive
seagate barracuda 3T
Hard Drive
Optane 900p 480G OS
Optical Drive
Asus BW-16D1HT
Power Supply
EVGA Supernova 1300 G2
Cooling
Cooler Master MasterLiquid Pro 120 (cpu)
Cooling
2 140mm case fans, 2 120mm
Case
Fractal Design R4 (no window)
Operating System
W10 64 pro
Monitor
panasonic TC-58AX800U
Audio
Focal Elear
Audio
SoundbasterX AE-5
Other
Megatron
CPU
4770k
Motherboard
Asus Z87 Deluxe
GPU
Asus 780ti DC2OC
GPU
Asus 780ti DC2OC
RAM
8Gb patriot 1600mhz
Hard Drive
ROG Raidr 240Gb pcie
Hard Drive
1Tb WD blue
Power Supply
Pc Power&Cooling silencer Mk2 950w
Cooling
Deepcool Lucifer V2
Case
DIYPC P48-W
Operating System
W10 64 pro
Monitor
40"tv
CPU
4980hq
Motherboard
Asus H81T/CSM
RAM
16GB 1600 generic
Hard Drive
Samsung 850 evo 120gb
Power Supply
Skyvast 90w brick for hp pavilion something
Cooling
SilverStone Tek Super Slim
Case
SilverStone Tek PT13B
Operating System
W10 64 pro
Monitor
50" samsung plasma 720p
Keyboard
Logitech K400+
Other
Intel wifi ac card and noname antennas
▲ hide details ▲
rluker5 is offline  
Sponsored Links
Advertisement
 
post #24 of 38 (permalink) Old 03-19-2019, 12:45 AM
Umm, How Bout Noooooo.
 
Cyber Locc's Avatar
 
Join Date: Aug 2012
Location: White Moutians, Arizona
Posts: 3,617
Rep: 113 (Unique: 78)
Quote: Originally Posted by Hwgeek View Post
For OEM's and for us as a product it will be much better, less debugging of Ram issues/compatibility with chipset/socket/CPU- just send the CPU for RMA, less power usage/cleaner and simpler MB PCB design.
It's going to be more expensive for us tho- since no more keeping your current memory modules and just upgrading CPU/MB, need to buy all new CPU/MB combo.
Quote:
Sounds like RAM manufacturers will be fighting to sign a deal with Intel & AMD to get their memory paired with future CPU's.
Same situation like on GPU's/smartphone's and other devices with memory soldered on board. But for Memory Brands (not the manufacturers) this going to be bad, they will be forced to go for other new products.
I don't know where you get less power usage? It will still consume the same amount of power.

The board cleaner is objective, it will increase the pin count, by alot. CPUs will be massive, have you not seen how many chips it takes to make 11gbs on the 2080ti? Multiply that by 3, or 6 lol.

And this will 100% be bad for memory manufacturer's. Do you think every one of them is going to get a contract? Lol, no.

So just like with GPUs, 1 or 2 will get a contract, of which they will barely be able to keep up supply's, prices will rise and the other manufacturer's will suffer from no business until the next shot at a contract.

If this actually happens, which I think you guys are overthinking it, I think it will be like a L4 cache, not full on system memory. It will be bad for everyone.

What about OEMs? Now they are supposed to have what 50 CPU skews? Mobos are supposed to make 50 boards? They can't be the same sized chips. So now we need a i3, in 4gb, 8gb, 16gb, and then in i5 in 4-32, i7 4-64, plus all the different models of those? Now we have I9s to add too.

Then for us overclockers, rip that, the heat will be extreme. CPU blocks massive.


This will not simplify or improve anything, it will things worse.

Then there is right to repair. Ram sticks die, alot. So what happens when some of the ram, overheated by the CPU dies, 1 day after warranty, on a 2000 dollar CPU? Ya let's hope for that case.

My siggy so empty frown.gif
Favorite OCN Quotes (Click to show)
Quote:
Originally Posted by Jakusonfire go_quote.gif

You are just talking more and more nonsense now. It's time to take this back to whatever noob forum it came from.
____________________________________________________________________________________________________________________

Last edited by Cyber Locc; 03-19-2019 at 12:52 AM.
Cyber Locc is offline  
post #25 of 38 (permalink) Old 03-19-2019, 06:30 AM
Hey I get one of these!
 
KyadCK's Avatar
 
Join Date: Aug 2011
Location: Chicago
Posts: 7,209
Rep: 301 (Unique: 212)
Quote: Originally Posted by Cyber Locc View Post
I don't know where you get less power usage? It will still consume the same amount of power.

The board cleaner is objective, it will increase the pin count, by alot. CPUs will be massive, have you not seen how many chips it takes to make 11gbs on the 2080ti? Multiply that by 3, or 6 lol.

And this will 100% be bad for memory manufacturer's. Do you think every one of them is going to get a contract? Lol, no.

So just like with GPUs, 1 or 2 will get a contract, of which they will barely be able to keep up supply's, prices will rise and the other manufacturer's will suffer from no business until the next shot at a contract.

If this actually happens, which I think you guys are overthinking it, I think it will be like a L4 cache, not full on system memory. It will be bad for everyone.

What about OEMs? Now they are supposed to have what 50 CPU skews? Mobos are supposed to make 50 boards? They can't be the same sized chips. So now we need a i3, in 4gb, 8gb, 16gb, and then in i5 in 4-32, i7 4-64, plus all the different models of those? Now we have I9s to add too.

Then for us overclockers, rip that, the heat will be extreme. CPU blocks massive.

This will not simplify or improve anything, it will things worse.

Then there is right to repair. Ram sticks die, alot. So what happens when some of the ram, overheated by the CPU dies, 1 day after warranty, on a 2000 dollar CPU? Ya let's hope for that case.
Components closer together require less power to communicate, as you need less voltage to accomplish the same task. See: Every Node shrink.

Stacking RAM on the CPU would reduce board pin count and complexity. See: Any HBM based GPU.

There are only three DRAM manufacturers these days. Hynix, Samsung, and Micron. AMD has no problem working with Hynix and Samsung to make HBM, no reason Micron could not except they're working with Intel on HBC.

There are actually more than two GPU companies. ARM, Broadcom, Qualcomm, and Apple all make their own GPUs. They just do not make what you would consider to be a performance GPU.

L4 cache has proven to be effective in Broadwell-C. I agree that it will be L4 as it will be near impossible to put enough RAM on the chip to satisfy all needs, but I disagree that L4 is bad for "everyone".

Intel already has far more than 50 SKUs. This Z370 board supports 39 alone:
https://www.msi.com/Motherboard/supp...RO#support-cpu

This one supports 63 SKUs:
https://www.asus.com/Motherboards/X99E_WS/HelpDesk_CPU/

This one supports 51 SKUs:
https://www.asus.com/us/Motherboards.../HelpDesk_CPU/

And so on.

They can be the same sized chips. AMD wants to do 3D stacking, and that means putting the RAM on top of the IO die. The socket will not change between models, just like the "socket" did not change between 16GB Vega FE and 8GB Vega.

DDR4/HBM RAM does not make much heat or use much power. Almost all our RAM could be run without a heat sink if we wanted to, they just look nice. As long as RAM speeds are decoupled from the core speeds, there would be zero change in how you overclock today.

Have you ever actually seen a TR4 chip? Not that it matters, larger blocks just means more surface area and less thermal density. That is a good thing for overclocking, not a bad thing.

It will improve plenty, or they would not bother to implement it.

RAM never dies in the grand scheme of things. In the last five years over 5000 replaced/disposed assets spanning the last 8 model generations back to the Core 2 Duo/DDR2 through 8000-series on DDR4 and with over 6000 active assets in use with an average of 1.5 sticks per PC, both laptops and desktops, I have seen exactly three RAM stick deaths. In comparison I have seen over a thousand HDD failures and a few hundred PSU failures as cause of "death". I have never seen a CPU fail. MB failures were maybe a few dozen, but almost always user damage.

The RAM would not be overheated by the CPU. RAM is capable of running at 80C+ just like a CPU can, and the RAM temp would never be higher than the CPU temp, so it literally can not die that way as the CPU would throttle itself before hitting dangerous limits. Once again, I will point you to Fury/Vega, which put out far more power than AMD's CPUs do and have no problems doing so.

Forge
(18 items)
Forge-LT
(7 items)
CPU
AMD Threadripper 1950X
Motherboard
Gigabyte X399 Designare
GPU
EVGA 1080ti SC2 Hybrid
GPU
EVGA 1080ti SC2 Hybrid
RAM
32GB G.Skill TridentZ RGB (4x8GB 3200Mhz 14-14-14)
Hard Drive
Intel 900P 480GB
Hard Drive
Samsung 950 Pro 512GB
Power Supply
Corsair AX1200
Cooling
EK Predator 240
Case
Corsair Graphite 780T
Operating System
Windows 10 Enterprise x64
Monitor
2x Acer XR341CK
Keyboard
Corsair Vengeance K70 RGB
Mouse
Corsair Vengeance M65 RGB
Audio
Sennheiser HD700
Audio
Sound Blaster AE-5
Audio
Audio Technica AT4040
Audio
30ART Mic Tube Amp
CPU
i7-4720HQ
Motherboard
UX501JW-UB71T
GPU
GTX 960m
RAM
16GB 1600 9-9-9-27
Hard Drive
512GB PCI-e SSD
Operating System
Windows 10 Pro
Monitor
4k IPS
▲ hide details ▲

Last edited by KyadCK; 03-19-2019 at 06:35 AM.
KyadCK is offline  
post #26 of 38 (permalink) Old 03-19-2019, 07:20 AM
New to Overclock.net
 
EniGma1987's Avatar
 
Join Date: Sep 2011
Posts: 6,098
Rep: 328 (Unique: 240)
Quote: Originally Posted by rluker5 View Post
Zen 2 is getting that I/O chiplet. And a good amount of L3. Maybe there is some way they can work doubling ram channels and controllers and keep them straight with that stuff. Lowering Ryzen's ram latency to less than Intel's would be good for gaming.

AMD is only going to make 1 IO chip die, just like they oly make 1 core die. So there will be 8 memory channels in the IO die on both AM4 socket and the Epyc socket. They dont need to work on anything to fit more channels in, as they already have 8. The problem is that the AM4 socket cannot just magically get more memory channels. It does not have the pins to get any more channels out of the socket. So just like in the past, they will disable the unused channels.
If AMD is going to use the stacked memory as actual system RAM then that would be a very bad choice. 1) it will run at AMD's actual supported memory speed which will mean any dimms you add in will have to match that same slow speed and timings. 2) You need a lot of pins for memory so even if they are able to technically stack dozens of layers of transistors, you run into an actual limit of the number of traces you can run through the stack. AMD would have to significantly enlarge the IO die to use 8 channels on stacked memory to give us crazy channel counts on consumer socket with 32GB of RAM on board, so they could fit the traces through the stack and still have room for the actual memory cells. They will want to go smaller if anything which means even less space for traces to memory chips. So system memory just seems like a bad idea outside of laptops. Abetter idea is to use stacked memory cells as an L4 cache as others have said.


Last edited by EniGma1987; 03-19-2019 at 07:48 AM.
EniGma1987 is offline  
post #27 of 38 (permalink) Old 03-19-2019, 07:43 AM
Otherworlder
 
epic1337's Avatar
 
Join Date: Feb 2011
Posts: 7,165
Rep: 213 (Unique: 121)
they'll just leave the unused channels blank, it'll operate in dual-channel mode, or quad-channel mode if AM4 is pinned to all 4dimm slots properly.
but yes, there'd be issue if they just split the channels for on-package + off-package, the on-package ram would pull down the clock speed of the off-package ram.

anyway, the best solution is to use the HBCC they developed for this, on-package dram has it's own cache controller.

trolling an adult is very dangerous, don't try it at home nor at work. you don't want to play tag with a rabid man.
epic1337 is offline  
post #28 of 38 (permalink) Old 03-19-2019, 08:22 AM
Hey I get one of these!
 
KyadCK's Avatar
 
Join Date: Aug 2011
Location: Chicago
Posts: 7,209
Rep: 301 (Unique: 212)
Quote: Originally Posted by epic1337 View Post
they'll just leave the unused channels blank, it'll operate in dual-channel mode, or quad-channel mode if AM4 is pinned to all 4dimm slots properly.
but yes, there'd be issue if they just split the channels for on-package + off-package, the on-package ram would pull down the clock speed of the off-package ram.

anyway, the best solution is to use the HBCC they developed for this, on-package dram has it's own cache controller.
Agreed.

One "Core" chiplet (8c/16t), one "GPU" chiplet (1024-1280 shader-ish), one "I/O Die" chiplet with the usual dual-channel, but also with 2-4GB of HBM (single stack) on top to act as L4 eDRAM cache/VRAM like Broadwell-C, and add in HBCC for multi pool management. Give it 125-140w to go ham on desktop models.

Finally after what feels like a decade, an APU that lives up to what HSA started.

Also they want to do 3D stacking because consoles, which obviously would not upgrade RAM anyway. Selling what amounts to an entire PC on a chip would be good for them.

Forge
(18 items)
Forge-LT
(7 items)
CPU
AMD Threadripper 1950X
Motherboard
Gigabyte X399 Designare
GPU
EVGA 1080ti SC2 Hybrid
GPU
EVGA 1080ti SC2 Hybrid
RAM
32GB G.Skill TridentZ RGB (4x8GB 3200Mhz 14-14-14)
Hard Drive
Intel 900P 480GB
Hard Drive
Samsung 950 Pro 512GB
Power Supply
Corsair AX1200
Cooling
EK Predator 240
Case
Corsair Graphite 780T
Operating System
Windows 10 Enterprise x64
Monitor
2x Acer XR341CK
Keyboard
Corsair Vengeance K70 RGB
Mouse
Corsair Vengeance M65 RGB
Audio
Sennheiser HD700
Audio
Sound Blaster AE-5
Audio
Audio Technica AT4040
Audio
30ART Mic Tube Amp
CPU
i7-4720HQ
Motherboard
UX501JW-UB71T
GPU
GTX 960m
RAM
16GB 1600 9-9-9-27
Hard Drive
512GB PCI-e SSD
Operating System
Windows 10 Pro
Monitor
4k IPS
▲ hide details ▲
KyadCK is offline  
post #29 of 38 (permalink) Old 03-19-2019, 10:15 AM
Not a linux lobbyist
 
rluker5's Avatar
 
Join Date: Feb 2014
Location: Wisconsin
Posts: 1,523
Rep: 42 (Unique: 33)
Quote: Originally Posted by EniGma1987 View Post
AMD is only going to make 1 IO chip die, just like they oly make 1 core die. So there will be 8 memory channels in the IO die on both AM4 socket and the Epyc socket. They dont need to work on anything to fit more channels in, as they already have 8. The problem is that the AM4 socket cannot just magically get more memory channels. It does not have the pins to get any more channels out of the socket. So just like in the past, they will disable the unused channels.
If AMD is going to use the stacked memory as actual system RAM then that would be a very bad choice. 1) it will run at AMD's actual supported memory speed which will mean any dimms you add in will have to match that same slow speed and timings. 2) You need a lot of pins for memory so even if they are able to technically stack dozens of layers of transistors, you run into an actual limit of the number of traces you can run through the stack. AMD would have to significantly enlarge the IO die to use 8 channels on stacked memory to give us crazy channel counts on consumer socket with 32GB of RAM on board, so they could fit the traces through the stack and still have room for the actual memory cells. They will want to go smaller if anything which means even less space for traces to memory chips. So system memory just seems like a bad idea outside of laptops. Abetter idea is to use stacked memory cells as an L4 cache as others have said.
I meant that AMD could use some of those unused channels for an on die edram like cache, and use some of that 16MB L3 for tags to sort what is on the on package dram so that can be used, and what isn't can be accessed from off package dram concurrently. I probably just stated it sloppily as I often do.
You don't need pins for what isn't leaving the package. If they put in more edram modules, they could run even more concurrent L4 caches.
It is really just wistful thinking. The odds they will use these bits in such a way, even when they are all so close together, are pretty insignificant.

Edit: This is assuming that there are actually more than one memory controller on that I/O die. If there is just one with more channels, and it is not segmentable into differently configured modules, then my edram idea won't work and was based on a misunderstanding.

L5
(19 items)
Lea2
(12 items)
L7
(11 items)
CPU
5950hq
Motherboard
z97 Classified
GPU
Aorus 1080ti Waterforce
GPU
Aorus 1080ti Waterforce
RAM
16 G Gskill Trident @ 2400,cas10,1.575v
RAM
16 G Team Extreme @ 2400,cas10,1.575v
Hard Drive
2xSamsung 840 EVO 250G
Hard Drive
seagate barracuda 3T
Hard Drive
Optane 900p 480G OS
Optical Drive
Asus BW-16D1HT
Power Supply
EVGA Supernova 1300 G2
Cooling
Cooler Master MasterLiquid Pro 120 (cpu)
Cooling
2 140mm case fans, 2 120mm
Case
Fractal Design R4 (no window)
Operating System
W10 64 pro
Monitor
panasonic TC-58AX800U
Audio
Focal Elear
Audio
SoundbasterX AE-5
Other
Megatron
CPU
4770k
Motherboard
Asus Z87 Deluxe
GPU
Asus 780ti DC2OC
GPU
Asus 780ti DC2OC
RAM
8Gb patriot 1600mhz
Hard Drive
ROG Raidr 240Gb pcie
Hard Drive
1Tb WD blue
Power Supply
Pc Power&Cooling silencer Mk2 950w
Cooling
Deepcool Lucifer V2
Case
DIYPC P48-W
Operating System
W10 64 pro
Monitor
40"tv
CPU
4980hq
Motherboard
Asus H81T/CSM
RAM
16GB 1600 generic
Hard Drive
Samsung 850 evo 120gb
Power Supply
Skyvast 90w brick for hp pavilion something
Cooling
SilverStone Tek Super Slim
Case
SilverStone Tek PT13B
Operating System
W10 64 pro
Monitor
50" samsung plasma 720p
Keyboard
Logitech K400+
Other
Intel wifi ac card and noname antennas
▲ hide details ▲

Last edited by rluker5; 03-19-2019 at 10:22 AM.
rluker5 is offline  
post #30 of 38 (permalink) Old 03-19-2019, 11:10 AM
Newb to Overclock.net
 
mouacyk's Avatar
 
Join Date: Jan 2013
Posts: 3,738
Rep: 164 (Unique: 122)
If it helps any, eDRAM on Broadwell added another $100 to msrp. Unlikely to be cost effective now either.

Gentoo64 in Water
(14 items)
LGA775 X5470
(6 items)
CPU
9900K 5GHz 1.224v
Motherboard
EVGA Z370 Micro
GPU
MSI 1080TI GXEK 2100.5/12627
RAM
16GB Trident Z 4000C16
Hard Drive
970 EVO 500GB
Power Supply
Seasonic X850 Gold
Cooling
480mm Radiator Custom
Case
Silverstone FT03
Operating System
Windows 7 Ultimate 64-bit
Operating System
Gentoo Linux 64 Multi-Lib
Monitor
Acer Predator XB271UH 165Hz
Keyboard
Logitech G710+
Mouse
Logitech G502
Audio
Sound Blaster Z
CPU
X5470 4GHz (stock v)
Motherboard
GA-EP45-UD3P
GPU
EVGA 9800 GTX+ 512MB
RAM
8GB 4x2GB GSkill 1066MHz DDR2
Cooling
XSPC Rasa, D5 + Res, 240mm Rad
Case
Lian-Li PC7-HX
▲ hide details ▲
mouacyk is offline  
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off