Overclock.net › Forums › Industry News › Hardware News › Cache and memory in the many-core era
New Posts  All Forums:Forum Nav:

Cache and memory in the many-core era - Page 2

post #11 of 21
Is it just me or was about 4-6 page deleted?

Anyway..
Workoholic REborn
(16 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i7 2600K P8Z68-V PRO NVIDIA GeForce GTX 1080 Ti 12GB 1x4GB+1x8GB 
Hard DriveHard DriveOptical DriveCooling
Intel 520 Series WD Black ASUS DVD+RW Sunbeam Twister 120 
OSMonitorMonitorKeyboard
Windows 7 Ultimate LG OLED65C7P Oculus Rift Logitech K400 
PowerCaseMouse
Corsair 620 Modular Lian Li V1020B G9x 
  hide details  
Reply
Workoholic REborn
(16 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i7 2600K P8Z68-V PRO NVIDIA GeForce GTX 1080 Ti 12GB 1x4GB+1x8GB 
Hard DriveHard DriveOptical DriveCooling
Intel 520 Series WD Black ASUS DVD+RW Sunbeam Twister 120 
OSMonitorMonitorKeyboard
Windows 7 Ultimate LG OLED65C7P Oculus Rift Logitech K400 
PowerCaseMouse
Corsair 620 Modular Lian Li V1020B G9x 
  hide details  
Reply
post #12 of 21
Quote:
Originally Posted by DuckieHo View Post
Nope... For current quad-core, more like the other way around. Intel current implementation is two dual cores next to each other. Each core has a dedicated L1 cache and a shared L2... then it uses the FSB so the two sets of cores can communicate. AMD's implementation will give each core a L1 and L2 cache. There is a L3 cache that will be shared between all four cores. AMD's current design is better at scaling. Intel of course will have a new architecture next year though.
True about Intels quad core, I forgot to mention that.

AMD's approach much slower than Intels. The cores cannot talk to each other, and must spend vital time accessing the memory/page file in order to access something that the next core over might have in cache. Intels approach is tougher to implement because each core has to have "dibs" on a particular memory location so that the other doesn't try to throw some new memory into it.
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
post #13 of 21
Quote:
Originally Posted by trueg50 View Post
True about Intels quad core, I forgot to mention that.

AMD's approach much slower than Intels. The cores cannot talk to each other, and must spend vital time accessing the memory/page file in order to access something that the next core over might have in cache. Intels approach is tougher to implement because each core has to have "dibs" on a particular memory location so that the other doesn't try to throw some new memory into it.
Huh? What do you mean.

AMD's approach is theortically faster. Each core has two dedicated cache and a L3. They can communicate with each fine. Intel is therotically slower since it has to go through the FSB and chipset.

AMD:
CPU1->L1->L2-> |shared L3|
CPU2->L1->L2-> |shared L3|
CPU3->L1->L2-> |shared L3|
CPU4->L1->L2-> |shared L3|

Intel:
CPU1->L1-> |shared L2| =
CPU2->L1-> |shared L2| => |FSB link|
----------------------------------- |FSB link|
CPU3->L1-> |shared L2| => |FSB link|
CPU4->L1-> |shared L2| =


AMD 8-core:
CPU1->L1->L2-> |shared L3|
CPU2->L1->L2-> |shared L3|
CPU3->L1->L2-> |shared L3|
CPU4->L1->L2-> |shared L3|
CPU5->L1->L2-> |shared L3|
CPU6->L1->L2-> |shared L3|
CPU7->L1->L2-> |shared L3|
CPU8->L1->L2-> |shared L3|

Intel 8-core:
CPU1->L1-> |shared L2| =
CPU2->L1-> |shared L2| => |FSB link|
----------------------------------- |FSB link|
CPU3->L1-> |shared L2| => |FSB link|
CPU4->L1-> |shared L2| =
-----------------------------------
CPU5->L1-> |shared L2| =
CPU6->L1-> |shared L2| => |FSB link|
----------------------------------- |FSB link|
CPU7->L1-> |shared L2| => |FSB link|
CPU8->L1-> |shared L2| =
Once again...
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 920 [4.28GHz, HT] Asus P6T + Broadcom NetXtreme II VisionTek HD5850 [900/1200] + Galaxy GT240 2x4GB G.Skill Ripjaw X [1632 MHz] 
Hard DriveOSMonitorKeyboard
Intel X25-M 160GB + 3xRAID0 500GB 7200.12 Window 7 Pro 64 Acer H243H + Samsung 226BW XARMOR-U9BL  
PowerCaseMouseMouse Pad
Antec Truepower New 750W Li Lian PC-V2100 [10x120mm fans] Logitech G9 X-Trac Pro 
  hide details  
Reply
Once again...
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 920 [4.28GHz, HT] Asus P6T + Broadcom NetXtreme II VisionTek HD5850 [900/1200] + Galaxy GT240 2x4GB G.Skill Ripjaw X [1632 MHz] 
Hard DriveOSMonitorKeyboard
Intel X25-M 160GB + 3xRAID0 500GB 7200.12 Window 7 Pro 64 Acer H243H + Samsung 226BW XARMOR-U9BL  
PowerCaseMouseMouse Pad
Antec Truepower New 750W Li Lian PC-V2100 [10x120mm fans] Logitech G9 X-Trac Pro 
  hide details  
Reply
post #14 of 21
Quote:
Originally Posted by DuckieHo View Post
Huh? What do you mean.

AMD's approach is theortically faster. Each core has two dedicated cache and a L3. They can communicate with each fine. Intel is therotically slower since it has to go through the FSB and chipset.

AMD:
CPU1->L1->L2-> |shared L3|
CPU2->L1->L2-> |shared L3|
CPU3->L1->L2-> |shared L3|
CPU4->L1->L2-> |shared L3|

Intel:
CPU1->L1-> |shared L2| =
CPU2->L1-> |shared L2| => |FSB link|
----------------------------------- |FSB link|
CPU3->L1-> |shared L2| => |FSB link|
CPU4->L1-> |shared L2| =


AMD 8-core:
CPU1->L1->L2-> |shared L3|
CPU2->L1->L2-> |shared L3|
CPU3->L1->L2-> |shared L3|
CPU4->L1->L2-> |shared L3|
CPU5->L1->L2-> |shared L3|
CPU6->L1->L2-> |shared L3|
CPU7->L1->L2-> |shared L3|
CPU8->L1->L2-> |shared L3|

Intel 8-core:
CPU1->L1-> |shared L2| =
CPU2->L1-> |shared L2| => |FSB link|
----------------------------------- |FSB link|
CPU3->L1-> |shared L2| => |FSB link|
CPU4->L1-> |shared L2| =
-----------------------------------
CPU5->L1-> |shared L2| =
CPU6->L1-> |shared L2| => |FSB link|
----------------------------------- |FSB link|
CPU7->L1-> |shared L2| => |FSB link|
CPU8->L1-> |shared L2| =

Don't the cores on AMD chips communicate with the HT link anyways?
Magicbox
(17 items)
 
crapbox
(13 items)
 
 
CPUMotherboardGraphicsRAM
FX 8320 Sabertooth 990FX Nitro+ RX480 Kingston HyperX Fury 
Hard DriveHard DriveHard DriveCooling
Samsung 850 EVO  Kingston HyperX 3K Seagate Barracuda 7200.14 Noctua NH-D15 
OSOSMonitorKeyboard
Kubuntu  Windows 10 Pro Dell U2515H CM Quickfire TK (Cherry Blue) 
PowerCaseMouseMouse Pad
Cooler Master Silent Pro M 850W Enthoo Pro Logitech G502 Logitech G440 
Audio
Xonar DX 
CPUMotherboardGraphicsRAM
Sempron 3300+ HP stock mobo (laptop) 200M (IGP) 2x1GB PC3200 
Hard DriveOptical DriveOSMonitor
100GB ATA133 DVD/CDRW Kubuntu 32 bit 14.1" (1280x768) 
Power
6 cell 
  hide details  
Reply
Magicbox
(17 items)
 
crapbox
(13 items)
 
 
CPUMotherboardGraphicsRAM
FX 8320 Sabertooth 990FX Nitro+ RX480 Kingston HyperX Fury 
Hard DriveHard DriveHard DriveCooling
Samsung 850 EVO  Kingston HyperX 3K Seagate Barracuda 7200.14 Noctua NH-D15 
OSOSMonitorKeyboard
Kubuntu  Windows 10 Pro Dell U2515H CM Quickfire TK (Cherry Blue) 
PowerCaseMouseMouse Pad
Cooler Master Silent Pro M 850W Enthoo Pro Logitech G502 Logitech G440 
Audio
Xonar DX 
CPUMotherboardGraphicsRAM
Sempron 3300+ HP stock mobo (laptop) 200M (IGP) 2x1GB PC3200 
Hard DriveOptical DriveOSMonitor
100GB ATA133 DVD/CDRW Kubuntu 32 bit 14.1" (1280x768) 
Power
6 cell 
  hide details  
Reply
post #15 of 21
Quote:
Originally Posted by Melcar View Post
Don't the cores on AMD chips communicate with the HT link anyways?
They communicate to components, chipset, and memory via HyperTransport. The cores communicate to each other on die.
Once again...
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 920 [4.28GHz, HT] Asus P6T + Broadcom NetXtreme II VisionTek HD5850 [900/1200] + Galaxy GT240 2x4GB G.Skill Ripjaw X [1632 MHz] 
Hard DriveOSMonitorKeyboard
Intel X25-M 160GB + 3xRAID0 500GB 7200.12 Window 7 Pro 64 Acer H243H + Samsung 226BW XARMOR-U9BL  
PowerCaseMouseMouse Pad
Antec Truepower New 750W Li Lian PC-V2100 [10x120mm fans] Logitech G9 X-Trac Pro 
  hide details  
Reply
Once again...
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 920 [4.28GHz, HT] Asus P6T + Broadcom NetXtreme II VisionTek HD5850 [900/1200] + Galaxy GT240 2x4GB G.Skill Ripjaw X [1632 MHz] 
Hard DriveOSMonitorKeyboard
Intel X25-M 160GB + 3xRAID0 500GB 7200.12 Window 7 Pro 64 Acer H243H + Samsung 226BW XARMOR-U9BL  
PowerCaseMouseMouse Pad
Antec Truepower New 750W Li Lian PC-V2100 [10x120mm fans] Logitech G9 X-Trac Pro 
  hide details  
Reply
post #16 of 21
sorry I should have clarified.

In human terms, they are probably the same speed. However in terms of clocks, the Intel approach is going to be faster. When a CPU needs to throw something into memory or pull something out of memory, it has to take a long time accessing it from the RAM, something like 10-20 times slower than accessing it from its own L2 cache.

As multi-threading improves, each core will be utilized for similar things (ie: instead of one core running Quake 4, the load is divided) the cores need to speak with each other more than ever.

Intel: If core 0 needs some data from what core 1 has in cache, then it just grabs it out of the cache pool.

AMD If core 0 needs date from what core 1 has in cache ... well.. It has no ability to talk with that other core, and it has to access the memory (5 clock cycles vs 30 clock cycles) to get that data. They don't communicate. Thus the HT bus is handy for accessing memory.

I forgot which one but a hardware review site tested running a Q6600 (I think, it was a while ago) with a 1066 mhz and 1333mhz bus, there was no clock speed change, just FSB, and there was no performace increase. This proves that the FSB has yet to be "saturated" and becoming a bottleneck.

Just my 2 cents on that article.
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
post #17 of 21
well atleast AMD has come out with a newer FSB system and intel has yet to even start on a new one,they have been useing the same one for years,Also it is not only the CPU that the FSB and HT link are for,the FSB system has to go throw othere controllers to talk to the PCI and PCIE buses where as the HT link does not,Also HT does not need a northbridge,this alows you to run everthing on one bus and use the southbirdge to controll the hard drives and stuff,but with the intel FSB system you must have a controller to talk to things that the FBS does not have free access to,so its not just about speeding up your cpu it is about the system as a hole.
    
CPUGraphicsRAMHard Drive
Intel Core 2 Duo 2.00GHz ATI Radeon HD 3650 512MB RAM 4GB DDR2 800MHZ One 250GB and one 320GB 
Optical DriveOSMonitorKeyboard
DVD-RW Windows vista  17" Toshiba Toshiba 
PowerCaseMouseMouse Pad
Toshiba Touchpad None 
  hide details  
Reply
    
CPUGraphicsRAMHard Drive
Intel Core 2 Duo 2.00GHz ATI Radeon HD 3650 512MB RAM 4GB DDR2 800MHZ One 250GB and one 320GB 
Optical DriveOSMonitorKeyboard
DVD-RW Windows vista  17" Toshiba Toshiba 
PowerCaseMouseMouse Pad
Toshiba Touchpad None 
  hide details  
Reply
post #18 of 21
Quote:
Originally Posted by jonny1989 View Post
well atleast AMD has come out with a newer FSB system and intel has yet to even start on a new one,they have been useing the same one for years Also it is not only the CPU that the FSB and HT link are for,the FSB system has to go throw othere controllers to talk to the PCI and PCIE buses where as the HT link does not,Also HT does not need a northbridge,this alows you to run everthing on one bus and use the southbirdge to controll the hard drives and stuff,but with the intel FSB system you must have a controller to talk to things that the FBS does not have free access to,so its not just about speeding up your cpu it is about the system as a hole.
While yes Intel has maintained the FSB for many years, there is no need to change, it has yet to be a bottleneck.

The HT in no way eliminates the Northbridge. AMD CPU's from 939's on (I believe it was) had the memory controller on the CPU, eliminating the need for a northbridge. Nehalem will have an integrated memory controller, one of the reason for the move to the new socket.
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
post #19 of 21
Quote:
Originally Posted by trueg50 View Post

AMD If core 0 needs date from what core 1 has in cache ... well.. It has no ability to talk with that other core, and it has to access the memory (5 clock cycles vs 30 clock cycles) to get that data. They don't communicate. Thus the HT bus is handy for accessing memory.
What about L3 cache on forthcoming AMD CPU's?
    
CPUMotherboardGraphicsRAM
Intel Centrino T2300 @ 1.83 GHz Intel GM945 Express Chipset Intel GMA950 2x1GB PC2-5300 667MHz 
Hard DriveOptical DriveOSMonitor
250GB 7200rpm WD Scorpio Black USB External DVD/CDRW Windows XP Pro SP3 12.1" 1024x768 DVI 
PowerCaseMouse
4400mAh Li-ion (7+ hours on power saving profile) Carbon Fibre with Magnesium alloy roll cage IBM Trackpoint 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
Intel Centrino T2300 @ 1.83 GHz Intel GM945 Express Chipset Intel GMA950 2x1GB PC2-5300 667MHz 
Hard DriveOptical DriveOSMonitor
250GB 7200rpm WD Scorpio Black USB External DVD/CDRW Windows XP Pro SP3 12.1" 1024x768 DVI 
PowerCaseMouse
4400mAh Li-ion (7+ hours on power saving profile) Carbon Fibre with Magnesium alloy roll cage IBM Trackpoint 
  hide details  
Reply
post #20 of 21
no idea, I expected them to come with shared L3 caches with the 7x series when they were released. It does seem to be the next logical step with the next CPU. I wonder if Barcelona will have shared L3; again seems logical, but we will see.
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
C2D T7100 1.8 ghz (undervolted) ummm... Dell Intel X3100 2 x 1gb 667mhz 
Hard DriveOptical DriveOSMonitor
Fujitsu 7200 RPM 120gb CD-RW/DVD dual boot Vista business 1440x900 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Hardware News
Overclock.net › Forums › Industry News › Hardware News › Cache and memory in the many-core era