Overclock.net › Forums › Industry News › Rumors and Unconfirmed Articles › [Various] AMD's Zen To Have 10 Pipelines Per Core - Details Leaked In Patch (Updated)
New Posts  All Forums:Forum Nav:

[Various] AMD's Zen To Have 10 Pipelines Per Core - Details Leaked In Patch (Updated) - Page 18

post #171 of 758
Quote:
Originally Posted by spurdomantbh View Post

That image was proven fake. It's not 2x256bit FMAC, because FMAC requires both FADD and FMUL to work. So since it's 2x128bit FADD and 2x128bit FMUL, that results in 2x128bit FMAC, which can fuse into 1x256bit FMAC.
Well actually the design states it can only execute one 128-bit FMAC per cycle.

128-bit FADD/FMUL/FMAC{iADD/iMUL} + 128-bit FADD/FMUL/FMAC{iADD/SHUF} + 128-bit FADD/SHUF + 128-bit FADD/FMAC{iADD} if you check out the execution allocation.

FP0 needs FP3 to execute FMA and FP1 needs FP3 to execute FMA. The integer/memory side is lackluster, and the FPU side is a clusterhump.

BD/PD: 1 x 128-bit FMAC/iMAC + 1 x 128-bit FMAC/XBAR + 128-bit iADD + 128-bit iADD
SR/XV: 1 x 128-bit FMAC/iADD/iMAC + 1 x 128-bit FMAC/XBAR + 128-bit iADD/SHUF
p.s. Packed Integer MAC = XOP , XBAR = Super Shuf

Zen FPU = {Mind-blown} // BD/SR FPU = {Oh that is pretty simple.}
Edited by Seronx - 10/9/15 at 4:29am
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
post #172 of 758
Quote:
Originally Posted by Seronx View Post

FP0 needs FP3 to execute FMA and FP1 needs FP3 to execute FMA. The integer/memory side is lackluster, and the FPU side is a clusterhump.

Indeed, I noticed it, but as I said previously in this thread, I think it might be a typo, since FP2 is fully used in other operations. I don't see how crippling that one unit for that one instruction makes sense in any way. Who knows though, just seems too weird of a design choice IMO.
post #173 of 758
Quote:
Originally Posted by The Stilt View Post

By saying "your particular selection of benchmarks heavily favors Haswell's improvements" you of course mean that these benchmarks are floating point heavy instead of being integer heavy.

15h CPUs never had real issues with integer performance since they had sufficient resources built into them, which never was the case with FP.
It would be silly to use integer heavy workloads to predict the performance of Zen, since the improvement in integer performance won´t decide the fate of AMD. The floating point performance of Zen will. 15h CPUs are behind the competition in integer performance too, but the difference isn´t even remotely as massive as in floating point.

By which I mean your ENTIRE selection of benchmarks is known to heavily favor one genre of instructions over the entire spectrum of instructions. And, in fact, favor the instructions not most heavily used in the majority of applications. i.e. it is not a representative sample. You need more to add to this.

WebXPRT, 3dPM, Google Octane V2, and many other real-world benchmarks don't show anywhere near the improvements on the Intel side (intel over intel).

Benefits over Sandy Bridge for Haswell:

Single Threaded (using forced affinity, or single threaded benchmarks):
CB-10:ST: 19%
CB-15 ST: 16%
CB-11.5 S:14%
3dPM ST: 5%
7-zip: 0%*
WebXPRT: -1.5%*
Octane V2: -3%*

Multi-threaded (i7 2600k @ 4.4 vs i7 4790k @ 4.4):
HB-4K: 29%
Agisoft: 28%
HB-LQ: 19%
x265: 17%
x264-P2: 16%
CB-R10 MT: 16%
CB-15 MT: 15%
CB-R11.5 MT: 12%
x264: P1: 9%
3dPM: 6%
7-Zip: 4%*
WebXPT: 3%*
Octane V2: -5%*

* I'll be re-running these benchmarks soon to verify, most of these are just collected from the web

In the end, that is a 13% improvement over Sand Bridge, which is also in-line with Intel claims and most/all reviews.

Notice that your benchmark choices are entirely at the top end of the spectrum? You're showing Haswell in the best possible light, and also focusing on the one area that will show 15h CPUs in their worst light. Using just the above benchmarks (and POV-Ray, whose results seem to be in yet another spreadsheet...) Puts Zen in the position of matching Haswell.

I have another collection of tests which compares Intel CPUs from Penryn to Skylake. The benchmarks aren't fully uniform, but they are as close as I could manage with the time spread. The full spread of benchmarks (as many as 20 comparing adjacent iterations) shows the following:

Penryn: 100%
Nehalem: 109%
Sandy: 118.8%
Ivy: 125.94%
Haswell: 137.27%
Skylake: 149.63%

Intel's claims are very little different. In fact, Intel's claimed performance increase for Haswell over Penryn is just 39.39%, so my results are beautifully inline with Intel's.

Excavator is effectively tied with Penryn, though it undoubtedly excels in some areas I am unable to do a direct comparison, so I assume them to be equal. This method, too, shows Zen and Haswell almost exactly even.

Again, of course, provided Zen actually gives us a 40% boost.

A 38% boost changes the story, as does a 42% boost. But that's just because Intel has almost exclusively focused on improvement that help sell certain classes of CPU.

Skylake is out of reach for Zen, but Zen+ should match or exceed it. At which point in time Intel will already have Kaby or Cannon out, and will maintain a 10~15% or greater IPC lead.

Zen gets AMD closer, but it does not deliver them an IPC win.

I expect AMD to make up for that deficit using SMT and mroe cores at lower price points. If they can't reach 4.5GHz, having Haswell IPC would be a moot point... since Haswell CAN reach 4.5GHz.
post #174 of 758
Godbless you guys.

Still wish I went to school.
post #175 of 758
I always thought the gap between Nehalem and SB should be quite big. A possible 25% improvement in IPC. On the other hand, Skylake seem to be too much improvement over Haswell as I recall its only 4% at best IPC improvement.



If the above comparison is true between different generation of Intel CPUs, den its sad to see AMD being at least 5 generations behind.



Lets just hope DX12 can give AMD more edge over for Zen in gaming (due to higher amt of cores). So even if their IPC only matches Haswell, the more cores will leverage the disadvantage on lower IPC against KB lake.
Project Frostbite
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core I7-6800K OC 4.2GHz @ 1.28 Vcore ASUS X99 Sabertooth Gigabyte Aorus GTX 1080 Ti 11G Corsair Vengeance White LED 4x8GB DDR4-3200MHz 
Hard DriveCoolingOSMonitor
Samsung Evo 850 1TB Corsair H100i V2 AiO Window 10 64-Bits Home Premium Acer Predator XB271HU 
KeyboardPowerCaseMouse
Tesoro Spectrum Mechanical Keyboard (Blue Switc... Thermaltake Toughpower 1000W Gold Phantek Enthoo Pro Full Acrylic Black Razer Diamondback 3G 
AudioAudioAudio
Audio Engine A2+ Speaker Audio Engine D1 DAC Audio Engine S8 Subwoofer 
  hide details  
Reply
Project Frostbite
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core I7-6800K OC 4.2GHz @ 1.28 Vcore ASUS X99 Sabertooth Gigabyte Aorus GTX 1080 Ti 11G Corsair Vengeance White LED 4x8GB DDR4-3200MHz 
Hard DriveCoolingOSMonitor
Samsung Evo 850 1TB Corsair H100i V2 AiO Window 10 64-Bits Home Premium Acer Predator XB271HU 
KeyboardPowerCaseMouse
Tesoro Spectrum Mechanical Keyboard (Blue Switc... Thermaltake Toughpower 1000W Gold Phantek Enthoo Pro Full Acrylic Black Razer Diamondback 3G 
AudioAudioAudio
Audio Engine A2+ Speaker Audio Engine D1 DAC Audio Engine S8 Subwoofer 
  hide details  
Reply
post #176 of 758
Quote:
Originally Posted by looncraz View Post


Again, of course, provided Zen actually gives us a 40% boost.

A 38% boost changes the story, as does a 42% boost. But that's just because Intel has almost exclusively focused on improvement that help sell certain classes of CPU.

Skylake is out of reach for Zen, but Zen+ should match or exceed it. At which point in time Intel will already have Kaby or Cannon out, and will maintain a 10~15% or greater IPC lead.

Zen gets AMD closer, but it does not deliver them an IPC win.

I expect AMD to make up for that deficit using SMT and mroe cores at lower price points. If they can't reach 4.5GHz, having Haswell IPC would be a moot point... since Haswell CAN reach 4.5GHz.

Zen's 40% boost over Excavator is suppose to be independent of process shrinkage. It's impossible to determine where Zen will be in comparison to Skylake because none of the Intel benchmarks really account for improvement because of the architecture or improvement because the process shrunk.
Edited by variant - 11/12/15 at 12:16pm
post #177 of 758
Quote:
Originally Posted by spurdomantbh View Post

That image was proven fake. It's not 2x256bit FMAC, because FMAC requires both FADD and FMUL to work. So since it's 2x128bit FADD and 2x128bit FMUL, that results in 2x128bit FMAC, which can fuse into 1x256bit FMAC.

Not entirely fake.
post #178 of 758
Quote:
Originally Posted by st0necold View Post

Godbless you guys.

Still wish I went to school.

You don't have to go to school (assuming you mean college) to learn it, you just have to be at least as insane as I am biggrin.gif

Of course, it isn't something you can figure out in a year or two on your own, either. Guidance is extremely helpful.
post #179 of 758
Quote:
Originally Posted by variant View Post

Zen's 40% boost over Excavator is suppose to be independent of process shrinkage. It's impossible to determine where Zen will be in comparison to Skylake because none of the Intel benchmarks really account for improvement because of the architecture or improvement because the process shrunk.

This is true, we don't know what Intel did to reach Skylake's performance levels. If the transistors switch fast enough and all they did was simplify a couple stages as a result, then this is easily something that AMD could have done independent of the larger architecture as well (or, perhaps, something they expect to do for Zen+)

Maybe this is why Intel won't disclose what they have done? Because, in essence, they did nothing.
post #180 of 758
Quote:
Originally Posted by looncraz View Post

This is true, we don't know what Intel did to reach Skylake's performance levels. If the transistors switch fast enough and all they did was simplify a couple stages as a result, then this is easily something that AMD could have done independent of the larger architecture as well (or, perhaps, something they expect to do for Zen+)

Maybe this is why Intel won't disclose what they have done? Because, in essence, they did nothing.

Considering Penryn and Nehalem were 45nm and Skylake is now at 14nm, there should have been large IPC gains from process shrinkage alone and 8 core should probably be standard for the top of the line consumer CPU. Yet we don't even see 50% gains and we are still only getting 4 cores. Instead they've added an integrated GPU and we have seen much larger increases in the power of the iGPU with each generation. We even know Karby Lake, which will be competing with Zen+, is basically Skylake with a better iGPU. Intel's focus simply does not appear to have been on providing a better CPU, but instead providing a better APU.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Rumors and Unconfirmed Articles
Overclock.net › Forums › Industry News › Rumors and Unconfirmed Articles › [Various] AMD's Zen To Have 10 Pipelines Per Core - Details Leaked In Patch (Updated)