Overclock.net › Forums › Industry News › Hardware News › [AT] AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed
New Posts  All Forums:Forum Nav:

[AT] AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed - Page 10

post #91 of 233
I really love the armchair electrical engineers here. Uh huh, yeah, I'm sure a 14nm Excavator would absolutely destroy Zen. That's why AMD went through the trouble of designing a new core from scratch, right?

AMD took a gamble will Bulldozer, but lost since highly threaded applications didn't start (and still haven't) showed up. It's simple, if the Bulldozer architecture could be made to work, they would have done so. It can't, so they're throwing it out.
post #92 of 233
Quote:
Originally Posted by AmericanLoco View Post

I'm sure a 14nm Excavator would absolutely destroy Zen. That's why AMD went through the trouble of designing a new core from scratch, right?
In the case of a 20nm/14nm 15h architecture, it wouldn't be 14nm Excavator. The valid nomenclature would be NG2 Bulldozer, if no codename was offered.

Bulldozer => Bulldozer
Enhanced Bulldozer => Piledriver
Next-Gen Bulldozer => Steamroller
Enhanced Next-gen Bulldozer => Excavator
Next-Gen 2 Bulldozer => ???

The reason for the trouble of designing a new core from scratch could be ANYTHING.
Quote:
Originally Posted by AmericanLoco View Post

AMD took a gamble will Bulldozer, but lost since highly threaded applications didn't start (and still haven't) showed up. It's simple, if the Bulldozer architecture could be made to work, they would have done so. It can't, so they're throwing it out.
Bulldozer works better with lightly threaded applications this is do to higher clocks than Intel/AMD Jaguar. The issue was heavily threaded applications which could thrash execution units, L1i, L1d, and L2. There is really only two workloads memory(MLP/memory-level-parallelism) and execution(IPC/instructions-per-cycle), which are divided by energy(EPI/energy-per-instruction).

Stoney Ridge replacing Carrizo-L/Beema/Kabini is all the proof needed. If you running a lightweight browser which is faster a 2 GHz core @ 15Ws or a 3.5 GHz core @ 15Ws?


AMD has pointed out there is no EPI improvement between Zen(GF14LPP) and Excavator(GF28A). Except, Excavator(GF28HPA) has a 10% EPI improvement over Excavator(GF28A).

Ars Technica;
Quote:
AMD has fully taken the wraps off its brand new seventh generation APU architecture Bristol Ridge, which it announced earlier this year. It promises users around a 20-percent boost in CPU performance and a 37-percent boost in GPU performance over Bristol Ridge's predecessor Carrizo, which launched in 2015.
Quote:
"We didn't change the shape of the transistor, but we changed transistor implant and gave the transistor much more mobility," explained Macri to Ars. "At any given voltage, we get more current out. It's typically what you'd call a process variant. GlobalFoundries did a great piece of work here. We basically got an extra 200MHz or so out of the core, for a nice 10 percent boost in performance, which is greater than what you typically get out of a simple process tweak. But this wouldn't have made a new product. I wouldn't be calling this a seventh generation product if all we did was get this."

There is also the FIVR in Bristol/Stoney but apparently AMD pulled a Skylake and Zen doesn't have a FIVR;
http://dl.acm.org/citation.cfm?id=2934586
Quote:
This paper describes modeling and implementation of a fully digital integrated linear voltage regulation system implemented in a 28nm x86-64 core to reduce power gating entry or exit latency. Running on a 100 MHz clock, the controller samples voltage using a time-to-digital converter, and controls a set of PFETs organized in a ring topology around the CPU cores to drop voltage down to a specified target value. A simple analytical model is developed and validated through fast Matlab-Simulink simulation, enabling quick design turnaround and reducing schedule impact.

The regulation system is designed to support input-output voltages in the range 1.3 V - 0.55 V. Digitally-controlled header resistance values range from 1.5 Ω to 2 mΩ. Stable processor behavior is observed down to 0.6 V, enabling fast pseudo-power gating entry and exit. In a high-performance x86-64 dual-core microprocessor chip, the controller enables an effective 6% frequency increase for lightly threaded applications by increasing the boost state residency.

Signals; We are launching this new architecture look at it! LOOK AT IT! *Meanwhile behind the scenes* When is GlobalFoundries going to get to 1.0 PDK for 22FDX so we can Pentium M ourselves.
Edited by Seronx - 8/18/16 at 7:42pm
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
post #93 of 233
Please read those charts more carefully.
Quote:
energy(EPI/energy-per-instruction).
Quote:
AMD has pointed out there is no EPI improvement between Zen(GF14LPP) and Excavator(GF28A). Except, Excavator(GF28HPA) has a 10% EPI improvement over Excavator(GF28A).

AMD's chart does NOT say that energy per instruction stays the same as excavator. It says energy per cycle. The energy per cycle remains the same, except the IPC increase by over 40%. That means the EPI is over 40% BETTER than Excavator.
post #94 of 233
Quote:
Originally Posted by AmericanLoco View Post

AMD's chart does NOT say that energy per instruction stays the same as excavator. It says energy per cycle. The energy per cycle remains the same, except the IPC increase by over 40%. That means the EPI is over 40% BETTER than Excavator.
Zen peaks @ 3.2 GHz, Excavator(GF28HPA) peaks @ 4.4 GHz. Both consume the same TDP. This implies EPI has worsen(same units/4 units) or is the same(more units/6 units).

-> Thrown in 22FDX & all of its optimization.
-> CPUID Fn8000_001A_EAX with bit 2 @ value 1.
-> Addition in the AGLU pipes. (& everything else that can be applied to a complex ALU[EX0/EX1])
-> 32B Loads, 32B Stores
-> Lower Vmin(0.85v to 0.5v/0.9v to 0.55v)
-> Higher Fmax(4.4 GHz to 5 GHz)

...and Zen is dead on arrival.
Edited by Seronx - 8/18/16 at 8:16pm
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
post #95 of 233
Quote:
Originally Posted by Seronx View Post

Zen peaks @ 3.2 GHz, Excavator(GF28HPA) peaks @ 4.4 GHz. Both consume the same TDP. This implies EPI has worsen(same units/ 4 units) or is the same(more units/6 units).

Can you imagine if someone were to make comparisons between an ES CPU's clocks and a post production CPU's clocks and make definitive judgements based off of that? Like the person thinks ES clocks should be comparable to those of final CPU clocks, like the 2.5 GHz of Bulldozer ES CPUs. That would be something to see...if someone were to do that. I'm not saying it's you...tho.
Number Cruncher
(14 items)
 
  
CPUMotherboardGraphicsRAM
i7 2700k @ 4.6 1.3V Gigabyte Z68XP UD4 MSI TF3 7950 3GB G. Skill 4x4GB Ripjaw Zs 
Hard DriveHard DriveHard DriveOptical Drive
Samsung 830 256GB 2 - WD 750 GB Black WD 320GB Blue Lite-On DVD Drive 
CoolingOSMonitorPower
Corsair H80 Win 7 Pro x64 Acer P243w Corsair 750W TX 
CaseAudio
Lian Li PC-A71F Audioengine A2/Audioengine D1 DAC 
  hide details  
Reply
Number Cruncher
(14 items)
 
  
CPUMotherboardGraphicsRAM
i7 2700k @ 4.6 1.3V Gigabyte Z68XP UD4 MSI TF3 7950 3GB G. Skill 4x4GB Ripjaw Zs 
Hard DriveHard DriveHard DriveOptical Drive
Samsung 830 256GB 2 - WD 750 GB Black WD 320GB Blue Lite-On DVD Drive 
CoolingOSMonitorPower
Corsair H80 Win 7 Pro x64 Acer P243w Corsair 750W TX 
CaseAudio
Lian Li PC-A71F Audioengine A2/Audioengine D1 DAC 
  hide details  
Reply
post #96 of 233
Quote:
Originally Posted by Seronx View Post

Zen peaks @ 3.2 GHz, Excavator(GF28HPA) peaks @ 4.4 GHz. Both consume the same TDP. This implies EPI has worsen(same units/4 units) or is the same(more units/6 units).

-> Thrown in 22FDX & all of its optimization.
-> CPUID Fn8000_001A_EAX with bit 2 @ value 1.
-> Addition in the AGLU pipes.
-> 32B Loads, 32B Stores

Lower Voltages(0.85v to 0.5v/0.9v to 0.55v), Higher Frequencies (4.4 GHz to 5 GHz)... and Zen is dead on arrival.
You don't need to make any assumptions. It's laid out right there for you. The chart literally says the energy per cycle is the SAME as excavator. The chart also states the IPC is over 40% higher. If the energy per cycle is the same, and the Instructions per cycle is higher, then by basic elementary school math the energy required per instruction is reduced.

There is literally no possible way you can spin it as energy per instruction being the same or worse... and stop pretending that you know what speeds Zen runs at. It's literally laid out right there in front of you. Just give up and admit your wrong, otherwise you're just making yourself look worse. Keep throwing around your meaningless "technical CPU jargon" too - you're not fooling anyone into thinking that you know what you're talking about.
Edited by AmericanLoco - 8/18/16 at 8:17pm
post #97 of 233
Quote:
Originally Posted by one-shot View Post

Can you imagine if someone were to make comparisons between an ES CPU's clocks and a post production CPU's clocks and make definitive judgements based off of that? Like the person thinks ES clocks should be comparable to those of final CPU clocks, like the 2.5 GHz of Bulldozer ES CPUs. That would be something to see...if someone were to do that. I'm not saying it's you...tho.
1D/2D are pre-production SKUs. The 2.8 GHz Bulldozer was a pre-production FX-8150. Pre-production is up for comparison. The only problems with pre-production is idle and stock clocks and not Turbo clocks. These issues do not plague Summit Ridge like they did for Bulldozer.

2D Summit Ridge(4c/8t@3.2 GHz) with RX 480 performs worse than a A10-7890K(4c/4t@4.4 GHz) with RX 480. Which no one really saw, since they were focused on 1D(8c/16t@2.8 GHz). Which means Raven Ridge might actually be worse than Bristol Ridge. It would absolutely get obliterated with a 22FDX SKU. Summit Ridge to Kaveri is being optimistic for Raven Ridge to Bristol Ridge.
Edited by Seronx - 8/18/16 at 8:27pm
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
AMD FX ~Seronx
(16 items)
 
  
CPUMotherboardGraphicsRAM
FX-9800P Acer Wasp R7 M440 SK Hynix HMA41GS6AFR8N-TF 
Hard DriveHard DriveOptical DriveCooling
KINGSTON RBU-SNS8152S3128GG2 TOSHIBA MQ01ABD100 HL-DT-ST DVDRAM GUE1N Stock 
OSMonitorKeyboardPower
Microsoft Windows 10 Home Build 14393 Viewsonic XG2401 24 Hz-144 Hz Ducky Channel Shine 3 Stock 65W 
CaseMouseMouse PadAudio
Acer Exoskeleton Steelseries Rival 300 Razer Megasoma AMD-Realtek ALC255 
  hide details  
Reply
post #98 of 233
I need to buy me some popcorns!

And coffee. Tons of.
\\

This will be a long day of a read, I sense!.

tongue.gif
post #99 of 233
Zen plus LN2 should be interesting.
post #100 of 233
Quote:
Originally Posted by Seronx View Post

Zen peaks @ 3.2 GHz, Excavator(GF28HPA) peaks @ 4.4 GHz. Both consume the same TDP. This implies EPI has worsen(same units/4 units) or is the same(more units/6 units).

-> Thrown in 22FDX & all of its optimization.
-> CPUID Fn8000_001A_EAX with bit 2 @ value 1.
-> Addition in the AGLU pipes. (& everything else that can be applied to a complex ALU[EX0/EX1])
-> 32B Loads, 32B Stores
-> Lower Vmin(0.85v to 0.5v/0.9v to 0.55v)
-> Higher Fmax(4.4 GHz to 5 GHz)

...and Zen is dead on arrival.

Wrong. That's energy per cycle not energy per instruction. IPC is Instructions Per Cycle, so 40% more instructions per cycle at the same draw of power which means that the efficiency gain is greater than 40% while operating at the same power draw and so in order to match power draw you'd have overclock the Zen processor to match it.

40% improvement over excavator is different. It really depends how AMD desires to calculate it's accurately either 40% by the Zen's clock or by FX's clock. Either way if it's Zen's a 2.0ghz is as fast as a 2.8ghz excavator. So 2.8ghz Zen is the same power draw as 2.8ghz excavator, it's performance is equal to a 3.92ghz excavator.

Now imagine that in a laptop! smile.gif
Power Tower
(22 items)
 
SteamBox
(9 items)
 
Doge Miner
(7 items)
 
CPUMotherboardGraphicsRAM
Ryzen 1700X AX370-Gaming 5 AMD Radeon R9 200 Series G.Skill DDR4-2400 
RAMRAMRAMHard Drive
G.Skill DDR4-2400 G.Skill DDR4-2400 G.Skill DDR4-2400 Samsung 840 Pro 
Hard DriveHard DriveHard DriveHard Drive
CX300 Crucial 480GB Toshiba 4TB Toshbia 4TB Western Digital Black 1TB 
CoolingOSMonitorMonitor
h110i Windows 10 42" LG TV 20" Digitizer ASUS 
KeyboardPowerCaseMouse
Corsair Vengeance Mechanical Keyboard  850watt Vampire Gold Rated NZXT S340 Elite Corsair RGB FPS Mouse 
Mouse PadAudio
Borderlands Mousepad Realtek HD 
  hide details  
Reply
Power Tower
(22 items)
 
SteamBox
(9 items)
 
Doge Miner
(7 items)
 
CPUMotherboardGraphicsRAM
Ryzen 1700X AX370-Gaming 5 AMD Radeon R9 200 Series G.Skill DDR4-2400 
RAMRAMRAMHard Drive
G.Skill DDR4-2400 G.Skill DDR4-2400 G.Skill DDR4-2400 Samsung 840 Pro 
Hard DriveHard DriveHard DriveHard Drive
CX300 Crucial 480GB Toshiba 4TB Toshbia 4TB Western Digital Black 1TB 
CoolingOSMonitorMonitor
h110i Windows 10 42" LG TV 20" Digitizer ASUS 
KeyboardPowerCaseMouse
Corsair Vengeance Mechanical Keyboard  850watt Vampire Gold Rated NZXT S340 Elite Corsair RGB FPS Mouse 
Mouse PadAudio
Borderlands Mousepad Realtek HD 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Hardware News
Overclock.net › Forums › Industry News › Hardware News › [AT] AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed