[WCCFT]AMD Zen Engineering Sample Benchmarks Leak Out – Summit Ridge CPU Faster Than The Intel Core i5 4670K In AotS Benchmark - Page 31 - Overclock.net

Forum Jump: 
Reply
 
Thread Tools
post #301 of 306 Old 10-05-2016, 07:59 PM
AMD Overclocker
 
BulletBait's Avatar
 
Join Date: Jun 2015
Location: St. Paul, MN
Posts: 1,393
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 150
Quote:
Originally Posted by Marios145 View Post

The BD/PD module can execute 2x128bit FP instructions and also has 2x integer cores(2 cores sharing the FPU).

Zen core can execute 2x128bit FP instructions and also has an integer core(there's no sharing here).

If I remember correctly, the Zen FMAC is a split 2x 256bit profile, carrying forward the 15h architecture of a split 2x 128bit FMAC. The difference is assignment of the whole FMAC (both 256bit) to a single core instead instead of sharing the FMAC between two cores.



So while Zen can theoretically perform 512bit instructions, the instructions weren't included in the Linux update. The speculation was that although those FMACs could pair to perform a 512bit instruction, the ALUs did not support it. So it was a bit of 'future proofing' built into the Zen FMACs, likely for Zen+ when 512bit AVX instruction set will start to become mainstream.

Maybe I'm completely wrong, but that's the way I read the news and AMD's block diagram.

BulletBait is offline  
Sponsored Links
Advertisement
 
post #302 of 306 Old 10-05-2016, 08:03 PM
Programmer
 
Join Date: Sep 2011
Posts: 651
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 104
Quote:
Originally Posted by Marios145 View Post

The BD/PD module can execute 2x128bit FP instructions and also has 2x integer cores(2 cores sharing the FPU).

Zen core can execute 2x128bit FP instructions and also has an integer core(there's no sharing here).

That gives BD/PD a total of 8x128bit FPUs.

It gives Zen a total of 16x128bit FPUs.

Cinebench uses mostly SSE which is 128bit.(correct me if i'm wrong)

Now all other things equal, the 8C/8T Zen will theoretically have twice the FP performance of 4M/8C/8T BD and derivatives.

While Zen can, theoretically, execute four concurrent floating point instructions, not all of its pipelines are equal... or independent... enough to say it can execute 2x128 instructions... though it's not terribly far off, to be fair. Scaling should be less than you might think... because ILP extraction is greatly improved... which is something that is diminished when you are executing two threads at once on Zen.

When SMT is active, we know that the execution units are not segmented, they are competitively utilized. That includes the FPU. SSE instructions can span across the entire FPU with just one thread, so adding another thread will not extract much more performance (well, 20% or so is a fair guess).

Without SMT, having twice the FPU width will only add, maybe, 50% more performance... more in some cases, much much less in others (bordering on zero). The rest of the CPU will determine how that relates to program performance.

However, each Zen FPU is superior to each Excavator FPU so, technically, Zen will have more than double the theoretical floating point capabilities... but hamstrung by ILP limitations and supporting infrastructure issues. SSE tasks may execute, in some cases, 100%+ faster, but by the time those results are usable, we have eaten up half of the advantage in other areas (such as waiting on an AGU or ALU).
looncraz is offline  
post #303 of 306 Old 10-05-2016, 08:08 PM
Programmer
 
Join Date: Sep 2011
Posts: 651
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 104
Quote:
Originally Posted by BulletBait View Post

If I remember correctly, the Zen FMAC is a split 2x 256bit profile, carrying forward the 15h architecture of a split 2x 128bit FMAC. The difference is assignment of the whole FMAC (both 256bit) to a single core instead instead of sharing the FMAC between two cores.



So while Zen can theoretically perform 512bit instructions, the instructions weren't included in the Linux update. The speculation was that although those FMACs could pair to perform a 512bit instruction, the ALUs did not support it. So it was a bit of 'future proofing' built into the Zen FMACs, likely for Zen+ when 512bit AVX instruction set will start to become mainstream.

Maybe I'm completely wrong, but that's the way I read the news and AMD's block diagram.



Due to the fact that multiple pipelines have to work together to execute certain instructions, it seems the pipelines are something like 288-bit or 304-bit combined.
looncraz is offline  
Sponsored Links
Advertisement
 
post #304 of 306 Old 10-05-2016, 08:16 PM
*cough* Stock *cough*
 
Marios145's Avatar
 
Join Date: May 2011
Posts: 344
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 70
Anandtech:
Quote:
The FP Unit uses four pipes rather than three on Excavator, and we are told that the latency in Zen is reduced as well for operations (though more information on this will come at a later date). We have two MUL and two ADD in the FP unit, capable of joining to form two 128-bit FMACs, but not one 256-bit AVX. In order to do AVX, the unit will split the operations accordingly.

But the theoretical peak FLOPS should be:
4 zen cores = 4 BD modules




EDIT:
Techreport:
Quote:
Bulldozer's 128-bit FMAC units will work together on 256-bit vectors, effectively producing a single 256-bit vector operation per cycle. Intel's Sandy Bridge, due early in 2011, will have two 256-bit vector units capable of producing a 256-bit multiply and a 256-bit add in a single cycle, double Bulldozer's AVX peak.
....
With dual 128-bit FMACs, Bulldozer's peak FLOPS throughput should be comparable to Sandy Bridge's peak with AVX and 256-bit vectors.
Up to Ivy Bridge it was ok, but then in haswell:
Anandtech:
Quote:
The other major addition to the execution engine is support for Intel's AVX2 instructions, including FMA (Fused Multiply-Add). Ports 0 & 1 now include newly designed 256-bit FMA units. As each FMA operation is effectively two floating point operations, these two units double the peak floating point throughput of Haswell compared to Sandy/Ivy Bridge. A side effect of the FMA units is that you now get two ports worth of FP multiply units, which can be a big boon to legacy FP code.

If you scroll down here, you will see that AVX on BD/PD is slower compared to XOP, even SSE3 is faster.


So yes, zen MUST have 100% improvement in pure FP workloads, no excuses this time.
Marios145 is offline  
post #305 of 306 Old 10-06-2016, 05:24 AM
ATI Enthusiast
 
Join Date: Nov 2010
Location: England
Posts: 1,724
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 61
Quote:
Originally Posted by DarkIdeals View Post

In the story the storm god Susano'o fought an eight headed serpent called Yamata no Orochi by leaving eight Sake barrels out to get all eight heads drunk, then cutting off each head one by one etc.. interesting piece of mythology for those who like that kinda thing
AMD = Naruto fans confirmed!

Phoenixlight is offline  
post #306 of 306 Old 10-06-2016, 05:28 AM
Overclocker
 
ku4eto's Avatar
 
Join Date: Oct 2013
Location: Bulgaria , Sofia
Posts: 2,777
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 109
Quote:
Originally Posted by Phoenixlight View Post

AMD = Naruto fans confirmed!
Yes, because you need Sharingan for Susano'o. Sharingang is red, AMD are red -> AMD are SASKEEEE fans.

Enough with the OT biggrin.gif

Avatar from the manga : GATE - Thus the JSDF Fought There!
Can be found on Bato.to website.

Previous Hardware:Warning: Spoiler! (Click to show)
CPU:
Intel Pentium 2 400 @400Mhz, AMD Athlon Orion 1000 @1Ghz, AMD Athlon Thunderbird 1000B @1Ghz, AMD Athlon 64 LE-1640 Orleans @2.6Ghz, OC to 2.9Ghz, AMD Athlon 64 X2 5050e Brisbane @2.6Ghz OC to 2.9Ghz, AMD Phenom II 960T @3.6Ghz 1605T 2.8Ghz CPU-NB

GPU:
Voodoo 3, GeForce2 MX 400, GeForce4 MX 440, Inno3D 7300GT 256MB AGP8X, Sapphire Radeon X550 256MB PCI-Ex16, PowerColor Radeon HD6950 1GB

Monitors:
Belinea 17" CRT, KTC 17" CRT, HP 19" Office monitor.
ku4eto is offline  
Reply

Quick Reply

Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off