Overclock.net › Forums › Intel › Intel CPUs › Just a heads up for anyone buying X299, if you want full AVX capability, you need to buy the 7900X and up
New Posts  All Forums:Forum Nav:

Just a heads up for anyone buying X299, if you want full AVX capability, you need to buy the 7900X and up - Page 3

post #21 of 26
I just received benchmark results showing that both 7900X and the 7800X have the full-throughput AVX512.

From HWBOT: http://forum.hwbot.org/showthread.php?p=490227#post490227

This benchmark was written by myself. It is the same benchmark that found the Ryzen FMA bug.


Here's the 7900X @ 4.5 GHz:


The benchmark shows: 1443.84 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (10 cores) * (4.5 GHz) = 1440 GFlops

(It's slightly over the theoretical limit since there are minor timing fluctuations.)

And here's the 7800X @ 4.5 GHz:


The benchmark shows: 872.832 GFlops
Theoretical limit (half-throughput AVX512): (1 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 432 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 864 GFlops

(Again, timing fluctuations make it slightly above the theoretical limit.)

This is very puzzling:
  • Did anandtech get it wrong? IIRC, there was another source that said the same thing?
  • Did Intel initially intend the lower-core parts to have half-throughput. But decided to change at the last moment due to Ryzen?
  • Is it because these are ES samples and the retail ones will only have half-throughput?
  • I have first-hand information directly from Intel that servers are also supposed to be split into this half vs. full-throughput AVX512 thing. But that was several months ago. Did things change this quickly?

Edited by Mysticial - 6/23/17 at 6:40pm
post #22 of 26
Quote:
Originally Posted by Mysticial View Post

I just received benchmark results showing that both 7900X and the 7800X have the full-throughput AVX512.
Warning: Spoiler! (Click to show)
From HWBOT: http://forum.hwbot.org/showthread.php?p=490227#post490227

This benchmark was written by myself. It is the same benchmark that found the Ryzen FMA bug.


Here's the 7900X @ 4.5 GHz:


The benchmark shows: 1443.84 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle for full-throughput AVX512) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (10 cores) * (4.5 GHz) = 1440 GFlops

(It's slightly over the theoretical limit since there are minor timing fluctuations.)

And here's the 7800X @ 4.5 GHz:


The benchmark shows: 872.832 GFlops
Theoretical limit (half-throughput AVX512): (1 FMA/cycle for full-throughput AVX512) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 432 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle for full-throughput AVX512) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 864 GFlops

(Again, timing fluctuations make it slightly above the theoretical limit.)

This is very puzzling:
  • Did anandtech get it wrong? IIRC, there was another source that said the same thing?
  • Did Intel initially intend the lower-core parts to have half-throughput. But decided to change at the last moment due to Ryzen?
  • Is it because these are ES samples and the retail ones will only have half-throughput?
  • I have first-hand information directly from Intel that servers are also supposed to be split into this half vs. full-throughput AVX512 thing. But that was several months ago. Did things change this quickly?

It very well could be an ES thing, needs to be tested on retail.
post #23 of 26
Quote:
Originally Posted by Seijitsu View Post

It very well could be an ES thing, needs to be tested on retail.

Agreed. ES always show benefits to entice the consumer.
post #24 of 26
Thread Starter 
Quote:
Originally Posted by Mysticial View Post

Warning: Spoiler! (Click to show)
I just received benchmark results showing that both 7900X and the 7800X have the full-throughput AVX512.

From HWBOT: http://forum.hwbot.org/showthread.php?p=490227#post490227

This benchmark was written by myself. It is the same benchmark that found the Ryzen FMA bug.


Here's the 7900X @ 4.5 GHz:


The benchmark shows: 1443.84 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (10 cores) * (4.5 GHz) = 1440 GFlops

(It's slightly over the theoretical limit since there are minor timing fluctuations.)

And here's the 7800X @ 4.5 GHz:


The benchmark shows: 872.832 GFlops
Theoretical limit (half-throughput AVX512): (1 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 432 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 864 GFlops

(Again, timing fluctuations make it slightly above the theoretical limit.)

This is very puzzling:
  • Did anandtech get it wrong? IIRC, there was another source that said the same thing?
  • Did Intel initially intend the lower-core parts to have half-throughput. But decided to change at the last moment due to Ryzen?
  • Is it because these are ES samples and the retail ones will only have half-throughput?
  • I have first-hand information directly from Intel that servers are also supposed to be split into this half vs. full-throughput AVX512 thing. But that was several months ago. Did things change this quickly?

There are other sources confirming the same thing.

http://www.tomshardware.com/reviews/intel-core-i9-7900x-skylake-x,5092-3.html

Quote:
Core i9-7900X employs two 256-bit AVX FMA units per core that operate in parallel, whereas Ryzen's Zen architecture divides 256-bit AVX operations across two FMA units per core. Intel deactivates one FMA per core on the sub-10-core Skylake-X models. As such, Core i9-7900K has an inherent advantage in the y-cruncher benchmark, a single- and multi-threaded program that computes Pi using AVX instructions. We tested with version 0.7.2.9469, which includes Ryzen optimizations.

The -7900X's single-core SHA2-256 test results are nearly twice that of the two previous-generation models due to Intel's targeted AVX2 optimizations for hashing performance. That same advantage carries over to the threaded test. Intel offers AVX-512 support with the Skylake-X processors but doesn't employ all 11 features in the desktop models. Instead, the company targets specific feature sets at different market segments.

http://techreport.com/review/32111/intel-core-i9-7900x-cpu-reviewed-part-one

Quote:
What's more, not every Core X chip in the lineup will enjoy the same boost in SIMD performance from AVX-512. Only the Core i9 series of CPUs will ship with the dedicated AVX-512 FMA. The Core i7-7800X and Core i7-7820X will still have the wider registers for AVX-512, but they'll only execute instructions using the pair of 256-bit AVX units common to all Skylake chips. This exercise in segmentation might surprise people expecting a uniform performance increase from AVX-512 across all the CPUs that support it. (The Kaby Lake-X Core i5-7640X and Core i7-7740X won't support AVX-512 at all.)

Because of those caveats, we may be waiting a while for mainstream desktop applications that can really take advantage of all the extra parallelism on offer from these new instructions. Scientific-computing, deep-learning, and financial-services folks will probably be drooling for AVX-512, but regular Joes and Janes probably won't see any major speedups until companies recompile their software (at the very least). That assumes AVX-512 is coming to mainstream Intel CPUs, as well.


It looks like there are multiple sources.

If they changed at last minute, thus far, it has not been communicated to us nor most tech reviewers it seems.
Trooper Typhoon
(20 items)
 
  
CPUMotherboardGraphicsGraphics
5960X X99A Godlike MSI r9 290X Lightning  MSI r9 290X Lightning 
RAMHard DriveHard DriveHard Drive
G.Skill Trident Z 32 Gb Samsung SM843T 960 GB Western Digital Caviar Black 2Tb Samsung 850 Pro 
Hard DriveOptical DriveCoolingCooling
Samsung SV843 960 GB LG WH14NS40 Cryorig R1 Ultimate 9x Gentle Typhoon 1850 rpm on case 
OSMonitorKeyboardPower
Windows 7 Pro x64 Korean 27" 2560x1440 Ducky Legend with Vortex PBT Doubleshot Backlit... EVGA 1300W G2 
CaseMouseAudioOther
Cooler Master Storm Trooper Logitech G502 Proteus Asus Xonar Essence STX Lamptron Fanatic Fan Controller  
  hide details  
Reply
Trooper Typhoon
(20 items)
 
  
CPUMotherboardGraphicsGraphics
5960X X99A Godlike MSI r9 290X Lightning  MSI r9 290X Lightning 
RAMHard DriveHard DriveHard Drive
G.Skill Trident Z 32 Gb Samsung SM843T 960 GB Western Digital Caviar Black 2Tb Samsung 850 Pro 
Hard DriveOptical DriveCoolingCooling
Samsung SV843 960 GB LG WH14NS40 Cryorig R1 Ultimate 9x Gentle Typhoon 1850 rpm on case 
OSMonitorKeyboardPower
Windows 7 Pro x64 Korean 27" 2560x1440 Ducky Legend with Vortex PBT Doubleshot Backlit... EVGA 1300W G2 
CaseMouseAudioOther
Cooler Master Storm Trooper Logitech G502 Proteus Asus Xonar Essence STX Lamptron Fanatic Fan Controller  
  hide details  
Reply
post #25 of 26
Anyone have a retail 7800X or 7820X and are willing to test the AVX512?
post #26 of 26
Looks like the retail 7800X does in fact have full-throughput:

http://www.pcgameshardware.de/Skylake-X-Codename-266252/News/Core-i7-7800X-AVX512-Durchsatz-1232713/

The guys who run that site sent me an email this morning. And using the same FLOPs benchmark, were able to show full-throughput on the 7800X.

Alright... Now I'm interested in the story behind this mess.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Intel CPUs
  • Just a heads up for anyone buying X299, if you want full AVX capability, you need to buy the 7900X and up
Overclock.net › Forums › Intel › Intel CPUs › Just a heads up for anyone buying X299, if you want full AVX capability, you need to buy the 7900X and up