Just a heads up for anyone buying X299, if you want full AVX capability, you need to buy the 7900X and up - Page 3 - Overclock.net - An Overclocking Community

Forum Jump: 

Just a heads up for anyone buying X299, if you want full AVX capability, you need to buy the 7900X and up

Reply
 
Thread Tools
post #21 of 27 (permalink) Old 06-23-2017, 07:37 PM
Hardware Enthusiast
 
Mysticial's Avatar
 
Join Date: Aug 2015
Posts: 606
Rep: 66
I just received benchmark results showing that both 7900X and the 7800X have the full-throughput AVX512.

From HWBOT: http://forum.hwbot.org/showthread.php?p=490227#post490227

This benchmark was written by myself. It is the same benchmark that found the Ryzen FMA bug.


Here's the 7900X @ 4.5 GHz:


The benchmark shows: 1443.84 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (10 cores) * (4.5 GHz) = 1440 GFlops

(It's slightly over the theoretical limit since there are minor timing fluctuations.)

And here's the 7800X @ 4.5 GHz:


The benchmark shows: 872.832 GFlops
Theoretical limit (half-throughput AVX512): (1 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 432 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 864 GFlops

(Again, timing fluctuations make it slightly above the theoretical limit.)

This is very puzzling:
  • Did anandtech get it wrong? IIRC, there was another source that said the same thing?
  • Did Intel initially intend the lower-core parts to have half-throughput. But decided to change at the last moment due to Ryzen?
  • Is it because these are ES samples and the retail ones will only have half-throughput?
  • I have first-hand information directly from Intel that servers are also supposed to be split into this half vs. full-throughput AVX512 thing. But that was several months ago. Did things change this quickly?
Mysticial is offline  
Sponsored Links
Advertisement
 
post #22 of 27 (permalink) Old 06-23-2017, 07:41 PM
Linux Lobbyist
 
Seijitsu's Avatar
 
Join Date: Jan 2017
Posts: 180
Rep: 14
Quote:
Originally Posted by Mysticial View Post

I just received benchmark results showing that both 7900X and the 7800X have the full-throughput AVX512.
Warning: Spoiler! (Click to show)
From HWBOT: http://forum.hwbot.org/showthread.php?p=490227#post490227

This benchmark was written by myself. It is the same benchmark that found the Ryzen FMA bug.


Here's the 7900X @ 4.5 GHz:


The benchmark shows: 1443.84 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle for full-throughput AVX512) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (10 cores) * (4.5 GHz) = 1440 GFlops

(It's slightly over the theoretical limit since there are minor timing fluctuations.)

And here's the 7800X @ 4.5 GHz:


The benchmark shows: 872.832 GFlops
Theoretical limit (half-throughput AVX512): (1 FMA/cycle for full-throughput AVX512) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 432 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle for full-throughput AVX512) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 864 GFlops

(Again, timing fluctuations make it slightly above the theoretical limit.)

This is very puzzling:
  • Did anandtech get it wrong? IIRC, there was another source that said the same thing?
  • Did Intel initially intend the lower-core parts to have half-throughput. But decided to change at the last moment due to Ryzen?
  • Is it because these are ES samples and the retail ones will only have half-throughput?
  • I have first-hand information directly from Intel that servers are also supposed to be split into this half vs. full-throughput AVX512 thing. But that was several months ago. Did things change this quickly?

It very well could be an ES thing, needs to be tested on retail.
Seijitsu is offline  
post #23 of 27 (permalink) Old 06-23-2017, 07:49 PM
New to Overclock.net
 
Nebulous's Avatar
 
Join Date: Sep 2006
Location: The Empire State
Posts: 309
Rep: 7
Quote:
Originally Posted by Seijitsu View Post

It very well could be an ES thing, needs to be tested on retail.

Agreed. ES always show benefits to entice the consumer.

ASRock Z270 Extreme 4 / i7 7700K @ 5.0GHz / GSKILL Ripjaws V F4-3200C16-8GVGB 16GB (2x8GB)
H2O = XSPC RayStorm / Coolgate CG480 / Lowara D5 / XSPC Dual Bay Res V1
LITEON LCS-256L9S-11 256GB SSD / WD RED Nas 3tb x2 -RAID-0 / WD VelociRaptor 300GB
Zotac GTX 1070 AMP! Extreme @ 2100 / 4600
Creative Sound Blaster Z / Logitech Z560 / Senn HD518
Seasonic Prime Titanium SSR-650TD 650w
Phanteks Enthoo Primo / Win10 Pro x64
** Under TITLE II, SECTION 210.340: it is legal to overclock **


Nebulous is offline  
Sponsored Links
Advertisement
 
post #24 of 27 (permalink) Old 06-24-2017, 09:19 AM - Thread Starter
Meeeeeeeow!
 
CrazyElf's Avatar
 
Join Date: Dec 2011
Location: Ontario, Canada
Posts: 2,223
Rep: 428
Quote:
Originally Posted by Mysticial View Post

Warning: Spoiler! (Click to show)
I just received benchmark results showing that both 7900X and the 7800X have the full-throughput AVX512.

From HWBOT: http://forum.hwbot.org/showthread.php?p=490227#post490227

This benchmark was written by myself. It is the same benchmark that found the Ryzen FMA bug.


Here's the 7900X @ 4.5 GHz:


The benchmark shows: 1443.84 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (10 cores) * (4.5 GHz) = 1440 GFlops

(It's slightly over the theoretical limit since there are minor timing fluctuations.)

And here's the 7800X @ 4.5 GHz:


The benchmark shows: 872.832 GFlops
Theoretical limit (half-throughput AVX512): (1 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 432 GFlops
Theoretical limit (full-throughput AVX512): (2 FMA/cycle) * (2 Flops/FMA) * (8 DP/instruction for AVX512) * (6 cores) * (4.5 GHz) = 864 GFlops

(Again, timing fluctuations make it slightly above the theoretical limit.)

This is very puzzling:
  • Did anandtech get it wrong? IIRC, there was another source that said the same thing?
  • Did Intel initially intend the lower-core parts to have half-throughput. But decided to change at the last moment due to Ryzen?
  • Is it because these are ES samples and the retail ones will only have half-throughput?
  • I have first-hand information directly from Intel that servers are also supposed to be split into this half vs. full-throughput AVX512 thing. But that was several months ago. Did things change this quickly?

There are other sources confirming the same thing.

http://www.tomshardware.com/reviews/intel-core-i9-7900x-skylake-x,5092-3.html

Quote:
Core i9-7900X employs two 256-bit AVX FMA units per core that operate in parallel, whereas Ryzen's Zen architecture divides 256-bit AVX operations across two FMA units per core. Intel deactivates one FMA per core on the sub-10-core Skylake-X models. As such, Core i9-7900K has an inherent advantage in the y-cruncher benchmark, a single- and multi-threaded program that computes Pi using AVX instructions. We tested with version 0.7.2.9469, which includes Ryzen optimizations.

The -7900X's single-core SHA2-256 test results are nearly twice that of the two previous-generation models due to Intel's targeted AVX2 optimizations for hashing performance. That same advantage carries over to the threaded test. Intel offers AVX-512 support with the Skylake-X processors but doesn't employ all 11 features in the desktop models. Instead, the company targets specific feature sets at different market segments.

http://techreport.com/review/32111/intel-core-i9-7900x-cpu-reviewed-part-one

Quote:
What's more, not every Core X chip in the lineup will enjoy the same boost in SIMD performance from AVX-512. Only the Core i9 series of CPUs will ship with the dedicated AVX-512 FMA. The Core i7-7800X and Core i7-7820X will still have the wider registers for AVX-512, but they'll only execute instructions using the pair of 256-bit AVX units common to all Skylake chips. This exercise in segmentation might surprise people expecting a uniform performance increase from AVX-512 across all the CPUs that support it. (The Kaby Lake-X Core i5-7640X and Core i7-7740X won't support AVX-512 at all.)

Because of those caveats, we may be waiting a while for mainstream desktop applications that can really take advantage of all the extra parallelism on offer from these new instructions. Scientific-computing, deep-learning, and financial-services folks will probably be drooling for AVX-512, but regular Joes and Janes probably won't see any major speedups until companies recompile their software (at the very least). That assumes AVX-512 is coming to mainstream Intel CPUs, as well.


It looks like there are multiple sources.

If they changed at last minute, thus far, it has not been communicated to us nor most tech reviewers it seems.
CrazyElf is offline  
post #26 of 27 (permalink) Old 07-10-2017, 07:08 PM
Hardware Enthusiast
 
Mysticial's Avatar
 
Join Date: Aug 2015
Posts: 606
Rep: 66
Mysticial is offline  
post #27 of 27 (permalink) Old 08-15-2017, 05:11 AM
New to Overclock.net
 
autoshot's Avatar
 
Join Date: Nov 2014
Posts: 54
Rep: 0
Hello everyone!

I discovered this topic while I was trying to find the degree of AVX-support in Handbrake since I'm planning to finally upgrade my Xeon X5650 rig, but still don't know which CPU to pick as a successor frown.gif More precisely, my system is (and will be, at least for now) primarily used to convert 4K30 H.264 and 1080p60 H.264 footage from various sources (Drone, Action Cam, iPhone) to 4K30 H.265, 1080p30 H.264/5 and 720p30/60 H.264 using Handbrake. In addition to that I sometimes run Monte Carlo AC/DC optimal power flow simulations for my Ph.D. and occasionally play games like GTA V or Project CARS.

By now I've narrowed the choice down to TR 1950X or i9 7940X. Unfortunately, both CPUs have their up- and downsides frown.gif

AMD
- soldered
- significantly better value
- 64 PCIe lanes
- motherboard power supply (apparently) not undersized
- I like AMD smile.gif

INTEL
- more future-proof thanks to AVX512? (important to me since I usually keep my systems for many years)
- probably better overclockability (after delidding, that is)
- Thunderbolt 3 support
- higher single-core IPC
- availability of CPU-coolers

A year ago I was 100% certain I would buy Skylake-X. Then first information about Threadripper made it to the public, along with Intel's decision to not solder the heat spreader of its new HEDT CPUs anymore, so I started doubting my inital plan and finally decided to go for AMD this time. At some point, however, I found out about how AVX can significantly speed up things like video encoding and now my fear is that TR might not be the best choice for my future system afterall, especially given the encoding results from Anandtech and the fact that Handbrake apparently cannot use more than 20 threads anyway, but probably will make use of AVX512 soon. What would you do?


autoshot is offline  
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off