Theories on why the SMT hurts the performance of gaming in Ryzen and some recommendations for the future - Page 5 - Overclock.net - An Overclocking Community
Forum Jump: 

Theories on why the SMT hurts the performance of gaming in Ryzen and some recommendations for the future

Reply
 
Thread Tools
post #41 of 56 (permalink) Old 03-06-2017, 03:03 AM
New to Overclock.net
 
superstition222's Avatar
 
Join Date: Sep 2014
Posts: 2,725
Rep: 218 (Unique: 102)
Quote:
Originally Posted by gtbtk View Post

Asrock did not even have any product ready on release day. That is not a business choice made from choice
It's also very possible for board makers to do a poor job even if they could have done a better one. That can be the result of business decisions.

It's not like software hasn't frequently had the "ship it in beta form and issue endless patches" model of development for a long time, in many cases. Board makers have a long history of issuing multiple revisions of the same board.

My Gigabyte UD3P 2.0 board, like all the others, will not boot with a multiplier higher than 4.4 GHz (don't remember the number... 22 maybe). AsRock FX boards have been shipped without LLC and with poor VRM cooling. One board didn't even have the thermal pad covering all the VRMs, at least according to one person's picture here. AsRock made a board that said it supports the 9000 series FX CPUs but isn't robust enough in terms of VRM cooling and quality to reliably do so.

Poor support for the features of Intel's Broadwell C CPUs is something that is pretty common, like being able to disable TDP throttling and manually adjust the EDRAM speed successfully. AsRock, oddly enough, has a good reputation in that area, as far as I know. Board makers decided that investing more time into making Broadwell C work better wasn't in their interest or Intel's.

Free expression is the base of human rights, the root of human nature, and the mother of truth. — 刘晓波 (Liu Xiaobo)
superstition222 is offline  
Sponsored Links
Advertisement
 
post #42 of 56 (permalink) Old 03-06-2017, 03:50 AM
stock...hahaha
 
gtbtk's Avatar
 
Join Date: Aug 2016
Location: Hong Kong
Posts: 1,616
Rep: 124 (Unique: 87)
Quote:
Originally Posted by superstition222 View Post
 
Quote:
Originally Posted by gtbtk View Post

Asrock did not even have any product ready on release day. That is not a business choice made from choice
It's also very possible for board makers to do a poor job even if they could have done a better one. That can be the result of business decisions.

It's not like software hasn't frequently had the "ship it in beta form and issue endless patches" model of development for a long time, in many cases. Board makers have a long history of issuing multiple revisions of the same board.

My Gigabyte UD3P 2.0 board, like all the others, will not boot with a multiplier higher than 4.4 GHz (don't remember the number... 22 maybe). AsRock FX boards have been shipped without LLC and with poor VRM cooling. One board didn't even have the thermal pad covering all the VRMs, at least according to one person's picture here. AsRock made a board that said it supports the 9000 series FX CPUs but isn't robust enough in terms of VRM cooling and quality to reliably do so.

Poor support for the features of Intel's Broadwell C CPUs is something that is pretty common, like being able to disable TDP throttling and manually adjust the EDRAM speed successfully. AsRock, oddly enough, has a good reputation in that area, as far as I know. Board makers decided that investing more time into making Broadwell C work better wasn't in their interest or Intel's.

 

There is a difference between shipping and selling beta hardware and software and not shipping anything so you have nothing to sell.

 

Lets be honest here, companies, and I am not just talking about Motherboard vendors, all aim to do the minimum amount of work/effort possible and sell the product they have at the highest price that they can convince customers to pay for it.

 

Broadwell desktop CPUs, with their production run of what, a month(?) did not exactly sell a lot of units. As there are multiple vendors with multiple models in their range to the CPUs, they maybe sold a thousand of each motherboard model making a total profit of say $20,000 for the entire family of products after the R&D costs are accounted for. Why would you put a bios developer being paid $120,000 a year onto developing bios upgrades for a product that you will never sell again and only has maybe 1000 to 5000 users of the CPU in total? In 2 man months, you have just eaten all of the profit you made from that entire family of products. After that you are ensuring a loss. You would put that developer working on a new product that stands the chance of selling 100,000 plus units and being much more profitable.

 

Ryzen is a product with much greater potential than Broadwell, not having a product available to be included in the day one reviews is a huge loss of inexpensive "free" marketing potential. The only reason you would not try and ship volume selling hardware is if by doing so with a totally unfinished product, you know it will damage your reputation even more than doing on the fly patches like Asus have been doing. 

gtbtk is offline  
post #43 of 56 (permalink) Old 03-06-2017, 05:10 AM
Daisy Chained Dumb Switch
 
navjack27's Avatar
 
Join Date: Aug 2015
Location: Bath, New York
Posts: 1,227
Rep: 50 (Unique: 39)
don't even talk to me about broadwell-c... i friggin LOVE that series of CPUs. i love the concept of the speed that L4 cache brings to the table.

EDIT: in fact since getting my 5820k main desktop. i retired my 5775c to a 24/7 folding machine/boinc machine and i just feel bad for doing that to it, but i don't have the desk space to have two full machines at my access at the moment. i'd set it up next to my main rig on the same monitor in a heartbeat LOL

8700k
(10 items)
CPU
Intel Core i7 8700k
Motherboard
Gigabyte Z370 AORUS Gaming 7
GPU
EVGA GeForce GTX 1080 Ti 11GB FTW3 GAMING iCX
RAM
Corsair Vengeance LPX 32GB CMK32GX4M4A2666C15
Hard Drive
SanDisk Ultra II 480GB 2.5"
Hard Drive
PNY CS1311 960GB 2.5"
Power Supply
SeaSonic PRIME Titanium 850W 80+ Titanium Certified Fully-Modular ATX Power Supply
Cooling
Noctua NH-D15
Case
Fractal Design Define S ATX Mid Tower Case
Operating System
Windows 10 Pro
▲ hide details ▲


navjack27 is offline  
Sponsored Links
Advertisement
 
post #44 of 56 (permalink) Old 03-06-2017, 12:06 PM
New to Overclock.net
 
superstition222's Avatar
 
Join Date: Sep 2014
Posts: 2,725
Rep: 218 (Unique: 102)
Quote:
Originally Posted by gtbtk View Post

Broadwell desktop CPUs, with their production run of what, a month(?) did not exactly sell a lot of units. As there are multiple vendors with multiple models in their range to the CPUs, they maybe sold a thousand of each motherboard model making a total profit of say $20,000 for the entire family of products after the R&D costs are accounted for. Why would you put a bios developer being paid $120,000 a year onto developing bios upgrades for a product that you will never sell again and only has maybe 1000 to 5000 users of the CPU in total? In 2 man months, you have just eaten all of the profit you made from that entire family of products. After that you are ensuring a loss. You would put that developer working on a new product that stands the chance of selling 100,000 plus units and being much more profitable.

Ryzen is a product with much greater potential than Broadwell, not having a product available to be included in the day one reviews is a huge loss of inexpensive "free" marketing potential. The only reason you would not try and ship volume selling hardware is if by doing so with a totally unfinished product, you know it will damage your reputation even more than doing on the fly patches like Asus have been doing. 
Broadwell C didn't require a new special board. As for seeking profit, neither ASUS nor Gigabyte bothered to put premium features on their Zen boards, like a hybrid water-air cooler, features that have been on Intel boards since 2013. As for why board makers should have fully supported Broadwell C chips — it's because if they claim to support a CPU they need to fully support it.

Free expression is the base of human rights, the root of human nature, and the mother of truth. — 刘晓波 (Liu Xiaobo)
superstition222 is offline  
post #45 of 56 (permalink) Old 03-06-2017, 01:42 PM
stock...hahaha
 
gtbtk's Avatar
 
Join Date: Aug 2016
Location: Hong Kong
Posts: 1,616
Rep: 124 (Unique: 87)
Quote:

Broadwell C didn't require a new special board. As for seeking profit, neither ASUS nor Gigabyte bothered to put premium features on their Zen boards, like a hybrid water-air cooler, features that have been on Intel boards since 2013. As for why board makers should have fully supported Broadwell C chips — it's because if they claim to support a CPU they need to fully support it.

 

Broadwell C did require Bios upgrades to support the chips in just the same way z170 needed bios updates to support Kaby Lake. Support just means that the manufacturer guarantees you can run a cpu on a given piece of hardware and will assist you for the life of that product. It doesn't guarantee you will get any upgrades after the event.

 

with regards the Zen motherboards, the planned full range of manufacturers boards have almost certainly not yet been announced.  Given the market share of bulldozer and Piledriver AMD CPUs, I would think that the vendors are probably taking a wait and see approach. If there is good take up of Ryzen, they will expand the range. That is called risk mitigation.

 

If Ryzen fails, and I am certainly not suggesting it will, They have enough product to cover what demand there is but why would they throw good money after bad to support a CPU that no-one is buying?

gtbtk is offline  
post #46 of 56 (permalink) Old 03-06-2017, 01:55 PM
New to Overclock.net
 
superstition222's Avatar
 
Join Date: Sep 2014
Posts: 2,725
Rep: 218 (Unique: 102)
Quote:
Originally Posted by gtbtk View Post

Broadwell C did require Bios upgrades to support the chips in just the same way z170 needed bios updates to support Kaby Lake.
A BIOS update is hardly the same thing as a different board.

As for support, all the features of the product (the board and the CPU) should be fully supported. Otherwise it should be labeled partial or minimal support rather than just "supported". For Broadwell C that means enabling the user to adjust the TDP, to avoid TDP throttling, and it means being able to adjust the EDRAM clock. Some brands made the effort to provide this support for at least one board, some did it in a half-baked manner, and some didn't bother.

Free expression is the base of human rights, the root of human nature, and the mother of truth. — 刘晓波 (Liu Xiaobo)
superstition222 is offline  
post #47 of 56 (permalink) Old 03-06-2017, 04:36 PM
New to Overclock.net
 
mozmo's Avatar
 
Join Date: Feb 2017
Posts: 18
Rep: 1 (Unique: 1)
Ryzen will never be as good as intel in heavily coherent memory sharing applications.

The L3 in Ryzen is a victim cache not inclusive, it's broken into 2(CCX) and acts like a cluster on die chip. The L3 is not the LLC in the system like the L3 is in intel designs. This means any coherent locks/dependent memory sharing is going to be much slower than intel because a lot of the time it will need to go through slower DDR4 to ensure memory coherency.

This is why it falls behind in gaming, gaming depends on coherency and memory sharing a lot more. Improving the windows scheduler to recognize the clusters will help somewhat but you'll still hit scaling issues if a thread from CCX1 need to share data to a thread on CCX2. The bandwidth between these 2 is only 22gb/s and not fast, you're looking at around 50-100ns of pipeline stall vs 10ns on intel.
mozmo is offline  
post #48 of 56 (permalink) Old 03-06-2017, 04:42 PM
New to Overclock.net
 
superstition222's Avatar
 
Join Date: Sep 2014
Posts: 2,725
Rep: 218 (Unique: 102)
Quote:
Originally Posted by mozmo View Post

Ryzen will never be as good as intel in heavily coherent memory sharing applications.

The L3 in Ryzen is a victim cache not inclusive, it's broken into 2(CCX) and acts like a cluster on die chip. The L3 is not the LLC in the system like the L3 is in intel designs. This means any coherent locks/dependent memory sharing is going to be much slower than intel because a lot of the time it will need to go through slower DDR4 to ensure memory coherency.

This is why it falls behind in gaming, gaming depends on coherency and memory sharing a lot more. Improving the windows scheduler to recognize the clusters will help somewhat but you'll still hit scaling issues if a thread from CCX1 need to share data to a thread on CCX2. The bandwidth between these 2 is only 22gb/s and not fast, you're looking at around 50-100ns of pipeline stall vs 10ns on intel.
The worst Ryzen gaming results are typically with those that favor single-threaded performance and fewer threads being heavily used, right? So, a quad that doesn't deal with the CCX would be most optimal? Things like Dolphin would also benefit from higher clocks and fewer cores/threads.

Is there a way to have a quad (half of Ryzen) and have 8 threads via SMT or does that involve the CCX1 to CCX2 latency issue? A 4/8 part with a high enough clock that doesn't have the CCX to CCX latency issue should be pretty competitive.

I wonder if Zen+ will have an eDRAM L4.

Free expression is the base of human rights, the root of human nature, and the mother of truth. — 刘晓波 (Liu Xiaobo)
superstition222 is offline  
post #49 of 56 (permalink) Old 03-06-2017, 04:56 PM
New to Overclock.net
 
Kuivamaa's Avatar
 
Join Date: Feb 2013
Location: Finland
Posts: 4,594
Rep: 218 (Unique: 113)
Quote:
Originally Posted by superstition222 View Post

The worst Ryzen gaming results are typically with those that favor single-threaded performance and fewer threads being heavily used, right? So, a quad that doesn't deal with the CCX would be most optimal? Things like Dolphin would also benefit from higher clocks and fewer cores/threads.

Is there a way to have a quad (half of Ryzen) and have 8 threads via SMT or does that involve the CCX1 to CCX2 latency issue? A 4/8 part with a high enough clock that doesn't have the CCX to CCX latency issue should be pretty competitive.

I wonder if Zen+ will have an eDRAM L4.

No, Ryzen game performance seems to depend on engine sensitivities and nuances. Single thread performance is strong , MT even more so yet there are both poorly and well threaded games that run both good and less good on ryzen. It is down to what is the engine doing and whether it touches upon non optimized areas.

Kuivamaa is offline  
post #50 of 56 (permalink) Old 03-06-2017, 08:37 PM
New to Overclock.net
 
mozmo's Avatar
 
Join Date: Feb 2017
Posts: 18
Rep: 1 (Unique: 1)
Quote:
Originally Posted by superstition222 View Post

The worst Ryzen gaming results are typically with those that favor single-threaded performance and fewer threads being heavily used, right? So, a quad that doesn't deal with the CCX would be most optimal? Things like Dolphin would also benefit from higher clocks and fewer cores/threads.

Is there a way to have a quad (half of Ryzen) and have 8 threads via SMT or does that involve the CCX1 to CCX2 latency issue? A 4/8 part with a high enough clock that doesn't have the CCX to CCX latency issue should be pretty competitive.

I wonder if Zen+ will have an eDRAM L4.
Watchdogs 2 and BF1 are heavily threaded and they perform worse on Ryzen, Rise of the tomb raider, dx12 spreads load on all threads, runs worse. GTA5 another one that spreads load and runs worse. Lots of games use many threads now.

The IPC of Ryzen is roughly the same as broadwell-e in single thread, Ryzen runs at higher clocks and still loses badly to broadwell-e, it's only gaming largely which is heavily cache dependent. Most other workloads that Ryzen does well are when it scales 100% to all cores because each thread has no dependency on another thread or any shared data.

If you look at the architecture of these chips, they are very similar now, same width decoder, same number of integer /fp units, similar register count, out of order window, trace cache etc.

The only difference now is the L3 (split victim vs full inclusive) and 128bit FMACs vs 256bit in intel chips.

AVX performance on Ryzen is half the rate of intel and cache performance and latency is worse than intel.

Luckily no games use AVX otherwise we'd have an even larger meltdown by AMD fans.
mozmo is offline  
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off