[Various] Ashes of the Singularity DX12 Benchmarks - Page 172 - Overclock.net - An Overclocking Community

Forum Jump: 

[Various] Ashes of the Singularity DX12 Benchmarks

 
Thread Tools
post #1711 of 2682 (permalink) Old 09-01-2015, 03:19 PM
New to Overclock.net
 
Xuper's Avatar
 
Join Date: Jan 2014
Location: None
Posts: 789
Rep: 43 (Unique: 29)

More Fire !!

 

Quote by AMD_Robert:
 Maxwell cards are now also crashing out of the benchmark as they spend >3000ms trying to compute one of the workloads.

The author is not interpreting the results correctly.

Look at the height of the graphics bars.

Look at the height of the compute bars.

Notice how NVIDIA's async results are the height of those bars combined? This means the workloads are running serially, otherwise compute wouldn't have to wait on graphics and the bars would not be additive.

Compare that to the GCN results. Compute and graphics together, async shading bars are no higher than any other workload, demonstrating that frame latencies are not affected when the workloads are running together.

//EDIT: Asynchronous shading isn't simply whether or not a workload can contain compute and graphics. It's whether or not that workload overlay graphics and compute, processing them both simultaneously without the pipeline latency getting any longer than the longest job. This is what GCN shows, but Maxwell does not.

//15:45 Central Edit: This benchmark has now been updated. GPU utilization of Maxwell-based graphics cards is now dropping to 0% under async compute workloads. As the workloads get more aggressive, the application ultimately crashes as the architecture cannot complete the workload before Windows terminates the thread (>3000ms hang).

 

https://www.reddit.com/r/pcgaming/comments/3j87qg/nvidias_maxwell_gpus_can_do_dx12_async_shading/


CPU : AMD Ryzen 1600X | Memory : [Ripjaws V] F4-3200C16D-16GVKB  | Motherboard : Asus Prime X370 Pro | Graphic : XFX AMD Radeon R9 290 Double Dissipation | Monitor : AOC 931Sw | HDD : 1x120 GB SSD Samsung Evo , 1x2TB Seagate

 

 

Xuper is offline  
Sponsored Links
Advertisement
 
post #1712 of 2682 (permalink) Old 09-01-2015, 03:23 PM
New to Overclock.net
 
Mahigan's Avatar
 
Join Date: Aug 2015
Location: Ottawa, Canada
Posts: 1,750
Rep: 874 (Unique: 233)
Aside from the Async stuff...

Here's what I think they did at Beyond3D:
  1. They set the amount of threads, per kernel, to 32 (they're CUDA programmers after-all).
  2. They've bumped the Kernel count to up to 512 (16,384 Threads total).
  3. They're scratching their heads wondering why the results don't make sense when comparing GCN to Maxwell 2

Here's why that's not how you code for GCN








Why?:
  1. Each CU can have 40 Kernels in flight (each made up of 64 threads to form a single Wavefront).
  2. That's 2,560 Threads total PER CU.
  3. An R9 290x has 44 CUs or the capacity to handle 112,640 Threads total.

If you load up GCN with Kernels made up of 32 Threads you're wasting resources. If you're not pushing GCN you're wasting compute potential. In slide number 4, it stipulates that latency is hidden by executing overlapping wavefronts. This is why GCN appears to have a high degree of latency but you can execute a ton of work on GCN without affected the latency. With Maxwell/2, latency rises up like a staircase with the more work you throw at it. I'm not sure if the folks at Beyond3D are aware of this or not.


Conclusion:

I think they geared this test towards nVIDIAs CUDA architectures and are wondering why their results don't make sense on GCN. If true... DERP! That's why I said the single Latency results don't matter. This test is only good if you're checking on Async functionality.


GCN was built for Parallelism, not serial workloads like nVIDIAs architectures. This is why you don't see GCN taking a hit with 512 Kernels.

What did Oxide do? They built two paths. One with Shaders Optimized for CUDA and the other with Shaders Optimized for GCN. On top of that GCN has Async working. Therefore it is not hard to determine why GCN performs so well in Oxide's engine. It's a better architecture if you push it and code for it. If you're only using light compute work, nVIDIAs architectures will be superior.

This means that the burden is on developers to ensure they're optimizing for both. In the past, this hasn't been the case. Going forward... I hope they do. As for GameWorks titles, don't count them being optimized for GCN. That's a given. Oxide played fair, others... might not.

"Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth." - Arthur Conan Doyle (Sherlock Holmes)
Mahigan is offline  
post #1713 of 2682 (permalink) Old 09-01-2015, 03:37 PM
New to Overclock.net
 
Forceman's Avatar
 
Join Date: Sep 2012
Posts: 10,869
Rep: 794 (Unique: 514)

I'm not sure even the guy who wrote it knows what it is doing, so I'm not sure it's wise for AMD to be piping up quite yet. Maybe instead of calling Nvidia out over the TDR they should explain why their cards take 4 to 5 times as long to complete the compute only portion.

Quote:
Originally Posted by Mahigan View Post

Aside from the Async stuff...

Here's what I think they did at Beyond3D: Warning: Spoiler! (Click to show)
  1. They set the amount of threads, per kernel, to 32 (they're CUDA programmers after-all).
  2. They've bumped the Kernel count to up to 512 (16,384 Threads total).
  3. They're scratching their heads wondering why the results don't make sense when comparing GCN to Maxwell 2

Here's why that's not how you code for GCN




Why?:
  1. Each CU can have 40 Kernels in flight (each made up of 64 threads to form a single Wavefront).
  2. That's 2,560 Threads total PER CU.
  3. An R9 290x has 44 CUs or the capacity to handle 112,640 Threads total.

If you load up GCN with Kernels made up of 32 Threads you're wasting resources. If you're not pushing GCN you're wasting compute potential.


Conclusion:

I think they geared this test towards nVIDIAs CUDA architectures and are wondering why their results don't make sense on GCN. If true... DERP! That's why I said the single Latency results don't matter. This test is only good if you're checking on Async functionality.


GCN was built for Parallelism, not serial workloads like nVIDIAs architectures. This is why you don't see GCN taking a hit with 512 Kernels.

Exactly. I'm not convinced this little test is actually testing what they think it is, or that it is returning meaningful results.

Forceman's Law: Any AMD/Nvidia GPU thread, no matter what the topic, will eventually include a post referencing the GTX 970.
Forceman is offline  
Sponsored Links
Advertisement
 
post #1714 of 2682 (permalink) Old 09-01-2015, 03:40 PM
RET 0
 
Paul17041993's Avatar
 
Join Date: May 2013
Location: 'stralia, whereh teh weahter canneh dehcide whet et wahnts te beh.
Posts: 2,545
Rep: 99 (Unique: 78)
and all this is why my engine I'm designing will make use of hardware profiles, so ideal low-level settings are automatically default for the particular hardware in use.


Paul17041993 is offline  
post #1715 of 2682 (permalink) Old 09-01-2015, 03:42 PM
New to Overclock.net
 
Join Date: Oct 2007
Posts: 2,834
Rep: 216 (Unique: 175)
Quote:
Originally Posted by Themisseble View Post

That is desktop 7870
https://www.techpowerup.com/gpudb/1966/radeon-hd-8970m.html

Not this again. It's a GCN1.1 or GCN2 part, not pitcarin.

https://www.overclock.net/t/1571391/tpu-amd-also-quietly-launches-the-radeon-r9-370x-sapphire-gives-it-vapor-x-treatment/20#post_24349701

If you can keep your beer while all about you are spilling theirs and blaming it on you, then you'd be a playa, my brutha. - Dudyard Broling
gamervivek is offline  
post #1716 of 2682 (permalink) Old 09-01-2015, 03:45 PM
New to Overclock.net
 
spacin9's Avatar
 
Join Date: Apr 2013
Posts: 400
Rep: 14 (Unique: 13)
And a few things of note in my ongoing assessment:

DX 11 in a Win 10 environment, all 12 threads of my hex core CPU are being stressed and will go as high as 80-100 %. So I'm seeing a higher CPU usage in DX 11. Sits around 50% most of the time in DX 12. I do not see a real gain in DX 11 in Win 10. It's seems to be a bit worse if anything.

DX 11 in a Win 7 environment, 4 threads are not used at all... it seems it just uses 7-8 threads of my hex core cpu. I don't know if that's expected or not.. just thought it was interesting that there seems to be evidence of the promise of DX 12.thumb.gif
spacin9 is offline  
post #1717 of 2682 (permalink) Old 09-01-2015, 03:46 PM
New to Overclock.net
 
Mahigan's Avatar
 
Join Date: Aug 2015
Location: Ottawa, Canada
Posts: 1,750
Rep: 874 (Unique: 233)
Quote:
Originally Posted by Forceman View Post

I'm not sure even the guy who wrote it knows what it is doing, so I'm not sure it's wise for AMD to be piping up quite yet. Maybe instead of calling Nvidia out over the TDR they should explain why their cards take 4 to 5 times as long to complete the compute only portion.
Exactly. I'm not convinced this little test is actually testing what they think it is, or that it is returning meaningful results.

You're absolutely right. But it is fun watching them scratch their heads. I don't feel like creating an account there. I think the Async results are all that I find is meaningful from their tests. Maybe they'll figure out what kind of coding is required to enable Async on Maxwell 2, if it can perform the task. That's what I'm looking for.

"Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth." - Arthur Conan Doyle (Sherlock Holmes)
Mahigan is offline  
post #1718 of 2682 (permalink) Old 09-01-2015, 03:55 PM
Otherworlder
 
epic1337's Avatar
 
Join Date: Feb 2011
Posts: 7,191
Rep: 214 (Unique: 122)
this i wonder, will Nvidia hurry their release to their successor card?
they're in a pitfall with maxwell, a dead-end you might say, since theres no clear route for "refining drivers" to make it perform better in the future.
well of course they're still great cards for DX11, just not for future games that would be using Vulkan or DX12 as it's primary API.

they've been pretty laid-back with their releases, look at how long it took them to release GTX960 and GTX950.

trolling an adult is very dangerous, don't try it at home nor at work. you don't want to play tag with a rabid man.
epic1337 is offline  
post #1719 of 2682 (permalink) Old 09-01-2015, 04:09 PM
Linux Lobbyist
 
semitope's Avatar
 
Join Date: Jul 2013
Location: Florida/Jamaica
Posts: 536
Rep: 32 (Unique: 20)
Quote:
Originally Posted by Mahigan View Post

I'm not sure if Fable Legends will make heavy use of Async or not.

Hopefully does. They were one of the devs praising it.
semitope is offline  
post #1720 of 2682 (permalink) Old 09-01-2015, 05:00 PM
New to Overclock.net
 
Clocknut's Avatar
 
Join Date: Jun 2012
Posts: 3,458
Rep: 101 (Unique: 60)
Quote:
Originally Posted by epic1337 View Post

this i wonder, will Nvidia hurry their release to their successor card?
they're in a pitfall with maxwell, a dead-end you might say, since theres no clear route for "refining drivers" to make it perform better in the future.
well of course they're still great cards for DX11, just not for future games that would be using Vulkan or DX12 as it's primary API.

they've been pretty laid-back with their releases, look at how long it took them to release GTX960 and GTX950.
It will only depends on 2 things,

1. The speed of DirectX12 adoption
2. The time it takes for Pascal(thats assuming it has at least GCN1.0 Asynchronous computer capability)

*on points no 2, I am not even sure if Pascal have that parallelism feature, if Pascal was already well passed the design phase & still didnt have GCN 1.0 capability, then God help them, they will have to at least bake something to miracle make it work within pascal's architecture.

Clocknut is offline  
Closed Thread

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off