Overclock.net › Forums › Industry News › Video Game News › [computerbase.de] DOOM + Vulkan Benchmarked.
New Posts  All Forums:Forum Nav:

[computerbase.de] DOOM + Vulkan Benchmarked. - Page 37

post #361 of 632
Quote:
Originally Posted by EightDee8D View Post

Quote:
Originally Posted by Defoler View Post

Crystal dynamics blablabla my maxwell is obsolete mimimi


Should i also post 1080 launch stream where they showed 1080 running doom on vulkan ? or you will stop spreading bs ? a dev who has worked with amd in past doesn't mean they won't get sponsored by nvidia.

And if 3dmark can run async on pascal, why can't on maxwell ? who is stopping them to enable it on driver ? oh nothing for conspiracy now ? awww

And Aots was a game, but called out as a benchmark but nothing, 3dmark is a freaking benchmark, why does it count now ? because hypocrisy ? .lol

thumb.gif
post #362 of 632
Is the firestrike bench tuned for AMD on the tesselation part of the bench?
post #363 of 632
Quote:
Originally Posted by PontiacGTX View Post

It is a synthethic benchmark it is quite a bit different to Dx11 games, even games which have overhead dont show it on all levels/maps, and some DX11 games dont have issues with DX11 draw calls limit at all

What I liked was hearing that 3DMark spokesperson claiming that the "driver" is responsible for the behavior we see in a DX12 benchmark. DX12 and Driver... let that sink-in.

This may be true for nVIDIA (due to their inclusion of static scheduling) but it most certainly is not as true for AMD (due to their hardware scheduling). Kollock stated the same thing regarding AMDs hardware scheduler).

It seems to me that optimizations are lacking for the AMD path (if there are actually separate AMD and nVIDIA paths to begin with). The programmer is the one responsible for "marking up" the tasks he/she wants executed in parallel (as the Microsoft sample code I shared shows and as Kollock explained). So if the programmer did not mark up many of these tasks for the AMD hardware then of course you are not going to receive all of the potential performance. A low amount of marked up work would fit well for nVIDIAs Pascal architecture but would end up under-utilizing GCN for the reasons I mentioned in a previous post (Pascal GPCs and Dynamic Load Balancing explanation).

All of the games we have seen "mark up" a lot more work to be executed in parallel than what 3DMark stated with their "10-20%" claim. It seems to me that 3DMark should have gone for 40% of a frame being executed in parallel for AMD (which is what AotS does) and stuck to 10-20% for nVIDIA. That way they would have two perfectly optimized paths for both architectures. This is how games are being programmed (like Ashes of the Singularity) with separate optimized paths for both AMD and nVIDIA. The kicker is that it is nVIDIAs driver which is responsible for handling the scheduling of such tasks to the nVIDIA hardware. This means that nVIDIA would incur a larger CPU overhead (as we have seen under AotS). We also see that this will be the case for nVIDIA hardware under Doom Vulkan as absent Asynchronous Compute + Graphics... the nVIDIA hardware is tied with the AMD hardware in terms of CPU overhead. Once the Async path is implemented... nVIDIAs CPU overhead will be higher as I had mentioned in my initial coverage of nVIDIAs Async Compute capabilities.

We will likely end up with a version of 3DMark which will not at all represent the performance we will be seeing in upcoming DX12 titles for AMD. I think that the nVIDIA performance is perfectly optimized though... so what we see in 3DMark perfectly highlights what we can expect from Pascal.

As for Maxwell... when Async Compute is enabled... we should be seeing a drop in performance due to the GPU stalls caused by the fences. Even if the nVIDIA driver says "No Async Compute" the fences remain. This is what Kollock mentioned and what we have seen thus far in actual games making use of the technology.
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
post #364 of 632
Quote:
Originally Posted by Bauxno View Post

Is the firestrike bench tuned for AMD on the tesselation part of the bench?

No, indeed the tesselation is killing Radeon cards.
post #365 of 632
I think there's not enough tessellation, otherwise 480 will be faster than 390/x. it has better tessellation performance.
post #366 of 632
Quote:
Originally Posted by Mahigan View Post

What I liked was hearing that 3DMark spokesperson claiming that the "driver" is responsible for the behavior we see in a DX12 benchmark. DX12 and Driver... let that sink-in.

This may be true for nVIDIA (due to their inclusion of static scheduling) but it most certainly is not as true for AMD (due to their hardware scheduling). Kollock stated the same thing regarding AMDs hardware scheduler).

It seems to me that optimizations are lacking for the AMD path (if there are actually separate AMD and nVIDIA paths to begin with). The programmer is the one responsible for "marking up" the tasks he/she wants executed in parallel (as the Microsoft sample code I shared shows and as Kollock explained). So if the programmer did not mark up many of these tasks for the AMD hardware then of course you are not going to receive all of the potential performance. A low amount of marked up work would fit well for nVIDIAs Pascal architecture but would end up under-utilizing GCN for the reasons I mentioned in a previous post (Pascal GPCs and Dynamic Load Balancing explanation).

All of the games we have seen "mark up" a lot more work to be executed in parallel than what 3DMark stated with their "10-20%" claim. It seems to me that 3DMark should have gone for 40% of a frame being executed in parallel for AMD (which is what AotS does) and stuck to 10-20% for nVIDIA. That way they would have two perfectly optimized paths for both architectures. This is how games are being programmed (like Ashes of the Singularity) with separate optimized paths for both AMD and nVIDIA. The kicker is that it is nVIDIAs driver which is responsible for handling the scheduling of such tasks to the nVIDIA hardware. This means that nVIDIA would incur a larger CPU overhead (as we have seen under AotS). We also see that this will be the case for nVIDIA hardware under Doom Vulkan as absent Asynchronous Compute + Graphics... the nVIDIA hardware is tied with the AMD hardware in terms of CPU overhead. Once the Async path is implemented... nVIDIAs CPU overhead will be higher as I had mentioned in my initial coverage of nVIDIAs Async Compute capabilities.

We will likely end up with a version of 3DMark which will not at all represent the performance we will be seeing in upcoming DX12 titles for AMD. I think that the nVIDIA performance is perfectly optimized though... so what we see in 3DMark perfectly highlights what we can expect from Pascal.

As for Maxwell... when Async Compute is enabled... we should be seeing a drop in performance due to the GPU stalls caused by the fences. Even if the nVIDIA driver says "No Async Compute" the fences remain. This is what Kollock mentioned and what we have seen thus far in actual games making use of the technology.

But then the overhead for DX11 AMD driver werent sorely for draw calls? because it seems the overhead you are talking about DX12 and async compute is about compute+graphics nvidia can either avoid using async compute+graphics and just focus on DX12 path. or optimize async compute only like 3dmark time spy is doing (but then this is a syncthehic benchmark, it isnt a game)
Quote:
Originally Posted by EightDee8D View Post

I think there's not enough tessellation, otherwise 480 will be faster than 390/x. it has better tessellation performance.

From anandtech


Quote:
Under the hood, the engine only makes use of FL 11_0 features, which means it can run on video cards as far back as GeForce GTX 680 and Radeon HD 7970. At the same time it doesn't use any of the features from the newer feature levels, so while it ensures a consistent test between all cards, it doesn't push the very newest graphics features such as conservative rasterization.

That said, Futuremark has definitely set out to make full use of FL 11_0. Futuremark has published an excellent technical guide for the benchmark, which should go live at the same time as this article, so I won't recap it verbatim. But in brief, everything from asynchronous compute to resource heaps get used. In the case of async compute, Futuremark is using it to overlap rendering passes, though they do note that "the asynchronous compute workload per frame varies between 10-20%." On the work submission front, they're making full use of multi-threaded command queue submission, noting that every logical core in a system is used to submit work.

Edited by PontiacGTX - 7/17/16 at 7:09am
Wanted: [WTB] GPU upgrade
$210 (USD) or best offer
  
Reply
Wanted: [WTB] GPU upgrade
$210 (USD) or best offer
  
Reply
post #367 of 632
Quote:
Originally Posted by LionS7 View Post

No, indeed the tesselation is killing Radeon cards.
so.for some unknow reason they made a code path so even when asymc is set to on maxwell card dont get the hit on perf they get on almost every game with dx12 but didnt do the same for a critical part on their dx11 bench to balance the tesselation between amd and nvidia?

That tells me a lot about this company.
post #368 of 632
Quote:
Originally Posted by PontiacGTX View Post


From anandtech

I saw that, but why 480 isn't faster than 390/x ? maybe not optimized for polaris ?
post #369 of 632
Quote:
Originally Posted by Bauxno View Post

so.for some unknow reason they made a code path so even when asymc is set to on maxwell card dont get the hit on perf they get on almost every game with dx12 but didnt do the same for a critical part on their dx11 bench to balance the tesselation between amd and nvidia?

That tells me a lot about this company.

Richard Huddy said it in the interview with PCPER. He said it about the cape of Batman in Arkham City. It was overtesselated from Nvidia to kill more performance of Radeon other then GeForce. So Huddy said that Radeon is weaker than GeForce in tesselation. You can find the interview on the official channel of the PCPER in youtube. We can find many examples if we want.
post #370 of 632
Quote:
Originally Posted by EightDee8D View Post

I saw that, but why 480 isn't faster than 390/x ? maybe not optimized for polaris ?
it isnt faster in overall?

or graphics?
Quote:
Graphics test 1
Graphics test 1 focuses more on rendering of transparent elements. It utilizes
the A-buffer heavily to render transparent geometries and big particles in an
order-independent manner. Graphics test 1 draws particle shadows for
selected light sources. Ray-marched volumetric illumination is enabled only for
the directional light. All post-processing effects are enabled.
Graphics test 2
Graphics test 2 focuses more on ray-marched volume illumination with
hundreds of shadowed and unshadowed spot lights. The A-buffer is used to
render glass sheets in an order-independent manner. Also, lots of small
particles are simulated and drawn into the A-buffer. All post-processing effects
are enabled.
Wanted: [WTB] GPU upgrade
$210 (USD) or best offer
  
Reply
Wanted: [WTB] GPU upgrade
$210 (USD) or best offer
  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Video Game News
Overclock.net › Forums › Industry News › Video Game News › [computerbase.de] DOOM + Vulkan Benchmarked.