Originally Posted by ZealotKi11er
If there is serious CPU overhead then yes. Something like AoTS.
Well considering that a stock clocked RX 480 sits between a R9 390 and R9 390x and that a R9 390x is often either beating or matching a GTX 980 Ti under DX12 titles then it is not a stretch to imagine a RX 480 coming mighty close to a GTX 980 Ti under DX 12 titles.
As for Total War - Warhammer... it does make use of Asynchronous Compute + Graphics last I heard. That would likely explain the performance loss attributed to the GTX 980 Ti due to latency being introduced into the execution pipeline.
I would bet that a GTX 1070/1080 would either get similar performance going from DX11 to DX12 or a slight bump (ever so slight). This is mostly due to the reduction in latencies attributed to flushing an SM (when moving from one compute task to another a.k.a finer grained preemption) as well as the capability to run the Graphics and Compute tasks in parallel but in two separate GPCs (which can allow for a performance boost when very minute degrees of Async compute + graphics are introduced into the execution pipeline due to the fact that you only have so many GPCs being available at any given time). So while the GTX 1070 and 1080 do not support Asynchronous Compute + Graphics they do have some slight tweaks in order to alleviate the performance losses we witnessed with Maxwell.
What will determine the performance boost for the RX 480 (relative to past GCN parts) is the degree by which the increased instruction buffers (per CU) help to boost single threaded performance. If the RX 480 is already obtaining a significant single threaded boost under DX11 then moving to DX12 will not result in the same performance boost we have witnessed in prior GCN GPUs. Asynchronous compute+graphics-wise though... I would still expect a boost in terms of performance for the RX 480... perhaps even moreso than with prior GCN GPUs due to the caching improvements in the Polaris architecture.
As for the larger instruction buffers... basically Polaris can buffer up 20 or 22 DWORD worth of instructions vs 16 for prior GCN GPUs. This means that the RX 480 is going to be making less requests to the CPU for work and that leads to there being less GPU stalls if the CPU is busy with some other work when the GPU is ready to receive new work. This translates into a boost in performance in single threaded scenarios (DX9/10/11 scenarios).
So I do not expect Pascal to suffer much from the use of dX12 and Asynchronous compute+graphics but Maxwell and Kepler GPUs are basically heading towards being obsolete very fast (as I had expected).Edited by Mahigan - 7/2/16 at 2:34pm