Multi threaded command listing & deferred rendering (DirectX runtime MT):
Originally Posted by sugarhell
This number that dx11 use only 60% of fury x's shaders . There is a fact for this? A research? Or a proof. Because i call it a BS
I am pretty sure that dx11 can use 100% of the fury's x shaders.
People that doesn't have a proper technical knowledge (ie being knowledge about tech news is not enough) SHOULDNT post things like that.
We know that amd use more cpu cycles for the shaders. We dont know why we can only speculate.
DirectX works by creating bundles (batches) of commands (command lists). These bundles or batches of commands are sent from the API to the Graphics driver. The driver can perform some changes to these commands (shader replacements, reordering of commands etc) and then translates them into ISA (Instruction Set Architecture, the GPUs language) command lists (Grids/threads) before sending them to the GPU for processing.
Multi-threaded command listing allows the DirectX driver to pre-record lists of commands on idling CPU cores. These lists of commands are then played back to the Graphics driver using the CPUs primary Core (thread 0). Why? The DirectX driver can only run on the primary CPU thread.Multi-threaded rendering (DirectX runtime MT + DirectX driver MT):
Is more or less same as above (DirectX runtime can also scale past 4 cores) except the last part, the DirectX driver doesn't need to play back the commands over the primary CPU thread, any CPU core/thread can talk directly to the GPU driver and thus send its command lists to the Graphics driver. How? The DirectX driver is split amongst every CPU thread.NVIDIA
NVIDIA's driver uses more than one thread (hidden driver threads) to perform the DirectX driver translations into ISA. These commands are kept in system memory and fetched in bulk by the Gigathread engine. This saves on CPU time. Commands can be sent in bulk and then the CPU can handle other complex tasks without creating a stall. Lower DX11 API over head. Result: Higher draw call rate.AMD
The AMD driver wouldn't benefit from being multi-threaded because there is only a 64 thread slot in the commmand processor. So even if multi-threaded command listing and deferred rendering were used, the Command Processor could only fetch 64 threads at a time. That means constant fetching or streaming of work. If the CPU is busy with some other work, a stall occurs, and the GPU waits for the CPU to feed it. Hence GCNs higher DX11 API over head. Result: Lower draw call rate.Edited by Mahigan - 2/24/16 at 9:37am