Overclock.net › Forums › Industry News › Video Game News › [computerbase.de] DOOM + Vulkan Benchmarked.
New Posts  All Forums:Forum Nav:

[computerbase.de] DOOM + Vulkan Benchmarked. - Page 40

post #391 of 632

Are you really sure that Driver can't ignore Fence? even it says "I can't do" ( this is what FM_Jarnis Mentioned) ?

post #392 of 632
I've been reading this thread the past hour and a half and I'd like to thank you guys who understand this a lot better than I do for their detailed explanations and comments (Mahigan being one of them smile.gif ).

Thank you, this thread cleared up a lot of false claims I used to think were correct thumb.gif
My PC
(17 items)
 
  
CPUMotherboardGraphicsRAM
Intel i7 - 5820k MSI X99S Plus SLI Sapphire R9 Fury NITRO Corsair Vengeance 16GB DDR4 2800Mhz 
Hard DriveHard DriveOptical DriveCooling
Samsung SM951 512GB Seagate Barracuda 500GB Noctua NH-U14S 
OSMonitorKeyboardPower
Windows 7 Ultimate 64Bit Asus MG279Q Logitech G510 Corsair RM750 
CaseMouseMouse PadAudio
Corsair Obsidian 700D Logitech G700 Outplay Sennheiser HD598 
Audio
Tritton PC510HDA (Microphone use only) 
  hide details  
Reply
My PC
(17 items)
 
  
CPUMotherboardGraphicsRAM
Intel i7 - 5820k MSI X99S Plus SLI Sapphire R9 Fury NITRO Corsair Vengeance 16GB DDR4 2800Mhz 
Hard DriveHard DriveOptical DriveCooling
Samsung SM951 512GB Seagate Barracuda 500GB Noctua NH-U14S 
OSMonitorKeyboardPower
Windows 7 Ultimate 64Bit Asus MG279Q Logitech G510 Corsair RM750 
CaseMouseMouse PadAudio
Corsair Obsidian 700D Logitech G700 Outplay Sennheiser HD598 
Audio
Tritton PC510HDA (Microphone use only) 
  hide details  
Reply
post #393 of 632
Quote:
Originally Posted by Xuper View Post

Are you really sure that Driver can't ignore Fence? even it says "I can't do" ( this is what FM_Jarnis Mentioned) ?

Nope. A driver cannot ignore a fence which is why Maxwell incurs a performance penalty under AotS when Async is turned on eventhough the driver does not support Asynchronous Compute. This is why AMD/nVIDIA and Microsoft suggested that every DX12 developer keep a non-Async Compute path in their code.

Have a read..
http://www.dualshockers.com/2016/03/14/directx12-requires-different-optimization-on-nvidia-and-amd-cards-lots-of-details-shared/

Below you can check out a summary of the most interesting points, and the slides that were showcased during the presentation. Most of the data was obviously developer-facing, and very technical, but there are definitely some points even gamers like us can take away from the presentation.


"Consider architecture specific paths"

  • DirectX 12 is for those who want to achieve maximum GPU and CPU performance, but there’s a significant requirement in engineering time, as it demands developers to write code at a driver level that DirectX 11 takes care of automatically,. For that reason, it’s not for everyone.
  • Since it’s “closer to the metal” than DirectX 11, it requires different settings on certain things for Nvidia and AMD cards.
  • With DirectX 12 you’re not CPU-bound for rendering.
  • The command lists written in DirectX 12 need to be running as much as possible, without any delay at any point. There should be 15-30 of them per frame, bundled into 5-10 “ExecuteCommandList” calls, each of which should include at least 200 microseconds of GPU Work. Preferably more, up to 500 microseconds.
  • Scheduling latency on the operating system’s side takes 60 microseconds, so developers should put at least more than that in each call, otherwise what’s left of the 60 microseconds would be wasted idling.
  • Bundles, which are the main new feature of DirectX 12, are great to send work to the GPU very early in each frame, and that’s very advantageous for applications that require very low latency like VR.
  • They’re not inherently faster on the GPU. The gain is all on the CPU side, so they need to be used wisely. Optimizing bundles diverges for Nvidia and AMD cards, and require a different approach. In particular, for AMD cards bundles should be used only if the game is struggling on the CPU side.
  • Compute queues still haven’t been completely researched on DirectX 12. For the moment, they can offer 10% gains if done correctly, but there might be more gains coming as more research is done on the topic.
  • Since those gains don’t automatically happen unless things are setup correctly, developers should always make sure whether they do or not, as poorly scheduled compute tasks can result in the opposite outcome.
  • The use of root signature tables is where optimization between AMD and Nvidia diverges the most, and developers will need brand-specific settings in order to get the best benefits on both vendors’ card.
  • When developers find themselves with not enough video memory, DirectX 12 allows them to create overflow heaps in system memory, moving resources out of video memory at their own discretion.
  • Using aliased memory on DirectX 12 allows to save GPU memory even further.
  • DirectX 12 introduces Fences, which are basically GPU semaphores, making sure that the GPU has finished working on a resources before it moves on to the next.
  • Multi-GPU functiinality is now embedded in the DirectX 12 API.
  • It’s important for developers to keep in mind the limitations in bandwidth of different version of PCI (the interface between motherboard and video card), as PCI 2.0 is still common, and grants half the bandwidth of PCI 3.0.
  • DirectX 12 includes a “Set Stable Power State” API, and some are using it. It’s only really useful for profiling, and even then only some times. It reduces performance and should not be used in a shipped game.
  • When deciding whether to use a pixel shader or a compute shader, there are “extreme” difference in pros and cons on Nvidia and AMD cards (as shown by the table in the gallery).
  • Conservative rasterization lets you draw all the pixels touched by a triangle of your 3D models. It was possible before using a geometry shader trick, but it was quite slow. Now it’s possible to enable neat effects like the ray traced shadows in Tom Clancy’s The Division. In the picture in the gallery below you can see the detail of the shadow, with the bike’s spokes visible on the ground. That wasn’t possible without using a tray traced twchnique, which is enabled only with conservative rasterization.
  • Tiled resources can now be used on 3D assets, and grant “extreme” performance and memory saving benefits.
  • DirectX 11 is still “very much alive” and will continue to be on the side of DirectX 12 for a while.
  • Developers can’t mix and match DirectX 11 and DirectX 12. Either they commit to DirectX 12 entirely, or they shouldn’t use it.

Edited by Mahigan - 7/17/16 at 11:29am
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
post #394 of 632
Then why Maxwell isn't getting loss in Timespy with Async ON?
post #395 of 632
Quote:
Originally Posted by Greenland View Post

Then why Maxwell isn't getting loss in Timespy with Async ON?
either is a very low usage of it
or the async isnt really the async ms dictates
post #396 of 632
Quote:
Originally Posted by Greenland View Post

Then why Maxwell isn't getting loss in Timespy with Async ON?

It could be due to...
  • Running compute shaders specifically optimized for nVIDIA GPUs (meaning optimized short running shaders).
  • Making very little use of Asynchronous Compute (less than any game title making use of it so far).
  • Not running Graphics and Compute in parallel.
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
post #397 of 632
Quote:
Originally Posted by Mahigan View Post


The powerpoint is available directly from Nvidia.
Tsar Bomba
(12 items)
 
Fat Man
(13 items)
 
Little Boy
(29 items)
 
CPUMotherboardRAMHard Drive
Intel Core i7-6700K ASRock Fatal1ty Z170 Gaming-ITX/ac Crucial Ballistix Sport LT 32GB (2 x 16GB) DDR4... Samsung 850 EVO 500GB M.2-2280 SSD 
Hard DriveCoolingCoolingOS
Sandisk Ultra II 960GB 2.5" SSD Zalman CNPS8900 Quiet Noctua NF-F12 PWM Microsoft Windows 10 Pro 
KeyboardPowerCaseMouse
Corsair K95 Corsair SF 600W Silverstone SG05BB-LITE Corsair M65 
CPUMotherboardGraphicsGraphics
AMD A10-7850K GIGABYTE GA-F2A88X-UP4 XFX R9-290A-ENBC Radeon R9 290 XFX R9-290A-ENBC Radeon R9 290 
Hard DriveHard DriveCoolingCooling
Samsung 830 128GB Western Digital Blue 1TB XSPC Raystorm CPU/APU Waterblock - AMD XSPC D5 Photon 170 Reservoir/Pump Combo 
CoolingCoolingCoolingPower
XSPC AX360 Radiator XSPC AX240 Radiator 2x XSPC Razor R9 290X / 290 - Full Cover Water ... EVGA Supernova 1000 P2 
Case
Corsair Carbide Series Air 540 Black  
CPUMotherboardGraphicsGraphics
Intel Core i7-4770K  Asus Maximus VI Extreme Sapphire R9 290 unlocked to 290X Sapphire R9 290 unlocked to 290X 
GraphicsGraphicsRAMHard Drive
Sapphire R9 290 unlocked to 290X Sapphire R9 290 unlocked to 290X G.SKILL Sniper Series 32GB (4 x 8GB) DDR3 2400 2x SAMSUNG 840 EVO 250GB 
Hard DriveOptical DriveCoolingCooling
2x TOSHIBA PH3300U-1I72 3TB Asus BW-12B1ST/BLK/G/AS Blu-Ray/DVD/CD Writer 29x Noctua NF-F12 PWM 2x Swiftech MCP655 + Bitspower Dual D5 Mod Top 
CoolingCoolingCoolingCooling
Phobya Balancer 450 Silver Nickel Reservoir 2x Alphacool NexXxoS XT45 480mm Radiator Alphacool NexXxoS XT45 360mm Radiator Alphacool NexXxoS XT45 240mm Radiator 
CoolingCoolingCoolingCooling
Alphacool NexXxoS ST30 240mm Radiator Alphacool NexXxoS XT45 120mm Radiator 4x Koolance VID-AR290X Radeon VGA Liquid Coolin... EK Supreme LTX Intel CPU Liquid Cooling Block -... 
OSMonitorMonitorMonitor
Microsoft Windows 8.1 Pro Asus VS238H-P 23.0" Asus VS238H-P 23.0" Asus VS238H-P 23.0" 
KeyboardPowerCaseMouse
Corsair Vengeance K95 ENERMAX Maxrevo EMR1500EWT 1500W Corsair Obsidian Series 900D Razer DeathAdder 2013 
Audio
Astro Gaming A50 Circumaural Headset - Black 
  hide details  
Reply
Tsar Bomba
(12 items)
 
Fat Man
(13 items)
 
Little Boy
(29 items)
 
CPUMotherboardRAMHard Drive
Intel Core i7-6700K ASRock Fatal1ty Z170 Gaming-ITX/ac Crucial Ballistix Sport LT 32GB (2 x 16GB) DDR4... Samsung 850 EVO 500GB M.2-2280 SSD 
Hard DriveCoolingCoolingOS
Sandisk Ultra II 960GB 2.5" SSD Zalman CNPS8900 Quiet Noctua NF-F12 PWM Microsoft Windows 10 Pro 
KeyboardPowerCaseMouse
Corsair K95 Corsair SF 600W Silverstone SG05BB-LITE Corsair M65 
CPUMotherboardGraphicsGraphics
AMD A10-7850K GIGABYTE GA-F2A88X-UP4 XFX R9-290A-ENBC Radeon R9 290 XFX R9-290A-ENBC Radeon R9 290 
Hard DriveHard DriveCoolingCooling
Samsung 830 128GB Western Digital Blue 1TB XSPC Raystorm CPU/APU Waterblock - AMD XSPC D5 Photon 170 Reservoir/Pump Combo 
CoolingCoolingCoolingPower
XSPC AX360 Radiator XSPC AX240 Radiator 2x XSPC Razor R9 290X / 290 - Full Cover Water ... EVGA Supernova 1000 P2 
Case
Corsair Carbide Series Air 540 Black  
CPUMotherboardGraphicsGraphics
Intel Core i7-4770K  Asus Maximus VI Extreme Sapphire R9 290 unlocked to 290X Sapphire R9 290 unlocked to 290X 
GraphicsGraphicsRAMHard Drive
Sapphire R9 290 unlocked to 290X Sapphire R9 290 unlocked to 290X G.SKILL Sniper Series 32GB (4 x 8GB) DDR3 2400 2x SAMSUNG 840 EVO 250GB 
Hard DriveOptical DriveCoolingCooling
2x TOSHIBA PH3300U-1I72 3TB Asus BW-12B1ST/BLK/G/AS Blu-Ray/DVD/CD Writer 29x Noctua NF-F12 PWM 2x Swiftech MCP655 + Bitspower Dual D5 Mod Top 
CoolingCoolingCoolingCooling
Phobya Balancer 450 Silver Nickel Reservoir 2x Alphacool NexXxoS XT45 480mm Radiator Alphacool NexXxoS XT45 360mm Radiator Alphacool NexXxoS XT45 240mm Radiator 
CoolingCoolingCoolingCooling
Alphacool NexXxoS ST30 240mm Radiator Alphacool NexXxoS XT45 120mm Radiator 4x Koolance VID-AR290X Radeon VGA Liquid Coolin... EK Supreme LTX Intel CPU Liquid Cooling Block -... 
OSMonitorMonitorMonitor
Microsoft Windows 8.1 Pro Asus VS238H-P 23.0" Asus VS238H-P 23.0" Asus VS238H-P 23.0" 
KeyboardPowerCaseMouse
Corsair Vengeance K95 ENERMAX Maxrevo EMR1500EWT 1500W Corsair Obsidian Series 900D Razer DeathAdder 2013 
Audio
Astro Gaming A50 Circumaural Headset - Black 
  hide details  
Reply
post #398 of 632
Is it possible that the fences are there even when async toggle is off in the benchmark settings? Wouldn't that explain why Maxwell has no penalty from switching it on?
The Air Tunnel
(10 items)
 
  
CPUMotherboardGraphicsRAM
i7-4820K (4500MHz@1.28V) P9X79 Sapphire R9 290 Tri-X New Edition Team Group Vulcan 4x4GB 2133MHz 
Hard DriveCoolingOSPower
2x240GB SSD @RAID 0 Noctua NH-D14 SE2011 Windows 10 Pro Corsair AX750 
CaseAudio
Rosewill Armor EVO hiFace+AudioGD+DT770Pro 
  hide details  
Reply
The Air Tunnel
(10 items)
 
  
CPUMotherboardGraphicsRAM
i7-4820K (4500MHz@1.28V) P9X79 Sapphire R9 290 Tri-X New Edition Team Group Vulcan 4x4GB 2133MHz 
Hard DriveCoolingOSPower
2x240GB SSD @RAID 0 Noctua NH-D14 SE2011 Windows 10 Pro Corsair AX750 
CaseAudio
Rosewill Armor EVO hiFace+AudioGD+DT770Pro 
  hide details  
Reply
post #399 of 632
@Mahigan, unrelated, but you said that 480 might be ROP bottlenecked, and that it could be 980 Ti performance if it had 64 ROPs.

Do you still think that's the case? That was a couple of weeks ago.
post #400 of 632
Quote:
Originally Posted by specopsFI View Post

Is it possible that the fences are there even when async toggle is off in the benchmark settings? Wouldn't that explain why Maxwell has no penalty from switching it on?

 

Then it's not Benchmark! What is it if they're same ?

New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Video Game News
Overclock.net › Forums › Industry News › Video Game News › [computerbase.de] DOOM + Vulkan Benchmarked.