Overclock.net › Forums › Industry News › Software News › [PCPER] NVIDIA Publishes DirectX 12 Tips for Developers
New Posts  All Forums:Forum Nav:

[PCPER] NVIDIA Publishes DirectX 12 Tips for Developers - Page 3

post #21 of 127
This one is also interesting...
Quote:
Don’ts

Don’t use Raster Order View (ROV) techniques pervasively
Guaranteeing order doesn’t come for free
Always compare with alternative approaches like advanced blending ops and atomics

Basically ROV, although Maxwell 2 supports it, comes with a hefty performance penalty.

Quote:
Do's

Use hardware conservative raster for full-speed conservative rasterization
No need to use a GS to implement a ‘slow’ software base conservative rasterization

Well they do need to use it for GCN hardware. Therefore I suspect nVIDIA would like developers to use Vendor ID specific paths.
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
post #22 of 127
Thread Starter 
Does Maxwell 2 support PreEmption?
Ishimura
(21 items)
 
Silent Knight
(13 items)
 
 
CPUMotherboardGraphicsRAM
Intel Core i7 3770K @ 4.6GHz ASRock Z77E-ITX eVGA GTX 1080 Ti Hybrid AVEXIR Blitz 1.1 16GB DDR3-2400MHz CL10 
Hard DriveHard DriveCoolingCooling
SanDisk Ultra II 960GB Toshiba X300 5TB Corsair H100i GTX eVGA Hybrid Water Cooler  
CoolingOSMonitorKeyboard
4x GentleTyphoon AP-15 Windows 10 Pro 64-Bit Philips Brilliance BDM4065UC 4K Razer BlackWidow Chroma  
PowerCaseMouseMouse Pad
eVGA SuperNOVA 750 G3 Define Nano S Logitech G502 Proteus Core PECHAM Gaming Mouse Pad XX-Large 
AudioAudioAudioAudio
Audioengine D1 DAC Mackie CR Series CR3 Audio-Technica ATH-M50 Sennheiser HD 598 
Audio
Sony XB950BT 
CPUMotherboardGraphicsRAM
AMD Phenom II X4 955 @ 4.2GHz ASUS M4A79XTD EVO AMD Radeon HD 7970 3GB @ 1200/1500 2x 4GB G.SKILL Ripjaws X DDR3-1600 
Hard DriveHard DriveHard DriveCooling
OCZ Agility 3 60GB WD Caviar Green 1.5TB 2 x Seagate Barracuda 2TB XSPC Raystorm 
CoolingCoolingOSPower
EK-FC7970 XSPC RS360 Windows 10 Pro 64-Bit Corsair TX750 
Case
NZXT Switch 810  
  hide details  
Reply
Ishimura
(21 items)
 
Silent Knight
(13 items)
 
 
CPUMotherboardGraphicsRAM
Intel Core i7 3770K @ 4.6GHz ASRock Z77E-ITX eVGA GTX 1080 Ti Hybrid AVEXIR Blitz 1.1 16GB DDR3-2400MHz CL10 
Hard DriveHard DriveCoolingCooling
SanDisk Ultra II 960GB Toshiba X300 5TB Corsair H100i GTX eVGA Hybrid Water Cooler  
CoolingOSMonitorKeyboard
4x GentleTyphoon AP-15 Windows 10 Pro 64-Bit Philips Brilliance BDM4065UC 4K Razer BlackWidow Chroma  
PowerCaseMouseMouse Pad
eVGA SuperNOVA 750 G3 Define Nano S Logitech G502 Proteus Core PECHAM Gaming Mouse Pad XX-Large 
AudioAudioAudioAudio
Audioengine D1 DAC Mackie CR Series CR3 Audio-Technica ATH-M50 Sennheiser HD 598 
Audio
Sony XB950BT 
CPUMotherboardGraphicsRAM
AMD Phenom II X4 955 @ 4.2GHz ASUS M4A79XTD EVO AMD Radeon HD 7970 3GB @ 1200/1500 2x 4GB G.SKILL Ripjaws X DDR3-1600 
Hard DriveHard DriveHard DriveCooling
OCZ Agility 3 60GB WD Caviar Green 1.5TB 2 x Seagate Barracuda 2TB XSPC Raystorm 
CoolingCoolingOSPower
EK-FC7970 XSPC RS360 Windows 10 Pro 64-Bit Corsair TX750 
Case
NZXT Switch 810  
  hide details  
Reply
post #23 of 127
Quote:
Originally Posted by Mahigan View Post

Well,

That's it then. The theory was right.
You mean't it as a joke... but it is there..
That context switch is costly, sometimes to the tune of thousands of ms if a context switch gets stuck at the end of a long draw call. GCNs ACEs simply transfer intermediate results into the LDS (Local Data Share Cache), work on the pre-empted request, pull the intermediate results back from cache. All within a single cycle.
Does it mean AMD can always retire 1 ops each cycle? That's what I was suspecting, as AMD pipes are 4-deep vector and asynchronous shading removes that latency from what I understand.
PS: If that's possible, AMD's vector units work at the best of both scalar and vector capabilities.
Edited by mtcn77 - 9/27/15 at 9:32am
The Machine
(14 items)
 
Nexus 7 2013
(11 items)
 
 
CPUMotherboardGraphicsRAM
A10 6800K Asus F2A85-V MSI 6870 Hawx, VTX3D 5770, AMD HD6950(RIP), Sap... G.skill Ripjaws PC12800 6-8-6-24 
Hard DriveOptical DriveOSMonitor
Seagate 7200.5 1TB NEC 3540 Dvd-Rom Windows 7 x32 Ultimate Samsung P2350 23" 1080p 
PowerCaseMouseAudio
Seasonic s12-600w CoolerMaster Centurion 5 Logitech G600 Auzen X-Fi Raider 
CPUMotherboardGraphicsRAM
Quad Krait 300 at 1.5Ghz Qualcomm APQ8064-1AA SOC Adreno 320 at 400mhz 2GB DDR3L-1600 
Hard DriveOSMonitorKeyboard
32GB Internal NAND Android 5.0 7" 1920X1200 103% sRGB & 572 cd/m2 LTPS IPS Microsoft Wedge Mobile Keyboard 
PowerAudio
3950mAh/15.01mAh Battery Stereo Speakers 
  hide details  
Reply
The Machine
(14 items)
 
Nexus 7 2013
(11 items)
 
 
CPUMotherboardGraphicsRAM
A10 6800K Asus F2A85-V MSI 6870 Hawx, VTX3D 5770, AMD HD6950(RIP), Sap... G.skill Ripjaws PC12800 6-8-6-24 
Hard DriveOptical DriveOSMonitor
Seagate 7200.5 1TB NEC 3540 Dvd-Rom Windows 7 x32 Ultimate Samsung P2350 23" 1080p 
PowerCaseMouseAudio
Seasonic s12-600w CoolerMaster Centurion 5 Logitech G600 Auzen X-Fi Raider 
CPUMotherboardGraphicsRAM
Quad Krait 300 at 1.5Ghz Qualcomm APQ8064-1AA SOC Adreno 320 at 400mhz 2GB DDR3L-1600 
Hard DriveOSMonitorKeyboard
32GB Internal NAND Android 5.0 7" 1920X1200 103% sRGB & 572 cd/m2 LTPS IPS Microsoft Wedge Mobile Keyboard 
PowerAudio
3950mAh/15.01mAh Battery Stereo Speakers 
  hide details  
Reply
post #24 of 127
Quote:
Originally Posted by Glottis View Post

oh yes, game devs should totally cripple performance on nvidia gpus by needlessly overusing async compute like Ashes did just to prove some sick point. yes, that's so good and healthy for gaming community! if you actually bothered reading what more neutral game devs have to say about this matter you would see they just want to optimize their game so people have best experience regardless of gpu brand. speaking of maxwell 2, you do realize that 980Ti is still #1 DX12 card out there. it's only in mid range where amd has a few faster cards and only in async heavy game like ashes.

Ashe's of the Singularity made little usage of Async compute. What Oxide likely did was utilize many context switches (Graphic to Compute) in the minimal Async Compute they did make use of. Therefore it pretty much means that it's not the heavy usage of Async Compute, which leads to a performance loss on the nVIDIA Maxwell 2 architecture, it is the use of many context switches. The problem is that by limiting the use of context switches, you limit the use of Asynchronous Compute.

Ashes makes mild usage of Async Compute, as does Fable Legends (5% use). Therefore these are not Async heavy games. Deus Ex:Mankind Divided will likely be a heavier Async title. All of the DX12 titles I've looked at, so far, make use of Async Compute to one degree or another. Which is why the performance we're seeing in the recent game engine benchmarks will likely translate across the board.

Hence my recommendations smile.gif
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
post #25 of 127
what are we to do now ? smile.gif nvidia is better in dx11, 290 seems to be absolutely out of this planet with price/performance ratio in dx12 ..... Multi-adapter ? biggrin.gif
Can async be assigned only to an amd card while running multi adapter, and gameworks stuff assigned to the nvidia card ?
post #26 of 127
Quote:
Originally Posted by ZealotKi11er View Post

Does Maxwell 2 support PreEmption?

Coarse grained preemption. Which means you're limited as to when you can preempt. Usually at the end of a draw call. This is why Maxwell 2 reuses the previously rendered frame in VR when a preemption request is made. If it didn't do this then you'd see a visual glitch (missing frame). This glitch made people feel sick.
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Reply
post #27 of 127
Quote:
Originally Posted by 47 Knucklehead View Post


And if AMD comes up with their own list, do you really expect them to say "DO: Support Conservative Rasterization and Raster Order Views, etc"? Yeah, I think not. rolleyes.gif

At least AMD don't have a performance problem with these features, as they're officially not supported. So they just don't run at all, and everyone is aware. It's not like we know how much async compute is OK with Nvidia and how much isn't.

So for all we know, the best call is to simply disable async compute features when run on Nvidia hardware. I think that'd make for a really good tip. Not this questionable claim that context switching is a 'heavyweight' Dx12. (edit: and I honestly don't like when people communicate something untrue while being unclear enough to not necessarily meaning something untrue. Like they could always claim that in the context of the tips, it is implied that they mean 'on our hardware ', though from a linguistic standpoint, I have a hard time seeing it.)
Edited by Tivan - 9/27/15 at 9:22am
Cute PC
(15 items)
 
  
CPUMotherboardGraphicsRAM
4930k@4200 Sabertooth x79 R9 290 Tri-X@950/1250 4x4GB@2133CL9 
Hard DriveCoolingOSMonitor
Crucial BX100 Mugen 4 Win7 Benq xl2411z 
MonitorKeyboardPowerCase
NEC EA231WMi QPad-MK50 (reds) Seasonic S12G 750 Define R4  
MouseMouse PadAudio
Deathadder 3.5G BE Razer Goliathus Speed Edition Large Onboard 
  hide details  
Reply
Cute PC
(15 items)
 
  
CPUMotherboardGraphicsRAM
4930k@4200 Sabertooth x79 R9 290 Tri-X@950/1250 4x4GB@2133CL9 
Hard DriveCoolingOSMonitor
Crucial BX100 Mugen 4 Win7 Benq xl2411z 
MonitorKeyboardPowerCase
NEC EA231WMi QPad-MK50 (reds) Seasonic S12G 750 Define R4  
MouseMouse PadAudio
Deathadder 3.5G BE Razer Goliathus Speed Edition Large Onboard 
  hide details  
Reply
post #28 of 127
Thread Starter 
Quote:
Originally Posted by Mahigan View Post

Coarse grained preemption. Which means you're limited as to when you can preempt. Usually at the end of a draw call. This is why Maxwell 2 reuses the previously rendered frame in VR when a preemption request is made. If it didn't do this then you'd see a visual glitch (missing frame). This glitch made people feel sick.

Doesn't that increase latency though?
Ishimura
(21 items)
 
Silent Knight
(13 items)
 
 
CPUMotherboardGraphicsRAM
Intel Core i7 3770K @ 4.6GHz ASRock Z77E-ITX eVGA GTX 1080 Ti Hybrid AVEXIR Blitz 1.1 16GB DDR3-2400MHz CL10 
Hard DriveHard DriveCoolingCooling
SanDisk Ultra II 960GB Toshiba X300 5TB Corsair H100i GTX eVGA Hybrid Water Cooler  
CoolingOSMonitorKeyboard
4x GentleTyphoon AP-15 Windows 10 Pro 64-Bit Philips Brilliance BDM4065UC 4K Razer BlackWidow Chroma  
PowerCaseMouseMouse Pad
eVGA SuperNOVA 750 G3 Define Nano S Logitech G502 Proteus Core PECHAM Gaming Mouse Pad XX-Large 
AudioAudioAudioAudio
Audioengine D1 DAC Mackie CR Series CR3 Audio-Technica ATH-M50 Sennheiser HD 598 
Audio
Sony XB950BT 
CPUMotherboardGraphicsRAM
AMD Phenom II X4 955 @ 4.2GHz ASUS M4A79XTD EVO AMD Radeon HD 7970 3GB @ 1200/1500 2x 4GB G.SKILL Ripjaws X DDR3-1600 
Hard DriveHard DriveHard DriveCooling
OCZ Agility 3 60GB WD Caviar Green 1.5TB 2 x Seagate Barracuda 2TB XSPC Raystorm 
CoolingCoolingOSPower
EK-FC7970 XSPC RS360 Windows 10 Pro 64-Bit Corsair TX750 
Case
NZXT Switch 810  
  hide details  
Reply
Ishimura
(21 items)
 
Silent Knight
(13 items)
 
 
CPUMotherboardGraphicsRAM
Intel Core i7 3770K @ 4.6GHz ASRock Z77E-ITX eVGA GTX 1080 Ti Hybrid AVEXIR Blitz 1.1 16GB DDR3-2400MHz CL10 
Hard DriveHard DriveCoolingCooling
SanDisk Ultra II 960GB Toshiba X300 5TB Corsair H100i GTX eVGA Hybrid Water Cooler  
CoolingOSMonitorKeyboard
4x GentleTyphoon AP-15 Windows 10 Pro 64-Bit Philips Brilliance BDM4065UC 4K Razer BlackWidow Chroma  
PowerCaseMouseMouse Pad
eVGA SuperNOVA 750 G3 Define Nano S Logitech G502 Proteus Core PECHAM Gaming Mouse Pad XX-Large 
AudioAudioAudioAudio
Audioengine D1 DAC Mackie CR Series CR3 Audio-Technica ATH-M50 Sennheiser HD 598 
Audio
Sony XB950BT 
CPUMotherboardGraphicsRAM
AMD Phenom II X4 955 @ 4.2GHz ASUS M4A79XTD EVO AMD Radeon HD 7970 3GB @ 1200/1500 2x 4GB G.SKILL Ripjaws X DDR3-1600 
Hard DriveHard DriveHard DriveCooling
OCZ Agility 3 60GB WD Caviar Green 1.5TB 2 x Seagate Barracuda 2TB XSPC Raystorm 
CoolingCoolingOSPower
EK-FC7970 XSPC RS360 Windows 10 Pro 64-Bit Corsair TX750 
Case
NZXT Switch 810  
  hide details  
Reply
post #29 of 127
Quote:
Originally Posted by ZealotKi11er View Post

Doesn't that increase latency though?
Like render ahead frames target number, I guess. 3 improves fps saving on cpu latency, but extra frames are wasted if cpu cannot correctly preempt what you will do next.
The Machine
(14 items)
 
Nexus 7 2013
(11 items)
 
 
CPUMotherboardGraphicsRAM
A10 6800K Asus F2A85-V MSI 6870 Hawx, VTX3D 5770, AMD HD6950(RIP), Sap... G.skill Ripjaws PC12800 6-8-6-24 
Hard DriveOptical DriveOSMonitor
Seagate 7200.5 1TB NEC 3540 Dvd-Rom Windows 7 x32 Ultimate Samsung P2350 23" 1080p 
PowerCaseMouseAudio
Seasonic s12-600w CoolerMaster Centurion 5 Logitech G600 Auzen X-Fi Raider 
CPUMotherboardGraphicsRAM
Quad Krait 300 at 1.5Ghz Qualcomm APQ8064-1AA SOC Adreno 320 at 400mhz 2GB DDR3L-1600 
Hard DriveOSMonitorKeyboard
32GB Internal NAND Android 5.0 7" 1920X1200 103% sRGB & 572 cd/m2 LTPS IPS Microsoft Wedge Mobile Keyboard 
PowerAudio
3950mAh/15.01mAh Battery Stereo Speakers 
  hide details  
Reply
The Machine
(14 items)
 
Nexus 7 2013
(11 items)
 
 
CPUMotherboardGraphicsRAM
A10 6800K Asus F2A85-V MSI 6870 Hawx, VTX3D 5770, AMD HD6950(RIP), Sap... G.skill Ripjaws PC12800 6-8-6-24 
Hard DriveOptical DriveOSMonitor
Seagate 7200.5 1TB NEC 3540 Dvd-Rom Windows 7 x32 Ultimate Samsung P2350 23" 1080p 
PowerCaseMouseAudio
Seasonic s12-600w CoolerMaster Centurion 5 Logitech G600 Auzen X-Fi Raider 
CPUMotherboardGraphicsRAM
Quad Krait 300 at 1.5Ghz Qualcomm APQ8064-1AA SOC Adreno 320 at 400mhz 2GB DDR3L-1600 
Hard DriveOSMonitorKeyboard
32GB Internal NAND Android 5.0 7" 1920X1200 103% sRGB & 572 cd/m2 LTPS IPS Microsoft Wedge Mobile Keyboard 
PowerAudio
3950mAh/15.01mAh Battery Stereo Speakers 
  hide details  
Reply
post #30 of 127
Quote:
Originally Posted by ZealotKi11er View Post

This time around AMD and Nvidia have to tell Developer that to do instead of doing it themselves in their end. The true results from this change will show itself once more games come out in DX12, how their Day 1 performance/stability is.

That's because they have added DX12 support retroactively. They released cards before DX12 was finalized and promised support so you wouldn't have to upgrade.

This is the cost of that backwards support: quirky behavior that needs to be worked around.
Edited by RagingCain - 9/27/15 at 9:23am
Snowdevil
(16 items)
 
ASUS G750JM
(9 items)
 
 
CPUMotherboardGraphicsGraphics
[i7 4790K @ 4.4 GHz (1.186v)] [Asus Sabertooth Z97 Mark S] [nVidia Geforce GTX 1080] [nVidia Geforce GTX 1080] 
RAMHard DriveCoolingOS
[G.Skill 32GB DDR3 2133 MHz] [Crucial MX100 256GB] [Phanteks PH-TC12DX] [Win 10.1 Pro] 
MonitorMonitorKeyboardPower
[LG 29UM65 (2560x1080)] [QNIX Evo II LED (2560x1440)] [WASD v2 Tenkeyless] [NZXT Hale90 v2 ] 
CaseMouseMouse PadAudio
[ThermalTake GT10 Snow Edition] [Razer Mamba - Chroma] [Razer Kabuto] [Razer Man O' War] 
CPUMotherboardGraphicsRAM
i7 4770HQ Intel HM87 Express Chipset Geforce GTX 860M 8GB DDR3L 1600 MHz 
Hard DriveOptical DriveCoolingOS
Samsung SSD EVO DVD-RW Stock Windows 8.1 
Monitor
1920x1080 TN 
  hide details  
Reply
Snowdevil
(16 items)
 
ASUS G750JM
(9 items)
 
 
CPUMotherboardGraphicsGraphics
[i7 4790K @ 4.4 GHz (1.186v)] [Asus Sabertooth Z97 Mark S] [nVidia Geforce GTX 1080] [nVidia Geforce GTX 1080] 
RAMHard DriveCoolingOS
[G.Skill 32GB DDR3 2133 MHz] [Crucial MX100 256GB] [Phanteks PH-TC12DX] [Win 10.1 Pro] 
MonitorMonitorKeyboardPower
[LG 29UM65 (2560x1080)] [QNIX Evo II LED (2560x1440)] [WASD v2 Tenkeyless] [NZXT Hale90 v2 ] 
CaseMouseMouse PadAudio
[ThermalTake GT10 Snow Edition] [Razer Mamba - Chroma] [Razer Kabuto] [Razer Man O' War] 
CPUMotherboardGraphicsRAM
i7 4770HQ Intel HM87 Express Chipset Geforce GTX 860M 8GB DDR3L 1600 MHz 
Hard DriveOptical DriveCoolingOS
Samsung SSD EVO DVD-RW Stock Windows 8.1 
Monitor
1920x1080 TN 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Software News
Overclock.net › Forums › Industry News › Software News › [PCPER] NVIDIA Publishes DirectX 12 Tips for Developers