Overclock.net › Forums › Industry News › Software News › [Various] Ashes of the Singularity DX12 Benchmarks
New Posts  All Forums:Forum Nav:

[Various] Ashes of the Singularity DX12 Benchmarks - Page 217  

post #2161 of 2682
Warning: Spoiler! (Click to show)
Quote:
Originally Posted by Kollock View Post

Wow, lots more posts here, there is just too many things to respond to so I'll try to answer what I can.

/inconvenient things I'm required to ask or they won't let me post anymore
Regarding screenshots and other info from our game, we appreciate your support but please refrain from disclosing these until after we hit early access. It won't be long now.
/end

Regarding batches, we use the term batches just because we are counting both draw calls and dispatch calls. Dispatch calls are compute shaders, draw calls are normal graphics shaders. Though sometimes everyone calls dispatchs draw calls, they are different so we thought we'd avoid the confusion by calling everything a draw call.

Regarding CPU load balancing on D3D12, that's entirely the applications responsibility. So if you see a case where it’s not load balancing, it’s probably the application not the driver/API. We’ve done some additional tunes to the engine even in the last month and can clearly see usage cases where we can load 8 cores at maybe 90-95% load. Getting to 90% on an 8 core machine makes us really happy. Keeping our application tuned to scale like this definitely on ongoing effort.

Additionally, hitches and stalls are largely the applications responsibility under D3D12. In D3D12, essentially everything that could cause a stall has been removed from the API. For example, the pipeline objects are designed such that the dreaded shader recompiles won’t ever have to happen. We also have precise control over how long a graphics command is queued up. This is pretty important for VR applications.

Also keep in mind that the memory model for D3d12 is completely different the D3D11, at an OS level. I’m not sure if you can honestly compare things like memory load against each other. In D3D12 we have more control over residency and we may, for example, intentionally keep something unused resident so that there is no chance of a micro-stutter if that resource is needed. There is no reliable way to do this in D3D11. Thus, comparing memory residency between the two APIS may not be meaningful, at least not until everyone's had a chance to really tune things for the new paradigm.

Regarding SLI and cross fire situations, yes support is coming. However, those options in the ini file probablly do not do what you think they do, just FYI. Some posters here have been remarkably perceptive on different multi-GPU modes that are coming, and let me just say that we are looking beyond just the standard Crossfire and SLI configurations of today. We think that Multi-GPU situations are an area where D3D12 will really shine. (once we get all the kinks ironed out, of course). I can't promise when this support will be unvieled, but we are commited to doing it right.

Regarding Async compute, a couple of points on this. FIrst, though we are the first D3D12 title, I wouldn't hold us up as the prime example of this feature. There are probably better demonstrations of it. This is a pretty complex topic and to fully understand it will require significant understanding of the particular GPU in question that only an IHV can provide. I certainly wouldn't hold Ashes up as the premier example of this feature.

We actually just chatted with Nvidia about Async Compute, indeed the driver hasn't fully implemented it yet, but it appeared like it was. We are working closely with them as they fully implement Async Compute. We'll keep everyone posted as we learn more.

Also, we are pleased that D3D12 support on Ashes should be functional on Intel hardware relatively soon, (actually, it's functional now it's just a matter of getting the right driver out to the public).

Thanks!

@Kollock Thank You smile.gif
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
post #2162 of 2682
Quote:
Originally Posted by Mahigan View Post

Quote:
Originally Posted by Noufel View Post

I don't know if the performance gain will be the same as GCN with a simple driver optimization ???

Slower context switching... like AMD said. It depends on the work load. At least now it might be working. If it works for Maxwell 2, in theory, they could make it work for Maxwell. Unless Maxwell's Warp Schedulers don't function Asynchronously.

Why didn't nVIDIA just admit to this in the first place? Why stay silent on the matter. Like I said earlier..

"We're aware of the issue and we're working to get it resolved as quickly as possible".
they didn't know that it will be a major issue and thnx to people like you Mahigan they admited that thumb.gif
Number 3
(16 items)
 
Dust Catcher
(9 items)
 
CPUMotherboardGraphicsRAM
i7 6700k 4.5 ghz MSI Gaming M7  1080ti aorus xtreme 2x8 go Gskill 3200mhz C15 
Hard DriveHard DriveHard DriveHard Drive
crucial ssd 128 Go 2x Samsung 840 evo 250 Go Hitachi HDD 2To Toshiba HDD 2To 
Optical DriveCoolingOSMonitor
Old DVD-RW KRAKEN X61 Windows 10 x64 Swift 2k 165hz 
KeyboardPowerCaseMouse
g910 CM V1000 CM Trooper modded g502 proteus 
CPUMotherboardGraphicsRAM
Phenom ii x6 1075t Asus Sabertooth 990fx CF 2x Asus 6950 directcu ii  2x4Gb Gskill 1333 
Hard DriveCoolingOSPower
80go Intel x25 ssd series + 500 go Samsung  cooler master V6 gt Win 7 64bit ultimate edition cooler master silent pro 850w 
Case
HAF 912 advanced 
  hide details  
Number 3
(16 items)
 
Dust Catcher
(9 items)
 
CPUMotherboardGraphicsRAM
i7 6700k 4.5 ghz MSI Gaming M7  1080ti aorus xtreme 2x8 go Gskill 3200mhz C15 
Hard DriveHard DriveHard DriveHard Drive
crucial ssd 128 Go 2x Samsung 840 evo 250 Go Hitachi HDD 2To Toshiba HDD 2To 
Optical DriveCoolingOSMonitor
Old DVD-RW KRAKEN X61 Windows 10 x64 Swift 2k 165hz 
KeyboardPowerCaseMouse
g910 CM V1000 CM Trooper modded g502 proteus 
CPUMotherboardGraphicsRAM
Phenom ii x6 1075t Asus Sabertooth 990fx CF 2x Asus 6950 directcu ii  2x4Gb Gskill 1333 
Hard DriveCoolingOSPower
80go Intel x25 ssd series + 500 go Samsung  cooler master V6 gt Win 7 64bit ultimate edition cooler master silent pro 850w 
Case
HAF 912 advanced 
  hide details  
post #2163 of 2682
Quote:
Originally Posted by Mahigan View Post


Nope. That's where you get the "Performance per Watt" figures of Kepler and Maxwell/2. Hardware schedulers take up a lot of power.


Things just got a whole lot more interesting biggrin.gif

 

I seeeeeeee! :thumb:

post #2164 of 2682
Quote:
Originally Posted by Noufel View Post

I don't know if the performance gain will be the same as GCN with a simple driver optimization ???

Probably not, that's what my theory was about... but it should help the GTX 980 Ti get a boost. Once Oxide work on more Post Processing effects and optimize for the Fury-X, another boost for the GCN parts in that direction as well (something Kollock mentioned).

At least now we know that Asynchronous Compute is in fact software driven for Maxwell 2. We know it can be activated with a driver update, at least that's what nVIDIA have stated. We're still at a wait and see phase imo.

Now watch as the internet explodes once again LOL
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
post #2165 of 2682
Quote:
Originally Posted by Mahigan View Post

Nope. That's where you get the "Performance per Watt" figures of Kepler and Maxwell/2. Hardware schedulers take up a lot of power.


Things just got a whole lot more interesting biggrin.gif

I don't have a 750W PSU for a single CPU/GPU setup for nothing lol.

Also - I must thank Kollock for continuing to monitor and answer questions in this thread.
More importantly, answering with professionalism and restraint - it's a rarity in this thread haha.
Main
(20 items)
 
  
CPUMotherboardGraphicsRAM
i5 4670K @ 4.7Ghz [1.284v] Z87X-UD4H [F7] MSI GTX 1070 Gaming X 2x4GB Samsung MV-3V4G3; 10-10-10-28 @ 2133Mhz [... 
Hard DriveHard DriveHard DriveHard Drive
1x Samsung 850 Pro 256GB 1x Crucial M500 960GB 1x WD4003FZEX 1x WD30EFRX 
CoolingCoolingOSMonitor
NH-D14 3x A15s @ 600RPM 2x Phanteks F140SP BBK (front), SFF21E (bottom) Win 10 Pro x64 Catleap 2B @ 119hz +1 
MonitorPowerCaseMouse
U3014 eVGA 750G2 Fractal R5 - Blackout Edition MS WMO 1.1a 
Mouse PadAudio
fUnc 1030 Creative Sound Blaster Z 
  hide details  
Main
(20 items)
 
  
CPUMotherboardGraphicsRAM
i5 4670K @ 4.7Ghz [1.284v] Z87X-UD4H [F7] MSI GTX 1070 Gaming X 2x4GB Samsung MV-3V4G3; 10-10-10-28 @ 2133Mhz [... 
Hard DriveHard DriveHard DriveHard Drive
1x Samsung 850 Pro 256GB 1x Crucial M500 960GB 1x WD4003FZEX 1x WD30EFRX 
CoolingCoolingOSMonitor
NH-D14 3x A15s @ 600RPM 2x Phanteks F140SP BBK (front), SFF21E (bottom) Win 10 Pro x64 Catleap 2B @ 119hz +1 
MonitorPowerCaseMouse
U3014 eVGA 750G2 Fractal R5 - Blackout Edition MS WMO 1.1a 
Mouse PadAudio
fUnc 1030 Creative Sound Blaster Z 
  hide details  
post #2166 of 2682

OK , Quick Question, Does that mean Nvidia should Implant in every game ? I think Nvidia should Create new Profile for each game that uses heavy AC?

post #2167 of 2682
Quote:
Originally Posted by Xuper View Post

OK , Quick Question, Does that mean Nvidia should Implant in every game ? I think Nvidia should Create new Profile for each game that uses heavy AC?

That's up to the developers. If they code a game to make use of a high amount of Asynchronous Compute... then, in theory, it should hit Maxwell 2 harder than GCN. GCN was built to do everything in hardware.

What I'm wondering is how the CPU load will look once nVIDIA activate it in their drivers. For GCN, the driver just send the load to the GPU based on the queue the developer picks (Compute/Graphics/Copy) and the GCN schedulers (ACEs and Graphic Command Processor) handle the rest... it seems that for Maxwell 2... the driver actually plays a larger role in distributing the work to the various elements without the Graphics card being involved in the process. In theory, this should show up as higher CPU overhead for nVIDIA when handling Asynchronous Compute.

It also means more latency... and this is what the VR guys have been talking about. It is also what I mentioned in my original theory.

Now we know nVIDIA didn't lie... well... wait and see before we draw that conclusion. Software support is still valid as many DX12 features are emulated in software in this generation of cards.

Quote:
Originally Posted by Noufel View Post

they didn't know that it will be a major issue and thnx to people like you Mahigan they admited that thumb.gif

Don't thank me... thank everyone... everyone who made this into a big issue compelling a response smile.gif
Edited by Mahigan - 9/4/15 at 3:08pm
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
Kn0wledge
(20 items)
 
Pati3nce
(14 items)
 
Wisd0m
(10 items)
 
post #2168 of 2682
Quote:
Originally Posted by Mahigan View Post

Probably not, that's what my theory was about... but it should help the GTX 980 Ti get a boost. Once Oxide work on more Post Processing effects and optimize for the Fury-X, another boost for the GCN parts in that direction as well (something Kollock mentioned).

At least now we know that Asynchronous Compute is in fact software driven for Maxwell 2. We know it can be activated with a driver update, at least that's what nVIDIA have stated. We're still at a wait and see phase imo.

Now watch as the internet explodes once again LOL

I don't understand - is it software only? There is nothing in the NV GPU hardware that could carry out such functionality? at all?
It's hard to follow all these posts.
post #2169 of 2682
Quote:
Originally Posted by Mahigan View Post

That's up to the developers. If they code a game to make use of a high amount of Asynchronous Compute... then, in theory, it should hit Maxwell 2 harder than GCN. GCN was built to do everything in hardware.

What I'm wondering is how the CPU load will look once nVIDIA activate it in their drivers. For GCN, the driver just send the load to the GPU based on the queue the developer picks (Compute/Graphics/Copy) and the GCN schedulers (ACEs and Graphic Command Processor) handle the rest... it seems that for Maxwell 2... the driver actually plays a larger role in distributing the work to the various elements without the Graphics card being involved in the process. In theory, this should show up as higher CPU overhead for nVIDIA when handling Asynchronous Compute.

It also means more latency... and this is what the VR guys have been talking about. It is also what I mentioned in my original theory.

Now we know nVIDIA didn't lie... well... wait and see before we draw that conclusion. Software support is still valid as many DX12 features are emulated in software in this generation of cards.
Don't thank me... thank everyone... everyone who made this into a big issue compelling a response smile.gif

Well, they didn't lie TECHNICALLY. If Async Compute is CPU based on Maxwell 2, this would mean problems if the CPU is with low core amount ( i3 or i5), and they are already pushed due to the game character.
Desktop PC
(22 items)
 
ASUS ROG G72GX
(6 items)
 
 
CPUMotherboardGraphicsGraphics
AMD FX-8320 ASRock Fatal1ty 990FX Killer Sapphire Nitro+ RX480 Sapphire R9 290 Tri-X 
GraphicsRAMHard DriveHard Drive
XFX RX470 Singlefan Mushkin Redline 996996 2x4GB 2133Mhz Maxtor 6Y080L0 80GB 7200 RPM 8MB Western Digital 160GB 7200RPM 8MB 
Hard DriveHard DriveHard DriveOptical Drive
Maxtor 250GB 7200RPM 8MB Corsair Force LS WesternDigital Blue 500GB 7200RPM 16MB ASUS DVD-RW 
CoolingOSOSOS
ThermalTake Frio Silent 14 Windows 10 Enterprise Linux Mint 17.3 Rosa OphCrack 
MonitorKeyboardPowerCase
ASUS VS228HR Logitech K120 Corsair VS650 ThermalTake View 27 
MouseAudio
Bloody V5 Corsair HS30 Raptor 
CPUMotherboardGraphicsRAM
Intel Mobile Core 2 Duo P8700 G72GX NVIDIA GeForce GTX 260M  Hyundai Electronics  
RAMRAM
Hyundai Electronics  Hyundai Electronics  
  hide details  
Desktop PC
(22 items)
 
ASUS ROG G72GX
(6 items)
 
 
CPUMotherboardGraphicsGraphics
AMD FX-8320 ASRock Fatal1ty 990FX Killer Sapphire Nitro+ RX480 Sapphire R9 290 Tri-X 
GraphicsRAMHard DriveHard Drive
XFX RX470 Singlefan Mushkin Redline 996996 2x4GB 2133Mhz Maxtor 6Y080L0 80GB 7200 RPM 8MB Western Digital 160GB 7200RPM 8MB 
Hard DriveHard DriveHard DriveOptical Drive
Maxtor 250GB 7200RPM 8MB Corsair Force LS WesternDigital Blue 500GB 7200RPM 16MB ASUS DVD-RW 
CoolingOSOSOS
ThermalTake Frio Silent 14 Windows 10 Enterprise Linux Mint 17.3 Rosa OphCrack 
MonitorKeyboardPowerCase
ASUS VS228HR Logitech K120 Corsair VS650 ThermalTake View 27 
MouseAudio
Bloody V5 Corsair HS30 Raptor 
CPUMotherboardGraphicsRAM
Intel Mobile Core 2 Duo P8700 G72GX NVIDIA GeForce GTX 260M  Hyundai Electronics  
RAMRAM
Hyundai Electronics  Hyundai Electronics  
  hide details  
post #2170 of 2682
Quote:
Originally Posted by Mahigan View Post

That's up to the developers. If they code a game to make use of a high amount of Asynchronous Compute... then, in theory, it should hit Maxwell 2 harder than GCN. GCN was built to do everything in hardware.

What I'm wondering is how the CPU load will look once nVIDIA activate it in their drivers. For GCN, the driver just send the load to the GPU based on the queue the developer picks (Compute/Graphics/Copy) and the GCN schedulers (ACEs and Graphic Command Processor) handle the rest... it seems that for Maxwell 2... the driver actually plays a larger role in distributing the work to the various elements without the Graphics card being involved in the process. In theory, this should show up as higher CPU overhead for nVIDIA when handling Asynchronous Compute.

It also means more latency... and this is what the VR guys have been talking about. It is also what I mentioned in my original theory.

Now we know nVIDIA didn't lie... well... wait and see before we draw that conclusion. Software support is still valid as many DX12 features are emulated in software in this generation of cards.
Don't thank me... thank everyone... everyone who made this into a big issue compelling a response smile.gif

Cool! thumb.gif

But thanks anyway Mahigan! biggrin.gif
Quote:
Originally Posted by ku4eto View Post

Well, they didn't lie TECHNICALLY. If Async Compute is CPU based on Maxwell 2, this would mean problems if the CPU is with low core amount ( i3 or i5), and they are already pushed due to the game character.

So us folks with 12 and 16 threads shouldn't worry AT ALL right?
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Software News
This thread is locked  
Overclock.net › Forums › Industry News › Software News › [Various] Ashes of the Singularity DX12 Benchmarks