[Gamer's Nexus] PCIe x16/x16 vs. x8/x8 (Dual Titan V Bandwidth Limit Test)

Blameless · Dec 20, 2017

That's a relatively minimal hit for cutting bandwidth in half in a top of the line multi-GPU rendering configuration that doesn't have any sort of bridge dedicated to frame compositing.

Probably time for those looking to continue to use top of the line multi-GPU configs to use 16x/16x 3.0, if possible, but that's a pretty small segment.

Single card is probably fine at far lower interface bandwidth, probably even with most external GPU enclosures over thunderbolt (something I'd also like to see tested).

EniGma1987 · Dec 20, 2017

Quote:

Originally Posted by CrazyElf

Edit: Oh, and we finally have a solution that can average more than 60 fps at 4k on Crazy settings with Ashes of Singularity: Escalation. The problem of course is the cost and the fact that 0.1% frame percentile is still hovering around 20 fps, with no gains from 2 GPUs.

That would be because of the game engine doing something that takes significant CPU time and causes a bottleneck that prevents the GPU from being fed with data any faster during that calculation.

HMBR · Dec 20, 2017

this puts people wanting this setup in a weird situation, go for the 8700K which is the best gaming CPU, but being limited on PCIE bandwidth, or buy the more workstation style 7900x with no bandwidth problems but clearly less CPU perf...

in the end, SLI, specially of this expensive card is not very practical these days, so... interesting to see, shows that the new PCIE with double the bandwidth per lane could be beneficial in some cases for dual GPUs soon.

Nautilus · Dec 20, 2017

Quote:

Originally Posted by HMBR

this puts people wanting this setup in a weird situation, go for the 8700K which is the best gaming CPU, but being limited on PCIE bandwidth, or buy the more workstation style 7900x with no bandwidth problems but clearly less CPU perf...

in the end, SLI, specially of this expensive card is not very practical these days, so... interesting to see, shows that the new PCIE with double the bandwidth per lane could be beneficial in some cases for dual GPUs soon.

7900x is not less performant. I am running it at 4.7Ghz daily, I can go up to 4.8 which removes any possible bottlenecks.

DarkBlade6 · Dec 20, 2017

it's weird that it makes a difference in SLI but not for a single Titan V

Nautilus · Dec 20, 2017

Quote:

Originally Posted by DarkBlade6

it's weird that it makes a difference in SLI but not for a single Titan V

At x16 it does not make any difference in single vs sli, no bottleneck.

DarkBlade6 · Dec 20, 2017

Quote:

Originally Posted by Nautilus

At x16 it does not make any difference in single vs sli, no bottleneck.

That's not what im saying ... two Titan V @ 16x is 14% faster than two Titan V @ 8x. But a single Titan V @ 16x or 8x, there's no difference.
Edit: aww its because they dont use an SLI bridge, they rely on DX12 multi GPU thingy (cant remember whats called lol)

L36 · Dec 20, 2017

Quote:

Originally Posted by Nautilus

7900x is not less performant. I am running it at 4.7Ghz daily, I can go up to 4.8 which removes any possible bottlenecks.

Skylake E will always perform worse in high refresh rate scenarios over old ring bus based CPUs. In this case the 8700K would overtake it in performance and it clocks higher than your 7900x on average.

EdgeCrusher86 · Dec 20, 2017

Hi,
here are more practical numbers of how the pci-e and bridge-bandwith can influence performance/scaling when games are maxed out (especially incl. a G-SYNC Panel which has more Impact at the needed bandwith). About 1.5 half years old by now but it shows very well that a mainstream socket with PCI-E 3.0 x8 x8 (simulated with using Gen 2 x16 x16) is not sufficient in some cases. Especially this is the case when using highres, T.AA and adaptive sync. So PCI-E 4.0 x8 x8 is going to be a big boost for upcoming mainstream sockets. The interesting thing is that MGPU @ DX12 has not that impact when it comes to scaling with less bandwith. Saw some good numbers at YT for example, but anyway there are not really that much titles on the market supporting MGPU while using that API.

These tests were made by a friend of mine (Blaire at german's 3DCenter-forums) who is also beta tester for NV drivers, so you can call him very experienced.

Yes the socket is not really up to date but with 4K+GW effects and T.AA, you are GPU limited in SLI.

Some hard cases - you can clearly see while using a flex bridge and Gen 2 x16 x16 the scaling does not do well:

Original post (in german)

Another new game showing how important PCI-E bandwith is -> Hellblade with UE4 and T.AA. With Blaire's custom SLI bits you get very good scaling and consistent frametimes (~50-70% scaling like always when games are using T.AA), but some users who are running 3.0 x8 x8 systems with SLI responded that they had no scaling at all - turned out to be a bit better when disabling G-SYNC. In fact as you can see the frametimes in MGPU are much smoother than with SGPU! Using 2-way TITAN X (Pascal).

Have a look at this comment and the replies:
1080 in SLI... made absolutely zero difference.

Regards,
Edge

pas008 · Dec 20, 2017

did they do this pcie test on titan x with hb sli bridge and ribbon?

rluker5 · Dec 20, 2017

GN stated at 1:00 that the 1080ti test was single card. Multigpu uses more pcie bandwith per card. They may have found the same conclusion had they tested sli 1080tis, just to a lesser extent.

I tested this too, just a bit, by tossing my PhysX gpu in a slot that forced a x16,x0 to 2 x8s and made my sli run x16,x8. my loss in Deus Ex MD dx12 was bigger, dx11 stayed pretty close to the same. Don't know why, but it did.

andydabeast · Dec 20, 2017

My 1080 is on a riser out of the second slot. I probably don't have to worry

BrainSplatter · Dec 20, 2017

Most games don't have a problem but I am currently testing that with a 6850K @ 4.5Ghz and SLI 1080TI @ 2Ghz and in Total-War Warhammer 2 @ 4K DSR there was a 10% difference between PICE 3.0 and 2.0 x16 (same speed difference as v3.0 x16 vs x8). 1080p was partially CPU limited which is why the test didn't make much sense.

There are a couple of games which can supposedly loose up to 20% performance with x8 PCIE in SLI: Fallout 4 (-20%), Witcher 3 (-20%), Watch Dogs (-10%):
https://www.forum-3dcenter.org/vbulletin/showpost.php?p=11059897&postcount=2261

Especially modern SSAO (eg. HBAO+) and AA solutions (e.g. TSSAA) seem to require a lot of bandwidth.

looniam · Dec 20, 2017

Quote:

Originally Posted by pas008

did they do this pcie test on titan x with hb sli bridge and ribbon?

they tested the titan V which doesn't support SLI.

if you go to https://youtu.be/i8iE_sQBFXk?t=1m57s

aots uses explicit multi gpu under dx12 so SLI and Xfire doesn't need supported.

oopps misundeerstood. read below.

profundido · Dec 20, 2017

Quote:

Originally Posted by pas008

did they do this pcie test on titan x with hb sli bridge and ribbon?

in their previous test they basically did exactly that yes. The only difference was that it was not exactly the older TitanX but the somewhat newer and more performant 1080ti which was pretty much the same performance of the TitanXPv1 at that time (before v2 was released)

pas008 · Dec 20, 2017

Quote:

Originally Posted by profundido

in their previous test they basically did exactly that yes. The only difference was that it was not exactly the older TitanX but the somewhat newer and more performant 1080ti which was pretty much the same performance of the TitanXPv1 at that time (before v2 was released)

have a link to that one? cant find it with my limited internet at work

profundido · Dec 21, 2017

Quote:

Originally Posted by pas008

have a link to that one? cant find it with my limited internet at work

I went through the articles on their site and found it:

https://www.gamersnexus.net/guides/2963-intel-12k-marketing-blunder-pcie-lane-scaling-benchmarks

and also:

https://www.gamersnexus.net/guides/2488-pci-e-3-x8-vs-x16-performance-impact-on-gpus

Really nice tests they did there

pas008 · Dec 21, 2017

Quote:

Originally Posted by profundido

I went through the articles on their site and found it:

https://www.gamersnexus.net/guides/2963-intel-12k-marketing-blunder-pcie-lane-scaling-benchmarks

and also:

https://www.gamersnexus.net/guides/2488-pci-e-3-x8-vs-x16-performance-impact-on-gpus

Really nice tests they did there

thanks
I know I seen them before just couldnt get access to anything at work
was going to hunt them down once I was done at work but kiddos got me all sidetracked

really wish they didnt compared different cpus
ring vs mesh is really an inconvenience to this test to me
but I know it wouldnt make that much of a difference but the gaps would possibly get bigger

I am more interested on how much gpu power it takes for considering pcie lane bandwidth generations/versions

sry typed fast at work again

Talon720 · Dec 21, 2017

Wait they didn’t use sli bridges? Dosnt that make these tests sorta flawed since nvidia cards aren’t really designed as a bridgeless design. I wonder how using the dx12 multi gpu over pcie would effect an amd card? Or does it just bypass any hardware based sli/crossfire bridging?

EniGma1987 · Dec 22, 2017

Quote:

Originally Posted by Talon720

Wait they didn't use sli bridges? Dosnt that make these tests sorta flawed since nvidia cards aren't really designed as a bridgeless design. I wonder how using the dx12 multi gpu over pcie would effect an amd card? Or does it just bypass any hardware based sli/crossfire bridging?

It depends how the program wants to use the multiple GPUs, for a typical DX12 Multi-GPU you dont actually enable SLI and thus you dont use an SLI bridge. There are games that do use SLI mode and need the bridge when in SLI, but the way games like AOTS do it you dont enable SLI. And you can also mix things like Nvidia and AMD GPUs in the same system and it works. In the case of these tests, because the way the game functions an SLI bridge would do nothing so the test is valid for the game. I believe the mode is called "Explicit multi-GPU"?

okcomputer360 · Dec 22, 2017

What about with regards to mining and hash rate, where the cards are usually running at or near full capacity? Any bottlenecks between architectures 2.0 & 3.0

Thank you

Blameless · Dec 22, 2017

Quote:

Originally Posted by okcomputer360

What about with regards to mining and hash rate, where the cards are usually running at or near full capacity? Any bottlenecks between architectures 2.0 & 3.0

Thank you

Mining/hashing generally uses very little PCI-E bandwidth and even 1x 2.0 won't be a bottleneck for most mining or password cracking setups.

[Gamer's Nexus] PCIe x16/x16 vs. x8/x8 (Dual Titan V Bandwidth Limit Test)

Top Contributors this Month

Recommended Communities