The issue is real and the statement by nVIDIA is correct and very marketing aware and sneaky. They have thought about the answer, but they gave away that this is a hardware issue and it cannot be fixed. Also it might not have a very big influence on average FPS, but it does have a big influence on frametime, which is extremely annoying and causes massive intermittent stuttering. Allow me to explain:
The official post said this:
"However the 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section."
What they are saying here is that the 970GTX has fewer resources than the 980GTX to access the same amount of memory. What they don't tell you is that NVIDIA can guarentee optimal memory performance ONLY if the 3.5GB section is used. Because the 970GTX has LESSER resources, but the SAME amount of memory as the 980GTX it can never achieve the memory bandwidth the 980GTX can achieve if ALL of the vRAM is used for reading/writing, i.e. if the 970GTX is FORCED to read or fetch data strechting the ENTIRE memory address space including the 0.5GB partition. If only the 3.5GB partition is used no bandwidth issues can occur, that's why the driver tries to force everything in the 3.5GB partition.
In gaming if the full 4GB must be used this translates to frequent frametime spikes and bad performance that people have reported. But because rarely all the vRAM has to be read or written at the same time during gameplay the 0.5GB partition does not have to be accessed ALL the time. It is accessed SOMETIMES when you turn around or fly through a game world really fast. I suspect that the driver incorporates all sorts of optimizations to make sure that whatever memory piece you need to access often is stored in the 3.5GB segment, because that partition it can access without bandwidth loss. However you NOTICE when it accesses the 0.5GB partion or when it swaps data from the 0.5GB partition, because this takes more time due to bandwidth limitations accessing this partitions. In your game this will translate to a splitsecond freeze or stuttering. Tthe video card does not have to access this partition ALL the time so this influences your average FPS only by 1-3% as was shown by nVIDIA.
However this is very misleading, because that does not seem like much, but even if the video card has to access its 0.5GB partion 2 times in a second to copy and read some memory you will notice this, because this induces huge frametime spikes. They are very annoying to the user since they appear as stutters or intermittent freezing on the screen and even while they are not very frequent they may not have such a drastic impact on average FPS, but the gameplay experience is horrible! Imagine having to put up with multiple splitsecond freezes every time you turn your head!
To avoid these framespikes and stutter behaviour the driver tries to keep everything in the 3.5GB partition. If the 0.5GB partition is required you're going to see frametime spikes sooner or later.
The actual average bandwidth of the 970GTX is therefore much lower than advertised if the card is actually using all its 4GB of vRAM. The last 512MB is just for show, if it is really needed then it can be used, but it's too slow to be properly used and keep your frametime low constantly, i.e. you will not have a fluent gaming experience. Unfortunately this is clearly a hardware design flaw and it cannot be fixed. The only way to avoid the issue is to play games that require 3.5GB of vRAM or less.
The benchmarks by nvidia are very misleading. They show games running with low fps, so the frametime spikes have even less impact on total performance, because the initial performance is bad to begin with. Unfortunately I am an 970GTX SLI user so I have very good fps in most games while gaming at 2K and I will suffer most from it, because when the 4GB is truely needed the frametime spikes are VERY significant versus average frametime. Watch Dogs can use 4GB of vRAM easily and this time the stuttering is not due to the engine. I used to play this with 2x670GTX 4GB and had no issues. Now the same game with the same settings stutters on my gpus. Far Cry 4 uses 4GB easily with 4xMSAA and there the stuttering is also very appearent once the 3.5GB cap is reached. There are plenty of more example to come up with. They are not fabricated by users, they are just very hard to benchmark and to prove, since the stuttering issue is very intermittent and very game and settings dependent. And the fact that 4GB loaded does not mean that all 4GB has to be accessed simultaneously makes it even harder.
The explanation nVIDIA gave us is very misleading and should have been made public beforehand. If I had known this I would have bought 2 x 980GTX, since I game at high resolutions and suffer greatly from this issue, not only in Far Cry 4 and Watch Dogs. The bandwidth issues however are real as the following benchmark shows:THE FOLLOWING BENCHMARK IS DEVELOPED BY NVIDIA AND PART OF THE CUDA DEVELOPER TOOLKIT.
IT MEASURES DEVICE TO DEVICE MEMORY COPY PERFORMANCE, MEANING IT IS COPYING A VRAM PORTION FROM ONE PART OF THE VRAM TO ANOTHER.
Unfortunately I cannot redistribute it due to EULA agreement, but it shows better what the other benchmark also showed. You can download and compile it yourself because it is part of the official NVIDIA CUDA toolkit. This benchmark is independent of DWM settings or other weird glitches Nai's benchmark seemed to be subject to.
With the settings I use it starts by copying 1GB of vRAM to another portion of the vRAM, so 2GB of vRAM is used in total in the first iterations. For next iterations it does the same each time for 64MB more. It ends by copying 1.875GB of vRAM to another portion of the vRAM, so at the end 3.75GB is used on the card for the copy benchmark, meaning that the 0.5GB partition has be used one way or another. The memory bandwidth should stay the same throughout the test. But this copy operation is seriously limited by the 970GTX design flaw. As soon as the last 512MB of the card is needed the performance starts dropping.
This is why people are reporting stuttering. It is a hardware limitation and it can't be fixed, we have to deal with it.