Originally Posted by Mahigan
No it isn't.
You're thinking about streaming in textures leading to GPU stalls (how things have been done thus far). That's not how Vega works. Rather than pausing the GPU in order to stream in textures (or using Asynchronous Compute in order to allow streaming in parallel with other work on the GPU), HBCC retains all of the required and most frequently accessed information in the framebuffer and then can stream and access data residing outside of the framebuffer, on any storage device, without incurring a GPU stall. That means that the GPU never pauses, ever, utilization would be (most likely) in the 90+ percentile at all times (hence the tachometer AMD is introducing on the GPU).
So no, system RAM is no longer "too slow", neither would a hard drive or SSD be too slow and yes... this is a gaming feature. A +50% performance boost (avg FPS) and a +100% performance boost (min FPS) was claimed by AMD in Deus Ex due to this feature (this feature alone... we're not talking about any other changes in Vega here).
Yeah... Vega is not gunning for Pascal, it is gunning for Volta.
Not sure was a GPU stall due to memory overcommitment is?
With Vega, this is no longer an issue.
The second case is Texture Data Transfer related: http://www.crcnetbase.com/doi/abs/10.1201/b12288-35
Of course Asynchronous transfers only allow one way, at a time, bi-directional transfers. HBCC allows seamless streaming... two way bi-directional transfers.
I think you can see what this means.
I don't know how bi-directional it will be. Most likely it will just transfer what needed to the GPU.
But there are some mistakes in the understanding of the idea. It isn't really "seamless streaming".
What nvidia say is true, when over committing data to the GPU through huge buffers, once that memory ends on the GPU, it needs to "clean house" and get the new buffers.
Nvidia's "fix" is just put more memory on the GPU, hence the 11/12GB of their top end cards. Also through careful work on optimizing the software.
That is an issue that AMD doesn't solve exactly. The GPU will still need to fetch the memory, and AMD it seems isn't going to put 16GB on their top end card.
So their fix of the HBCC is that they are making the pool of GPU in the system memory, and the GPU will only get what it actually needs through the HBCC management.
It doesn't provide +50%/+100% boosts. That is the wrong interpretation of what it can do.
AMD also went on to show how HBCC seemingly halves memory requirements, by deliberately capping the amount of addressable memory on the HBCC-aware system to only 2 GB - half of the 4 GB addressable by the non-HBCC-aware system, while claiming that even so, the HBCC-enabled system still showed "the same or better performance" through its better memory management and bandwidth speeds. If these results do hold up to scrutiny, this should benefit implementations of "Vega" with lower amounts of video memory, while simultaneously reducing production costs and overall end-user pricing, since smaller memory pools would be needed for the same effect.
Meaning their idea is to instead use 4GB of memory, to use 2GB, without losing performance.
Another way to see is that instead of putting costly HBM2 stacks of memory, they can use less, and just put 8GB of HBM2 on the vega, and still get performance results that nvidia get with 12GB.
It won't make their GPU +50% faster compared to similar situation.
Basically if a game engine doesn't commit so much memory, and can stay under the 8GB of memory to the GPU in buffers, you basically have the HBCC doing nothing. And with nvidia using 11/12GB, doesn't have as much of a problem.
The future benefit of it, is if developers use huge 8K textures, which will cause nvidia cards to bottleneck on the GPU memory, but AMD will have the HBCC, which means they will not suffer as much.
It can be useful for AMD in the future. I hope.
There is also the issue of actual system memory, textures loading etc, which even with using fast SSD and fast memory, people aren't gong to put 512TB of memory in their system, with being capped anyway by the system.