Originally Posted by jojoenglish85
I don't care about the bus size, i see every other card in the 700 series with an option of 4GB or more, they intentionally left the 780 like that. I wonder why
GTX 770 and GTX 760 are both GK104 which has a 256-bit bus / four 64-bit memory controllers
For the GTX 660 Ti in 2012 NVIDIA is once again going to use their asymmetrical memory technique in order to outfit the GTX 660 Ti with 2GB of memory on a 192bit bus, but they’re going to be implementing it slightly differently. Whereas the GTX 550 Ti mixed memory chip density in order to get 1GB out of 6 chips, the GTX 660 Ti will mix up the number of chips attached to each controller in order to get 2GB out of 8 chips. Specifically, there will be 4 chips instead of 2 attached to one of the memory controllers, while the other controllers will continue to have 2 chips. By doing it in this manner, this allows NVIDIA to use the same Hynix 2Gb chips they already use in the rest of the GTX 600 series, with the only high-level difference being the width of the bus connecting them.
Of course at a low-level it’s more complex than that. In a symmetrical design with an equal amount of RAM on each controller it’s rather easy to interleave memory operations across all of the controllers, which maximizes performance of the memory subsystem as a whole. However complete interleaving requires that kind of a symmetrical design, which means it’s not quite suitable for use on NVIDIA’s asymmetrical memory designs. Instead NVIDIA must start playing tricks. And when tricks are involved, there’s always a downside.
The best case scenario is always going to be that the entire 192bit bus is in use by interleaving a memory operation across all 3 controllers, giving the card 144GB/sec of memory bandwidth (192bit * 6GHz / 8). But that can only be done at up to 1.5GB of memory; the final 512MB of memory is attached to a single memory controller. This invokes the worst case scenario, where only 1 64-bit memory controller is in use and thereby reducing memory bandwidth to a much more modest 48GB/sec.
How NVIDIA spreads out memory accesses will have a great deal of impact on when we hit these scenarios. In the past we’ve tried to divine how NVIDIA is accomplishing this, but even with the compute capability of CUDA memory appears to be too far abstracted for us to test any specific theories. And because NVIDIA is continuing to label the internal details of their memory bus a competitive advantage, they’re unwilling to share the details of its operation with us. Thus we’re largely dealing with a black box here, one where poking and prodding doesn’t produce much in the way of meaningful results.
Full GK104 (GTX 770/GTX 680)
GTX 780 , worst case where an entire GPC is off and you lose a raster engine
* 12 of 15 total possible SMX, compared to 14/15 SMX on TITAN
* If they use asymmetric setup, they'd have to use 4 memory controllers like on GTX 680/GTX 770 if they want to avoid the odd interleaving situation. Given that a "good" GTX 780 with all Raster engines in tact has 5 SMX it would be odd. If a "bad" GTX 780 with an entire GPC/raster engine gone has its memory controllers disabled then you could get 4GB , at the expense of bandwidth (which would make it worse than a 3GB GTX 780 with 384-bit bus).
* They could just swap two 512 MB modules for 1GB modules. (i..e mixed memory density) since there's 6 memory controllers , swapping two modules would mean even keeping dual channel wouldn't be an issue. Currently it's 6 controllers x 512 MB in the GTX 780 and 6 controllers x 1024MB for TITAN. The GTX 660 Ti / GTX 660/GTX 650 Ti Boost have 3 memory controllers. There has to be a reason they went through all the trouble to avoid using mixed memory density on the lower end GTX 660Ti though....
--> For reference there's 24 modules of 64M × 16 GDDR5 used on TITAN (12 modules are on the back), 12 modules of 64M × 16 GDDR5 are on GTX 780
--> GTX 550 Ti had a mixed memory density setup and required custom logic in the drivers and the die itself.
Back when Gk110 was still a rumor, they thought it to have a 512-bit bus with 4GB VRAM.Edited by AlphaC - 10/14/13 at 5:28pm