Originally Posted by raghu78
wow this thread has gone totally crazy. btw alatar you are hoping (or praying) that the chiphell rumours are not true.
Because that would mean AMD will have the single GPU crown. Thats a tough pill to swallow for diehard Nvidia supporters. On the other hand I am cautiously optimistic that R9 390X has a very good chance of being 65% faster than R9 290X (roughly 50% faster than GTX 980). If you look at it R9 390X will most probably sport an improved GCN 2.0 architecture. GCN 1.2 aka Tonga made improvements to tesselation performance, ROP performance and memory bandwidth efficiency.
But AMD has not yet improved the core shader/ stream processor/compute unit architecture
. Its a ripe candidate for improving perf and efficiency. With GCN 2.0 AMD can improve perf/sp , perf/transistor and perf/sq mm in a significant way with architectural efficiency and perf improvements.
With just 37.5% more shaders R9 290X was 35% faster than R9 280X / HD 7970. This was with no major core architectural improvements and no major bandwidth improvements. With 45% more cores, 512 GB/s of HBM (60% more bandwidth than R9 290X) , Tonga color compression which further improves bandwidth by 40% you are looking at 1. 6 * 1.4 / 1.45 = 1.54 times the bandwidth per sp or per compute unit . So >50% increase in available memory bandwidth per sp or per compute unit. AMD has already improved the rest of the GPU to allow smooth perf scaling when more sp are added. Now its easier for AMD to focus on improving the GPU shader or sp or compute unit.
btw the memory controller on modern GPUs can draw anywhere from 30 - 50% of the board power . For 256 bit GPUs its closer to 30% and for 512 bit GPUs its 50% or even higher . here is a study on memory controller/DRAM power consumption as a % of board power on Quadro FX 9800 (based on GTX 280 with 512 bit GDDR3 memory bus) and HD 6990 (with dual GPUs each connected to 256 bit GDDR5 memory bus)
HBM cuts power by 65% when compared against GDDR5 for a given bandwidth.
More importantly HBM also reduces die size significantly (or alternatively frees up transistors to be used for scaling the core GPU perf) on the core GPU by moving the vast majority of memory controller complexity and transistors to the HBM stack which has a base logic die + 4 DRAM stacks (5mKGSD - molded known good stacked die). Since the stacked HBM on silicon interposer does not need to drive memory I/O through the printed circuit board it cuts power in a massive way. HBM, TSVs and 2.5D stacking are truly breakthrough and revolutionary technologies and AMD will have atleast an 18 month time to market lead against Nvidia.
What all this does is allow AMD to scale GPU performance in a huge way at the same 28nm node. Other techniques like adaptive voltage control which are found in AMD Carrizo could also be found on the R9 3xx GPUs as these power reduction techniques are commonly shared by AMD GPUs and APUs.
Finally the GF 28SHP process which is the node at which the R9 390X and R9 380X GPUs are likely to be built is a much better process than TSMC 28HP at which R9 290X was built. This is well known from the power improvements seen in Beema (built at GF 28SHP) compared to Kabini (TSMC 28HP)
"AMD claims a 19% reduction in core leakage/static current for Puma+ compared to Jaguar at 1.2V, and a 38% reduction for the GPU. The drop in leakage directly contributes to a substantially lower power profile for Beema and Mullins
In summary AMD has a lot of levers to work on to provide a true GPU perf beast even at the 28nm node and thats exactly what they seem to have done with the R9 390X. We will known in 2 -3 months if AMD has hit a homerun.