Quote:
Originally Posted by
dantoddd 
Quote:
SIMD Hierarchy
4 Graphics Processing Clusters (GPC)
4 Streaming Multiprocessors (SM) per GPC = 16 SM
96 Stream Processors (SP) per SM = 1536 CUDA cores
TMU / Geometry Domain
8 Texture Units (TMU) per SM = 128 TMUs
32 Raster OPeration Units (ROPs)
Memory
256-bit wide GDDR5 memory interface
2048 MB (2 GB) memory amount standard
Clocks/Other
950 MHz core/CUDA core (no hot-clocks)
1250 MHz actual (5.00 GHz effective) memory, 160 GB/s memory bandwidth
2.9 TFLOP/s single-precision floating point compute power
486 GFLOP/s double-precision floating point compute power
Estimated die-area 340mm²
sooooo..................
what do these numbers mean to laymen like me
GK104 pushes much more pixels per clock than GF110. All computing logic function at the same clock (no 2x ROP clocks for the shaders anymore), so basically the shaders are more parallel oriented than pipelined vs. Fermi -- they probably took example of AMDs SP architecture.
This as a trade in effect vs the geometry performance and memory bandwidth. So GK104 will push 25% less polygones per clock than GF110 and has 33% smaller bus although the clocks are pushed somewhat higher. Also the clusters (SMs) are more crammed with logic units than GF110 -- much like GF104 vs GF100.
Aside from that it's a completely different architecture than Fermi so we have to wait and see how this actually performs in real world. Also they say nothing about the polymorphic engines (tessy units).