THIS ARTICLE REVEALS all of the important information regarding GeForce 8800 series, which is set to be released to the world on November 8th, 2006 in San Jose. We have learned that during traditional Editor's Day in San Francisco nVidia kept its rules, so "no porn surfing" and "no leaks to the Inquirer" banners were shown. But, we have no hard feelings about that. It is up to the companies to either respect millions of our readers, including employees of Nvidia or... not.
As you already know, Adrianne Curry, a Playboy bunny, America's Next Top Model star and an actor from My Fair Brady is the demo chick for G80. After we posted the story, we received a growl from Graphzilla, but we are here to serve you, our dear readers. However, this was just a story about a person who posed for the G80. Now, it's time to reveal the hardware. Everything you want to know, and don't want to wait for November 8th - lies in this article. Get your pop-corn ready; this will be a messy ride.
For starters, the 8800 launch is a hard one, so expect partners to have boards in store for the big day's press conference at 11AM on the 8th. The board delivery will go in several waves, with the first two separated by days. The boards were designed by ASUSTeK, and feature a departure from usual suspects at Micro-Star International. This is also the first ever black graphics card from nVidia. Bear in mind that every 8800GTX and 8800GTS is manufactured by ASUS. AIBs (add-in board vendors) can only change the cooling, while no overclocking is allowed on 1st gen products. Expect a very limited allocation of these boards, with UK alone getting a mediocre 200 boards.
G80 is a 681 million transistor chip manufactured by TSMC. Since Graphzilla opted for the traditional approach, it eats up around 140 Watts of power. The rest gets eaten by Nvidia's I/O chip, video memory and the losses in power conversion on the PCB itself.
If you remember the previous marchitecture, the G70 GPU embedded in 7800GTX 256MB, you will probably remember that the Pixel and Vertex Shader units worked at a different clock speed. G80 takes it one step forward, with a massive increase in clocks of Shader units.
GigaThread is the name of the G80 marchitecture which supports thousands of executing threads - similar to ATI's RingBus, keeping all of the Shader units well fed. G80 comes with 128 scalar Shader units, which Nvidia calls Stream Processors.
The reason Nvidia went with SP description is a DirectX 10 function called Stream Output, that those Shader units will now work on Pixel, Vertex, Geometry and Physics instructions, but not all at the same time. The function, in short, enables data from vertex or geometry shaders to be sent to memory and forwarded back to the top of GPU pipeline in order to be processed again. This enables developers to put in more shiny lighting calculations, physical calculations, or just more complex geometry processing in the engine. Read: more stuff for fewer transistors.
In order to enable that, Nvidia pulled a CPU approach and stuffed L1 and L2 cache across the chip. On the other hand, you might like to know that both Geometry and Vertex Shader programs support Vertex Texturing.
And when it comes to texturing itself, G80 features 64 Texture Filtering Units, which can feed the rest of the GPU with 64 pixels in a single clock. For comparison, GF7800GTX could manage only 24. Depending on the method of texture sampling and filtering used, G80 ranges from 18.4 to 36.8 billion texels in a single second. Pixel wise, the G80 churns out 36.8 billion of finished pixels in a single second.
When it comes to RingBus vs. GigaThread, DAAMIT's X1900 can branch granularity of 48 Pixels, X1800 can do 16. GeForce 8800GTX can do 32 pixel threads in some cases, but mostly the chip will be able to do 16, thus you can expect Nvidia to lose out on GPGPU front (for instance, in Folding@Home stuff).
However, Nvidia claims 100% efficiency, and we know for sure that ATI is mostly running in high 60s to high 70s in percentage points.
How many pixels can G80 push?
One of the things we are using to describe the traditional pixel pipeline is the number of pixels a chip can render in a single clock. With programmable units, the traditional pipeline died out, but many hacks out there are still using this inaccurate description.
To cut a long story short, on the pixel-rendering side, G80 can render the same amount of pixels as G70 (7800) and G71 (7900) chips.
The G80 chip in its full configuration comes with six Raster Operation Partitions (ROP) and each can render four pixels. So, 8800GTX can churn out 24, and 8800GTS can push 20 pixels per clock. However, these are complete pixels. If you use only Z-processing, you can expect a massive 192 pixels if one sample per pixel is used. If 4x FSAA is being used, then this number drops to 48 pixels per clock.
For game developers, the important information is that eight MRT (Multiple Render Targets) can be utilised and the ROPs support Frame Buffer blending of FP16 and FP32 render targets and every type of Frame Buffer surface can be used with FSAA and HDR.
If you are not a game developer, this sentence above means that Nvidia now supports FP32 blending, which was not a thing in the past, and FSAA/HDR combination will be supported by default. In fact, 16xAA and 128-bit HDR are supported at the same time.
Lumenex Engine - New FSAA and HDR explained
ROPs are also in charge of AntiAliasing, which has remained very similar to GeForce 7 series, albeit with quality adjustments. The G80 chip supports multi-sampling (MSAA), supersampling (SSAA) and transparency adaptive anti-aliasing (TAA). The four new 1GPU modes are 8x, 8xQ, 16x and 16xQ. Of course, you can't expect that you will be able to have enough horsepower to run the latest games with 16xQ enabled on a single 8800GTX, right?
Wrong. In certain games you can buy today, you can enjoy full 16xQ with the performance of regular 4xAA. The reason is exactly the difference between those 192 and 48 pixels in a single clock. But in games which aren't able to utilise 16x and 16xQ optimisations, you're far better off with lower AntiAliasing settings.
This mode Nvidia now calls "Application Enhanced, joining the two old scoundrels "Application Override" and "Application Controlled". Only "App Enhanced" is the new mode, and the idea is probably that the application talks with Nvidia's driver in order to decide which piece of a scene gets the AA treatment, and what does not. Can you say.... partial AA?
Now, where did we hear that one before.... ah, yes. EAA on Renderition Verite in late 90s of the past century and Matrox Parhelia in the early 21st century?
On the HDR (High Dynamic Range) side, Nvidia has designed the feature around OpenEXR spec, offering 128-bit precision (32-bit FP per component, Red:Green:Blue:Alpha channel) instead of today's 64-bit version. Nvidia is calling its new feature True HDR, although you can bet your arse this isn't the latest feature that vendors will call "true". Can't wait for "True AA", "True AF" and so on...
Anisotropic filtering has been raised in quality to match for ATI's X1K marchitecture, so now Nvidia offers angle-independent Aniso Filtering as well, thus killing the shimmering effect which was so annoying in numerous battles in Alterac Valley (World of WarCraft), Spywarefied (pardon, BattleField), Enemy Territory and many more. When compared to GeForce 7, it looks like GeForce 7 was in the stone age compared to the smoothness of the GeForce 8 series. Expect interesting screenshots of D3D AF-Tester Ver1.1. in many of GF8 reviews on the 8th.
Oh yeah, you can use AA in conjunction with both high-quality AF and 128-bit HDR. The external I/O chip now offers 10-bit DAC and supports over a billion colours, unlike 16.7 million in previous GeForce marchitectures.
Since PhysiX failed to take off in a spectacular manner, DAAMIT's Menage-a-Trois and Nvidia's SLI-Physics used Havok to create simpler physics computation on respective GPUs. Quantum Effects should take things on a more professional (usable) level, with hardware calculation of effects such as smoke, fire and explosions added to the mix of rigid body physics, particle effects, fluid, cloth and many more things that should make their way into games of tomorrow.
Developed under a codename P355, the 8800GTX is Nvidia's flagship implementation. It features a fully fledged G80 chip clocked at 575MHz. Inside the GPU, there are 128 scalar Shader units clocked at 1.35GHz and raw Shader power is around 520GFLOPS. So, if anyone starts to talk about teraflops on a single GPU, we can tell you that we're around a year before that number becomes true. Before G90 and R700 these claims come from marketing alone.
768MB of Samsung memory is clocked at 900MHz DDR, or 1800 MegaTransfers (1.8GHz) wielding out a commanding 86.4 GB/s of memory bandwidth.
The PCB is massive 10.3 inches, or 27 centimetres, and on top of the PCB there are couple of new things. First of all, there are two power connectors, and secondly - the GTX features two new SLI MIO connectors. Their usage is "TBA" (To Be Announced), but we can tell you that this is not the only 8800 you will be seeing on the market. Connectors are two dual-link DVIs and one HDTV 7-pin out. HDMI 1.3 support is here from day one, but we don't think you'll be seeing too much of 8800GTX w/HDMI connection.
Cooling is not water/air cooled, but more manufacturer friendly aluminium with copper heat pipe. The fan is expected to be silent as a grave, and several AIBs are planning a more powerful version for 2nd gen 8800GTX, expected to be overclocked to 600 MHz for GPU and 1 GHz DDR for the memory.
The board's recommended price has changed couple of times and stands at 599 or dollars/euros, or 399 pounds. However, due to expected massive shortage, expect these prices to hit stratospheric levels.
Codenamed P356, the 8800GTS is a smaller brother of the GTX. The G80 chip is the same as on the GTX, but the amount of wiring has been cut, so you have the 320-bit memory controller instead of 384-bit, 96 Shader units instead of 128 and 20 pixels per clock instead of 24.
The board itself is long and comes with a simpler layout than the GTX one. Dual-Link DVI, 7-pin HDTV out come by default. "Only" one 6-pin PEG connector is used, and power-supply requirements are lighter on the wallet.
The clocks have been set at 500MHz for the GPU, 1.2GHz for Shader Units, while the 640MB of memory has been clocked down to 800MHz DDR, or 1600 MegaTransfers (1.6GHz), yielding out bandwidth of 64GB/s. Both pixel and texel fill-rate fell by a significant margin, to 24 billion pixels and 16 to 32 billion texels.
Recommended price is 399 dollars/euros, but who are we kidding? Expect at least 100 dollars/euros higher price.
Performance is CPU Bound
Yes, you've read it correctly. Both GTS and GTX are maxing out the CPUs of today, and even Kentsfield and upcoming 4x4 will not have enough CPU to max out the graphics card â€“ G80 chip just eats up all the processing power that a CPU can provide to them.
Having said this, expect fireworks with AMD's 4x4 platform once that true quad-core FX become available.
In the end
Nvidia has a really strong line-up for upcoming Yuletide shopping madness. However, within the ranks of Graphzilla's troopers there is an obvious intent to bury all of the more advanced features that the competition will offer in couple of months' time. 512-bit memory interface, more pixels per clock, second gen RingBus marchitecture... all this is hidden in the dungeons of Markham and DAAMIT's R&D Labs in Santa Clara and Marlboro.
Also, we have to say that the market is now set for repeat of 2005 and the R520/580 vs.G70/71 duel, since Nvidia will probably offer a spring refresh of the high-end model at the same time as DAAMIT launches the long delayed R600 chip. Âµ