Overclock.net › Forums › Graphics Cards › NVIDIA › NVIDIA GK110 "Titan" Performance Analysis
New Posts  All Forums:Forum Nav:

NVIDIA GK110 "Titan" Performance Analysis

post #1 of 71
Thread Starter 
I received a few questions about the gk110 GPU from my post the other day. I have decided to just make a thread about the GPU so that anyone interested can read about it.

First of all, ALL of this information is either public, or derived only from public information. The gk110 GPU is already available to the public and I have had access to them for a while now. What I will not and cannot talk about is any benchmark performance, unreleased information and features, prices, future cards, etc. I cannot disclose any NDA information.

I will also skip the parts that are already floating around such as improved fp64 performance and such. I will focus on gaming applications.



GK110 GPU block diagram (left) and GK104 GPU block diagram (right)

Performance

Die Layout, extra cores, more meat
Unlike the gk104, gk110 appears to not only have additional cores and other units, but more advanced logic circuitry in multiple areas. Namely, the gk110's central command processor is significantly larger and more advanced. It appears to have more instruction and data cache as well. This implies that the GPU will be able to support much more complicated drivers and pipeline queues. This will be particularly useful for games with complicated and long graphics pipelines such as the original Crysis, Metro2033, and GTA4.

SIMD scheduling performance improvements
You will also notice the gk110 has 2 additional SPUs. This will provide a theoretical 50% improvement in pipeline throughput compared to gk104. This means that theoretically, gk110 could issue 50% more frames than gk104 in the same amount of time (note I said issue, not render). This will be great for games with heavy shadow map rendering and long graphics pipelines.

SMX and IPC gains
Although I remember hearing somewhere that the shader units are the same, you can clearly see that the shader units are not the same at all. Most notably they appear to have many more cache chips (they appear larger too but I'm not sure). This will be great for pretty much everything as the GPU will be able to reduce memory misses and theoretically improve average IPC. I'm also guessing that the beefier SMXs will provide improved average IPC as well. Note I am assuming normal cases that don't break some reasonable standard deviation (theoretical isn't always practical). I would guess, based on the diagram, that you will see between a 90% and 150% increase in average IPC for the entire GPU compared to gk104.

Clock Speed and Overall Performance Gains
If you assume reasonable power targets, the clocks are likely to be much lower than they were in gk104. I would guess they would need to be somewhere near the 720MHz range. If we compare this to say a 1GHz gk104, my analysis will show at least a 64.8% increase in performance compared to the gk104! Most likely this number will be closer to 86.4% but not higher than 108% increase (this might occur in some hand-picked cases but I doubt it). This would indeed put it very close to the current gtx690. which is often about 83% faster than a gtx680 at 1ghz.

VS gtx690
Many people are saying it would be 80% the performance of a gtx690 but I just don't see that. I believe, based on analysis, that is will be much closer in performance to a gtx690 than 80%. It may be that in benchmarks 80% is correct, but after looking at some benchmarks on anandtech, I think that in many games like metro2033, once the drivers are matured, the performance will be very close.


Memory

as for the memory. I can't say. I know that the professional cards I've seen and that are available now have 6gb of VRAM but I can't say anything about a consumer equivalent card. I can say, however, that 6gb of VRAM is useless for games. Any game that uses more than 1gb of VRAM plus 0.5 per 1080p monitor is very poorly written. I don't care how many crazy effects the game has or how many trees are being rendered. I know first hand about this type of coding and I can tell you that things just don't scale the way a lot of people think. Today, resources can be quickly swapped from RAM to VRAM, objects can be instanced, textures, compressed, passes instanced, etc. There's just no reason why anyone playing a well written game would need more than that. That said, if you have a 3 monitor setup at 1080p, you'd be fine with 3GB. A lot of people who have issues with VRAM are using SLI or crossfire and they don't realize that the memory is copied. So if you have 2gb per card and 4 cards, you only have 2gb total! I could go on forever about this lol.

I will admit, however, that this limit is being pressed on (I think about gta4 using 90% of my 1gb vram on my 1200p screen. it's so heavily deferred that it would probably add 0.8gb per monitor added. but that is still under 3gb for 3 monitors! tongue.gif)

It is very late now, but I will add some more commentary another time. Until then, I hope my analysis has been useful. Please post any questions or comments below and I will try to answer them when I get time (might be a few days wait though because I am very busy).
Edited by texcoord3 - 1/27/13 at 11:11pm
post #2 of 71
sounds cool, but i will believe it when i see it. You will also need more proof than what you presented, to convince me you've tried it.
Bender
(18 items)
 
  
CPUMotherboardGraphicsGraphics
Core i5-2500k @ 4.6Ghz Gigabyte Z68X-UD3H-B3 Sapphire AMD R9 290X Tri-X (1050/1300mhz)  EVGA GeForce GTX 1080 (How do you OC?) 
RAMHard DriveHard DriveOS
8GB 1600Mhz OCZ Vertex 3 240GB SSD OCZ Vertex 3 240GB SSD windows 10 64bit 
MonitorKeyboardPowerCase
Asus MG279Q Logitech G510  Corsair TX750 CM II 690 Advanced 
MouseMouse PadAudioAudio
CM Sentinel Advance  some big corsair one beyerdynamic DT 770 Pro 250 Ohm SoundBlaster Z Soundcard 
  hide details  
Reply
Bender
(18 items)
 
  
CPUMotherboardGraphicsGraphics
Core i5-2500k @ 4.6Ghz Gigabyte Z68X-UD3H-B3 Sapphire AMD R9 290X Tri-X (1050/1300mhz)  EVGA GeForce GTX 1080 (How do you OC?) 
RAMHard DriveHard DriveOS
8GB 1600Mhz OCZ Vertex 3 240GB SSD OCZ Vertex 3 240GB SSD windows 10 64bit 
MonitorKeyboardPowerCase
Asus MG279Q Logitech G510  Corsair TX750 CM II 690 Advanced 
MouseMouse PadAudioAudio
CM Sentinel Advance  some big corsair one beyerdynamic DT 770 Pro 250 Ohm SoundBlaster Z Soundcard 
  hide details  
Reply
post #3 of 71
Thread Starter 
Quote:
Originally Posted by th3illusiveman View Post

sounds cool, but i will believe it when i see it. You will also need more proof than what you presented, to convince me you've tried it.

Tried the GPU or the analysis? I guess I can go back and make some figures highlighting what I'm talking about on the dies. Also I can add my math work in. As for trying the GPUs, it doesn't matter much anyway since the cards I've seen are literally incapable of even putting out a video signal. They are purely for compute purposes only.
post #4 of 71
What about the cards performance in productivity apps, like rendering, as most if not all do use photoshop/max/maya. Also, I read somewhere where a gtx 670 was compared to gtx 580, for rendering, where the gtx580 was 50% faster at rendering. What can be expected from gk110
post #5 of 71
Quote:
Originally Posted by texcoord3 View Post

VS gtx690
Many people are saying it would be 80% the performance of a gtx690 but I just don't see that. I believe, based on analysis, that is will be much closer in performance to a gtx690 than 80%. It may be that in benchmarks 80% is correct, but after looking at some benchmarks on anandtech, I think that in many games like metro2033, once the drivers are matured, the performance will be very close.


You say that the GK110 "Titan" will be much closer in performance to a GTX690 than 80%, wouldn't you say that 80% is close to a GTX690? Aren't you contradicting yourself?
post #6 of 71
Thread Starter 
Quote:
Originally Posted by cowie View Post

What are you doing rehashing the rumur mill? you bring nothing new to the table
my unedited first post is the real cowie,the edited version is the new kinder gentlier cowie

I'm not doing that at all. I'm using my knowledge of computer systems architecture, graphics processor architecture, and general theoretical computer science to examine the gk110 die and relate the architecture to gains that are relevant to gaming. I have not seen this online before, but if it is, I certainly don't want to waste my time rehashing what someone else has already produced.
post #7 of 71
Thread Starter 
Quote:
Originally Posted by Systemlord View Post

You say that the GK110 "Titan" will be much closer in performance to a GTX690 than 80%, wouldn't you say that 80% is close to a GTX690? Aren't you contradicting yourself?

closer than 80%. IE 80% to 100%
post #8 of 71
Well i thank you for taking the time to complete your write-up Texcoord. +Rep thumb.gif Please forgive some of our less polite members here on OCN some of us do have manors.
Quote:
You say that the GK110 "Titan" will be much closer in performance to a GTX690 than 80%, wouldn't you say that 80% is close to a GTX690? Aren't you contradicting yourself?
Isn't 90% or 100% closer than 80% doh.gif
Edited by Swolern - 1/27/13 at 11:38pm
VR
(13 items)
 
   
CPUMotherboardGraphicsRAM
4930k on H2O Asus RIVE GTX 1080Ti Ripjaws 16GB 2133Mhz 
Hard DriveMonitorMonitorPower
[x2] Samsung 830 RAID0  VR- HTC VIVE  Acer Predator X34 3440x1440 100Hz Gsync EVGA Classified SR-2 1200w 
AudioAudioAudioAudio
Fostex TH-900 Headphones Asus Xonar Essence One DAC Sennheiser HDVD800 Sennheiser HD800 
Audio
HiFiMan HE-5LE 
CPUMotherboardGraphicsRAM
Intel 3930K Asrock Extreme 11 x4 GTX Titan Quad SLI + EK Blocks Ripjaws Z 16gb 2133  
Hard DriveHard DriveCoolingMonitor
[x2] Samsung 830 SSD 128gb in Raid0 Samsung Spinpoint 2TB HDD D5 Pump, Alphacool NexXxoS Monsta 480 + RX360 R... [x3] Qnix 2560x1440 120hz 
PowerAudioAudio
EVGA SR-2 1200w  Asus Xonar Essence One Amp/DAC Sennheiser HD800 Headphones 
CPUMotherboardGraphicsRAM
Intel I7 3930k Asus RIVE EVGA GTX 670 Quad SLI Ripjaws Z 16GB 2133 CL11 
Hard DriveHard DriveCoolingMonitor
[x2] SSD Samsung 830 126GB RAID0 Samsung Spinpoint 2TB XSPC Raystorm Block, RX360, D5 [x3] ASUS VG278H 27" 3D Monitors 
PowerCase
EVGA Classified SR-2 1200w CM HAF X 942 
  hide details  
Reply
VR
(13 items)
 
   
CPUMotherboardGraphicsRAM
4930k on H2O Asus RIVE GTX 1080Ti Ripjaws 16GB 2133Mhz 
Hard DriveMonitorMonitorPower
[x2] Samsung 830 RAID0  VR- HTC VIVE  Acer Predator X34 3440x1440 100Hz Gsync EVGA Classified SR-2 1200w 
AudioAudioAudioAudio
Fostex TH-900 Headphones Asus Xonar Essence One DAC Sennheiser HDVD800 Sennheiser HD800 
Audio
HiFiMan HE-5LE 
CPUMotherboardGraphicsRAM
Intel 3930K Asrock Extreme 11 x4 GTX Titan Quad SLI + EK Blocks Ripjaws Z 16gb 2133  
Hard DriveHard DriveCoolingMonitor
[x2] Samsung 830 SSD 128gb in Raid0 Samsung Spinpoint 2TB HDD D5 Pump, Alphacool NexXxoS Monsta 480 + RX360 R... [x3] Qnix 2560x1440 120hz 
PowerAudioAudio
EVGA SR-2 1200w  Asus Xonar Essence One Amp/DAC Sennheiser HD800 Headphones 
CPUMotherboardGraphicsRAM
Intel I7 3930k Asus RIVE EVGA GTX 670 Quad SLI Ripjaws Z 16GB 2133 CL11 
Hard DriveHard DriveCoolingMonitor
[x2] SSD Samsung 830 126GB RAID0 Samsung Spinpoint 2TB XSPC Raystorm Block, RX360, D5 [x3] ASUS VG278H 27" 3D Monitors 
PowerCase
EVGA Classified SR-2 1200w CM HAF X 942 
  hide details  
Reply
post #9 of 71
Thread Starter 
Quote:
Originally Posted by huzzug View Post

What about the cards performance in productivity apps, like rendering, as most if not all do use photoshop/max/maya. Also, I read somewhere where a gtx 670 was compared to gtx 580, for rendering, where the gtx580 was 50% faster at rendering. What can be expected from gk110

The gk110 cards that are currently available for productivity and compute work are beasts. They surpass any other fermi and keplar cards. You can read about them here: http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last

If there were a consumer card based of the gk110 GPU, I am honestly not sure of how the productivity performance would pair up. I do know that Geforce cards are not designed for these purposes and are not designed to be left on (under load) for extended periods of time. Also, Nvidia usually adds additional features in their drivers for professional cards and adds more memory which is always useful.
Edited by texcoord3 - 1/27/13 at 11:42pm
post #10 of 71
Quote:
Originally Posted by texcoord3 View Post

closer than 80%. IE 80% to 100%

Oh I misunderstood, so it's somewhere between 80-100%. I got yeah.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: NVIDIA
Overclock.net › Forums › Graphics Cards › NVIDIA › NVIDIA GK110 "Titan" Performance Analysis