Overclock.net › Forums › Graphics Cards › NVIDIA › Fermi Tessellation Performance Comparison
New Posts  All Forums:Forum Nav:

Fermi Tessellation Performance Comparison - Page 8

post #71 of 105
Thread Starter 
Quote:
Originally Posted by brettjv View Post

What they're showing so far is that < tess units = larger perf hit from Extreme.
Except for the 580s oddly enough.

Majin saw a >32% drop.
KonigGeist saw a >34% drop.
Ferrari8608 saw a >35% drop.

Also I'd really like to see the results from a 450/550 Ti. If the theory holds, their 4 units should be a huge bottleneck.
post #72 of 105
Quote:
Originally Posted by RagingCain View Post
nVidia have had one ace card in their pocket, and that has been extremely progressive and pro-active driver support. Unfortunately for them, AMD/ATi appear to have learned from past mistakes. They can have superior hardware (ala 58xx/59xx), and still lose customers due to shoddy support.
Emphasis mine.

Thank you! Getting a concession on this point is.. pulling teeth, in difficulty terms.

Further on your point (not in the quote), the most I've seen nvidia bend on this pricing strategy is with the 590. I was almost certain, based on the historical precedent you so eloquently detail, the 590 would adhere much more near a "580x2" price than match it's nearest competitor. Color me surprised (although nobody else was, apparently) .

My apologies for the random digress. Very nice work filling out this very interesting data. The 580 results, in comparison the rest, is quite intriguing. Keep it coming, fellas.
Quadfire Turbo
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 - 2600K / 5.4 GHz Gigabyte P67A-UD7-B3 #2 Sapphie Radeon 6990 @ 1030 / 1520 Ripjaws 4GB x 4 (16GB) @ 2200 MHz 
Hard DriveOptical DriveOSMonitor
OCZ Vertex3,Corsair Force GT,Intel X25-M, +4TB HDD LG Bluray Burner Windows 7 64 5x1 P(NEC ips), S750 Samsung 120hz HD3D, HP zr30w 
KeyboardPowerCaseMouse
Logitech G13 #2 Kingwin Mach 1 / 1220 Watts XSPC H2 R.A.T. 7 
Mouse Pad
Razer Goliath 
  hide details  
Reply
Quadfire Turbo
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 - 2600K / 5.4 GHz Gigabyte P67A-UD7-B3 #2 Sapphie Radeon 6990 @ 1030 / 1520 Ripjaws 4GB x 4 (16GB) @ 2200 MHz 
Hard DriveOptical DriveOSMonitor
OCZ Vertex3,Corsair Force GT,Intel X25-M, +4TB HDD LG Bluray Burner Windows 7 64 5x1 P(NEC ips), S750 Samsung 120hz HD3D, HP zr30w 
KeyboardPowerCaseMouse
Logitech G13 #2 Kingwin Mach 1 / 1220 Watts XSPC H2 R.A.T. 7 
Mouse Pad
Razer Goliath 
  hide details  
Reply
post #73 of 105
Although its easy to see the drop in performace turning tessalation on - because each card has different ratios of shaders to tessalation units you still not able to draw an accurate comparison between the actual numbers of tessalation units. The results are not providing the answer to your fundamental question.

The number of shaders and core clock need to be squared of against tessalation unit numbers.

Basically a higher number of shaders compared to tessalation units are going to give a higher performace difference becasue of the ratio. This is why we are seeing a higher percentage drop with the 460/560 as the shader power outweighs the tessalation power hence gives a wider gap between results.

We need to figure out how to find actuall performace differences between tessalation units.

Edit: The easiest way is to take for example the 460 and 560 ti, measure the difference between these two in raw shader power.................then with the difference in mind look at the differece between the two cards performace with tess turned on. Now because we know how much more rendering power the 560 has, we can play this off against it also having 1 more tessalation unit.

Hope you know what I mean, I need to do this with figures to make sence but havent got the time at the moment, but ill try and get round to it. Should be able to provide a more accurate comparison for which to judge tessalation unit performace this way

Edit2: this could also mean that the extra shaders in the 580, are out weighing the 1 extra tess unit it has compared to the 480/570, giving a slightly higher percentage difference.

Edit£: Bottom line - nex gen tessalation facilitating gpus are gona need a much higher tessalation unit ratio compared to shaders, to eliminate any bottleneck
Edited by Juliancahillane - 8/2/11 at 7:29am
    
CPUMotherboardGraphicsRAM
i5750 @ 3.8 (19x200)1.31vcore /1.21vtt  GA-P55M-UD4 EVGA GTX 480 @ 875/1800/2000 1.075vcore 8GB Corsair Dominator @ 2000mhz 9-10-9-27 1T 
Hard DriveHard DriveCoolingCooling
1TB Western Digital Black 64mb OS Boot 3x 60GB Vertex2E Raid 0 : 700mb/s  Prolimatech Megahalems Rev.B Zalman VF3000F 
OSMonitorPowerCase
Windows 7 Ultimate 64 42" Panasonic TXP42G20 1080p Plasma XFX 850w Pro XXX Edition (Silver) Fractal Design Arc Mini 
MouseMouse PadAudioOther
G9x Razer Goliathus Yamaha RXV671 reciever / Q Acoustic 2020i speakers Belkin n52e Keypad 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
i5750 @ 3.8 (19x200)1.31vcore /1.21vtt  GA-P55M-UD4 EVGA GTX 480 @ 875/1800/2000 1.075vcore 8GB Corsair Dominator @ 2000mhz 9-10-9-27 1T 
Hard DriveHard DriveCoolingCooling
1TB Western Digital Black 64mb OS Boot 3x 60GB Vertex2E Raid 0 : 700mb/s  Prolimatech Megahalems Rev.B Zalman VF3000F 
OSMonitorPowerCase
Windows 7 Ultimate 64 42" Panasonic TXP42G20 1080p Plasma XFX 850w Pro XXX Edition (Silver) Fractal Design Arc Mini 
MouseMouse PadAudioOther
G9x Razer Goliathus Yamaha RXV671 reciever / Q Acoustic 2020i speakers Belkin n52e Keypad 
  hide details  
Reply
post #74 of 105
Thread Starter 
Quote:
Originally Posted by Juliancahillane View Post
Although its easy to see the drop in performace turning tessalation on - because each card has different ratios of shaders to tessalation units you still not able to draw an accurate comparison between the actual numbers of tessalation units. The results are not providing the answer to your fundamental question.

The number of shaders and core clock need to be squared of against tessalation unit numbers.
You bring up an interesting point.

The number of shaders on each GPU, and the number of tessellation units, is directly tied to the number of SMs. The number of shaders can vary, but there is always 1 tessellation unit per SM.

The 470, 480, 570, and 580 each have 32 CUDA cores (shaders) per SM, which also means they each have 32 shaders to every 1 tesselation unit.

The 450, 550, 460 and 560 Ti have 48 CUDA cores per SM, so 48 shaders for each tess unit.

In that regard, you are right. The ratio of shaders to tess units in the 450-560Ti makes their performance more heavily weighted towards raw shader performance, and would explain their larger drop.

With that in mind, if we had 450/550 Ti results to analyze, they might look similar to the 460/560 results we've seen thus far.

Quote:
Edit2: this could also mean that the extra shaders in the 580, are out weighing the 1 extra tess unit it has compared to the 480/570, giving a slightly higher percentage difference.
This isn't supported though, as the 580's shader to tessellation unit ratio is the same as the 480/570/470.
Edited by Booty Warrior - 8/2/11 at 7:44am
post #75 of 105
aaaaaa didnt realise that, however the shader clocks will therefore be the variable in that instance. And also the balance isnt quite right, as there is a large deficit in performace with tess on, maybe this becomes more apparent with higher numbers of shaders and tess units -580

Edit: But we should still use the method or something along those lines I stated to work out the tessalation unit performace
Edited by Juliancahillane - 8/2/11 at 7:56am
    
CPUMotherboardGraphicsRAM
i5750 @ 3.8 (19x200)1.31vcore /1.21vtt  GA-P55M-UD4 EVGA GTX 480 @ 875/1800/2000 1.075vcore 8GB Corsair Dominator @ 2000mhz 9-10-9-27 1T 
Hard DriveHard DriveCoolingCooling
1TB Western Digital Black 64mb OS Boot 3x 60GB Vertex2E Raid 0 : 700mb/s  Prolimatech Megahalems Rev.B Zalman VF3000F 
OSMonitorPowerCase
Windows 7 Ultimate 64 42" Panasonic TXP42G20 1080p Plasma XFX 850w Pro XXX Edition (Silver) Fractal Design Arc Mini 
MouseMouse PadAudioOther
G9x Razer Goliathus Yamaha RXV671 reciever / Q Acoustic 2020i speakers Belkin n52e Keypad 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
i5750 @ 3.8 (19x200)1.31vcore /1.21vtt  GA-P55M-UD4 EVGA GTX 480 @ 875/1800/2000 1.075vcore 8GB Corsair Dominator @ 2000mhz 9-10-9-27 1T 
Hard DriveHard DriveCoolingCooling
1TB Western Digital Black 64mb OS Boot 3x 60GB Vertex2E Raid 0 : 700mb/s  Prolimatech Megahalems Rev.B Zalman VF3000F 
OSMonitorPowerCase
Windows 7 Ultimate 64 42" Panasonic TXP42G20 1080p Plasma XFX 850w Pro XXX Edition (Silver) Fractal Design Arc Mini 
MouseMouse PadAudioOther
G9x Razer Goliathus Yamaha RXV671 reciever / Q Acoustic 2020i speakers Belkin n52e Keypad 
  hide details  
Reply
post #76 of 105
It looks like those runs are average for my card. I just ran it again and got 82.1 fps with tessellation off and 54.6 fps with it on extreme, for a drop of 33.5%.
SOMETHING
(20 items)
 
  
CPUMotherboardGraphicsRAM
i7 4790k Gigabyte Z97X-SOC FORCE EVGA 980 Ti Crucial 
Hard DriveHard DriveHard DriveCooling
Hitachi Deskstar Crucial M4 Intel 750 H100i 
OSMonitorKeyboardPower
Windows 10 Pro ASUS PG278Q Novatouch Thermaltake 800W 
CaseMouseMouse PadAudio
Corsair 600C SteelSeries Sensei Steelseries QcK Soundblaster Z 
AudioAudioAudioAudio
M-Audio AV40 Dayton SUB-1200 Sennheiser HD 700 Schiit Magni 2 
  hide details  
Reply
SOMETHING
(20 items)
 
  
CPUMotherboardGraphicsRAM
i7 4790k Gigabyte Z97X-SOC FORCE EVGA 980 Ti Crucial 
Hard DriveHard DriveHard DriveCooling
Hitachi Deskstar Crucial M4 Intel 750 H100i 
OSMonitorKeyboardPower
Windows 10 Pro ASUS PG278Q Novatouch Thermaltake 800W 
CaseMouseMouse PadAudio
Corsair 600C SteelSeries Sensei Steelseries QcK Soundblaster Z 
AudioAudioAudioAudio
M-Audio AV40 Dayton SUB-1200 Sennheiser HD 700 Schiit Magni 2 
  hide details  
Reply
post #77 of 105
Thread Starter 
Quote:
Originally Posted by Juliancahillane View Post
aaaaaa didnt realise that, however the shader clocks will therefore be the variable in that instance.
So far that doesn't seem to be the case. Balla's run the test at 3 different clock settings and his results were within ~1% of each other.

As I understand it, Fermi uses the CUDA cores to do tess computations in each "unit," so I don't think their clocks would impact the relative performance loss.

Quote:
Edit: But we should still use the method or something along those lines I stated to work out the tessalation unit performace
That should also work with the 580/480, since they share the same architectural differences as the 460 and 560 Ti.

Some 450/550 Ti results would also be really helpful right now.
post #78 of 105
Quote:
Originally Posted by HeadlessKnight View Post
From what I see from these benchmarks. The new 500 series is far superior in performance compared to the old 400 series. especially 570 & 580.
Also I am wondering does this reflect real world performance in all other applications?
What benchmarks ? The numbers are all over the place. People testing without 4xAA (major difference), using Heaven 2.1 instead of 2.5 etc.

Check page number 3, Ballas' GTX 470 @ 880/1760/2150 is within 4 FPS (on extreme settings) of KonigGeist's GTX 580 @ 816 MHz/1632/2004 (I'm assuming it's not overclocked even further).

Those numbers make me hope that Kepler will be at least twice as powerful compared to Fermi.
post #79 of 105
Quote:
Originally Posted by Wulfgar View Post
What benchmarks ? The numbers are all over the place. People testing without 4xAA (major difference), using Heaven 2.1 instead of 2.5 etc.

Check page number 3, Ballas' GTX 470 @ 880/1760/2150 is within 4 FPS (on extreme settings) of KonigGeist's GTX 580 @ 816 MHz/1632/2004 (I'm assuming it's not overclocked even further).

Those numbers make me hope that Kepler will be at least twice as powerful compared to Fermi.
My 580 was running at stock then, but it had a glitched BIOS, so those aren't normal numbers. Running it several times in the past day or two, after fixing the BIOS, I've been getting around 50-55 on extreme, however the percentage lost has remained the same.
But I agree that I Kepler needs to be more of an upgrade
SOMETHING
(20 items)
 
  
CPUMotherboardGraphicsRAM
i7 4790k Gigabyte Z97X-SOC FORCE EVGA 980 Ti Crucial 
Hard DriveHard DriveHard DriveCooling
Hitachi Deskstar Crucial M4 Intel 750 H100i 
OSMonitorKeyboardPower
Windows 10 Pro ASUS PG278Q Novatouch Thermaltake 800W 
CaseMouseMouse PadAudio
Corsair 600C SteelSeries Sensei Steelseries QcK Soundblaster Z 
AudioAudioAudioAudio
M-Audio AV40 Dayton SUB-1200 Sennheiser HD 700 Schiit Magni 2 
  hide details  
Reply
SOMETHING
(20 items)
 
  
CPUMotherboardGraphicsRAM
i7 4790k Gigabyte Z97X-SOC FORCE EVGA 980 Ti Crucial 
Hard DriveHard DriveHard DriveCooling
Hitachi Deskstar Crucial M4 Intel 750 H100i 
OSMonitorKeyboardPower
Windows 10 Pro ASUS PG278Q Novatouch Thermaltake 800W 
CaseMouseMouse PadAudio
Corsair 600C SteelSeries Sensei Steelseries QcK Soundblaster Z 
AudioAudioAudioAudio
M-Audio AV40 Dayton SUB-1200 Sennheiser HD 700 Schiit Magni 2 
  hide details  
Reply
post #80 of 105
Quote:
Originally Posted by Booty Warrior View Post
You bring up an interesting point.

The number of shaders on each GPU, and the number of tessellation units, is directly tied to the number of SMs. The number of shaders can vary, but there is always 1 tessellation unit per SM.

The 470, 480, 570, and 580 each have 32 CUDA cores (shaders) per SM, which also means they each have 32 shaders to every 1 tesselation unit.

The 450, 550, 460 and 560 Ti have 48 CUDA cores per SM, so 48 shaders for each tess unit.

In that regard, you are right. The ratio of shaders to tess units in the 450-560Ti makes their performance more heavily weighted towards raw shader performance, and would explain their larger drop.

With that in mind, if we had 450/550 Ti results to analyze, they might look similar to the 460/560 results we've seen thus far.


This isn't supported though, as the 580's shader to tessellation unit ratio is the same as the 480/570/470.
Yup. Bottom-line, the cards based on GF100/110 have considerably lower perf hit due to tessellation, precisely because of the SM layout. Like you say, on the gf100/110 cards, you have 32 cuda cores per tess unit, on gf104/114 cards you have 48 cuda cores per tess unit.

The results so far show roughly matching drop-offs for all 32-shader SM cards, (around 33%), and matching (but worse) dropoffs for all those cards with 48-shader SM's (around 40%).

The different ratio of shaders to tess units is the key here.

I think w/further testing of GTX580's we're going to discover it has the same perf hit as the other 32-shader/SM cards, i.e. that 35% was an outlier.

I don't think clocks come into play here because the tess units are obviously being OC'd at the same rate as the core, and the OC scaling for them looks to be the same.
    
CPUMotherboardGraphicsRAM
xeon X5675 6-core @ 4.1ghz (1.29v, 20x205 +ht ) rampage iii extreme msi rx470 gaming X (the $159 budget king) 3 x 2gb corsair xms3 pc12800 (9-9-9-24-1T@1600MHz) 
Hard DriveOptical DriveCoolingOS
hynix 250gb ssd (boot), 2tb deskstar (apps),1tb... plextor px-712sa - still the best optical drive... corsair h8o v2 aio W10 home 
MonitorPowerCaseAudio
asus vw266h 25.5" (1920x1200) abs sl (enermax revolution) * single 70A rail 850w silverstone rv-03 XFi Titanium 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
xeon X5675 6-core @ 4.1ghz (1.29v, 20x205 +ht ) rampage iii extreme msi rx470 gaming X (the $159 budget king) 3 x 2gb corsair xms3 pc12800 (9-9-9-24-1T@1600MHz) 
Hard DriveOptical DriveCoolingOS
hynix 250gb ssd (boot), 2tb deskstar (apps),1tb... plextor px-712sa - still the best optical drive... corsair h8o v2 aio W10 home 
MonitorPowerCaseAudio
asus vw266h 25.5" (1920x1200) abs sl (enermax revolution) * single 70A rail 850w silverstone rv-03 XFi Titanium 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: NVIDIA
Overclock.net › Forums › Graphics Cards › NVIDIA › Fermi Tessellation Performance Comparison