Overclock.net banner

[SemiAccurate] Physics hardware makes Kepler/GK104 fast

14K views 203 replies 80 participants last post by  Dmac73 
#1 ·
Quote:
Nvidia's Kepler/GK104 chip has an interesting secret, a claimed Ageia PhysX hardware block that really isn't. If you were wondering why Nvidia has been beating the dead horse called PhysX for so long, now you know, but it only gets more interesting from there.

Sources tell SemiAccurate that the 'big secret' lurking in the Kepler chips are optimisations for physics calculations. Some are calling this PhysX block a dedicated chunk of hardware, but more sources have been saying that it is simply shaders, optimisations, and likely a dedicated few new ops. In short, marketing may say it is, but under the heat spreader, it is simply shaders and optimisations.
(Source)

Since this article is kind of long, I'll high light a few things:

Basically Charlie as we know him is back (smacktalking and criticizing nVidia) and motivates his previous statements.
Quote:
We stated earlier, Kepler wins in most ways vs the current AMD video cards. How does Nvidia do it with a $299 card? Is it raw performance? Massive die size? Performance per metric? The PhysX 'hardware block'? Cheating? The easy answer is yes, but lets go in to a lot more detail.

GK104 is the mid-range GPU in Nvidia's Kepler family, has a very small die, and the power consumption is far lower than the reported 225W. How low depends on what is released and what clock bins are supported by the final silicon. A1 stepping cards seen by SemiAccurate had much larger heatsinks than the A2 versions, and recent rumours suggest there may be an A3 to fix persistent PCIe3 headaches.
He releases some details about GK104 architecture
Quote:
The architecture itself is very different from Fermi, SemiAccurate's sources point to a near 3TF card with a 256-bit memory bus. Kepler is said to have a very different shader architecture from Fermi, going to much more AMD-like units, caches optimised for physics/computation, and clocks said to be close to the Cayman/Tahiti chips. The initial target floating among the informed is in the 900-1000MHz range. Rumours have it running anywhere from about 800MHz in early silicon to 1.1+GHz later on, with early stepping being not far off later ones. Contrary to some floating rumours, yields are not a problem for either GK104 or TSMC's 28nm process in general.

Performance is likewise said to be a tiny bit under 3TF from a much larger shader count than previous architectures. This is comparable to the 3.79TF and 2048 shaders on AMD's Tahiti, GK104 isn't far off either number. With the loss of the so called "Hot Clocked" shaders, this leaves two main paths to go down, two CUs plus hardware PhysX unit or three. Since there is no dedicated hardware physics block, the math says each shader unit will probably do two SP FLOPs per clock or one DP FLOP.
And reveals nVidia tactics

EDIT:

There seems to be a general misconception about the performance boost of GK104, so here are the parts were it's all explained
Quote:
In the same way that AMD's Fusion chips count GPU FLOPS the same way they do CPU FLOPS in some marketing materials, Kepler's 3TF won't measure up close to AMD's 3TF parts. Benchmarks for GK104 shown to SemiAccurate have the card running about 10-20% slower than Tahiti. On games that both heavily use physics related number crunching and have the code paths to do so on Kepler hardware, performance should seem to be well above what is expected from a generic 3TF card. That brings up the fundamental question of whether the card is really performing to that level?

This is where the plot gets interesting. How applicable is the "PhysX block"/shader optimisations to the general case? If physics code is the bottleneck in your app, A goal Nvidia appears to actively code for, then uncorking that artificial impediment should make an app positively fly. On applications that are written correctly without artificial performance limits, Kepler's performance should be much more marginal. Since Nvidia is pricing GK104 against AMD's mid-range Pitcairn ASIC, you can reasonably conclude that the performance will line up against that card, possibly a bit higher. If it could reasonably defeat everything on the market in a non-stacked deck comparison, it would be priced accordingly, at least until the high end part is released.

All of the benchmark numbers shown by Nvidia, and later to SemiAccurate, were overwhelmingly positive. How overwhelmingly positive? Far faster than an AMD HD7970/Tahiti for a chip with far less die area and power use, and it blew an overclocked 580GTX out of the water by unbelievable margins. That is why we wrote this article. Before you take that as a backpedal, we still think those numbers are real, the card will achieve that level of performance in the real world on some programs.

The problem for Nvidia is that once you venture outside of that narrow list of tailored programs, performance is likely to fall off a cliff, with peaky performance the likes of which haven't been seen in a long time. On some games, GK104 will handily trounce a 7970, on others, it will probably lose to a Pitcairn. Does this mean it won't actually do what is promised? No, it will. Is this a problem? Depends on how far review sites dare to step outside of the 'recommenced' list of games to benchmark in the reviewers guide.
 
#4 ·
LOL, This will be interesting.......

Oh Nvidia, u so crazy
 
#6 ·
Quote:
Originally Posted by Khemhotep View Post

If it is able to TROUNCE a 7970 in at least a few games, for only $300?!?!=BIG WIN.
thumb.gif
Yeah, crysis 2 and vantage only Have fun!
biggrin.gif


In all seriousness it is pretty hard to believe it will be as mentioned in this article.
 
#8 ·
I wish nvidia would stop with the PissX campaigns already. As long as it is proprietary and does not run on AMD cards it is not going to be widely used.
thumbsdownsmileyanim.gif
 
#11 ·
Quote:
Originally Posted by tx-jose View Post

ohh lawddd here we go!!!!
lachen.gif

I so hope this is true!!!
I can see the forum arguments already. Forget Bulldozer, forget Fermi. This will be the topic that will crash OCN servers lol
 
#13 ·
Wait and see, that's how it has always be.
All we can do right now is speculate.
 
#15 ·
I don't quite get this..... I do not understand how middleware could improve overall performance (unless it is the initial wrapper or is used to resolve some specific heavy workload).
 
#16 ·
So its kind of like BD made the the future but sucking all together. I bet its only good in DX11 games.
 
#17 ·
Quote:
Originally Posted by PureBlue View Post

I don't think you understand what that article is getting at.
Actually I do. How about you click the link and read the original story.Overall performance will be right where a mid-range gpu should be. But in games that support the built in Physx optimisations, performance will be off the charts. I'm skipping this and waiting for the real high-end Kepler. The GK104 will be a good buy for only $300.
 
#18 ·
Quote:
Originally Posted by DuckieHo View Post

I don't quite get this..... I do not understand how middleware could improve overall performance (unless it is the initial wrapper or is used to resolve some specific heavy workload).
It looks like Nvidia has consultants working pretty hands on alongside the gaming developer programmers, helping write performance coding optimizations as the games are going through their various mile-stones.

Smart
thumb.gif


Right from the beginning, the games are written coded in such a way so as to take maximum performance advantage of Nvidia's GPU architecture.
 
#19 ·
Quote:
Originally Posted by Khemhotep View Post

Actually I do. How about you click the link and read the original story.Overall performance will be right where a mid-range gpu should be. But in games that support the built in Physx optimisations, performance will be off the charts. I'm skipping this and waiting for the real high-end Kepler. The GK104 will be a good buy for only $300.
That does not make sense. PhysX is not a performance bottleneck today. If you play a game with a dedicated GTX580 for PhysX or with PhysX off, the game does not run any better.

Quote:
Originally Posted by tehRealChaZZZy View Post

It looks like Nvidia has consultants working pretty hands on alongside the gaming developer programmers, helping write performance coding optimizations as the games are going through their various mile-stones.
Smart
thumb.gif

Right from the beginning, the games are written in such a way so as to take maximum performance advantage of Nvidia's GPU architecture.
This is correct that TWIMTBP developers assist in coding. However, they do not restructure the entire game code nor do they write code in anticipation of a new architecture that is over a year away.... Benchmarks generally consist of recent previously released games. Reports have NVIDIA doing well in those.
 
#20 ·
Quote:
Originally Posted by Khemhotep View Post

Actually I do. How about you click the link and read the original story.Overall performance will be right where a mid-range gpu should be. But in games that support the built in Physx optimisations, performance will be off the charts. I'm skipping this and waiting for the real high-end Kepler. The GK104 will be a good buy for only $300.
You originally said

"If it is able to TROUNCE a 7970 in at least a few games, for only $300?!?!=BIG WIN."

The thing is it will beat the 7970 in some games and programs that the Physx, or optimized shaders, as the article makes out will take advantage of. This might show well in benchmarks, but when it comes to real life gaming those few programs and games are few amongst many and the 7970 will still be the better card.

This is dirty tricks by Nvidia to sell their cards and make them look better than they really are.
 
#21 ·
Quote:
Originally Posted by DuckieHo View Post

That does not make sense. PhysX is not a performance bottleneck today. If you play a game with a dedicated GTX580 for PhysX or with PhysX off, the game does not run any better.
Or they could have implemented PhysX in a way that even AMD can run it but as soon as the Nvidia GPU runs it it will be a lot faster. The problem is that current games will nit run any better. I would hate to buy a future GPU. I want the performance now.
 
#22 ·
Quote:
Originally Posted by ZealotKi11er View Post

Or they could have implemented PhysX in a way that even AMD can run it but as soon as the Nvidia GPU runs it it will be a lot faster. The problem is that current games will nit run any better. I would hate to buy a future GPU. I want the performance now.
Or DX11 that runs in a less than efficient way when it sees an ATI card and not an Nvidia? Like they did with AA in Batman? Just a guess. A bad one mabye though lol
 
#23 ·
Quote:
Originally Posted by ZealotKi11er View Post

Or they could have implemented PhysX in a way that even AMD can run it but as soon as the Nvidia GPU runs it it will be a lot faster. The problem is that current games will nit run any better. I would hate to buy a future GPU. I want the performance now.
Nope, because AMD GPUs cannot execute the PhysX path.

Quote:
Originally Posted by Newbie2009 View Post

Or DX11 that runs in a less than efficient way when it sees an ATI card and not an Nvidia? Like they did with AA in Batman? Just a guess. A bad one mabye though lol
Nope, DX11 is a Microsoft API. The API passes the commands to the video card through the drivers.
 
#24 ·
So if this is true, GK104, a $299 GPU, will exceed the performance of a 7970, a $550 GPU, in some games, while it's performance in others will drop to the level of a 7870 or 7850 (Pitcairn). Again, if true, the fact that a mid-range GPU can outperform the competition's high-range GPU in some games, while matching their mid-range GPU in others is a positive not a negative.

I'm sure a lot of people would be thrilled if they could spend $299 on a GPU and get 7970 or greater performance out of it in popular games like BF3 and what not.
 
#25 ·
Quote:
Originally Posted by DuckieHo View Post

This is correct that TWIMTBP developers assist in coding. However, they do not restructure the entire game code nor do they write code in anticipation of a new architecture that is over a year away....
Actually I'd say that's more than likely what is actually happening.
As the games go through various testing stages, I could see how Nvidia's embedded in-house agents could give 'suggestions' that would make any programmer's job lot's easier as the programmers deal with various performance issues and they develop and optimize their coding.
 
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top