Overclock.net - An Overclocking Community - View Single Post - [Techpowerup] NVIDIA DLSS and its Surprising Resolution Limitations

View Single Post
post #79 of (permalink) Old 02-20-2019, 09:09 AM
Looking Ahead
TheBlademaster01's Avatar
Join Date: Dec 2008
Location: Cluain Dolcáin, Leinster (Ireland)
Posts: 13,043
Rep: 785 (Unique: 536)
Quote: Originally Posted by DNMock View Post
Shoddy math will give a hint I think.

Going from 16nm to 12nm should yield a density increase of ~42% (144 / 256) although admittedly, I have no clue how what the real change is, nor do I know if the process shrink effects only lengths or if it effects lengths and widths the same... Someone fix this for me if they know a more accurate number.

GP100 is on a 610mm^2 die has

Texture Units - 224
CUDA - 3584
Tensor - 640

GV100 on the 815mm^2 die and has

Texture Units - 320
CUDA - 5120
Tensor - 640

Between the bigger die size and increased density, you should see about a total uptick of 79% across the board for a GP100 at that size on that process.

Theoretical GP100 on 12nm process at 815 mm^2

CUDA - 6415
Texture Units - 400

Which gives us 640 tensor cores = 1295 Cuda and 80 Texture Units

Or 1 tensor core = 2 Cuda and .1 Texture unit as far as overall real estate

Going out on a limb and saying a RT core is the same size as a Tensor core unit, the 2080ti has 544 Tensor cores and 68 RT cores adding up to 611 total. To further extrapolate, that would put a 2080ti with no tensor/RT cores at 5574 instead of the 4352 it got. Giving a final total of 25% of the Tensor/CUDA area budget being used on Tensor/RT cores.

Obviously that's taking a lot of freedom and making a series of extrapolations on that fast and loose guess, so take that with a dump truck of salt.
Haha, nice

Yes, it's valid to assume that transistor size scales quadratically with decreasing feature size. Aside from some manufacturing process specific factors (non-trivial), there's a linear relation between the width and length of a transistor and its driving strength/current. So, in order to get similar performance you can reduce width by approximately the same ratio as the feature size reduction.

I can follow the rest of your reasoning. I think equating the size of an RT core to be similar to Tensor core is not accurate, but the error might be negligible because of how few there are on the chip.

I Google'd around and your "pseudo-science" is within 1% of someone who solved the same problem but then graphically, so who knows



TheBlademaster01 is offline