Overclock.net › Forums › Industry News › Hardware News › [Real World Tech] Tile-based Rasterization in Nvidia GPUs
New Posts  All Forums:Forum Nav:

[Real World Tech] Tile-based Rasterization in Nvidia GPUs

post #1 of 42
Thread Starter 
Quote:
Nvidia has constantly evolved the architecture of their GPUs in each generation to enhance performance and power-efficiency. While the company has discussed the changes in the programmable shader cores for the Maxwell and Pascal generation, which have generally eliminated or reduced scheduling logic and placed a greater burden on the compiler. However, Nvidia’s architects have avoided disclosing details about the fixed function graphics hardware – in some cases denying changes.

Starting with the Maxwell architecture, Nvidia high-performance GPUs have borrowed techniques from low-power mobile graphics architectures. Specifically, Maxwell and Pascal use tile-based immediate-mode rasterizers that buffer pixel output, instead of conventional full-screen immediate-mode rasterizers. Using simple DirectX shaders, we demonstrate the tile-based rasterization in Nvidia’s Maxwell and Pascal GPUs and contrast this behavior to the immediate-mode rasterizer used by AMD.




Source: http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/

Also posted on Anandtech: http://www.anandtech.com/show/10536/nvidia-maxwell-tile-rasterization-analysis
post #2 of 42
The Anand itle is kinda funny. I thought everyone knew Nvidia was using a Tile-based rendered since Maxwell? They have a line or two about it on the specs page for the graphics cards over at nvidia. It was kinda funny a couple years ago when I posted Nvidia looked like they were moving to a Tile-Based renderer and most people didnt believe me. tongue.gif

edit:, 2 years ago: http://www.overclock.net/t/1496430/expr-nvidia-gm204-with-256bit-wide-flagship-will-512bit/130#post_22453169
Edited by EniGma1987 - 8/1/16 at 12:03pm
Gaming
(17 items)
 
Gaming PC
(20 items)
 
 
CPUMotherboardGraphicsRAM
7700K AS Rock Z170 OC Formula Titan X Pascal 2050MHz 64GB DDR4-3200 14-14-14-34-1T 
Hard DriveHard DriveHard DriveCooling
950 EVO m.2 OS drive 850 EVO 1TB games drive Intel 730 series 500GB games drive Custom water cooling 
OSMonitorKeyboardPower
Win 10 Pro x64 AMH A399U E-Element mechanical, black switches, Vortex b... EVGA G3 1kw 
CaseMouseAudioAudio
Lian-Li PC-V1000L Redragon M901 LH Labs Pulse X Infinity DAC Custom built balanced tube amp with SS diamond ... 
Audio
MrSpeakers Alpha Prime 
  hide details  
Reply
Gaming
(17 items)
 
Gaming PC
(20 items)
 
 
CPUMotherboardGraphicsRAM
7700K AS Rock Z170 OC Formula Titan X Pascal 2050MHz 64GB DDR4-3200 14-14-14-34-1T 
Hard DriveHard DriveHard DriveCooling
950 EVO m.2 OS drive 850 EVO 1TB games drive Intel 730 series 500GB games drive Custom water cooling 
OSMonitorKeyboardPower
Win 10 Pro x64 AMH A399U E-Element mechanical, black switches, Vortex b... EVGA G3 1kw 
CaseMouseAudioAudio
Lian-Li PC-V1000L Redragon M901 LH Labs Pulse X Infinity DAC Custom built balanced tube amp with SS diamond ... 
Audio
MrSpeakers Alpha Prime 
  hide details  
Reply
post #3 of 42
Wasn't familiar with much of this stuff, thanks for the video, was a good watch.
M06
(20 items)
 
  
CPUMotherboardGraphicsRAM
AMD FX6300 Gigabyte 990FXA-UD5 XFX 7950 - 3GB G.Skill Sniper 8GB (2x4GB) DDR3 2133 CL9 @ 1733... 
Hard DriveHard DriveHard DriveOptical Drive
WD Blue 500GB WD Black 1.5TB Crucial M4 128GB (OS) LG ODD 
CoolingOSMonitorMonitor
Deepcool Lucifer v2 Win7 Ultimate 64 bit Acer X223w (1050) LG 22EN33 (1080) 
KeyboardPowerCaseMouse
Sharkoon Tactix OCZ ModXstream Pro 700w Modular Corsair 300R CM Storm Xornet 
Mouse PadAudioAudioAudio
Steelseries Qck+ DOTA2 Edition Edifier e1100+  Sennheiser HD215 Plantronics Gamecom 307 
  hide details  
Reply
M06
(20 items)
 
  
CPUMotherboardGraphicsRAM
AMD FX6300 Gigabyte 990FXA-UD5 XFX 7950 - 3GB G.Skill Sniper 8GB (2x4GB) DDR3 2133 CL9 @ 1733... 
Hard DriveHard DriveHard DriveOptical Drive
WD Blue 500GB WD Black 1.5TB Crucial M4 128GB (OS) LG ODD 
CoolingOSMonitorMonitor
Deepcool Lucifer v2 Win7 Ultimate 64 bit Acer X223w (1050) LG 22EN33 (1080) 
KeyboardPowerCaseMouse
Sharkoon Tactix OCZ ModXstream Pro 700w Modular Corsair 300R CM Storm Xornet 
Mouse PadAudioAudioAudio
Steelseries Qck+ DOTA2 Edition Edifier e1100+  Sennheiser HD215 Plantronics Gamecom 307 
  hide details  
Reply
post #4 of 42
Interesting. Are there any cons to tile based?
post #5 of 42
Quote:
Originally Posted by tkenietz View Post

Interesting. Are there any cons to tile based?

It would make the chip too power efficent to be branded as a Raedon Global Warming Acclerator /s biggrin.gif
Skylake 1080 FTW
(18 items)
 
XPS 15
(7 items)
 
CPUMotherboardGraphicsRAM
i7-6700K  ASUS Maximus VIII Ranger EVGA 1080 FTW TridentZ DDR4-3000 
Hard DriveHard DriveHard DriveHard Drive
Samsung SM951 128GB Samsung 840 Pro 256 GB Muskin Reactor 1TB Seagate Baracuda 2TB ST2000DM001 HDD 
Hard DriveCoolingOSMonitor
Seagate Baracuda 2TB ST2000DM001 HDD Corsair H110i GT Windows 10 Pro  Acer XB321HK 32" 4K G-sync 
KeyboardPowerCaseMouse
Corsair K70 Rapid Fire EVGA SuperNova 650 P2 NZXT H440 Black Logitech G900  
Mouse PadOther
Razer Vespula CyberPower CP1500PFCLCD - PFC Sinewave UPS Syst... 
CPUMotherboardGraphicsRAM
i7-7700HQ 2.8Ghz(3.8Ghz Turbo) Dell XPS 15-9560 Nvidia GTX 1050 8GB DDR4-2400 
Hard DriveOSMonitor
256GB PCIE SSD Windows 10 Pro 15.6" 1080p 
  hide details  
Reply
Skylake 1080 FTW
(18 items)
 
XPS 15
(7 items)
 
CPUMotherboardGraphicsRAM
i7-6700K  ASUS Maximus VIII Ranger EVGA 1080 FTW TridentZ DDR4-3000 
Hard DriveHard DriveHard DriveHard Drive
Samsung SM951 128GB Samsung 840 Pro 256 GB Muskin Reactor 1TB Seagate Baracuda 2TB ST2000DM001 HDD 
Hard DriveCoolingOSMonitor
Seagate Baracuda 2TB ST2000DM001 HDD Corsair H110i GT Windows 10 Pro  Acer XB321HK 32" 4K G-sync 
KeyboardPowerCaseMouse
Corsair K70 Rapid Fire EVGA SuperNova 650 P2 NZXT H440 Black Logitech G900  
Mouse PadOther
Razer Vespula CyberPower CP1500PFCLCD - PFC Sinewave UPS Syst... 
CPUMotherboardGraphicsRAM
i7-7700HQ 2.8Ghz(3.8Ghz Turbo) Dell XPS 15-9560 Nvidia GTX 1050 8GB DDR4-2400 
Hard DriveOSMonitor
256GB PCIE SSD Windows 10 Pro 15.6" 1080p 
  hide details  
Reply
post #6 of 42
Quote:
Originally Posted by sherlock View Post

It would make the chip too power efficent to be branded as a Raedon Global Warming Acclerator /s biggrin.gif

Lol

I only ask because it doesn't seem like it's a new way of doing it, and if it's much more efficient why isn't it being adopted as the standard way? Is it possible for GCN to adopt this method or would it require a whole new redesign?
post #7 of 42
It would likely require a re-design of the majority of the back end of the GPU (whole rasterizer section and cache subsystem for the rasterization process and thus how the memory controllers are set to operate and the texture mapping units and such) and *maybe* the scheduler, but the GCN cores and compute clusters themselves shouldnt need a re-design.


Mobile chips have done this for a long time, and Nvidia has adopted it to in a way. They are really using a sort of hybrid tile based+immediate based renderer. AMD could do the same thing, but there likely needs to be a major shift in how graphics are handled in game engines, OS, and APIs as well as the hardware themselves and drivers to do a 100% tile based renderer. But there likely doesnt need to have all this done, as Nvidia's hybrid approach seems to work great on performance and efficiency as well as compatibility.

If you think about it, it is pretty impressive that AMD has managed to reach maxwell level efficiency with Polaris when they have a hardware scheduler sucking power, more hardware features and components, and no tile based renderer for power savings
Edited by EniGma1987 - 8/1/16 at 5:04pm
Gaming
(17 items)
 
Gaming PC
(20 items)
 
 
CPUMotherboardGraphicsRAM
7700K AS Rock Z170 OC Formula Titan X Pascal 2050MHz 64GB DDR4-3200 14-14-14-34-1T 
Hard DriveHard DriveHard DriveCooling
950 EVO m.2 OS drive 850 EVO 1TB games drive Intel 730 series 500GB games drive Custom water cooling 
OSMonitorKeyboardPower
Win 10 Pro x64 AMH A399U E-Element mechanical, black switches, Vortex b... EVGA G3 1kw 
CaseMouseAudioAudio
Lian-Li PC-V1000L Redragon M901 LH Labs Pulse X Infinity DAC Custom built balanced tube amp with SS diamond ... 
Audio
MrSpeakers Alpha Prime 
  hide details  
Reply
Gaming
(17 items)
 
Gaming PC
(20 items)
 
 
CPUMotherboardGraphicsRAM
7700K AS Rock Z170 OC Formula Titan X Pascal 2050MHz 64GB DDR4-3200 14-14-14-34-1T 
Hard DriveHard DriveHard DriveCooling
950 EVO m.2 OS drive 850 EVO 1TB games drive Intel 730 series 500GB games drive Custom water cooling 
OSMonitorKeyboardPower
Win 10 Pro x64 AMH A399U E-Element mechanical, black switches, Vortex b... EVGA G3 1kw 
CaseMouseAudioAudio
Lian-Li PC-V1000L Redragon M901 LH Labs Pulse X Infinity DAC Custom built balanced tube amp with SS diamond ... 
Audio
MrSpeakers Alpha Prime 
  hide details  
Reply
post #8 of 42
Quote:
Originally Posted by EniGma1987 View Post

It would likely require a re-design of the majority of the back end of the GPU (whole rasterizer section and cache subsystem for the rasterization process and thus how the memory controllers are set to operate and the texture mapping units and such) and *maybe* the scheduler, but the GCN cores and compute clusters themselves shouldnt need a re-design.


Mobile chips have done this for a long time, and Nvidia has adopted it to in a way. They are really using a sort of hybrid tile based+immediate based renderer. AMD could do the same thing, but there likely needs to be a major shift in how graphics are handled in game engines, OS, and APIs as well as the hardware themselves and drivers to do a 100% tile based renderer. But there likely doesnt need to have all this done, as Nvidia's hybrid approach seems to work great on performance and efficiency as well as compatibility.

If you think about it, it is pretty impressive that AMD has managed to reach maxwell level efficiency with Polaris when they have a hardware scheduler sucking power, more hardware features and components, and no tile based renderer for power savings

It is a feat but it currently the way I understand it is that AMD rops are held back from full potential by memory constraints. If they implemented something similar to this method of rasterization they could potentially boost performance a bit or cut back further and maintain similar levels of performance and reduce power consumption closer to that of nvidia cards while still keeping a hardware scheduler and ACEs. Would also benefit margins somewhat by using smaller bus width and less rops.
post #9 of 42
One would hope once AMD starts making some money for more R&D, they implement this too... maybe they have with Polaris.
AMD Box
(15 items)
 
  
CPUMotherboardGraphicsRAM
AMD FX-8320E @ 4.6 GHz +0.356250v offset Asus Sabertooth 990FX Rev1 eVGA GTX 970 SC ACX2.0 Patriot Viper Xtreme 2x4 GB 1600LL 8-9-8-24 1T 
Hard DriveHard DriveHard DriveCooling
Samsung 840 EVO WD Black 1 TB 32MB cache FALS WD Blue 1 TB 7200rpm EZEX Corsair H80i 
OSMonitorKeyboardPower
Windows 10 x64 HP LP2475w Logitech Illuminated Corsair TX750  
CaseMouseAudio
You don't want to know Logitech G9x Creative Sound Blaster Z 
  hide details  
Reply
AMD Box
(15 items)
 
  
CPUMotherboardGraphicsRAM
AMD FX-8320E @ 4.6 GHz +0.356250v offset Asus Sabertooth 990FX Rev1 eVGA GTX 970 SC ACX2.0 Patriot Viper Xtreme 2x4 GB 1600LL 8-9-8-24 1T 
Hard DriveHard DriveHard DriveCooling
Samsung 840 EVO WD Black 1 TB 32MB cache FALS WD Blue 1 TB 7200rpm EZEX Corsair H80i 
OSMonitorKeyboardPower
Windows 10 x64 HP LP2475w Logitech Illuminated Corsair TX750  
CaseMouseAudio
You don't want to know Logitech G9x Creative Sound Blaster Z 
  hide details  
Reply
post #10 of 42
I think Nvidia would not appreciate this public revelation as this should have been their "secret sauce". I can see AMD wasting no time looking into this for their own use.
Gaming PC
(15 items)
 
Laptop
(10 items)
 
Office PC
(15 items)
 
CPUMotherboardGraphicsRAM
Intel Core i7 6700HQ Intel HM170 Express NVIDIA GeForce GTX 980M 8GB GDDR5 G.Skill Ripjaws 16GB (8GBx2) DDR4 2133MHz SODIMM 
Hard DriveOSMonitorKeyboard
Samsung 850 EVO 500GB M.2 SATA SSD Windows 10 Pro 64-Bit Viewsonic XG2703-GS Logitech G610 MX Brown 
MouseMouse Pad
Logitech G502 Proteus Core Fellowes Microban 
CPUMotherboardGraphicsRAM
Intel Pentium G4560 ASRock B250M Pro4 Intel HD Graphics 610 G.Skill Aegis 8GB (4GBx2) DDR4 2400MHz 
Hard DriveCoolingCoolingCooling
Crucial M500 240GB SATA SSD Intel stock CPU cooler Corsair SP120 quiet edition Noctua NF P14s redux-1200 
OSMonitorKeyboardPower
Windows 10 Pro Dell U2311H Cooler Master Master Keys S PBT MX Speed Seasonic S12II 620W 
CaseMouseMouse Pad
Corsair Obsidian 550D Logitech G90 Fellowes Microban 
  hide details  
Reply
Gaming PC
(15 items)
 
Laptop
(10 items)
 
Office PC
(15 items)
 
CPUMotherboardGraphicsRAM
Intel Core i7 6700HQ Intel HM170 Express NVIDIA GeForce GTX 980M 8GB GDDR5 G.Skill Ripjaws 16GB (8GBx2) DDR4 2133MHz SODIMM 
Hard DriveOSMonitorKeyboard
Samsung 850 EVO 500GB M.2 SATA SSD Windows 10 Pro 64-Bit Viewsonic XG2703-GS Logitech G610 MX Brown 
MouseMouse Pad
Logitech G502 Proteus Core Fellowes Microban 
CPUMotherboardGraphicsRAM
Intel Pentium G4560 ASRock B250M Pro4 Intel HD Graphics 610 G.Skill Aegis 8GB (4GBx2) DDR4 2400MHz 
Hard DriveCoolingCoolingCooling
Crucial M500 240GB SATA SSD Intel stock CPU cooler Corsair SP120 quiet edition Noctua NF P14s redux-1200 
OSMonitorKeyboardPower
Windows 10 Pro Dell U2311H Cooler Master Master Keys S PBT MX Speed Seasonic S12II 620W 
CaseMouseMouse Pad
Corsair Obsidian 550D Logitech G90 Fellowes Microban 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Hardware News
Overclock.net › Forums › Industry News › Hardware News › [Real World Tech] Tile-based Rasterization in Nvidia GPUs