AMD might have just launched their new Vega GPU architecture with a slew of Vega-based products (Radeon Vega, Radeon RX Vega, Radeon Pro WX, and Radeon Instinct) but the real king is NVIDIA's now months-old Volta GPU architecture.
We don't hear much about NVIDIA's Volta GPU architecture because it's still a while out from finding its way into consumer GeForce graphics cards, but the supercomputer/AI/deep learning markets are now receiving their new Volta-based Tesla V100 accelerators which means... BENCHMARK TIME!
First off, let's look at the difference between the previous-gen Pascal-based Tesla P100 and the new Volta-based Tesla V100. Starting off with 12x more deep learning training performance, with 10 TFLOPs on P100 up to a freakin' is-it-real 120 TFLOPs of 'DL training' on V100. NVIDIA has some huge memory bandwidth numbers on Tesla V100 as well, with 900GB/sec available - up from 720GB/sec on Tesla P100. NVLINK 2.0 is also featured, throwing the internal bandwidth up from 160GB/sec to a huge 300GB/sec (1.9x) while we have 10MB of L1 cache, up from 1.3MB on Tesla P100 (7.7x increase). The new NVIDIA Tesla V100 has been tested on single-core Geekbench 4 compute tests, with an out-of-this-world score of 743,537... the next one close to that is the P100-based system with just 320,031 in comparison. Even HP's impressive Z8 G4 workstation PC is only capable of 278,706 points, and that system rocks 9 x PCIe slots with Quadro GP100 cards inside.