Overclock.net › Articles › State Of The Workstation

State of the workstation

Just testing out OCN's article/blog feature.

AMD - Solidworks, Maya, PTC Creo 2.0
W600 - Cape Verde Pro (HD7750 w/ 2GB ram)
W5000 -Pitcairn LE (HD 7850 cutdown) 1,267.20 GFLOPS SPFP; 79.2 GFLOPs DPFP , 2GB GDDR5 , 102.4 GB/s memory bandwidth
W5000 DVI - 128 bit memory bus. frown.gif
W7000 - Pitcairn XT (HD 7870GE) 2.4 TFLOPs SPFP, 152 GFLOPs DPFP, 4GB ECC GDDR5 , 154GB/s memory bandwidth
W8000 - Tahiti Pro (HD 7950 on 256bit memory bus) 3.23 TFLOPs SPFP and 806 GFLOPs DPFP, 4GB ECC GDDR5, 176GB/s
W9000 - Tahiti XT (HD7970GE 6GB) 6GB GDDR5 4.0 TFLOPs SPFP, TFLOP DPFP, 6GB ECC 384 bit GDDR5, 264 GB/s memory bandwidth

S7000 - Pitcairn XT (HD7870) 2.4 TFLOPs SPFP, 152 GFLOPs DPFP, 4GB GDDR5 , 154GB/s memory bandwidth
S9000 - Tahiti Pro (HD7950) 3.23 TFLOPS single-precision and 806 GFLOPS , 6GB GDDR5 384-bit , 264 GB/s memory bandwidth
S10000 - Tahiti Pro x2 (HD 7990 kind of) 5.91 TFLOPS single and 1.48 TFLOPS DP, 6GB GDDR5 384-bit , 480 GB/s memory bandwidth

NVIDIA - the only option for CUDA, Autodesk Autocad (because AMD sucks at that), Bunkspeed, Adobe Premiere, CATIA Live Render
K600 -GK107
K2000 -GK107 (GTX 650) 384 CUDA cores,732.67 GFLOPS,~30GFlops DPFP?, 2GB GDDR5 , 128-bit , 64 GBps
K2000D -GK107 (GTX 650)
K4000 -GK106 (GTX 650 Ti Boost) 768 CUDA cores, 1,244.93 GFLOPS, ~52Gflops DPFP ?, 3GB GDDR5 , 192-bit , 134GBps
K5000 -GK104 (GTX 680) 1536 CUDA cores, 2.1 Teraflops , 90 Gflops DPFP, 4GB ECC 256bit GDDR5, 173 GB/s ... ANSYS can use this for solver
K6000 - (not released)

* all Kepler series have DPFP = 1/24 SPFP

Tesla K10 - GK104 x2 (gtx 690) 4.58 teraflops SPFP, 0.19 teraflops DPFP, 8GB GDDR5, 320 GBytes/s memory bandwidth (DPFP=1/24 SPFP)
Tesla K20 - GK110 (TITAN) DPFP = 1/3 FP32


OLDER GEN
AMD - if for some reason you need Firepro instead of Radeons with VLIW5 to do SHA-256 stuff
V3900 - Turks (HD 6670 cut down) 624.00 GFLOPS
V4900 - Turks (HD 6670) 768.00 GFLOPS
V5900 - Cayman LE GL (HD 6950/HD6930 cut down) 614.40 GFLOPS , 154 GFLOPS DPFP
V7900 - Cayman Pro GL (hd6950 cut down , maybe HD6930 2GB since it has same specs?) 1,856.00 GFLOPS , 464 GFLOPs DPFP



NVIDIA
* only two screens unless SLI for Fermi
600 - GF108 , DPFP= 1/12 SPFP
2000 - GF106 , DPFP= 1/12 SPFP , 480.00 GFLOPS SPFP
4000 - GF100 , DPFP= 1/2 SPFP , 486.40 GFLOPS SPFP
5000 - GF100 , DPFP= 1/2 SPFP ,722.30 GFLOPS SPFP
6000 - GF100 , DPFP= 1/2 SPFP, 6GB GDDR5 on 384-bit bus 1,027.71 GFLOPS SPFP ... ANSYS can use this or Tesla for solver

Tesla M2050 - GF100 , 448 shaders, 1 GFLOPs, 3GB 384bit GDDR5 , 148.4 GB/s memory bandwidth http://www.techpowerup.com/gpudb/1534/.html (Click to show)
Shading Units: 448
TMUs: 56
ROPs: 48
SM: 14
Pixel Rate: 16.1 GPixel/s
Texture Rate: 32.2 GTexel/s
Floating-point performance: 1,030.40 GFLOPS
Tesla M2070 - GF100 , 448 shaders, 1GFLOPs, 6GB 384 bit GDDr5, 150.3 GB/s http://www.techpowerup.com/gpudb/1535/.html (Click to show)
Shading Units: 448
TMUs: 56
ROPs: 48
SM: 14
Pixel Rate: 16.1 GPixel/s
Texture Rate: 32.2 GTexel/s
Floating-point performance: 1,030.40 GFLOPS
Tesla M2090 - GF100 , 512 shaders, 1.3GFLOPs, 6GB 384-bit GDDR5, 177.6 GB/s http://www.techpowerup.com/gpudb/1537/.html (Click to show)
Shading Units: 512
TMUs: 56
ROPs: 48
SM: 16
Pixel Rate: 20.8 GPixel/s
Texture Rate: 36.4 GTexel/s
Floating-point performance: 1,331.20 GFLOPS
*For M2090, M2070, M2050 DPFP = 1/2 FP32


Possible softmod/SMD resistor hackable
Firepro W7000 ... but 4 displayports vs whatever is on HD 7870 GE , 4GB ECC vs 2GB
Firepro W9000... but 6 Displayports and ECC VRAM, Sapphire Vapor-X 6GB
V4900 ... but outputs are different than HD6670 , also for $150 it's not worth the hassle (Firepros have 3yr support)

GTX 650 Ti ... but 3GB VRAm instead of 2GB on the consumer one
GTX 650 2GB ... to K2000D
GTX 680 4GB to K5000 , but without ECC RAM

***
Quadro K2000 = 732.67 GFLOPS (GTX 650 = 812.54 GFLOPS)
Firepro V4900 = 768.00 GFLOPS (HD6670 = 768.00 GFLOPS)
Quadro K4000 = 1,244.16 GFLOPS (GTX 650 TI Boost = 1,505.28 GFLOPS)
Firepro W5000 = 1,267.20 GFLOPS (75% of HD 7850 shaders 1,761.28 GFLOPS = 1,321 GFLOPs)
Firepro V7900 = 1,856.00 GFLOPS (HD6930 = 1,920.00 GFLOPS)
Quadro K5000 = 2,168.83 GFLOPS (GTX 680 = 3,090.43 GFLOPS) <-- so much unused headroom, probably TDP was a problem or the ECC VRAM (W7000 has 4GB ECC though): Pitcairn has better performance per watt than anything from NVIDIA
Firepro W7000 = 2,432.00 GFLOPS (HD 7870 = 2,560.00 GFLOPS) * 4GB ECC VRAM is not available on consumer HD7870



AMD may gain in the workstation world this coming year.
Adobe CC is going to be optimized for OpenCL
Apple's Mac Pro is using Firepros.
PTC Creo 2.0 is optimized for Firepros
Solidworks has Firepro optimizations (but so does Nvidia)
Cinema4D /NX have Firepro support
Autodesk Inventor runs faster on Radeons than Geforce/Firepro/Quadro

Specviewperf 12 is coming too.http://www.spec.org/gwpg/publish/enews-7-13-web.html , http://www.develop3d.com/blog/2013/07/siggraph-2013-1-spec-unveils-new-workstation-benchmarks

Comments (6)

"
I ran the latest Nvidia 314.22 drivers and Quadro 311.35. It seems the 314.22 drivers are a little bite better so I'm using those.
I did some benchmarking to compare the cards before and after the mod's.
GTX 680 #1 GTX 680 #2 K5000 #1 K5000 #2
3DMARK 11 9022 8987 9077 9016
Passmark 8 (3D Graphics Mark) 6044 6091 6025 5996
PCMark Vantage (Gaming) 19336 18956 18880 16177
PhysX 10158-166 fps 10003-165 fps 10176-167 fps 10123-166 fps
SPECviewperf 11
Catia-03 6.05 5.98 5.9 10.20
Ensight-04 32.20 32.23 32.20 32.27
Lightwave-01 13.23 12.84 13.14 13.22
Maya-03 12.77 12.73 12.86 12.85
Proe-05 0.96 1.00 1.00 0.99
Sw-02 11.09 11.37 11.36 12.78
Tcvis-02 1.01 1.17 1.02 1.02
Snx-01 3.42 3.37 3.40 3.42
As you can see all the scores between stock and modded cards are about the same. The problem is with the SPECviewperf 11 scores. This is the benchmark for Graphic and CAD programs. This is what the Quadro cards were made for. The scores for the modded K5000 should be MUCH higher. Take a look here.
http://www.xbitlabs.com/articles/graphics/display/nvidia-quadro-k5000_4.html

It looks to me that just because the computer thinks it’s a Quadro K5000 does not mean that it will act like a K5000.
I even tried this benchmark with the Quadro drivers and got the same results. Hopefully It's just a driver issue and not a hardware issue."
http://www.eevblog.com/forum/projects/hacking-nvidia-cards-into-their-professional-counterparts/210/

sigh.
this interesting stuff!
I don't know if it is the same with GPU's or not. But Gflops really isn't that accurate of a measurement of performance for CPUs. I have a FX 6300 OC'd to 4.8Ghz, that can run just as good as my friends stock i5-3570k in day to day tasks/compiling and quad core games. Yet IBT shows my FX 6300 @ only 40 Gflops and the i5 @ stock 100Gflops, plus when the i5 is OC'd to 4.5Ghz, according to IBT, it is faster than my i7-3820 at the same clock.

Again for GPU's it may be different, but for CPU's it is not an accurate measurement of performance.
EaquitasAbsum , it's rather accurate if the drivers are tuned properly. It takes into account memory bandwidth (not bus width) and core clock.

Overclocking core leads to higher floating point performance as does overclocking memory.

At first I didn't understand why sometimes the Firepro V4900 would beat the V5900 in some workstation benches (presumably those not taking over 1GB VRAM) and then I saw the Single precision floating point performance, it's about 100 Gflops lower. This may be due to the lower memory clock of 500MHz vs the V4900's 1000MHz in addition to a lower core clock of 600Mhz instead of 800MHz. It's not so much the fill rate because the V4900 has much lower Pixel fill rate.
The more you know! Thanks for the detailed response!
Overclock.net › Articles › State Of The Workstation