Overclock.net › Forums › Industry News › Hardware News › [VRZ] NVIDIA's 18K+ Tesla K20 GPUs claims 'world's fastest open-science supercomputer'
New Posts  All Forums:Forum Nav:

[VRZ] NVIDIA's 18K+ Tesla K20 GPUs claims 'world's fastest open-science supercomputer' - Page 4

post #31 of 51
Quote:
Originally Posted by Capt View Post

Japan still has the fastest supercomputer.It's got around 700,000 cpu cores.

As soon as I read that I thought you would own an AMD X6 or 8 core Bulldozer, I looked at your sig rig and there we have it. An AMD X6

Power doesn't come purely from the amount of cores.
    
CPUMotherboardGraphicsRAM
AMD Ryzen R5 1600 Asus PRIME B350 PLUS  AMD Radeon HD7950 16GB Corsair Vengence (2x8GB) 
Hard DriveHard DriveHard DriveOS
1TB WD Blue 500GB WD Blue 120GB Hitachi Windows 10 Pro 
MonitorMonitorKeyboardPower
LG 32LD450 Dell Ducky DK9008 OCN Edition Corsair TX650v2 
Case
Fractal Design Core 3000 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
AMD Ryzen R5 1600 Asus PRIME B350 PLUS  AMD Radeon HD7950 16GB Corsair Vengence (2x8GB) 
Hard DriveHard DriveHard DriveOS
1TB WD Blue 500GB WD Blue 120GB Hitachi Windows 10 Pro 
MonitorMonitorKeyboardPower
LG 32LD450 Dell Ducky DK9008 OCN Edition Corsair TX650v2 
Case
Fractal Design Core 3000 
  hide details  
Reply
post #32 of 51
Quote:
Originally Posted by Capt View Post

Japan still has the fastest supercomputer. It's got around 700,000 cpu cores.
http://www.maximumpc.com/article/news/japans_k_supercomputer_still_fastest_world_top500_says

how is 10 Petaflop faster than 20 Petaflop?
post #33 of 51
Quote:
Originally Posted by Capt View Post

Japan still has the fastest supercomputer. It's got around 700,000 cpu cores.

http://www.maximumpc.com/article/news/japans_k_supercomputer_still_fastest_world_top500_says

That article is a year old. This new computer is twice as fast as that one, regardless of the number of cores.
Foldatron
(17 items)
 
Mat
(10 items)
 
Work iMac
(9 items)
 
CPUMotherboardGraphicsGraphics
i7 950 EVGA x58 3-way SLI EVGA GTX 660ti GTX 275 
RAMHard DriveHard DriveHard Drive
3x2GB Corsair Dominator DDR3-1600 80GB Intel X25-M SSD 2TB WD Black 150GB WD Raptor 
Hard DriveOSMonitorKeyboard
2x 150GB WD V-raptor in RAID0 Win7 Home 64-bit OEM 55" LED 120hz 1080p Vizio MS Natural Ergonomic Keyboard 4000 
PowerCase
750W PC P&C Silencer CoolerMaster 690 
CPUGraphicsRAMHard Drive
Intel Core i5 2500S AMD 6770M 8GB (2x4GB) at 1333Mhz 1TB, 7200 rpm 
Optical DriveOSMonitorKeyboard
LG 8X Dual-Layer "SuperDrive" OS X Lion 27" iMac screen Mac wireless keyboard 
Mouse
Mac wireless mouse 
CPUGraphicsRAMHard Drive
i7-2600K AMD 6970M 1GB 16GB PC3-10600 DDR3 1TB 7200rpm 
Hard DriveOptical DriveOSMonitor
256GB SSD 8x DL "SuperDrive" OS X 10.7 Lion 27" 2560x1440 iMac display 
Monitor
27" Apple thunderbolt display 
  hide details  
Reply
Foldatron
(17 items)
 
Mat
(10 items)
 
Work iMac
(9 items)
 
CPUMotherboardGraphicsGraphics
i7 950 EVGA x58 3-way SLI EVGA GTX 660ti GTX 275 
RAMHard DriveHard DriveHard Drive
3x2GB Corsair Dominator DDR3-1600 80GB Intel X25-M SSD 2TB WD Black 150GB WD Raptor 
Hard DriveOSMonitorKeyboard
2x 150GB WD V-raptor in RAID0 Win7 Home 64-bit OEM 55" LED 120hz 1080p Vizio MS Natural Ergonomic Keyboard 4000 
PowerCase
750W PC P&C Silencer CoolerMaster 690 
CPUGraphicsRAMHard Drive
Intel Core i5 2500S AMD 6770M 8GB (2x4GB) at 1333Mhz 1TB, 7200 rpm 
Optical DriveOSMonitorKeyboard
LG 8X Dual-Layer "SuperDrive" OS X Lion 27" iMac screen Mac wireless keyboard 
Mouse
Mac wireless mouse 
CPUGraphicsRAMHard Drive
i7-2600K AMD 6970M 1GB 16GB PC3-10600 DDR3 1TB 7200rpm 
Hard DriveOptical DriveOSMonitor
256GB SSD 8x DL "SuperDrive" OS X 10.7 Lion 27" 2560x1440 iMac display 
Monitor
27" Apple thunderbolt display 
  hide details  
Reply
post #34 of 51
That is um... a lot of calculations... NOW MINIATURIZE IT! I want it to fit under the palm of my hand, buried deep into the tissue... holo capabilities optional.
Sovereign
(13 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5 3570K ASRock Z77 Pro4 EVGA GTX 470 Kingston 4x4GB DDR3 1600 
Hard DriveOptical DriveOSMonitor
128 GB SSD LITE-ON Black 24X DVD Writer Windows 7 PRO x64 ASUS 1920x1080 LED 23.6" 
KeyboardPowerCaseMouse
Logitech USB Media Keyboard CORSAIR CMPSU-750TX 750W Antec Nine Hundred Two Logitech 
Audio
Turtlebeach Earforce P11 
  hide details  
Reply
Sovereign
(13 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5 3570K ASRock Z77 Pro4 EVGA GTX 470 Kingston 4x4GB DDR3 1600 
Hard DriveOptical DriveOSMonitor
128 GB SSD LITE-ON Black 24X DVD Writer Windows 7 PRO x64 ASUS 1920x1080 LED 23.6" 
KeyboardPowerCaseMouse
Logitech USB Media Keyboard CORSAIR CMPSU-750TX 750W Antec Nine Hundred Two Logitech 
Audio
Turtlebeach Earforce P11 
  hide details  
Reply
post #35 of 51
They spelled Crysis wrong in the article. tongue.gif

/grammarspellingnazi
Everest - Intel
(19 items)
 
  
CPUMotherboardGraphicsRAM
Intel i7 4790k Gigabyte Z97X Gaming 7 MSI Geforce GTX 1080 Ti Gaming X 16GB (2x8) Patriot Viper 1866Mhz  
Hard DriveHard DriveOptical DriveCooling
Seagate 3TB, WD 500GB HDD, WD 640GB HD Samsung 850 EVO 512GB Samsung DVD-Burner Corsair H110 w/ Dual Aerocool DS 140mm fans 
OSMonitorMonitorKeyboard
Windows 10 Pro Dell S2716DG (1440p, 144hz Gsync) AOC U3477 PQU (3440x1440 IPS) Logitech G810 Orion Spectrum 
PowerCaseMouseMouse Pad
Evga SuperNOVA 750 G2 NZXT Phantom 530 Black Logitech G502 Proteus Core Corsair MM400 
AudioAudioAudio
Creative Sound Blaster E5 DAC/AMP Sennheiser HD 598 Headphones HyperX Cloud Headset 
  hide details  
Reply
Everest - Intel
(19 items)
 
  
CPUMotherboardGraphicsRAM
Intel i7 4790k Gigabyte Z97X Gaming 7 MSI Geforce GTX 1080 Ti Gaming X 16GB (2x8) Patriot Viper 1866Mhz  
Hard DriveHard DriveOptical DriveCooling
Seagate 3TB, WD 500GB HDD, WD 640GB HD Samsung 850 EVO 512GB Samsung DVD-Burner Corsair H110 w/ Dual Aerocool DS 140mm fans 
OSMonitorMonitorKeyboard
Windows 10 Pro Dell S2716DG (1440p, 144hz Gsync) AOC U3477 PQU (3440x1440 IPS) Logitech G810 Orion Spectrum 
PowerCaseMouseMouse Pad
Evga SuperNOVA 750 G2 NZXT Phantom 530 Black Logitech G502 Proteus Core Corsair MM400 
AudioAudioAudio
Creative Sound Blaster E5 DAC/AMP Sennheiser HD 598 Headphones HyperX Cloud Headset 
  hide details  
Reply
post #36 of 51
Quote:
Originally Posted by Capt View Post

Japan still has the fastest supercomputer. It's got around 700,000 cpu cores.
http://www.maximumpc.com/article/news/japans_k_supercomputer_still_fastest_world_top500_says

lolwut?
Quote:
Originally Posted by andrews2547 View Post

As soon as I read that I thought you would own an AMD X6 or 8 core Bulldozer, I looked at your sig rig and there we have it. An AMD X6.
Power doesn't come purely from the amount of cores.

This ^^
Quote:
Originally Posted by Frank33 View Post

how is 10 Petaflop faster than 20 Petaflop?

More of This ^^
Quote:
Originally Posted by lordikon View Post

That article is a year old. This new computer is twice as fast as that one, regardless of the number of cores.

And This ^^
Pwnz0r Radley IV
(15 items)
 
   
CPUMotherboardGraphicsRAM
Q9550 Gigabyte X38-DS4 Asus GTX 480 2x2gb Corsair XMS2 
Hard DriveMonitorPower
2 x 500GB WD Caviar Black (RAID-0) 22" AOC 1080P Enermax Ininity 720W 
  hide details  
Reply
Pwnz0r Radley IV
(15 items)
 
   
CPUMotherboardGraphicsRAM
Q9550 Gigabyte X38-DS4 Asus GTX 480 2x2gb Corsair XMS2 
Hard DriveMonitorPower
2 x 500GB WD Caviar Black (RAID-0) 22" AOC 1080P Enermax Ininity 720W 
  hide details  
Reply
post #37 of 51
Well, that's it. You guys wanna' go grab a few drinks?
post #38 of 51
Quote:
Originally Posted by lordikon View Post

I'm glad the idea came across easily. The theory is simple, but I may have oversimplified how much work would be behind it. It's a similar architecture to how many PS3 games are written, or how distributed computing works. The SPEs have no access to the system memory, all they can do is work with data they are given and then pass it back. So you have the SPU (main CPU) create a queue of work and whenever an SPE is free to do more work you send it something from the queue. The SPE will do the work and send it back to the SPU. The complicated part is timing this whole process so you can send and get back all the data you'll need in time to render that frame. Some data may not be needed by any particular time, some day might, so you now need to balance it such that you get back all the data you need on time, while the other type of work can still go on over the course of time (probably a few frames).
If you want to do something like this with a system where each core has access to system memory, then you just need to make sure to lock the queue when something is taking work from it, so you don't get two cores at once trying to grab work from the top of the pile.
Finally, once the work is done by the core, it often needs to still be processed. If you're distributing a lot of work on one set of data, then the results all need to be combined again after the pieces have been calculated. As mentioned earlier, this process is similar to distributed computing. Take something like the folding@home program that Stanford runs, they create a large queue of work, and when someone requests work they are sent the data. After the data has the calculations finished for it they are sent back to Stanford for processing. The difference between something like distributed computing and a super computer is that you aren't guaranteed to get back the results, people may not do the work, so you have to keep a copy of the work you send out so that you can re-send it out again if you don't get a result within a certain amount of time.

Err, Jaguar/Titan is a big cluster of networked compute nodes; it really isn't like PS3 game programming.

There is a job queuing system but it's something different than what you describe. It's a user/time management system, not a process-level scheduler

Really, all it is a whole bunch of basic computers (CPU, mem, GPUs) that are networked together. You write parallel code using some communication API (eg MPI), but past beyond that you're just spawning the same process across however many cores you need. GPUs get used like they get used in normal computers (although some crazy people are trying to get node communication happening on the GPU itself).

It's AMD Bulldozers with eight (?) cores per proc and I think 2 procs per node, so there's some inter-node parallelism similar to PS3 cores, but that's not the thrust of Jaguar/Titan - or any compute cluster.
Callisto
(13 items)
 
  
CPUMotherboardGraphicsRAM
2500K 4.7ghz @ 1.37v MSI P67A-GD65 EVGA GTX 680 Mushkin Enhanced Redline 8GB (2 x 4GB) DDR3 1866 
Hard DriveOSMonitorPower
Intel SSD 40GB; Seagate Cavier Green 1TB Scientific Linux 6.3/Windoze7 22" Samsung SyncMaster 2232BW Corsair TX850 
  hide details  
Reply
Callisto
(13 items)
 
  
CPUMotherboardGraphicsRAM
2500K 4.7ghz @ 1.37v MSI P67A-GD65 EVGA GTX 680 Mushkin Enhanced Redline 8GB (2 x 4GB) DDR3 1866 
Hard DriveOSMonitorPower
Intel SSD 40GB; Seagate Cavier Green 1TB Scientific Linux 6.3/Windoze7 22" Samsung SyncMaster 2232BW Corsair TX850 
  hide details  
Reply
post #39 of 51
Quote:
Originally Posted by dharmaBum View Post

Quote:
Originally Posted by lordikon View Post

I'm glad the idea came across easily. The theory is simple, but I may have oversimplified how much work would be behind it. It's a similar architecture to how many PS3 games are written, or how distributed computing works. The SPEs have no access to the system memory, all they can do is work with data they are given and then pass it back. So you have the SPU (main CPU) create a queue of work and whenever an SPE is free to do more work you send it something from the queue. The SPE will do the work and send it back to the SPU. The complicated part is timing this whole process so you can send and get back all the data you'll need in time to render that frame. Some data may not be needed by any particular time, some day might, so you now need to balance it such that you get back all the data you need on time, while the other type of work can still go on over the course of time (probably a few frames).
If you want to do something like this with a system where each core has access to system memory, then you just need to make sure to lock the queue when something is taking work from it, so you don't get two cores at once trying to grab work from the top of the pile.
Finally, once the work is done by the core, it often needs to still be processed. If you're distributing a lot of work on one set of data, then the results all need to be combined again after the pieces have been calculated. As mentioned earlier, this process is similar to distributed computing. Take something like the folding@home program that Stanford runs, they create a large queue of work, and when someone requests work they are sent the data. After the data has the calculations finished for it they are sent back to Stanford for processing. The difference between something like distributed computing and a super computer is that you aren't guaranteed to get back the results, people may not do the work, so you have to keep a copy of the work you send out so that you can re-send it out again if you don't get a result within a certain amount of time.

Err, Jaguar/Titan is a big cluster of networked compute nodes; it really isn't like PS3 game programming.

There is a job queuing system but it's something different than what you describe. It's a user/time management system, not a process-level scheduler

Really, all it is a whole bunch of basic computers (CPU, mem, GPUs) that are networked together. You write parallel code using some communication API (eg MPI), but past beyond that you're just spawning the same process across however many cores you need. GPUs get used like they get used in normal computers (although some crazy people are trying to get node communication happening on the GPU itself).

It's AMD Bulldozers with eight (?) cores per proc and I think 2 procs per node, so there's some inter-node parallelism similar to PS3 cores, but that's not the thrust of Jaguar/Titan - or any compute cluster.

Yea, I guess I should've mentioned that the PS3 description is more of an abstraction. There's no way all of that hardware would run on a single system, of course. The abstraction being the PS3 SPE would be like a node in the cluster. I was trying to use an example of how these kinds of techniques are used in more common scenarios, like in a PS3. Different hardware, different architectures, similar concept of creating work, distributing it, getting it back, processing it, etc.
Foldatron
(17 items)
 
Mat
(10 items)
 
Work iMac
(9 items)
 
CPUMotherboardGraphicsGraphics
i7 950 EVGA x58 3-way SLI EVGA GTX 660ti GTX 275 
RAMHard DriveHard DriveHard Drive
3x2GB Corsair Dominator DDR3-1600 80GB Intel X25-M SSD 2TB WD Black 150GB WD Raptor 
Hard DriveOSMonitorKeyboard
2x 150GB WD V-raptor in RAID0 Win7 Home 64-bit OEM 55" LED 120hz 1080p Vizio MS Natural Ergonomic Keyboard 4000 
PowerCase
750W PC P&C Silencer CoolerMaster 690 
CPUGraphicsRAMHard Drive
Intel Core i5 2500S AMD 6770M 8GB (2x4GB) at 1333Mhz 1TB, 7200 rpm 
Optical DriveOSMonitorKeyboard
LG 8X Dual-Layer "SuperDrive" OS X Lion 27" iMac screen Mac wireless keyboard 
Mouse
Mac wireless mouse 
CPUGraphicsRAMHard Drive
i7-2600K AMD 6970M 1GB 16GB PC3-10600 DDR3 1TB 7200rpm 
Hard DriveOptical DriveOSMonitor
256GB SSD 8x DL "SuperDrive" OS X 10.7 Lion 27" 2560x1440 iMac display 
Monitor
27" Apple thunderbolt display 
  hide details  
Reply
Foldatron
(17 items)
 
Mat
(10 items)
 
Work iMac
(9 items)
 
CPUMotherboardGraphicsGraphics
i7 950 EVGA x58 3-way SLI EVGA GTX 660ti GTX 275 
RAMHard DriveHard DriveHard Drive
3x2GB Corsair Dominator DDR3-1600 80GB Intel X25-M SSD 2TB WD Black 150GB WD Raptor 
Hard DriveOSMonitorKeyboard
2x 150GB WD V-raptor in RAID0 Win7 Home 64-bit OEM 55" LED 120hz 1080p Vizio MS Natural Ergonomic Keyboard 4000 
PowerCase
750W PC P&C Silencer CoolerMaster 690 
CPUGraphicsRAMHard Drive
Intel Core i5 2500S AMD 6770M 8GB (2x4GB) at 1333Mhz 1TB, 7200 rpm 
Optical DriveOSMonitorKeyboard
LG 8X Dual-Layer "SuperDrive" OS X Lion 27" iMac screen Mac wireless keyboard 
Mouse
Mac wireless mouse 
CPUGraphicsRAMHard Drive
i7-2600K AMD 6970M 1GB 16GB PC3-10600 DDR3 1TB 7200rpm 
Hard DriveOptical DriveOSMonitor
256GB SSD 8x DL "SuperDrive" OS X 10.7 Lion 27" 2560x1440 iMac display 
Monitor
27" Apple thunderbolt display 
  hide details  
Reply
post #40 of 51
Quote:
Originally Posted by andrews2547 View Post

As soon as I read that I thought you would own an AMD X6 or 8 core Bulldozer, I looked at your sig rig and there we have it. An AMD X6
Power doesn't come purely from the amount of cores.

From the source article.
Quote:
Titan is a Cray XK7 system containing 18,688 nodes, each of which holds an Nvidia Tesla K20 GPU and a 16-core AMD Opteron 6274 processor.
Daily Driver
(21 items)
 
  
CPUMotherboardGraphicsGraphics
Phenom II X6 1090T Gigabyte GA-890FXA-UD5 XFX HD5670 1GB Gigabyte HD6570 1GB DDR3 
RAMHard DriveHard DriveHard Drive
Microcenter Value RAM OCZ Vertex 2 Maxtor STM3200820AS Seagate ST316002 3AS 
Optical DriveCoolingCoolingOS
Asus DRW-24B1ST Thermalright Venomous X Black Sanyo Denki - San Ace 9SG1212P1G01 120mm x 38mm... Win 7 Ultimate x64 
MonitorMonitorKeyboardPower
Dell U2311H Dell E193FP Dell OEM keyboard Corsair TX850V2 
CaseMouseMouse PadAudio
CM 690 II Advanced Razer Death Adder Black Edition 3.5G Steelseries QKC  On board Realtek HD audio 
Audio
Dell OEM 5.1 speaker system 
  hide details  
Reply
Daily Driver
(21 items)
 
  
CPUMotherboardGraphicsGraphics
Phenom II X6 1090T Gigabyte GA-890FXA-UD5 XFX HD5670 1GB Gigabyte HD6570 1GB DDR3 
RAMHard DriveHard DriveHard Drive
Microcenter Value RAM OCZ Vertex 2 Maxtor STM3200820AS Seagate ST316002 3AS 
Optical DriveCoolingCoolingOS
Asus DRW-24B1ST Thermalright Venomous X Black Sanyo Denki - San Ace 9SG1212P1G01 120mm x 38mm... Win 7 Ultimate x64 
MonitorMonitorKeyboardPower
Dell U2311H Dell E193FP Dell OEM keyboard Corsair TX850V2 
CaseMouseMouse PadAudio
CM 690 II Advanced Razer Death Adder Black Edition 3.5G Steelseries QKC  On board Realtek HD audio 
Audio
Dell OEM 5.1 speaker system 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Hardware News
Overclock.net › Forums › Industry News › Hardware News › [VRZ] NVIDIA's 18K+ Tesla K20 GPUs claims 'world's fastest open-science supercomputer'