Overclock.net › Forums › Industry News › Rumors and Unconfirmed Articles › [En.Expreview] NVIDIA Kepler GK100 and GK104 Specifications Spotted
New Posts  All Forums:Forum Nav:

[En.Expreview] NVIDIA Kepler GK100 and GK104 Specifications Spotted - Page 6

post #51 of 158
Quote:
Originally Posted by BallaTheFeared View Post




Too good to be true, these cards would utterly destroy everything AMD has out.

1500 cores is something we haven't heard before, 1000 cores "slightly faster than a 580" at 1GHz sounds like they put bulldozer cores instead of fermi cores in it.


 

The cores are going to be clocked at GPU speed now (should be just above 1 GHz according to reports) so really no big surprise that they are using a lot more cores now. 1500 sounds good to me! biggrin.gif
EconoGamer#
(16 items)
 
  
CPUMotherboardGraphicsRAM
i7 4790K 5 GHz (delidded) ASRock Z97 Extreme4 nVidia Titan Xp Team Xtreem 16GB/2400 
Hard DriveHard DriveCoolingOS
Corsair Force GT 120G SSD (OS) Crucial MX200 500G SSD (Games) Corsair H75 Hydro Cooler Windows 10 x64 
MonitorKeyboardPowerCase
Acer Predator Z35 Corsair K95 RGB Platinum Corsair HX750 v2 80+ Gold Thermaltake Core X9 (M0dd3D!) 
MouseMouse PadAudioAudio
Logitech G403 wireless XTrac Ripper XL (cloth) Realtek ALC1150 (on-board) Audio Technica ATH-MSR7 Headphones 
  hide details  
Reply
EconoGamer#
(16 items)
 
  
CPUMotherboardGraphicsRAM
i7 4790K 5 GHz (delidded) ASRock Z97 Extreme4 nVidia Titan Xp Team Xtreem 16GB/2400 
Hard DriveHard DriveCoolingOS
Corsair Force GT 120G SSD (OS) Crucial MX200 500G SSD (Games) Corsair H75 Hydro Cooler Windows 10 x64 
MonitorKeyboardPowerCase
Acer Predator Z35 Corsair K95 RGB Platinum Corsair HX750 v2 80+ Gold Thermaltake Core X9 (M0dd3D!) 
MouseMouse PadAudioAudio
Logitech G403 wireless XTrac Ripper XL (cloth) Realtek ALC1150 (on-board) Audio Technica ATH-MSR7 Headphones 
  hide details  
Reply
post #52 of 158

Don't read anything into this OP, this makes way more sense:

 

 

 

 

Quote:

NVIDIA’s Kepler GPU Specifications have been revealed over at 3DCenter which detail the upcoming 28nm GK-104 and GK-100 chip based graphic cards.

According to the released info, Nvidia’s Next Gen flagship GK-100/GK-112 chip which will feature a total f 1024 Shaders (Cuda Cores), 128 texture units (TMUs), 64 ROP’s and a 512-bit GDDR5 Memory interface. The 28nm Next Gen beast would outperform the current Dual chip Geforce GTX590 GPU.

Release date is expected between Q2 2012.

Next would be the GK-104 Core which would replace the current performance segment cards such as GTX560/GTX560Ti would feature 640-768 Shader Units (Cuda Cores), 80-96 texture units (TMUs), Memory interface would most likely remain GDDR5 ~ 256-  384-bit. Performance is expected to be a bit better then the current fastest Geforce GTX580 GPU.

It’s expected time of arrival would be in 2H 2012. The GK-104 chip would allow support for DirectX 11.1 and be compatible with PCI-e Gen 3.

Fudzilla has also reported that Nvidia already has Kepler Samples with them and yields are reportedly said to be better than 40nm based cards.

 

 

http://wccftech.com/nvidia-kepler-gk104-gk100-specifications-detailed-gk100-rumored-launch-q2-2012/

 

I believe the GK-104 cards will compete directly with the 7970. While being priced in the $200-$300 dollar range.

 

640-768 Cuda Cores with 28nm fab, and clocks over 1GHz (overclocking) will easily go toe to toe with the the 7970, the higher end (al la 768) will probably beat it.


Edited by BallaTheFeared - 12/28/11 at 11:16am
    
CPUMotherboardGraphicsGraphics
Intel Core i5 2500K P8P67 PRO NVIDIA GeForce GTX 470 NVIDIA GeForce GTX 470 
GraphicsRAMRAMRAM
NVIDIA GeForce 9800 GT G-Skill A-Data G-Skill 
RAMHard DriveOptical DriveOS
A-Data Crucial M4 64GB + 1TB F3 Spinpoint $155 LS/DL DVD RW $?? Windows 8 64-bit "Epic Registry" Edition 
MonitorPowerCase
ASUS 21.5 1920x1080 2ms $135 CORSAIR HX850 $120 Mother Earth $free 
  hide details  
Reply
    
CPUMotherboardGraphicsGraphics
Intel Core i5 2500K P8P67 PRO NVIDIA GeForce GTX 470 NVIDIA GeForce GTX 470 
GraphicsRAMRAMRAM
NVIDIA GeForce 9800 GT G-Skill A-Data G-Skill 
RAMHard DriveOptical DriveOS
A-Data Crucial M4 64GB + 1TB F3 Spinpoint $155 LS/DL DVD RW $?? Windows 8 64-bit "Epic Registry" Edition 
MonitorPowerCase
ASUS 21.5 1920x1080 2ms $135 CORSAIR HX850 $120 Mother Earth $free 
  hide details  
Reply
post #53 of 158
Quote:
Originally Posted by Nowyn View Post

I don't get what are you trying to tell with that, cause in my quote i don't really diagree with most of what you said.
Tho you should get some facts straight.
1) G80 and GT200 were SIMD, while Fermi's are MIMD, meaning they are better suited for complex calculation being able to use different function in the same cycle.
2) nVidia had ECC memory support since first Fermi
3) On the compute side of thing nVidia is years ahead in adaptation of CUDA, that is being thought in some universities, support for major IDEs like Visual Studio and features like native C++ support as well bunch of other languages
AMD did a step in a right direction, now it's up to them to provide tools and features, as well as make usage of their GPUs for computing attractive. DirectCompute (which is not really hardcore GPGPU oriented) and OpenCL do sound good on paper, but they are not at the point where it can match CUDA capabilities and there's not much movement with only AMD behind it and we all know, that AMD doesn't really work as closely with devs and they should.

1. There's many different kinds of SIMD architecture. SIMD is a very overarching term for one of four fundamental types of processing styles (in fact, VLIW is just a specialist subset of SIMD Edit: I somewhat misspoke here, the VLIW architecture was a MIMD of SIMD with VLIW ALUs, so VLIW is SIMD for our purposes). The kind implemented by GCN is much different from G80, G200, AMD x86 SIMD, Intel SIMD, etc. There's little practical difference between modern MIMD and SIMD (modern SIMD isn't technically SIMD) when working with a GPU. That said, how many times is having MIMD actually better than SIMD in the kinds of workloads suited for GPUs? There are a few, but they aren't the majority, so is sacrificing performance for MIMD worth the cost?

Here's a blog entry on the topic.
Quote:
I'd guess everyone has heard something of the large, public, flame war that erupted between Intel and Nvidia about whose product is or will be superior: Intel Larrabee, or Nvidia's CUDA platforms. There have been many detailed analyses posted about details of these, such as who has (or will have) how many FLOPS, how much bandwidth per cycle, and how many nanoseconds latency when everything lines up right. Of course, all this is “peak values,” which still means “the values the vendor guarantees you cannot exceed” (Jack Dongarra’s definition), and one can argue forever about how much programming genius or which miraculous compiler is needed to get what fraction of those values.

Such discussion, it seems to me, ignores the elephant in the room. I think a key point, if not the key point, is that this is an issue of MIMD (Intel Larrabee) vs. SIMD (Nvidia CUDA).

If you question this, please see the update at the end of this post. Yes, Nvidia is SIMD, not SPMD.

I’d like to point to a Wikipedia article on those terms, from Flynn’s taxonomy, but their article on SIMD has been corrupted by Intel and others’ redefinition of SIMD to “vector.” I mean the original. So this post becomes much longer.

MIMD (Multiple Instruction, Multiple Data) refers to a parallel computer that runs an independent separate program – that’s the “multiple instruction” part – on each of its simultaneously-executing parallel units. SMPs and clusters are MIMD systems. You have multiple, independent programs barging along, doing things that may have nothing to do with each other, or may not. When they are related, they barge into each other at least occasionally, hopefully as intended by the programmer, to exchange data or to synchronize their otherwise totally separate operation. Quite regularly the barging is unintended, leading to a wide variety of insidious data- and time-dependent bugs.

SIMD (Single Instruction, Multiple Data) refers to a parallel computer that runs the EXACT SAME program – that’s the “single instruction” part – on each of its simultaneously-executing parallel units. When ILIAC IV, the original 1960s canonical SIMD system, basically a rectangular array of ALUs, was originally explained to me late in grad school (I think possibly by Bob Metcalfe) it was put this way:

Some guy sits in the middle of the room, shouts ADD!, and everybody adds.

I was a programming language hacker at the time (LISP derivatives), and I was horrified. How could anybody conceivably use such a thing? Well, first, it helps that when you say ADD! you really say something like “ADD Register 3 to Register 4 and put the result in Register 5,” and everybody has their own set of registers. That at least lets everybody have a different answer, which helps. Then you have to bend your head so all the world is linear algebra: Add matrix 1 to matrix 2, with each matrix element in a different parallel unit. Aha. Makes sense. For that. I guess. (Later I wrote about 150 KLOC of APL, which bent my head adequately.)

Unfortunately, the pure version doesn’t make quite enough sense, so Burroughs, Cray, and others developed a close relative called vector processing: You have a couple of lists of values, and say ADD ALL THOSE, producing another list whose elements are the pair wise sums of the originals. The lists can be in memory, but dedicated registers (“vector registers”) are more common. Rather than pure parallel execution, vectors lend themselves to pipelining of the operations done. That doesn’t do it all in the same amount of time – longer vectors take longer – but it’s a lot more parsimonious of hardware. Vectors also provide a lot more programming flexibility, since rows, columns, diagonals, and other structures can all be vectors. However, you still spend a lot of thought lining up all those operations so you can do them in large batches. Notice, however, that it’s a lot harder (but not impossible) for one parallel unit (or pipelined unit) to unintentionally barge into another’s business. SIMD and vector, when you can use them, are a whole lot easier to debug than MIMD because SIMD simply can’t exhibit a whole range of behaviors (bugs) possible with MIMD.

Intel’s SSE and variants, as well as AMD and IBM’s equivalents, are vector operations. But the marketers apparently decided “SIMD” was a cooler name, so this is what is now often called SIMD.

Bah, humbug. This exercises one of my pet peeves: Polluting the language for perceived gain, or just from ignorance, by needlessly redefining words. It damages our ability to communicate, causing people to have arguments about nothing.

Anyway, ILLIAC IV, the CM-1 Connection Machine (which, bizarrely, worked on lists – elements distributed among the units), and a variety of image processing and hard-wired graphics processors have been rather pure SIMD. Clearspeed’s accelerator products for HPC are a current example.

Graphics, by the way, is flat-out crazy mad for linear algebra. Graphics multiplies matrices multiple times for each endpoint of each of thousands or millions of triangles; then, in rasterizing, for each scanline across each triangle it interpolates a texture or color value, with additional illumination calculations involving normals to the approximated surface, doing the same operations for each pixel. There’s an utterly astonishing amount of repetitive arithmetic going on.

Now that we’ve got SIMD and MIMD terms defined, let’s get back to Larrabee and CUDA, or, strictly speaking, the Larrabee architecture and CUDA. (I’m strictly speaking in a state of sin when I say “Larrabee or CUDA,” since one’s an implementation and the other’s an architecture. What the heck, I’ll do penance later.)

Larrabee is a traditional cache-coherent SMP, programmed as a shared-memory MIMD system. Each independent processor does have its own vector unit (SSE stuff), but all 8, 16, 24, 32, or however many cores it has are independent executors of programs. As are each of the threads in those cores. You program it like MIMD, working in each program to batch together operations for each program’s vector (SIMD) unit.

CUDA, on the other hand, is basically SIMD at its top level: You issue an instruction, and many units execute that same instruction. There is an ability to partition those units into separate collections, each of which runs its own instruction stream, but there aren’t a lot of those (4, 8, or so). Nvidia calls that SIMT, where the “T” stands for “thread” and I refuse to look up the rest because this has a perfectly good term already existing: MSIMD, for Multiple SIMD. (See pet peeve above.) The instructions it can do are organized around a graphics pipeline, which adds its own set of issues that I won’t get into here.

Which is better? Here are basic arguments:

For a given technology, SIMD always has the advantage in raw peak operations per second. After all, it mainly consists of as many adders, floating-point units, shaders, or what have you, as you can pack into a given area. There’s little other overhead. All the instruction fetching, decoding, sequencing, etc., are done once, and shouted out, um, I mean broadcast. The silicon is mainly used for function, the business end of what you want to do. If Nvidia doesn’t have gobs of peak performance over Larrabee, they’re doing something really wrong. Engineers who have never programmed don’t understand why SIMD isn’t absolutely the cat’s pajamas.

On the other hand, there’s the problem of batching all those operations. If you really have only one ADD to do, on just two values, and you really have to do it before you do a batch (like, it’s testing for whether you should do the whole batch), then you’re slowed to the speed of one single unit. This is not good. Average speeds get really screwed up when you average with a zero. Also not good is the basic need to batch everything. My own experience in writing a ton of APL, a language where everything is a vector or matrix, is that a whole lot of APL code is written that is basically serial: One thing is done at a time.

So Larrabee should have a big advantage in flexibility, and also familiarity. You can write code for it just like SMP code, in C++ or whatever your favorite language is. You are potentially subject to a pile of nasty bugs that aren’t there, but if you stick to religiously using only parallel primitives pre-programmed by some genius chained in the basement, you’ll be OK.

[Here’s some free advice. Do not ever even program a simple lock for yourself. You’ll regret it. Case in point: A friend of mine is CTO of an Austin company that writes multithreaded parallel device drivers. He’s told me that they regularly hire people who are really good, highly experienced programmers, only to let them go because they can’t handle that kind of work. Granted, device drivers are probably a worst-case scenario among worst cases, but nevertheless this shows that doing it right takes a very special skill set. That’s why they can bill about $190 per hour.]

But what about the experience with these architectures in HPC? We should be able to say something useful about that, since MIMD vs. SIMD has been a topic forever in HPC, where forever really means back to ILLIAC days in the late 60s.

It seems to me that the way Intel's headed corresponds to how that topic finally shook out: A MIMD system with, effectively, vectors. This is reminiscent of the original, much beloved, Cray SMPs. (Well, probably except for Cray’s even more beloved memory bandwidth.) So by the lesson of history, Larrabee wins.

However, that history played out over a time when Moore’s Law was producing a 45% CAGR in performance. So if you start from basically serial code, which is the usual place, you just wait. It will go faster than the current best SIMD/vector/offload/whatever thingy in a short time and all you have to do is sit there on your dumb butt. Under those circumstances, the very large peak advantage of SIMD just dissipates, and doing the work to exploit it simply isn’t worth the effort.

Yo ho. Excuse me. We’re not in that world any more. Clock rates aren’t improving like that any more; they’re virtually flat. But density improvement is still going strong, so those SIMD guys can keep packing more and more units onto chips.

Ha, right back at ‘cha: MIMD can pack more of their stuff onto chips, too, using the same density. But… It’s not sit on your butt time any more. Making 100s of processors scale up performance is not like making 4 or 8 or even 64 scale up. Providing the same old SMP model can be done, but will be expensive and add ever-increasing overhead, so it won’t be done. Things will trend towards the same kinds of synch done in SIMD.

Furthermore, I've seen game developer interviews where they strongly state that Larrabee is not what they want; they like GPUs. They said the same when IBM had a meeting telling them about Cell, but then they just wanted higher clock rates; presumably everybody's beyond that now.

Pure graphics processing isn’t the end point of all of this, though. For game physics, well, maybe my head just isn't build for SIMD; I don't understand how it can possibly work well. But that may just be me.

If either doesn't win in that game market, the volumes won't exist, and how well it does elsewhere won't matter very much. I'm not at all certain Intel's market position matters; see Itanium. And, of course, execution matters. There Intel at least has a (potential?) process advantage.

I doubt Intel gives two hoots about this issue, since a major part of their motivation is to ensure than the X86 architecture rules the world everywhere.

But, on the gripping hand, does this all really matter in the long run? Can Nvidia survive as an independent graphics and HPC vendor? More density inevitably will lead to really significant graphics hardware integrated onto silicon with the processors, so it will be “free,” in the same sense that Microsoft made Internet Explorer free, which killed Netscape. AMD sucked up ATI for exactly this reason. Intel has decided to build the expertise in house, instead, hoping to rise above their prior less-than-stellar graphics heritage.

My take for now is that CUDA will at least not get wiped out by Larrabee for the foreseeable future, just because Intel no longer has Moore’s 45% CAGR on its side. Whether it will survive as a company depends on many things not relevant here, and on how soon embedded graphics becomes “good enough” for nearly everybody, and "good enough" for HPC.


Update 8/24/09.

There was some discussion on Reddit of this post; it seems to have aged off now – Reddit search doesn’t find it. I thought I'd comment on it anyway, since this is still one of the most-referenced posts I've made, even after all this time.

Part of what was said there was that I was off-base: Nvidia wasn’t SIMD, it was SPMD (separate instructions for each core). Unfortunately, some of the folks there appear to have been confused by Nvidia-speak. But you don’t have to take my word for that. See this excellent tutorial from SIGGRAPH 09 by Kayvon Fatahalian of Stanford. On pp. 49-53, he explains that in “generic-speak” (as opposed to “Nvidia-speak”) the Nvidia GeForce GTX 285 does have 30 independent MIMD cores, but each of those cores is effectively 1024-way SIMD: It has with groups of 32 “fragments” running as I described above, multiplied by 32 contexts also sharing the same instruction stream for memory stall overlap. So, to get performance you have to think SIMD out to 1024 if you want to get the parallel performance that is theoretically possible. Yes, then you have to use MIMD (SPMD) 30-way on top of that, but if you don’t have a lot of SIMD you just won’t exploit the hardware.(source)

2. I thought that Nvidia ECC support still wasn't "complete". I stated this without verifying, so I might be incorrect on this point.

3. CUDA only works on Nvidia only. Most mobile GPU makers are starting to support OpenCL (for example, all PowerVR 5xx units have hardware support for OpenCL, PowerVR is still working on the firmware and then it's ready for prime-time). CUDA is popular because Fermi was that much better than AMD at compute, so that's what was used. A newer version of OpenCL is in the works to improve some of the rough areas in 1.1, so even the API advantages will become non-issues (they are already non-issues if supporting more than Nvidia products is a priority). Visual Studio does have OpenCL integration abilities. AMD has claimed that they are going to hire a large amount of people to fix the relations problem. Lastly, why would anyone programming a GPU for HPC want to program in C++? The performance hit isn't worth it.
Edited by hajile - 12/28/11 at 12:37pm
post #54 of 158
Meh, nothing but weak speculation from known weak sources. 1500 Cuda cores, really? It doesn't take a rocket scientist to realize they won't be making a chip with triple the transistors of GF100. At best they could aim for 2x i.e. 1024 Cuda cores, and even that would be a huge chip. Of course the 1500 number was due to dropping the hot clocks, so 1500 cuda cores at half the old clocks would equal 750 old CCs? Fun speculation, but rather pointless and has no basis in reality. I've yet to see a single Kepler rumor that was believable considering how tight-lipped Nvidia has been. At this point I'd be satisfied with some more concrete info on when Nvidia is going to release their high end cards. The current sources are far from concrete I'd say. rolleyes.gif

My guess is they'll release Q2 and end up at best 20% faster than 7970 as per usual, and 670/770 will compete head to head against 7970. That has been the gist of it the past few gens. This time though the prices seem to stay quite high.
Megurine
(15 items)
 
  
CPUMotherboardGraphicsRAM
Core i5 2500K Asus P8Z68-V MSI GeForce GTX 970 HyperX Blu 
Hard DriveCoolingOSMonitor
Vertex 3 Scythe Mugen II Win 10 64 bit Philips BDM4065UC 
MonitorKeyboardPowerCase
BenQ GW2760HS Corsair K60 Super Flower 650W modular Fractal Design R4 
MouseAudioAudio
Logitech G700 Audio pro T8  Creative X-Fi 5.1 
  hide details  
Reply
Megurine
(15 items)
 
  
CPUMotherboardGraphicsRAM
Core i5 2500K Asus P8Z68-V MSI GeForce GTX 970 HyperX Blu 
Hard DriveCoolingOSMonitor
Vertex 3 Scythe Mugen II Win 10 64 bit Philips BDM4065UC 
MonitorKeyboardPowerCase
BenQ GW2760HS Corsair K60 Super Flower 650W modular Fractal Design R4 
MouseAudioAudio
Logitech G700 Audio pro T8  Creative X-Fi 5.1 
  hide details  
Reply
post #55 of 158
Quote:
Originally Posted by Pantsu View Post

Meh, nothing but weak speculation from known weak sources. 1500 Cuda cores, really? It doesn't take a rocket scientist to realize they won't be making a chip with triple the transistors of GF100. At best they could aim for 2x i.e. 1024 Cuda cores, and even that would be a huge chip. Of course the 1500 number was due to dropping the hot clocks, so 1500 cuda cores at half the old clocks would equal 750 old CCs? Fun speculation, but rather pointless and has no basis in reality. I've yet to see a single Kepler rumor that was believable considering how tight-lipped Nvidia has been. At this point I'd be satisfied with some more concrete info on when Nvidia is going to release their high end cards. The current sources are far from concrete I'd say. rolleyes.gif
My guess is they'll release Q2 and end up at best 20% faster than 7970 as per usual, and 670/770 will compete head to head against 7970. That has been the gist of it the past few gens. This time though the prices seem to stay quite high.

My guess is that if these rumours are accurate, the dual GK100 might have ~1500 CUDA cores. Aside from the amount of CUDA cores, I'm more interested in the other fixed function hardware inside the chip (TMU, ROPs, Tessy units, etc.). CUDA cores don't say a whole lot about gaming performance.
Pernod
(28 items)
 
Oleo
(14 items)
 
The Sidewinder
(18 items)
 
CPUCPUMotherboardGraphics
Intel Xeon E5-4650 Intel Xeon E5-4650 Asus Z9PE-D8 WS GeForce GTX 780 Ti 
RAMRAMHard DriveHard Drive
4 x 4GB Corsair Dominator Platinum 4 x 4GB Corsair Dominator Platinum Samsung Spinpoint F3 2TB Sandisk Ultra 64GB 
Hard DriveCoolingCoolingCooling
mushkin Triactor 240GB EK Supremacy Nickel/Plexi EK Supremacy Nickel/Plexi TFC XChanger 360 
CoolingCoolingCoolingOS
Koolance RP452X2 Swiftech MCP-655 Corsair SP120 "QE" Windows 10 Enterprise x64 
OSMonitorMonitorMonitor
openSUSE Leap 42.1 x64 Dell U2412M Dell U2412M Acer P241W-d 
KeyboardPowerCaseMouse
Ducky 1008 w/ Cherry MX Blue Coolermaster Silent Pro M2 1.5kW Silverstone Raven RV01B-W Microsoft Sidewinder X5 
AudioAudioAudioOther
Asus Xonar DX Beyerdynamic DT 880 Premium AudioEngine A2 (2.0) Wacom Intuos Pen & Touch 
CPUMotherboardGraphicsRAM
Core i7 2600K @5.2  MSI MPower Z77 GeForce GTX 470 Corsair Vengeance 2000MHz C10 
Hard DriveCoolingCoolingCooling
Samsung Spinpoint F1 XSPC Raystorm EK Spin Bay Res Clear mcp-350 (ddc1) 
CoolingCoolingOSOS
XSPC EX120 XSPC EX240 openSUSE 12.3 x64 elementaryOS 0.3 (Freya) x64 
PowerCase
Corsair TX750 750W Lian Li A05NB 
CPUMotherboardGraphicsRAM
Intel C2Q Q9550 Asus P5G41C-M LX GeForce GTX 470 Corsair XMS2 DDR2 800 
Hard DriveHard DriveHard DriveHard Drive
Seagate LP 500GB Crucial M4 128GB WD Red 2TB WD Red 4TB 
CoolingOSMonitorPower
Scythe Big Shuriken w/ Silverstone Suscool 121 Windows 10 Home x64 Sony Bravia KDL-37EX500 Antec TP650 New 
CaseOther
Lian Li V351B Silverstone Suscool 121 120mm case fans 
  hide details  
Reply
Pernod
(28 items)
 
Oleo
(14 items)
 
The Sidewinder
(18 items)
 
CPUCPUMotherboardGraphics
Intel Xeon E5-4650 Intel Xeon E5-4650 Asus Z9PE-D8 WS GeForce GTX 780 Ti 
RAMRAMHard DriveHard Drive
4 x 4GB Corsair Dominator Platinum 4 x 4GB Corsair Dominator Platinum Samsung Spinpoint F3 2TB Sandisk Ultra 64GB 
Hard DriveCoolingCoolingCooling
mushkin Triactor 240GB EK Supremacy Nickel/Plexi EK Supremacy Nickel/Plexi TFC XChanger 360 
CoolingCoolingCoolingOS
Koolance RP452X2 Swiftech MCP-655 Corsair SP120 "QE" Windows 10 Enterprise x64 
OSMonitorMonitorMonitor
openSUSE Leap 42.1 x64 Dell U2412M Dell U2412M Acer P241W-d 
KeyboardPowerCaseMouse
Ducky 1008 w/ Cherry MX Blue Coolermaster Silent Pro M2 1.5kW Silverstone Raven RV01B-W Microsoft Sidewinder X5 
AudioAudioAudioOther
Asus Xonar DX Beyerdynamic DT 880 Premium AudioEngine A2 (2.0) Wacom Intuos Pen & Touch 
CPUMotherboardGraphicsRAM
Core i7 2600K @5.2  MSI MPower Z77 GeForce GTX 470 Corsair Vengeance 2000MHz C10 
Hard DriveCoolingCoolingCooling
Samsung Spinpoint F1 XSPC Raystorm EK Spin Bay Res Clear mcp-350 (ddc1) 
CoolingCoolingOSOS
XSPC EX120 XSPC EX240 openSUSE 12.3 x64 elementaryOS 0.3 (Freya) x64 
PowerCase
Corsair TX750 750W Lian Li A05NB 
CPUMotherboardGraphicsRAM
Intel C2Q Q9550 Asus P5G41C-M LX GeForce GTX 470 Corsair XMS2 DDR2 800 
Hard DriveHard DriveHard DriveHard Drive
Seagate LP 500GB Crucial M4 128GB WD Red 2TB WD Red 4TB 
CoolingOSMonitorPower
Scythe Big Shuriken w/ Silverstone Suscool 121 Windows 10 Home x64 Sony Bravia KDL-37EX500 Antec TP650 New 
CaseOther
Lian Li V351B Silverstone Suscool 121 120mm case fans 
  hide details  
Reply
post #56 of 158
Quote:
Originally Posted by Pantsu View Post

Meh, nothing but weak speculation from known weak sources. 1500 Cuda cores, really? It doesn't take a rocket scientist to realize they won't be making a chip with triple the transistors of GF100. At best they could aim for 2x i.e. 1024 Cuda cores, and even that would be a huge chip. Of course the 1500 number was due to dropping the hot clocks, so 1500 cuda cores at half the old clocks would equal 750 old CCs? Fun speculation, but rather pointless and has no basis in reality. I've yet to see a single Kepler rumor that was believable considering how tight-lipped Nvidia has been. At this point I'd be satisfied with some more concrete info on when Nvidia is going to release their high end cards. The current sources are far from concrete I'd say. rolleyes.gif
My guess is they'll release Q2 and end up at best 20% faster than 7970 as per usual, and 670/770 will compete head to head against 7970. That has been the gist of it the past few gens. This time though the prices seem to stay quite high.

Who said anything about triple the transistors? They said triple the CUDA cores, which could be highly simplified versions of what they currently produce. (being a new architecture).

You also mention dropping 'hot clocks' - nvidia cores already ran MUCH faster then AMD's - running at double the core speed.

If anything, they may dis-associate the core from the 'cuda cores' a bit further - just like AMD do, and have a more simplified, many CUDA core architechture backed up by a more efficient back end.

This way they could produce a 1500 or even 2000 CUDA core chip, running at 1.2Ghz+ no problem. Sure, it would be a hot, big beast for 28nm, but isn't that what NV do best?
Edited by Dublin_Gunner - 12/28/11 at 3:35pm
post #57 of 158
Quote:
Originally Posted by Denim-187 View Post

Quote:
GK100
Kepler architecture with completely changed shader units (no more Hotclocks, ie Shader Clock = Chip clock)
estimated ~ 1500 (1D) shader units
estimated 512-bit DDR memory interface (up to GDDR5)
Performance: still too uncertain for a forecast, but certainly about half the level of 590 GeForce GTX
Launch: second quarter of 2012
This better be a misconception...thinking.gif

That's what I was thinkin'. O.o
     
CPUMotherboardGraphicsRAM
Intel Core i5-6300HQ Dell Proprietary 4GB Nvidia GTX 960M Samsung 12GB (1x8GB 1x4GB) DDR3L 1600 MHz 
Hard DriveHard DriveCoolingOS
256GB SanDisk M.2 SSD 1TB HGST 7.2k HDD Custom Cooling by Dell Solutions Windows 10 Home 64-Bit 
MonitorKeyboardPowerMouse
15.6" 1920x1080 IPS Screen Dell 130w PSU Logitech G602 Wireless 
Mouse Pad
Xtrac 'Ripper' Mouse Pad 
CPUMotherboardGraphicsRAM
Intel Core i7-2760QM Toshiba Qosmio X775 Nvidia 1.5GB GTX 560M 8GB DDR3 1333 
Hard DriveOptical DriveCoolingOS
120GB Samsung 830 SSD + 1TB HGST 7200RPM HDD CD/DVD-RW Stock Windows 10 Home 64-Bit 
MonitorKeyboardPowerMouse
17.3" - 1600x900 Standard backlit 180w Power Adapter Logitech G500s 
Audio
Harmon/Kardon Onboard 
  hide details  
Reply
     
CPUMotherboardGraphicsRAM
Intel Core i5-6300HQ Dell Proprietary 4GB Nvidia GTX 960M Samsung 12GB (1x8GB 1x4GB) DDR3L 1600 MHz 
Hard DriveHard DriveCoolingOS
256GB SanDisk M.2 SSD 1TB HGST 7.2k HDD Custom Cooling by Dell Solutions Windows 10 Home 64-Bit 
MonitorKeyboardPowerMouse
15.6" 1920x1080 IPS Screen Dell 130w PSU Logitech G602 Wireless 
Mouse Pad
Xtrac 'Ripper' Mouse Pad 
CPUMotherboardGraphicsRAM
Intel Core i7-2760QM Toshiba Qosmio X775 Nvidia 1.5GB GTX 560M 8GB DDR3 1333 
Hard DriveOptical DriveCoolingOS
120GB Samsung 830 SSD + 1TB HGST 7200RPM HDD CD/DVD-RW Stock Windows 10 Home 64-Bit 
MonitorKeyboardPowerMouse
17.3" - 1600x900 Standard backlit 180w Power Adapter Logitech G500s 
Audio
Harmon/Kardon Onboard 
  hide details  
Reply
post #58 of 158
1k cuda cores is just not possible.
post #59 of 158
Even with a die-shrink, 1024 cuda cores is not physically possible on that die size. Hell, AMD nearly doubled their die size, yet they only added 33% stream processors.

This is weak speculation at best, and an outright lie at worst. Don't hold your breath guys. I doubt Nvidia will pull a rabbit out of their hat this time.
Everest - Intel
(19 items)
 
  
CPUMotherboardGraphicsRAM
Intel i7 4790k Gigabyte Z97X Gaming 7 MSI Geforce GTX 1080 Ti Gaming X 16GB (2x8) Patriot Viper 1866Mhz  
Hard DriveHard DriveOptical DriveCooling
Seagate 3TB, WD 500GB HDD, WD 640GB HD Samsung 850 EVO 512GB Samsung DVD-Burner Corsair H110 w/ Dual Aerocool DS 140mm fans 
OSMonitorMonitorKeyboard
Windows 10 Pro Dell S2716DG (1440p, 144hz Gsync) AOC U3477 PQU (3440x1440 IPS) Logitech G810 Orion Spectrum 
PowerCaseMouseMouse Pad
Evga SuperNOVA 750 G2 NZXT Phantom 530 Black Logitech G502 Proteus Core Corsair MM400 
AudioAudioAudio
Creative Sound Blaster E5 DAC/AMP Sennheiser HD 598 Headphones HyperX Cloud Headset 
  hide details  
Reply
Everest - Intel
(19 items)
 
  
CPUMotherboardGraphicsRAM
Intel i7 4790k Gigabyte Z97X Gaming 7 MSI Geforce GTX 1080 Ti Gaming X 16GB (2x8) Patriot Viper 1866Mhz  
Hard DriveHard DriveOptical DriveCooling
Seagate 3TB, WD 500GB HDD, WD 640GB HD Samsung 850 EVO 512GB Samsung DVD-Burner Corsair H110 w/ Dual Aerocool DS 140mm fans 
OSMonitorMonitorKeyboard
Windows 10 Pro Dell S2716DG (1440p, 144hz Gsync) AOC U3477 PQU (3440x1440 IPS) Logitech G810 Orion Spectrum 
PowerCaseMouseMouse Pad
Evga SuperNOVA 750 G2 NZXT Phantom 530 Black Logitech G502 Proteus Core Corsair MM400 
AudioAudioAudio
Creative Sound Blaster E5 DAC/AMP Sennheiser HD 598 Headphones HyperX Cloud Headset 
  hide details  
Reply
post #60 of 158
Quote:
Originally Posted by Imglidinhere View Post

That's what I was thinkin'. O.o

überhalb litteraly means above in German. So they imply it's going to be faster than a GTX 590. Simple translation error.
Pernod
(28 items)
 
Oleo
(14 items)
 
The Sidewinder
(18 items)
 
CPUCPUMotherboardGraphics
Intel Xeon E5-4650 Intel Xeon E5-4650 Asus Z9PE-D8 WS GeForce GTX 780 Ti 
RAMRAMHard DriveHard Drive
4 x 4GB Corsair Dominator Platinum 4 x 4GB Corsair Dominator Platinum Samsung Spinpoint F3 2TB Sandisk Ultra 64GB 
Hard DriveCoolingCoolingCooling
mushkin Triactor 240GB EK Supremacy Nickel/Plexi EK Supremacy Nickel/Plexi TFC XChanger 360 
CoolingCoolingCoolingOS
Koolance RP452X2 Swiftech MCP-655 Corsair SP120 "QE" Windows 10 Enterprise x64 
OSMonitorMonitorMonitor
openSUSE Leap 42.1 x64 Dell U2412M Dell U2412M Acer P241W-d 
KeyboardPowerCaseMouse
Ducky 1008 w/ Cherry MX Blue Coolermaster Silent Pro M2 1.5kW Silverstone Raven RV01B-W Microsoft Sidewinder X5 
AudioAudioAudioOther
Asus Xonar DX Beyerdynamic DT 880 Premium AudioEngine A2 (2.0) Wacom Intuos Pen & Touch 
CPUMotherboardGraphicsRAM
Core i7 2600K @5.2  MSI MPower Z77 GeForce GTX 470 Corsair Vengeance 2000MHz C10 
Hard DriveCoolingCoolingCooling
Samsung Spinpoint F1 XSPC Raystorm EK Spin Bay Res Clear mcp-350 (ddc1) 
CoolingCoolingOSOS
XSPC EX120 XSPC EX240 openSUSE 12.3 x64 elementaryOS 0.3 (Freya) x64 
PowerCase
Corsair TX750 750W Lian Li A05NB 
CPUMotherboardGraphicsRAM
Intel C2Q Q9550 Asus P5G41C-M LX GeForce GTX 470 Corsair XMS2 DDR2 800 
Hard DriveHard DriveHard DriveHard Drive
Seagate LP 500GB Crucial M4 128GB WD Red 2TB WD Red 4TB 
CoolingOSMonitorPower
Scythe Big Shuriken w/ Silverstone Suscool 121 Windows 10 Home x64 Sony Bravia KDL-37EX500 Antec TP650 New 
CaseOther
Lian Li V351B Silverstone Suscool 121 120mm case fans 
  hide details  
Reply
Pernod
(28 items)
 
Oleo
(14 items)
 
The Sidewinder
(18 items)
 
CPUCPUMotherboardGraphics
Intel Xeon E5-4650 Intel Xeon E5-4650 Asus Z9PE-D8 WS GeForce GTX 780 Ti 
RAMRAMHard DriveHard Drive
4 x 4GB Corsair Dominator Platinum 4 x 4GB Corsair Dominator Platinum Samsung Spinpoint F3 2TB Sandisk Ultra 64GB 
Hard DriveCoolingCoolingCooling
mushkin Triactor 240GB EK Supremacy Nickel/Plexi EK Supremacy Nickel/Plexi TFC XChanger 360 
CoolingCoolingCoolingOS
Koolance RP452X2 Swiftech MCP-655 Corsair SP120 "QE" Windows 10 Enterprise x64 
OSMonitorMonitorMonitor
openSUSE Leap 42.1 x64 Dell U2412M Dell U2412M Acer P241W-d 
KeyboardPowerCaseMouse
Ducky 1008 w/ Cherry MX Blue Coolermaster Silent Pro M2 1.5kW Silverstone Raven RV01B-W Microsoft Sidewinder X5 
AudioAudioAudioOther
Asus Xonar DX Beyerdynamic DT 880 Premium AudioEngine A2 (2.0) Wacom Intuos Pen & Touch 
CPUMotherboardGraphicsRAM
Core i7 2600K @5.2  MSI MPower Z77 GeForce GTX 470 Corsair Vengeance 2000MHz C10 
Hard DriveCoolingCoolingCooling
Samsung Spinpoint F1 XSPC Raystorm EK Spin Bay Res Clear mcp-350 (ddc1) 
CoolingCoolingOSOS
XSPC EX120 XSPC EX240 openSUSE 12.3 x64 elementaryOS 0.3 (Freya) x64 
PowerCase
Corsair TX750 750W Lian Li A05NB 
CPUMotherboardGraphicsRAM
Intel C2Q Q9550 Asus P5G41C-M LX GeForce GTX 470 Corsair XMS2 DDR2 800 
Hard DriveHard DriveHard DriveHard Drive
Seagate LP 500GB Crucial M4 128GB WD Red 2TB WD Red 4TB 
CoolingOSMonitorPower
Scythe Big Shuriken w/ Silverstone Suscool 121 Windows 10 Home x64 Sony Bravia KDL-37EX500 Antec TP650 New 
CaseOther
Lian Li V351B Silverstone Suscool 121 120mm case fans 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Rumors and Unconfirmed Articles
Overclock.net › Forums › Industry News › Rumors and Unconfirmed Articles › [En.Expreview] NVIDIA Kepler GK100 and GK104 Specifications Spotted