Originally Posted by hajile
There's an interesting idea here (talking theoretically as I don't think it's the case
). Apple could buy the entire IP of MIPS for a couple of hundred million; less than their investments in buying Intrinsity and PA Semi. The basics behind ARM and MIPS are fairly similar (with most of the architecture differences favoring MIPS). There wouldn't be be too much retraining required (in fact, for the ARM 64-bit rip off of MIPS, there's even fewer differences). ARM becoming popular means increased licensing costs (already seen in the difference in cost between similarly performing MIPS and ARM chips). Buying a complete ISA cuts these costs forever. China's MIPS cores offer a buy in to the performance(as previously mentioned). There's no/little security risk to apple because they would have the design “source code”.
Emulation isn't exactly the right term (if I understand the design). A more accurate discription would be that (like Intel and AMD) Loongson decodes x86 instructions into micro-ops and those micro-ops are MIPS opcodes. The performance hit is similar to the speedup that could be achieved on Intel or AMD chips if one could bypass the decoder and instead code the micro-ops directly. The big question is patents. While my understanding is that ISA's can't be patented (like APIs can't), the fundamental methods necessary for efficient (ie competitive) implementations can be and are patented. The fundamental x86 architecture (like the fundamental ARM and MIPS architectures) are no longer patent protected (relevant patents have expired). Only new instructions (with patented implementations), barrier of entry (R&D costs), and trade secrets stand in the way. It could be possible to use less efficient methods that bypass these patents and ensure basic x86 execution.
It can't be stressed enough that a 40W, 1GHz, 65nm, 128GFLOPS Loongson beats a 77W, 3.9GHz, 22nm
, 50GFLOPS 3770K ivy bridge (having 2.5x the GFLOPS of float performance). 70% of this performance while emulating x86 still beats the 3770K by 80%. A 3960X achieves 65GFLOPS while using 130W of power. If we assume that Haswell doubles this performance while reducing power consumption by 75%, we then have equivalent performance and power consumption on x86, but with billions more transistors, a 4x faster clockspeed (we know who has the best float IPC here), and 3 node jumps. It's little wonder why China could make a petaflop computer which used only 1/3 the power of similarly powerful machines (all x86) over here. I don't think there's ever been a significant length of time when x86 has held the GFLOPS record (that's almost always been SPARC or POWER since MIPS and Alpha left the market).
2500k gets 130GFLOPS in IBT in raw unoptimized GF at 5Ghz using the AVX extension, however that says nothing about the real performance of these chips. The 3570k(essentially a 2500k) uses 77w, but that includes the integrated GPU as well, so the real power consumption is closer to 45w. Imagine if you could program direct to hardware in assembly, bypassing the instruction set.
I'll wager that Intel's chips will maintain its real theoretical maximum in real workloads. As I understand it, MIPS is a RISC style CPU, so comparing it to an primarily x86 CISC instruction set is fruitless. At the very core, Intel's CISC CPU gets decoding into RISC instruction sets so there is massive overhead. Same can be said of Atom. Intel actually adds extra stages I believe.
If we break it down to the micro-op level, I don't there's much difference between these CPUs which have totally different purposes anyway. Loonsgson was designed for fluid simulations, something its 256-bit floating point vector processors are good at.
At the micro-architecture level, Intel(and dare I say, AMD) appears to have the better, more rounded chip.Edited by BizzareRide - 10/6/12 at 10:14am