GCN 1.1 has more power states , more instruction sets, and TrueAudio.
The biggest change here is support for flat (generic) addressing support, which will be critical to enabling effective use of pointers within a heterogeneous compute context. Coupled with that is a subtle change to how the ACEs (compute queues) work, allowing GPUs to have more ACEs and more queues in each ACE, versus the hard limit of 2 we’ve seen in Southern Islands. The number of ACEs is not fixed – Hawaii has 8 while Bonaire only has 2 – but it means it can be scaled up for higher-end GPUs, console APUs, etc. Finally GCN 1.1 also introduces some new instructions, including a Masked Quad Sum of Absolute Differences (MQSAD) and some FP64 floor/ceiling/truncation vector functions.
Along with these architectural changes, there are a couple of other hardware features that at this time we feel are best lumped under the GCN 1.1 banner when talking about PC GPUs, as GCN 1.1 parts were the first parts to introduce this features and every GCN 1.1 part (at least thus) far has that feature. AMD’s TrueAudio would be a prime example of this, as both Hawaii and Bonaire have integrated TrueAudio hardware, with AMD setting clear expectations that we should also see TrueAudio on future GPUs and future APUs.
AMD’s Crossfire XDMA engine is another feature that is best lumped under the GCN 1.1 banner. We’ll get to the full details of its operation in a bit, but the important part is that it’s a hardware level change (specifically an addition to their display controller functionality) that’s once again present in Hawaii and Bonaire, although only Hawaii is making full use of it at this time.
Finally we’d also roll AMD’s power management changes into the general GCN 1.1 family, again for the basic reasons listed above. AMD’s new Serial VID interface (SIV2), necessary for the large number of power states Hawaii and Bonaire support and the fast switching between them, is something that only shows up starting with GCN 1.1. AMD has implemented power management a bit differently in each product from an end user perspective – Bonaire parts have the states but lack the fine grained throttling controls that Hawaii introduces – but the underlying hardware is identical.
With that in mind, that’s a short but essential summary of what’s new with GCN 1.1. As we noted way back when Bonaire launched as the 7790, the underlying architecture isn’t going through any massive changes, and as such the differences are of primarily of interest to programmers more than end users. But they are distinct differences that will play an important role as AMD gears up to launch HSA next year. Consequently what limited fracturing there is between GCN 1.0 and GCN 1.1 is primarily due to the ancillary features, which unlike the core architectural changes are going to be of importance to end users. The addition of XDMA, TrueAudio, and improved power management (SIV2) are all small features on their own, but they are features that make GCN 1.1 a more capable, more reliable, and more feature-filled design than GCN 1.0.
AMD's Bonaire graphics processor has been kicking around inside the Radeon HD 7790 since March, and all the while, it's been harboring some secret features. Behind closed doors at the GPU14 event, we learned that Bonaire is based on the same "IP pool" as Hawaii, the next-gen GPU scheduled to premiere inside the R9 290X later this year.
In short, Bonaire has many of the same architectural perks as Hawaii: improved shaders (which also appeared in the Kabini APU), embedded TrueAudio DSP cores, and greater flexibility when it comes to connecting multiple monitors. Bonaire also has the same power management mojo as Hawaii, but unlike the other features, AMD made that functionality public at the 7790's launch.
Like Hawaii, Bonaire has shaders that support flat memory addressing and MQSAD (or masked quad sum of absolute difference) operations. With flat addressing, the idea seems to be to combine system and GPU memory into a single address space. This, among other things, should help facilitate the development of GPU computing applications.
Bonaire also supports AMD's new TrueAudio technology. Inside the GPU silicon are Tensilica HiFi EP Audio DSP cores, a streaming DMA engine, 384KB of shared internal memory, and a low-latency bus interface that ties the DSP cores to the GPU's frame buffer and main system memory. AMD doesn't say how many DSP cores there are, but it tells us they run at 800MHz, and it claims Bonaire and Hawaii have the same DSP config. That means the two chips should have the same audio processing capabilities, despite their being aimed at wildly different price points.
Thanks to TrueAudio, game developers will be able to implement advanced spatialization and reverb effects based on in-game geometry. At the GPU14 event, AMD demoed elevation and depth perception simulations on a 7.1-speaker setup. A 3D sound stage was also emulated using two speakers. The TrueAudio pipeline is programmable, so developers should have some freedom to tweak those effects and perhaps to use the DSPs for other things.
As we understand it, using specialized DSP cores is better than simply processing advanced audio effects in software, which can tax low-end CPUs and yield inconsistent performance. Crucially for audio, the specialized DSP approach also incurs lower latency than processing sound in GPU shaders via DirectCompute or OpenCL.
Bonaire is rigged to offer higher floating-point math performance, more texturing capability, and better tessellation performance than Cape Verde. Also, as you'll see on the next page, AMD equips Bonaire with substantially faster GDDR5 RAM, which gives it a bandwidth advantage despite its identical memory controller setup.
In addition to the different unit mix, Bonaire has learned a trick from Trinity and Richland, AMD's mainstream APUs. That trick takes the form of a new Dynamic Power Management (DPM) microcontroller, which enables Bonaire to switch between voltage levels much quicker than Cape Verde or other members of the Southern Islands family.
eight discrete DPM states, each with a different clock speed and voltage. Bonaire can switch between those states as quickly as every 10 milliseconds, which removes the need for the "inferred" states seen in Tahiti—that is, clock speed reductions without corresponding voltage cuts. This means the GPU can very quickly select the optimal clock speed and voltage combination to offer the best performance at the predefined power envelope.
HD7790 and beyond have more power states ("p-states") and for dynamic loads such as gaming it is a power savings. For GPGPU where the GPU is loaded 100% all the time it matters much less.
With Kaveri also having True Audio , it would be a top to bottom TrueAudio stack once they release 384-bit/256-bit memory Radeons at ~$200-300 with the feature. TrueAudio ought to be named TrueDSP though.Edited by AlphaC - 11/16/13 at 1:54pm