From the AMD http://www.amdzone.com/phpbb3/viewto...138624#p205857
"Average IPC integer will be higher. If you consider that the average IPC with pure integer code is something about 0.8-1, only logical conclusion is that the K10 ALU/AGU-s are pretty much underutilized. Main idea with Bulldozer is to make it for better utilization of resources.
If you have 2 ALU's with utilization of 60% that's give you IPC of 1.2 on average.
If you have 3 ALU's with utilization of 35% that's give you IPC of 1.05 on average.
Which is better, and more power efficient?
AMD's approach is to make smaller integer core and better utilized. Intel's approach is to make larger integer core and increase utilization with hyperthreading.
However, hyperthreading brings ~25% more utilization of 3-way integer core. If your integer core has 45% utilization with single thread, that is 1.35, per single thread, which is around ~20% faster than K10 for integer ops. With hyperthreaded core, utilization can go up to 1.65 IPC, which is 52-55% of utilization of 3 ALU's in the Sandy B. core.
With Bulldozer and two threads you can go up 60% of utilization of 2 ALU cores, which is 2.4 IPC with two threads per module vs. 1.5-1.6 IPC with two threads per Intel FAT core. That is ~50% faster than FAT core per clock.
Goal is to reach 50% more performance with 33% more ALU's (4 ALU's vs. 3 ALU's) and probably with same power envelope.
With single thread integer IPC will be slightly lower than SB, but not all workloads are integer. There is a lot of mixed code, and there are dedicated load/store and data caches with BD module vs. SB core."
Need to thank Tosh of the link to the discussion that the above was taken from.