Overclock.net - An Overclocking Community - Reply to Topic

Thread: [AMD] AMD Financial Analyst Day 2015 Reply to Thread
Title:
Message:

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in


  Additional Options
Miscellaneous Options

  Topic Review (Newest First)
05-20-2015 06:27 PM
Tojara
Quote:
Originally Posted by deepor View Post

AMD are also the ones that still might sell a CPU using their FX brand without any integrated graphics for a price that competes with the Intel CPUs for the LGA1150 socket. The integrated graphics eat up a lot of space on the die of those Intel CPUs. A competing product that comes close to the performance of the individual cores of those CPUs, but then has six cores or eight cores instead of just max. four would have been a very attractive CPU for me personally.

I think a product like that would compete very well while manufacturing costs would be good for AMD's profit. The added cores compared to Intel would not increase the cost for them because they would be missing the graphics part. For me, 16 threads for the price of an LGA1150 i7 sounds like a pretty sweet deal even if the cores are only roughly similar performance.
The platform will probably be rather cheap as well, if it really has integrated north+southbridge. There isn't all that much left on the motherboard at that point.
05-20-2015 12:04 PM
deepor
Quote:
Originally Posted by Redwoodz View Post

Correct. I think AMD is going for performance per watt. I'm fine with a 2.8GHz chip that is 40% faster.

AMD are also the ones that still might sell a CPU using their FX brand without any integrated graphics for a price that competes with the Intel CPUs for the LGA1150 socket. The integrated graphics eat up a lot of space on the die of those Intel CPUs. A competing product that comes close to the performance of the individual cores of those CPUs, but then has six cores or eight cores instead of just max. four would have been a very attractive CPU for me personally.

I think a product like that would compete very well while manufacturing costs would be good for AMD's profit. The added cores compared to Intel would not increase the cost for them because they would be missing the graphics part. For me, 16 threads for the price of an LGA1150 i7 sounds like a pretty sweet deal even if the cores are only roughly similar performance.
05-20-2015 11:22 AM
Redwoodz
Quote:
Originally Posted by Alatar View Post

Realistically performance/watt of a CPU architecture / node combo at desktop/server clocks is basically the same thing as absolute performance.

If you have better perf/watt you can just keep adding cores and clocks until the competing architecture with worse perf/watt can't keep up anymore.

This is especially true for servers where due to the prices big dies aren't as much of an issue.

Correct. I think AMD is going for performance per watt. I'm fine with a 2.8GHz chip that is 40% faster.
05-10-2015 06:43 AM
Alatar
Quote:
Originally Posted by Nnimrod View Post

Well acording to Lisa, absolute performance and being competitive with intel matter. So its more than just performance/$ or performance/watt.

Realistically performance/watt of a CPU architecture / node combo at desktop/server clocks is basically the same thing as absolute performance.

If you have better perf/watt you can just keep adding cores and clocks until the competing architecture with worse perf/watt can't keep up anymore.

This is especially true for servers where due to the prices big dies aren't as much of an issue.
05-10-2015 04:53 AM
The Stilt
Quote:
Originally Posted by Themisseble View Post

Why do you think FPU is mainly for AVX2?

I don´t, that was just the simplest example I could think of.
05-10-2015 04:47 AM
Kuivamaa
Quote:
Originally Posted by The Stilt View Post

I think it is rather optimistic to expect the FPU performance to double with 256-bit FMAC.

All 15h family cores have two 128-bit FMACs which are automatically either configured to unganged (2 * 128-bit) or ganged (256-bit) mode.
If the "slave core" within a compute unit is shedded the remaining "master core" (BSC) will have both of the FMACs in it´s private disposal.
On Bulldozer this resulted in slight performance boost especially in FP workloads but in later µArch iterations (Piledriver and newer) it makes absolutely no difference.

Why exactly this is, is beyond me.

The 256-bit FMAC is indeed mandatory for Zen as otherwise it will suffer severely when AVX2 will be more common.

AMD Piledriver

Similar microarchitecture to Bulldozer
Supports fused multiply-and-add instructions in both the FMA3 and FMA4 form. FMA3 is compatible with Intel processors. See Wikipedia for a discussion of the incompatibility between these instruction sets.
The throughput of FMA3 instructions is only half as much as the throughput of FMA4 instructions, even though they are doing exactly the same calculations.
Memory writes with the 256-bit AVX registers are exceptionally slow. The measured throughput is 5 - 6 times slower than on the previous model (Bulldozer), and 8 - 9 times slower than two 128-bit writes. No explanation for this has been found. This design flaw is likelty to negate any advantage of using the AVX instruction set.
The problems with cache performance on the Bulldozer seem to have been fixed in the Piledriver


http://www.agner.org/optimize/blog/read.php?i=285

It could be errata. Regardless, I expect a huge boost in FPU performance come Zen.
05-10-2015 04:09 AM
Themisseble
Quote:
Originally Posted by The Stilt View Post

I think it is rather optimistic to expect the FPU performance to double with 256-bit FMAC.

All 15h family cores have two 128-bit FMACs which are automatically either configured to unganged (2 * 128-bit) or ganged (256-bit) mode.
If the "slave core" within a compute unit is shedded the remaining "master core" (BSC) will have both of the FMACs in it´s private disposal.
On Bulldozer this resulted in slight performance boost especially in FP workloads but in later µArch iterations (Piledriver and newer) it makes absolutely no difference.

Why exactly this is, is beyond me.

The 256-bit FMAC is indeed mandatory for Zen as otherwise it will suffer severely when AVX2 will be more common.

Why do you think FPU is mainly for AVX2?
05-10-2015 03:56 AM
The Stilt
Quote:
Originally Posted by Themisseble View Post

With 256Bit FPu they should reach 2x FPu performance and thats what they need.

I think it is rather optimistic to expect the FPU performance to double with 256-bit FMAC.

All 15h family cores have two 128-bit FMACs which are automatically either configured to unganged (2 * 128-bit) or ganged (256-bit) mode.
If the "slave core" within a compute unit is shedded the remaining "master core" (BSC) will have both of the FMACs in it´s private disposal.
On Bulldozer this resulted in slight performance boost especially in FP workloads but in later µArch iterations (Piledriver and newer) it makes absolutely no difference.

Why exactly this is, is beyond me.

The 256-bit FMAC is indeed mandatory for Zen as otherwise it will suffer severely when AVX2 will be more common.
05-10-2015 03:50 AM
Cyrious
Quote:
Originally Posted by Themisseble View Post

With 256Bit FPu they should reach 2x FPu performance and thats what they need.
x2 if the block diagram is at any way correct. 256 bit floating point is going to be almost 4 times faster, as there are hopefully going to be 2 256-bit float units that each can run one 256-bit float per cycle, vs the construction cores taking two cycles per 256 bit float.

Of course though, i could be completely wrong about it, so take it with a grain of salt.
05-10-2015 02:58 AM
Themisseble
Quote:
Originally Posted by The Stilt View Post

I´ll call Zen as a success if:

- The average IPC increases by 40% or more and IPC in FP workloads increases by 55% or more, over Piledriver
- The base clocks are at least 3.2GHz immediately upon the release on 8 core SKUs
- The desktop platform (socket infrastructure) supports up to 16 cores / 32 threads or alternatively two separate nodes

With 256Bit FPu they should reach 2x FPu performance and thats what they need.
This thread has more than 10 replies. Click here to review the whole thread.

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off