Overclock.net › Forums › Industry News › Hardware News › [techreport] PS4 architect discusses console's custom AMD processor
New Posts  All Forums:Forum Nav:

[techreport] PS4 architect discusses console's custom AMD processor - Page 5

post #41 of 108
Thread Starter 
Cell processor was the grandfather of APU's

AMD made a true APU while Sony made an one large processor that had a role of CPU and GPU.
post #42 of 108
Quote:
Originally Posted by sdlvx View Post

No, it's a significantly more amount of instructions.

You are also forgetting about AVX. I've seen custom compiled Blender end up more than twice as fast with AVX and other optimizations enabled when comparing Gentoo to Windows.

Phoronix has also seen massive gains in some things (but not always) by enabling AVX.

However what I'm getting at is that if a game can use AVX, FMA, etc very well (which are a part of Jaguar) it would be unsurprising to me if Jaguar in PS4 ended up twice as fast as a Windows version running the same code but in legacy SSE mode.

This has been a major point that people discussing PS4 seem to constantly miss. As someone who has cut their render times in half by switching to AVX instead of a standard 'runs for everyone" Windows exe, I think that to assume that Jaguar will be comparable to desktop CPU performance because they are both x86 is a horribly short sighted. I wouldn't be surprised if a properly optimized for AVX program was twice as fast on PS4 as it was on the PS4 version of the APU AMD is going to sell running SSE code. And if it's x87 (like Skyrim), the AVX version would humiliate it. All on the same silicon.

I think that for a lot of people who are writing off PS4 because it's a 1.6ghz 8 core Jaguar are going to get a very harsh lesson in x86 software optimization and how CISC architectures work where instructions are meant to give more performance, not efficiency or tweaks to the microarchitecture. It should wake a lot of people up to the fact that a bunch of CPU graphs are completely useless if you don't know what kind of software optimizations and compiler settings were used. Needless to say I'm looking forward to the day when people realize running 4 games for a CPU review and declaring a winner is virtually meaningless given how much software choices can make.

EDIT: I'm just going to make a raw speculation based on what I've seen with compiler optimizations and playing with AVX in Gentoo, but in Blender I matched a 3930k clock for clock with my FX 8350 when 3930k had default blender.org exe and I had a fully optimized set up. That, quite honestly, should never happen and it's a massive outlier, but it's proof that software optimization can make a huge difference.

If people are calling Jaguar a little behind IB in IPC (but it can't clock nearly as well, so I'm not saying Jaguar is close to i3 performance in general use cases), leaving it at an 8 core 1.6ghz IB, I wouldn't be surprised that if you were to compare SSE version of a game to AVX version of a game that a single Jaguar core running AVX code would be close to a 3ghz range IB core running SSE (< version 3).

I know this is a sensitive subject and it makes people heated but it's using different instructions so it's not exactly fair to say one is better than the other overall. So don't misunderstand and assume I'm saying Jaguar is better than IB or something, it's not. However if Jaguar is in an environment where it can have highly tuned code for AVX, FMA, etc and IB is in an environment where it's running code that's designed to also run on ancient hardware with a small instruction set, the Jaguar is at a massive advantage and people aren't giving it enough credit, mainly because they see both are x86 CPUs and they assume they are the same.

AVX is a new thing, and it's an x86 optimization not specifically an x86-64. It is also available in 32bit code, so.........? Performance should be had in both, now you might not see the same boosts in 32bit as you would 64bit but it's complicated. The biggest boost comes from memory pools, this doesn't just mean accessing 64bits of ram.
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
post #43 of 108
Quote:
Originally Posted by Artikbot View Post

It doesn't matter at all. You are programming for a specific platform with fully coherent memories down to CPU caches (that's why I mentioned the L2), to the point that a CPU is not a standalone CPU, and a GPU is not a standalone GPU. You can make use of them for what you need them, no need to program in a traditional PC point of view.

You can have the GPU work realtime with the CPU physics calculation without even needing to move them from one place to another, for example.

sooo... is the Titan + 3930k rig obsolete due to this new programming method? I wonder if there's a way to backport this programming into the PC point of view (Seperate GPU and CPU)
     
CPUMotherboardGraphicsRAM
Ryzen R7 1700x Gigabyte Aorus AX370 Gaming 5 ASUS GTX 1080 Turbo @ 2050mhz G.Skill DDR4 16GB 8GBx2 3200 mhz 
Hard DriveHard DriveHard DriveHard Drive
Samsung 750 Evo 250GB Crucial m500 960GB OCZ Trion 150 960GB Samsung 960 Evo 1TB m.2 
Hard DriveOptical DriveCoolingOS
Crucial MX300 2TB DVD drive Corsair H60 2013 edition windows 10 home 
MonitorPowerCaseMouse
32inch Crossover 4K Thermaltake SMART M series 750W Powerspec G418 case Thermal 
CPUGraphicsRAMHard Drive
i7 4910mq GTX 880m (GK104 with 8GB VRAM) 16 GB DDR3 3x Samsung 840 EVO in RAID 0 (1.31 TB total) 
Hard DriveOptical DriveCoolingOS
Samsung M9T 2TB HDD Bluray/DVD drive Bluray/DVD drive Window 7 Home Premium 
MonitorPower
Panasonic 2880x1620 panel 180w Delta brick 
  hide details  
Reply
     
CPUMotherboardGraphicsRAM
Ryzen R7 1700x Gigabyte Aorus AX370 Gaming 5 ASUS GTX 1080 Turbo @ 2050mhz G.Skill DDR4 16GB 8GBx2 3200 mhz 
Hard DriveHard DriveHard DriveHard Drive
Samsung 750 Evo 250GB Crucial m500 960GB OCZ Trion 150 960GB Samsung 960 Evo 1TB m.2 
Hard DriveOptical DriveCoolingOS
Crucial MX300 2TB DVD drive Corsair H60 2013 edition windows 10 home 
MonitorPowerCaseMouse
32inch Crossover 4K Thermaltake SMART M series 750W Powerspec G418 case Thermal 
CPUGraphicsRAMHard Drive
i7 4910mq GTX 880m (GK104 with 8GB VRAM) 16 GB DDR3 3x Samsung 840 EVO in RAID 0 (1.31 TB total) 
Hard DriveOptical DriveCoolingOS
Samsung M9T 2TB HDD Bluray/DVD drive Bluray/DVD drive Window 7 Home Premium 
MonitorPower
Panasonic 2880x1620 panel 180w Delta brick 
  hide details  
Reply
post #44 of 108
Thread Starter 
Quote:
Originally Posted by ChronoBodi View Post

sooo... is the Titan + 3930k rig obsolete due to this new programming method? I wonder if there's a way to backport this programming into the PC point of view (Seperate GPU and CPU)

In terms of performance? No.

In terms of technology and implementation? Yes and no...

AMD could make Intel and nVidia outdated in matter of months if they decide to make an ultimate APU if they had the resources and TMSC and/or Global Foundries would have the needed tools. Imagine an APU that has 8 Steamroller core's that are on par Ivy Bridge in single threaded and on par i7 4770k in multi threaded performance and a GPU on the level of 8970 with 12Gb of GDDR5M/DDR4 RAM memory and all in one neat package with power consumption of 500-650 watts while TDP would be huge a great Phanteks cooler and Indigo Extreme would cool it nicely.

HSA in Kaveri will make technologically all Intel's and nVidia's product look old. HSA is programmers dream come true, coding for dedicated CPU and GPU would look like the pain as CELL processor inside PlayStation 3 was. PlayStation 4 hardware is not impressive but the technology is.

-edit-
@kx11

PlayStation 4 was suppose to have a semi custom Kaveri APU, but the costs of production of it to be ready at the aimed time frame that Sony wanted to be ahead or at least come out at the same time as Microsofts Xbox 720 so they don't do the same mistake as they did before. If they choose Kaveri, PlayStation 4 would cost the same or more than PlayStation 3 at launch since its an exotic.
Edited by vampirr - 4/28/13 at 1:38pm
post #45 of 108
Great to hear hope this helps AMD to get some profit
post #46 of 108
As long as it can hold a steady 60FPS im good, if i see an ounce of lag in heavy scenes I wont even bother.
post #47 of 108
Quote:
Originally Posted by Tsumi View Post

Back then PCs didn't really have crossfire/SLI technologies (I think), and they most definitely did not have 200+ watt GPUs.

SLI been round since 2005 (early implementation) to 2006 where they even started making dual graphics cards.

We've got a lot of new comers round here lately who don't remember the days before. I remember running Nvidia 6XXX series together, back then it was much different than it was today. It has gotten loads better.
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
post #48 of 108
Thread Starter 
SLI was made by 3DFX and then nVidia bought them, SLI was way before 2005.
post #49 of 108
Quote:
Originally Posted by ChronoBodi View Post

sooo... is the Titan + 3930k rig obsolete due to this new programming method? I wonder if there's a way to backport this programming into the PC point of view (Seperate GPU and CPU)

I'm fairly sure there will be backward compatibility. Though the i5 4750K may not be the most cost efficient easily OC-able CPU down the road as games with native 8-core compatibility are ported to PC.

For HSA, AMD will have an upper hand in games that support it. Initially HSA won't be used widely, like many architectures such as SSE2 (introduced in early 2000s, AutoCAD 2010 just started requiring it). But if Nividia and Intel doesn't include HSA compatibility, they may be at a disadvantage in the long run.
post #50 of 108
I always forget about Voodoo, I had one too. =( I'd say it would be fun to build an old rig but really, that would be a lie lol
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Hardware News
Overclock.net › Forums › Industry News › Hardware News › [techreport] PS4 architect discusses console's custom AMD processor