[See the SAP SD 2-Tier Internet Configuration chart in the above article link]
Intel probably feels that news of its next generation Xeon processors with Nehalem core has been a little thin on the ground in recent weeks. But now â€“ most likely with the acquiescence of Intel itself â€“ two results in the SAP Benchmark SAP SD 2-Tier Internet Configuration have appeared, which show the forthcoming Xeon X5570 dual-socket processor for servers in a particularly good light. At a clock frequency of 2.93 GHz and with a virtually identical system configuration, the new processor delivers nearly twice the performance of its 3.33 GHz predecessor, the Xeon X5470 â€“ namely 25,000PDF SAPS points compared with 12,600PDF.
The SAP SD releases also confirm the speculation about the model name of the forthcoming Xeon as the X5570, which Intel has so far referred to by the code name Nehalem-EP. Hewlett Packard (HP) plans to fit the new processors in its ProLiant DL380 G6 server; the Fujitsu Siemens Computers (FSC) machine is expected to be named Primergy TX300 S5. HP is currently using the Quad Core Xeons, mentioned in the above benchmark, in its ProLiant DL380 G5; both benchmark runs from HP were carried out with SAP ERP 6.0. The operating system in each instance was Windows Server 2003 and the database was SQL Server 2005. The ProLiant DL380 G5 had 32 GBytes of memory and the DL380 G6 48 GBytes. The new Nehalem EP servers will use six channels of registered DDR3 DIMMs, or two per CPU, rather than four fully-buffered DIMM memory channels. The processors communicate via QPI.
Unfortunately, no server manufacturer publishes SAP SD 2-Tier benchmark results for a machine with four of the new 45 nm Opterons (Opteron 8384). The performance difference between configurations with eight of the old 65 nm Opterons (Opteron 8360 SE) and those with eight of the new AMD processors is an impressive 32 per cent, which means that an HP ProLiant DL585 G5 with eight Opteron 8384s should deliver 25.000 SAPS â€“ making a server with four of the new Opterons about as fast in the SAP SD 2-Tier benchmark as a machine with two Xeon X5570s.
The Nehalem Xeons make life difficult for their own stablemates, the expensive six-core MP Xeon (Dunnington), whose excellent SAP SD performance Intel is keen to publicise. However, these MP Xeons will also run in systems with eight or even more processors, something the Xeon 5000 series does not cater for. Intel also has plans to deploy the Nehalem architecture (Nehalem EX, Beckton) in the multiprocessor (MP) Xeon 7000 Series, but not until the second half of 2009.
As the Core i7, in which the Nehalem architecture first appeared, has shown the new processors from Intel also have a much higher floating-point performance than their predecessors. As a result, the Nehalem Xeons are already receiving preferential treatment in some tenders for HPC-clusters, such as the HLRN II cluster in Berlin and Hanover, the NASA Pleiades system and the French Bull CEA/GENCI cluster. A publicationPDF issued by the EU project PRACE describes plans to build a supercomputer at the JÃ¼lich research centre (in partnership with the CEA) using 3072 modules, each with two Nehalem Xeons â€“ a total of 24,576 CPU cores. A combined supercomputer is to be built at the Stuttgart High Performance Computer Center (HLRS), apparently for the Teraflop Workbench. This will comprise an NEC SX-9 vector computer with four to eight nodes (or up to 128 cores) and 13 TFlop/s, together with an HPC cluster of 64 to 512 Dual Xenon systems said to deliver up to 50 TFlop/s. Until now, the Teraflop Workbench environment comprised an SX-8 vector computer, together with a Linux cluster of 200 older Xeon machines.
The Japanese research centre Riken has ordered a Nehalem cluster from Fujitsu. This will contain 1024 Primergy servers, each with two Nehalem EP processors, delivering around 108 TFlop/s â€“ approximately nine times more than the x86-based Riken Super Combined Cluster System (RCCS), which is also a combination of parallel and (NEC SX-7) vector computers. Riken will also be using a co-processor known as GRAPE that it developed with IBM especially for protein research. This is now in version MDGRAPE-3 and believed to be capable of approximately 165 GFlop/s per chip.