Head-to-Head: 12-core Opteron and 6-core Xeon
Original Article Date: 2010-07-22
Separated by only a couple of weeks, Intel and AMD launched new CPUs back in the
spring. Intel's 5600-series ("Westmere") improved on the the existing Nehalem
5500-series through a die shrink, increased cache size, and the introduction of
their first 6-core CPUs. Not to be outdone, AMD launched a completely overhauled
design with their 6100-series ("Magny-Cours"), in which the world's first
mainstream 8-core and 12-core CPUs saw the light of day.
So now that a few months have passed, all the launch hype and buzz has given way
to hard questions about the real world performance of this new crop of server
and workstation processors.
Many of you have been asking me, simply, which is going to be the fastest at the
best price? AMD or Intel? The answer from me (annoyingly) is - "Depends".
 A recent build using 2
of the new Intel Xeon X5680 CPUs
With Intel, we're dealing much more with a known quantity, since the Westmere
5600s are simply an improvement on an already established thoroughbred line of
CPUs - the "Nehalem" 5500 series. And benchmark figures for the 5600-series show
the expected increases in performance as a result of the 45nm to 32nm die
shrink, and 50%+ speed increases over the previous 4-core models, with their new
6-core chips.
 One of our first Quad
AMD Opteron 6100-series workstations
The AMD 6100-series Opterons have been harder to judge, since we're dealing not
only with a completely new design (new socket, DDR3 quad-channel RAM instead of DDR2
dual-channel, etc),
but an unprecedented number of processing cores available to the common dual
processor super-user. The earlier answer "Depends" is related to whether
software is able to take advantage of the 24 or 48 processing cores now
available on a 2P or 4P Opteron-based system. Most of the commercial software
that is running on PCs today (including some benchmarking tools) was written with
only a few processing cores (or only one!) in mind, and so when such software is
run on a new Opteron system, it will only use a fraction of the potential
processing power of the new machine.
AMD Core/Die/Socket Architecture and Correct DIMM Bank
Assignment
An additional issue with the AMD CPU package design is operating system support.

The picture shows that the AMD design is not a true 8/12-core but is in fact
two
4/6-cores incorporated into a single package. Intel did the same trick with
their first quad core CPUs 3 years ago, before moving to a true quad core with
Nehalem, so it's not like AMD can be bashed exclusively for doing this "cheat".
Why is this significant? Because it introduces an additional level of complexity
in assigning which core should get which DIMM bank when asking for RAM.
The
6100-series CPUs have a single memory controller that sits in between the two
CPU dies within each package. Ideally, the operating system works with the memory controller to determine best how to
route requests for memory from CPUs, but if the OS is ignorant about such an
architecture, it will effectively do a random assignment of cores to DIMM banks,
which is likely to result in RAM bottlenecks, especially when you also consider
that a system may have two or four sockets, all of which can share RAM between
themselves. At this time (July 2010), the latest versions of Windows 7 and Server fit into
such a category, and it is unclear to me which versions of Linux can
efficiently assign DIMM banks to cores.
This does not mean all is lost, however. Even within Windows, a skilled
programmer can set appropriate core and bank "affinity" to ensure efficient
routing of memory requests. So we may be in a phase where, to routinely take
advantage of the extra power of the AMD 8/12-core design, we have to wait for the OS to catch up in its support of it.
Opposing Chips Benchmarked
The usual caveat, of course here - "Benchmarks are synethic and may not
approximate performance in real-world computing scenarios". And yet, lacking
direct and informative feedback on a wide variety of real-world applications,
they're really all we have. And benchmarks at least
give us a general idea of the performance of specific CPUs in comparison to one
another.
So, if you've continued reading, after the above section's warnings on optimizing
the 6100-series Opteron with your software, then good. Because AMD's new CPU has
really shined in the benchmarks against Intel's.
The first report is from Lavalys Everest, a software suite which contains a
series of ten integer and floating point benchmarks that can be performed. The
Everest graph (pictured), which is an normalized aggregate of all the tests
carried out, shows that the 6100-series Opteron competes very well against the
fastest Intel Xeon X5680 CPU. Note also the improvement of both Intel and AMD
chips against their previous generation models (the W5590/X5550 and Opteron 2435
respectively).
I also performed some arithmetic benchmarks of 4 x Opteron 6168 1.9GHz and 2 x
Xeon X5680 3.33GHz CPUs in the latest version of SiSoft Sandra. There is no
comparison graph for these, since I've only very recently started using this
benchmark. I can report, however, that the aggregate performance of the Opterons
exceeded that of the Xeons by 40%. Considering that there are twice the number
of Opteron CPUs as Xeons, this would mean each Xeon is outperforming the each
Opteron CPU by about 40%. This is at odds with the SPEC results, which I detail
below, suggesting that Sandra may not be optimized for the Opteron.
The final report is from the SPEC organization, a consortium of chip
manufacturers intended to show balanced reporting of hardware performance. The
first of two SPEC graphs below show raw floating point and integer multi-core
("Rates")
performance.
The performance graph shows that the fastest system is that using four
Opteron 6100 series CPUs, outperforming the fastest dual Xeon system by about
two-to-one. Head-to-head comparison of two CPU Xeon and Opteron systems show
near-parity between the two CPU manufacturers.
The performance graph shows that for the Xeons, expected improvements in
performance are reported over previous generations (X5680 vs W5590 - about a 20%
improvement due to the die shrink). The new generation Opterons, however, almost
doubly outperform the previous generation 2439 model.
These results, which favor the Opterons, seem to suggest that the SPEC platform
is able to fully optimize the more complex core/die/socket architecture of these
CPUs, since they compare so well to the Xeons and the previous generation
Opterons.
The second graph compiled from SPEC data shows CPU multi-core performance indexed
to the cost of the CPU, namely "price-performance".

Here, our #1 selling CPU, the Xeon E5620, shows why it is so - because at its
competitive price point, it offers excellent performance. But even this
processor is beaten by the 8-core Opteron
6128, which, at its even lower price point tops out the price-performance score.
As
expected, higher performance and higher clock CPUs on both sides show lower
price-performance.
~
If the benchmarks shown in this article are any guide to performance, then it
shows that, with the correct optimizations, AMD can still beat Intel in the
server and workstation space, both in flat-out performance (using four
processors vs Intel's two), and on
price-performance. But that key word "optimizations" is sure to make a few of you
nervous about taking the AMD plunge.
My final word (at least, for now), on this is: if you're in academic computing,
and have full control over your software, the AMD Opteron will be
sure-to-please. If you're reliant on third-party software, especially on the
Windows platform, Intel is likely to be the safer bet.
Best regards,
Ben Ranson
Chief Systems Engineer
Electronics Nexus
http://elnexus.com
ben@elnexus.com
1-877-773-5366
|