Opteron 6200 & Xeon E7 Series CPU Benchmarks Compared
Original Article Date: 2012-01-11
Last year saw the release of a new crop of CPUs in the enterprise multi-socket
arena from both Intel and AMD. In April, Intel announced their new
E7-series Xeon, codename Westmere-EX, built upon the
excellent and highly popular "Westmere" technology in the dual-socket space. And
last November, AMD followed up with the release of their 6200-series Opteron
codenamed Interlagos or Bulldozer
(depending on whether you're referring to the CPU package or core design).
Both the AMD and Intel chips can be used in a quad-socket configuration
(i.e. four physical CPUs on a single motherboard). That said, the
Opterons are also intended for the larger dual-socket market, whilst the Xeons
can also be configured in even octo-socket (eight physical CPUs), dual
motherboard machines. For the purposes of comparison, however, we'll focus on
quad-socket performance, with a couple of examples of 2P and 8P results
demonstrating scalability.
To show performance comparisons, we have used the publicly available benchmarks
SPEC CPU2006, which can be found at the SPEC website
http://spec.org. SPEC is an industry consortium independent working group
whose aim is to provide impartial performance data on a range of computing
equipment.
I have focused on two benchmarks in the SPEC CPU2006 series. These are CINT2006
RATES and CFP2006 RATES. Both of these benchmarks are designed to make full use
of all available CPU cores in a system, so one would expect to see higher
performance figures in systems containing larger numbers of multi-socket,
multi-core processors. CINT performs integer based calculations, whilst CFP
performs in floating-point math.
The following graph shows a comparison of a selection of Opteron 6200 series CPUs
with Xeon E7s. Most are in a quad-socket configuration, although you'll find a
couple in dual and octo-socket configurations for comparison.
Results: Direct Comparison of 6200-series Opteron with E7-series Xeon
My first and most important analysis of the graph is comparing raw performance of
quad-socket offerings between AMD and Intel.
In looking at the graph you will see integer and floating point performance for
systems possessing four CPUs each, in both Intel and AMD. It becomes quickly
apparent that the Intel E7 CPUs, with 32 and 40 cores are of equivalent
performance with the AMD systems ranging from 48 to 64 cores in total. For
example, note that the top-end 16-core 6282SE Opteron is a match for the top-end
10-core Xeon on floating point, and is not far behind it on integer either. And
the mid-range Opteron 12-core 6234 compares with the "entry-level" E7-4820.
The fact that the Opteron CPU needs more cores to match the Xeon is expected. In
general AMD have made up for their weaker core performance by simply piling more
cores into each CPU package, up to 60% more than Intel. Apart from relative cost
(which I'll discuss below), the core count is perhaps the most significant
difference between the two offerings, and one likely to affect end-user choice.
Why? Because certain enterprise software applications benefit from a smaller
number of more powerful cores, either through cheaper software licensing, or
through the design of the software being limited to lower core-counts (either
through design or just poor programming). A
decision as to which way to go (lower core count and faster cores vs higher core
count and slower cores), should be made in full consultation with the primary
software vendor. It is very often the case, however, that software will be able
to fully utilize all available cores efficiently - this is especially common in
academic computing where the end-user has full control over the software
development process.
Results: CPU Socket and Core Number Scalability
Intel's E7 Xeon CPUs come only in 8 and 10 core variants, so there is not a great
deal of comparison between models of different core counts. AMD, however, have
CPUs with core counts from 4, through 8 and 12, to up to 16 per socket,
providing a considerable range of parallel computing capability. Generally
speaking, the higher the core count, the lower the clock speed - compare the
four-core 6204 at 3.3GHz with the 6282SE at 2.6GHz, for instance.
To directly
compare core-count scaling across the range, though, we need to keep the clock
speed constant. This is possible with three of the models being at 2.6GHz clock
speed - the 6212 (8-core), 6238 (12-core) and 6282SE (16-core). Comparison of
integer performance for these three CPUs shows us scores of 460, 620 and 700
respectively, and floating point scores of 489, 720 and 910 respectively. Interestingly, it appears that the
floating-point performance scales almost
linearly with core-count, whereas integer performance improves by only 50% when
the core-count is doubled. Since cost is an important factor in determining the
number of cores going into a CPU, the choice of whether to spend extra money on
high core count models may depend on whether your application is predominately
integer or floating point based. But how do you know which type is used in your
application? Well, integer based programming is your typical general purpose
data processing, most commonly found in servers, such as email, web and database
hosting, as well as video and photo editing and 2D video effects. Floating point
math is most commonly found in scientific and engineering analysis, where
computer models of real world phenomena are the order of the day. In short,
then, if you're a scientist or engineer, go multi-core! If you're not, it's most
likely better to go with faster cores in smaller numbers.
So, that is core scaling. What about socket, or physical CPU package scaling. How
do eight sockets compare with four, or two? Well, interestingly, almost the
opposite effect of integer vs floating-point performance seems to occur with the
Opteron - when you double the number of CPUs from 2 to 4, integer performance
scales almost linearly, whereas floating point performance increases only by
about 80%.
The Xeon E7, when doubled from 4 to 8 CPU sockets, increases in
performance by about 70% for both integer and floating point, which is somewhat
of a disappointment. The suggestion here is that the Xeon does not scale as well
as the Opteron when increasing total number of CPUs, but then this is not a fair
comparison, since with the Opteron we're going from 2 to 4 CPUs, and with the
Xeon from 4 to 8.
Results: Comparison of New Generation with Older Generation CPUs
I included a couple of results of older CPU models, just to demonstrate any
improvement in the new models.
The previous generation Opteron 6174 has an equivalent number of cores and almost
the same clock speed as the new 6234. It turns out that there is very little
difference in performance between these two, which makes one wonder a little
whether there was actually any significant change between these models. However,
the newer model is significantly cheaper, as I'll discuss in the next section.
The older generation X7560 is compared with the new E7-4820, both of which have
eight cores, and have similar clock speeds. Just like the Opteron old-new
comparison, results are similar, and just like the Opteron the newer model is
significantly cheaper (see next section).
Results: Factoring In Cost
The following graph takes the results of the first graph, and divides each result
by the total cost of the CPUs required to produce that result. This shows,
therefore, basic price-performance for each CPU set.
The most obvious, and most important observation on this graph is how
much more performance per dollar spent the Opterons provide compared to the
Xeons. This is an expected result, since the Opteron 6200-series is more of a
mass-market CPU aimed at both the dual-socket and four-socket space, providing
economies of scale with a total number of units sold being much higher than the
specialised four and eight socket Xeon E7 series. Consequently, one sees that
where these two CPU series meet in the middle, in the four-socket space,
AMD have a very significant advantage in price-performance, between four and
fifteen times more performance per dollar spent! This is likely to be
the most significant factor in an end-user deciding between AMD and Intel for
their next enterprise four-socket machine.
The other notable trend in this graph is to demonstrate the significant savings
of the new generation CPUs versus the previous generation. For instance, compare
the performance equivalent Opteron 6174 with the newer 6234, which is a third of
the price. Similarly, the newer Xeon E7-4820 delivers similar performance to the
older X7560, only 2-3 times cheaper.
Summary
- AMD and Intel are comparable in raw performance, but AMD must use up to 60% more
cores than Intel to get this performance. As high core counts benefit scientific
programming more than general purpose server computing, it would seem that
AMD
would be the choice for academics and engineers.
- AMD have a huge price-performance advantage over Intel in the four-socket
enterprise computing space. This alone may push most end-users toward AMD over
Intel.
- AMD no longer provide an eight-socket solution, whereas Intel do. So the
absolute maximum performance in a single box prize still goes to Intel. For some
end-users, this is all that counts, regardless of the price.
- The newer generations of CPUs offer equivalent or better performance than the
previous generation, at a fraction of the price. Advantage: consumer.
Overall, therefore, I would favor AMD at this time, on the basis of their very
large advantage in price over Intel in this space. But Intel still have the lead
when it comes to maximum raw performance in a single box at any cost.
Best regards,
Ben Ranson
Chief Systems Engineer
Electronics Nexus
http://elnexus.com
ben@elnexus.com
|