Enterprise & Cloud Benchmarks

Below you can find Intel's internal benchmarking numbers. The EPYC 7601 is the reference (performance=1), the 8160 is represented by the light blue bars, the top of the line 8180 numbers are dark blue. On a performance per dollar metric, it is the light blue worth observing.

Java benchmarks are typically unrealistically tuned, so it is a sign on the wall when an experienced benchmark team is not capable to make the Intel 8160 shine: it is highly likely that the AMD 7601 is faster in real life.

The node.js and PHP runtime benchmarks are very different. Both are open source server frameworks to generate for example dynamic page content. Intel uses a client load generator to generate a real workload. In the case of the PHP runtime, MariaDB (MySQL derivative) 10.2.8 is the backend.

In the case of Node.js, mongo db is the database. A node.js server spawns many different single threaded processes, which is rather ideal for the AMD EPYC processor: all data is kept close to a certain core. These benchmarks are much harder to skew towards a certain CPU family. In fact, Intel's benchmarks seem to indicate that the AMD EPYC processors are pretty interesting alternatives. Surely if Intel can only show a 5% advantage with a 10% more expensive processor, chances are that they perform very much alike in the real world. In that case, AMD has a small but tangible performance per dollar advantage.

The DPDK layer 3 Network Packet Forwarding is what most of us know as routing IP packets. This benchmark is based upon Intel own Data Plane Developer Kit, so it is not a valid benchmark to use for an AMD/Intel comparison.

We'll discuss the database HammerDB, NoSQL and Transaction Processing workloads in a moment.

The second largest performance advantage has been recorded by Intel testing the distributed object caching layer memcached. As Intel notes, the benchmark was not a processing-intensive workload, but rather a network-bound workload. As AMD's dual socket system is seen as a virtual 8-socket system, due to the way that AMD has put four dies onto each processor and each die has a sub-set of PCIe lanes linked to it, AMD is likely at a disadvantage.


Intel's example of network bandwidth limitations in a pseudo-socket configuration

Suppose you have two NICs, which is very common. The data of the first NIC will, for example, arrive in NUMA node 1, Socket 1, only to be accessed by NUMA node 4, Socket 1. As a result, there is some additional latency incurred. In Intel's case, you can redirect a NIC to each socket. With AMD, this has to be locally programmed, to ensure that the packets that are sent to each NICs are processed on each virtual node, although this might incur additional slowdown.

The real question is whether you should bother to use a 2S system for Memached. After all, it is distributed cache layer that scales well over many nodes, so we would prefer a more compact 1S system anyway. In fact, AMD might have an advantage as in the real world, Memcached systems are more about RAM capacity than network or CPU bottlenecks. Missing the additional RAM-as-cache is much more dramatic than waiting a bit longer for a cache hit from another server.

The virtualization benchmark is the most impressive for the Intel CPUs: the 8160 shows a 37% performance improvement. We are willing to believe that all the virtualization improvements have found their way inside the ESXi kernel and that Intel's Xeon can deliver more performance. However, in most cases, most virtualization systems run out of DRAM before they run out of CPU processing power. The benchmarking scenario also has a big question mark, as in the footnotes to the slides Intel achieved this victory by placing 58 VMs on the Xeon 8160 setup versus 42 VMs on the EPYC 7601 setup. This is a highly odd approach to this benchmark.

Of course, the fact that the EPYC CPU has no track record is a disadvantage in the more conservative (VMware based) virtualization world anyway.

Competitive Analysis and Price Comparisons Database Performance & Variability
Comments Locked

105 Comments

View All Comments

  • Johan Steyn - Monday, December 18, 2017 - link

    I am so glad people are realising ANandtechs rubish, probably led by Ian who wrote that terrible Threadripper review. Maybe he will realise it as more complain. It all depends on how much Intel is paying him...
  • mapesdhs - Wednesday, November 29, 2017 - link

    ANSYS is one of those cases where having massive RAM really matters. I doubt if any site would bother speccing out a system properly for that. One ANSYS user told me he didn't care about the CPU, just wanted 1TB RAM, and that was over a decade ago.
  • rtho782 - Tuesday, November 28, 2017 - link

    > Xeon Platinum 8160 (24 cores at 2.1 - 3.7 GHz, $4702k)

    $4,702,000? Intel really have bumped up their pricing!!
  • bmf614 - Tuesday, November 28, 2017 - link

    The pricetag discussion really needs to include software licensing as well. Windows Datacenter and SQL server on a machine with 64 cores will cost more than the hardware itself. This is the reason that the Xeon 5122 exists.
  • bmf614 - Tuesday, November 28, 2017 - link

    Also isnt it kind of silly to invest in a server platform with limited PCIE performance when faster and faster storage and networking is becoming commonplace?
  • Polacott - Tuesday, November 28, 2017 - link

    it really seems that AMD has crushed Intel this time. Also Charlie has some interest points about security ( has this topic being even analyzed here ? https://www.semiaccurate.com/2017/11/20/epyc-arriv... )
    Software WILL be tuned for Epyc, so a safe bet will not be getting Xeon but Epic, for sure.
    And power consumption and heat is really important as is an interesting part of datacenter maintenance costs.
    I really don't get how the article ends up in this conclusion.
  • Johan Steyn - Monday, December 18, 2017 - link

    Intel's financial support helps them reach this conclusion. Very sad
  • ZolaIII - Tuesday, November 28, 2017 - link

    As usually Intel cheated. Clients won't use neither their property compiler nor a software but GNU one's. Now let me show you a difference:
    https://3s81si1s5ygj3mzby34dq6qf-wpengine.netdna-s...
    Other than that this is boring as ARM NUMA based server chips are coming with some backup from good old veterans when it comes to to supercomputing and this time around Intel won't have even a compiler advantage to drag about it.
    Sources:
    https://www.nextplatform.com/2017/11/27/cavium-tru...
    http://www.cavium.com/newsevents-Cray-Catapults-Ar...
    Now this are the real news & melancholic ones for me as it brings back memories how it all started. & guess what? We are back their on the start again.
  • toyotabedzrock - Tuesday, November 28, 2017 - link

    Linux 4.15 has code to increase EPYC performance and enable the memory encryption features. 4.16 will have the code to enable the virtual machine memory encryption.
  • duploxxx - Friday, December 1, 2017 - link

    thx for sharing the article Johan, as usual those are the ones I will always read.

    Interesting to get feedback from Intel on benchmark compares, this tells how scared they really are from the competition. There is no way around, I' ve been to many OEM and large vendor events lately. One thing is for sure, the blue team was caught with there pants down and there is for sure interest from IT into this new competitor.

    Now talking a bit under the hood, having had both systems from beta stages.

    I am sure Intel will be more then happy to tell you if they were running the systems with jitter control. Off course they wont tell the world about this and its related performance issues.

    Second, will they also share to the world that there so called AVX enhancement have major clock speed disadvantages to the whole socket. really nice in virtual environments :)

    Third, the turbo boosting that is nowhere near the claimed values when running virtualization?
    Yes the benchmarking results are nice, but they don't give real world reality, its based on synthetic benches. Real world gets way less turbo boost due to core hot spots and there co-related TDP.

    There are reasons why large OEM did not yet introduce EPYC solutions, they are still optimizing BIOS and microcode as they want to bring a solid performing platform. The early tests from Intel show why.

    Even the shared VMware bench can be debated with no shared version info as the 6.5u1 has got major updates to the hypervisor with optimizations for EPYC.

    Sure DB benches are an Intel advantage, there is no magic to it looking at the die configurations, there are trade offs. But this is ONLY when the DB are bigger then certain amount of dies so we are talking here about 16+ cores from the 32 cores/socket systems for example, anything lower will have actually more memory bandwidth then the Intel part. So how reliable are these benchmarks for a day to day production.... not all are running the huge sizes. And those who do should not just compare based on synthetical benches provided but do real life testing.

    Aint it nice that a small company brings a new CPU line and already Intel needs to select there top bin parts as a counter part to show benchmarks to be better. There are 44 other bins available on the Intel portfolio, you can probably already start guessing how well they really fare against there competitor....

Log in

Don't have an account? Sign up now