CPU Performance

For simplicity, we are listing the percentage performance differentials in all of our CPU testing – the number shown is the % performance of having SMT2 enabled compared to having the setting disabled. Our benchmark suite consists of over 120 tests, full details of which can be found in our #CPUOverload article.

Here are the single threaded results.

Single Threaded Tests
AMD Ryzen 9 5950X
AnandTech SMT Off
Baseline
SMT On 
y-Cruncher 100% 99.5%
Dwarf Fortress 100% 99.9%
Dolphin 5.0 100% 99.1%
CineBench R20 100% 99.7%
Web Tests 100% 99.1%
GeekBench (4+5) 100% 100.8%
SPEC2006 100% 101.2%
SPEC2017 100% 99.2%

Interestingly enough our single threaded performance was within a single percentage point across the stack (SPEC being +1.2%). Given that ST mode should arguably give more resources to each thread for consistency, the fact that we see no difference means that AMD’s implementation of giving a single thread access to all the resources even in SMT mode is quite good.

The multithreaded tests are a bit more diverse:

Multi-Threaded Tests
AMD Ryzen 9 5950X
AnandTech SMT Off
Baseline
SMT On
Agisoft Photoscan 100% 98.2%
3D Particle Movement 100% 165.7%
3DPM with AVX2 100% 177.5%
y-Cruncher 100% 94.5%
NAMD AVX2 100% 106.6%
AIBench 100% 88.2%
Blender 100% 125.1%
Corona 100% 145.5%
POV-Ray 100% 115.4%
V-Ray 100% 126.0%
CineBench R20 100% 118.6%
HandBrake 4K HEVC 100% 107.9%
7-Zip Combined 100% 133.9%
AES Crypto 100% 104.9%
WinRAR 100% 111.9%
GeekBench (4+5) 100% 109.3%

Here we have a number of different factors affecting the results.

Starting with the two tests that scored statistically worse with SMT2 enabled: yCruncher and AIBench. Both tests are memory-bound and compute-bound in parts, where the memory bandwidth per thread can become a limiting factor in overall run-time. yCruncher is arguably a math synthetic benchmark, and AIBench is still early-beta AI workloads for Windows, so quite far away from real world use cases.

Most of the rest of the benchmarks are between a +5% to +35% gain, which includes a number of our rendering tests, molecular dynamics, video encoding, compression, and cryptography. This is where we can see both threads on each core interleaving inside the buffers and execution units, which is the goal of an SMT design. There are still some bottlenecks in the system affecting both threads getting absolute full access, which could be buffer size, retire rate, op-queue limitations, memory limitations, etc – each benchmark is likely different.

The two outliers are 3DPM/3DPMavx, and Corona. These three are 45%+, with 3DPM going 66%+. Both of these tests are very light on the cache and memory requirements, and use the increased Zen3 execution port distribution to good use. These benchmarks are compute heavy as well, so splitting some of that memory access and compute in the core helps SMT2 designs mix those operations to a greater effect. The fact that 3DPM in AVX2 mode gets a higher benefit might be down to coalescing operations for an AVX2 load/store implementation – there is less waiting to pull data from the caches, and less contention, which adds to some extra performance.

Overall

In an ideal world, both threads on a core will have full access to all resources, and not block each other. However, that just means that the second thread looks like it has its own core completely. The reverse SMT method, of using one global core and splitting it into virtual cores with no contention, is known as VISC, and the company behind that was purchased by Intel a few years ago, but nothing has come of it yet. For now, we have SMT, and by design it will accelerate some key workloads when enabled.

In our CPU results, the single threaded benchmarks showed no uplift with SMT enabled/disabled in our real-world or synthetic workloads. This means that even in SMT enabled mode, if one thread is running, it gets everything the core has on offer.

For multi-threaded tests, there is clearly a spectrum of workloads that benefit from SMT.

Those that don’t are either hyper-optimized on a one-thread-per-core basis, or memory latency sensitive.

Most real-world workloads see a small uplift, an average of 22%. Rendering and ray tracing can vary depending on the engine, and how much bandwidth/cache/core resources each thread requires, potentially moving the execution bottleneck somewhere else in the chain. For execution limited tests that don’t probe memory or the cache at all, which to be honest are most likely to be hyper-optimized compute workloads, scored up to +77% in our testing.

Investigating SMT on Zen 3 Gaming Performance (Discrete GPU)
Comments Locked

126 Comments

View All Comments

  • CityBlue - Thursday, December 3, 2020 - link

    Sure, there are numerous examples of non-core silicon in x86 environments that are insecure, however this article is about SMT yet it avoids mentioning the ever growing list of security problems with Intels implementation of SMT. The Intel implementation of SMT is so badly flawed from a security perspective that the only way to secure Intel CPUs is to completely disable SMT, and that's the bottom line recommendation of many kernel and distribution developers that have been trying to "fix" Intel CPUs for the past few years.
  • abufrejoval - Thursday, December 3, 2020 - link

    I use an Intel i7 in my pfSense firewall appliance, which based on BSD.

    BSD tends to remind you, that you should run it with SMT disabled because of these side channel exposure security issues.

    Yet, with the only workload being pfSense and no workload under an attacker's control able to sniff data, I just don't see why SMT should be a risk there, while extra threads to deeply inspect with Suricata should help avoiding that deeper analysis creates a bottleneck in the uplink.

    You need to be aware of the architectural risks that exist on the CPUs you use, but to argue that SMT should always be off is a bit too strong.

    Admittedly, when you have 16 real cores to play with, disabling SMT hurts somewhat less than on an i3-7350K (2C/4T), a chip I purchased once out of curiosity, only to have it replaced by an i5-7600K (4C/4T) just before Kaby Lake left the shelves and became temptingly cheap.

    It held up pretty well, actually, especially because it did go to 4,8 GHz without more effort than giving it permission to turbo. But I'm also pretty sure the 4 real cores of the i5-7600k will let the system live longer as my daughter's gaming rig.
  • CityBlue - Thursday, December 3, 2020 - link

    > to argue that SMT should always be off is a bit too strong.

    Not really - if you're a kernel or distro developer then Intel SMT "off" is the only sane recommendation you can give to end users given the state of Intel CPUs released over the last 10 years (note: the recommendation isn't relating to SMT in general, not even AMD SMT, it is only Intel SMT).

    However if end users with Intel CPUs choose to ignore the recommendation then that's their choice, as presumably they are doing so while fully informed of the risks.
  • leexgx - Saturday, December 5, 2020 - link

    The SMT risk is more a server issue then a consumer issue
  • schujj07 - Thursday, December 3, 2020 - link

    This article wasn't talking about Intel's implementation, only SMT performance on Zen 3. If this were about SMT on Intel then it would make sense, otherwise no.
  • CityBlue - Thursday, December 3, 2020 - link

    The start of the article is discussing the pros and cons of SMT *in general* and then discusses where SMT is used, and where it is not used, giving examples of Intel x86 CPUs. Why not then mention the SMT security concerns with Intel CPUs too? That's a rhetorical question btw, we all know the reason.
  • schujj07 - Friday, December 4, 2020 - link

    Since this is an AMD focused article, there isn't the side channel attack vector for SMT. Therefore why would you mention side channel attacks for Intel CPUs? That doesn't make any sense since Intel CPUs are only stated for who uses SMT and Intel's SMT marketing name. Hence bringing up Intel and side channel attack vectors would be including extraneous data/information to the article and take away from the stated goal. "In this review, we take a look at AMD’s latest Zen 3 architecture to observe the benefits of SMT."
  • mode_13h - Sunday, June 6, 2021 - link

    > The Intel implementation of SMT is so badly flawed from a security perspective that
    > the only way to secure Intel CPUs is to completely disable SMT

    That's not true. The security problems with SMT are all related to threads from different processes sharing the same physical core. To this end, the Linux kernel now has an option not to do that, since it's the kernel which decides what threads run where. So, you can still get most of the benefits of SMT, in multithreaded apps, but without the security risks!
  • dotjaz - Thursday, December 3, 2020 - link

    Windows knows how to allocate threads to the same CCX (after patches of course). It not only knows the physical core, it also knows the topology.
  • leexgx - Saturday, December 5, 2020 - link

    A lot of people forget to install the amd chipset drivers witch can result in some small loss of performance (but also need bios to be kept upto date as well Co compleat the ccx group support and best cores support to advertise to windows

Log in

Don't have an account? Sign up now