Gaming Performance (Discrete GPU)

For our gaming tests, we are using our AMD Ryzen 9 5950X paired with an NVIDIA RTX 2080 Ti graphics card. Our standard test suite consists of 12 titles, tested at four configurations:

  • Stage 1: Actual Gaming (1080p Maximum Quality, or equivalent)
  • Stage 2: All About Pixels (‘4K Minimum’ Quality)
  • Stage 3: Medium Low (‘1440p Minimum’)
  • Stage 4: Lowest Lows (720p Minimum or lower)

The final three settings are a set of CPU-limited gaming, and help find the limit of where we move from CPU limited to GPU limited. Some users baulk at this testing finding it irrelevant, however these configurations have been widely requested over the years. The contraire to this testing is the first setting, at 1080p Maximum: this being requested given that 1080p is the most popular gaming resolution, and Maximum Quality because this graphics card should be able to handle almost everything at that resolution at very playable framerates.

All the details for our gaming tests can be found in our #CPUOverload article.

Stage 1: Actual Gaming
AMD Ryzen 9 5950X, SMT On vs SMT Off
AnandTech Settings Average
Chernobylite 1080p Max 100% -
Civilization 6 1080p Max 103% -
Deus Ex: MD 1080p Max 99% 100%
Final Fantasy 14 1080p Max 102% -
Final Fantasy 15 8K Standard 100% 99%
World of Tanks 1080p Max 100% 102%
World of Tanks 4K Max 103% 102%
Borderlands 3 1080p Max 101% 103%
F1 2019 1080p Ultra 103% 106%
Far Cry 5 1080p Ultra 104% 104%
GTA V 1080p Max 99% 100%
RDR 2 1080p Max 100% 100%
Strange Brigate 1080p Ultra 101% 101%

In real-world gaming situations, there’s very little to pick between having SMT enabled or disabled. Almost universally it is either beneficial or a smidgen better to have it enabled, with F1 2019, Civilization 6, and Far Cry 5 seemingly the best recipients. I’ve also added in the Stage 3 result from World of Tanks, just because that benchmark doesn’t really have a proper settings menu.

Stage 2: All About Pixels
AMD Ryzen 9 5950X, SMT On vs SMT Off
AnandTech Settings Average
Chernobylite 4K Low 99% -
Civilization 6 4K Min 105% -
Deus Ex: MD 4K Min 98% 100%
Final Fantasy 14 4K Min 102% -
Final Fantasy 15 4K Standard 100% 100%
Borderlands 3 4K Very Low 101% 104%
F1 2019 4K Ultra Low 100% 100%
Far Cry 5 4K Low 101% 100%
GTA V 4K Low 100% 101%
RDR 2 8K Min 100% 100%
Strange Brigate 4K Low 100% 100%

With our high resolution settings with minimal quality, there is only one outlier in Civilization 6 on the average frame rates, which seem to be a bit higher when SMT is enabled.

Stage 3: Medium Low
AMD Ryzen 9 5950X, SMT On vs SMT Off
AnandTech Settings Average
Chernobylite 1440p Low 100% -
Civilization 6 1440p Min 105% -
Deus Ex: MD 1440p Min 97% 96%
Final Fantasy 14 1440p Min 102% -
Final Fantasy 15 1080p Standard 101% 105%
World of Tanks 1080p Standard 101% 101%
Borderlands 3 1440p Very Low 103% 105%
F1 2019 1440p Ultra Low 99% 99%
Far Cry 5 1440p Low 99% 99%
GTA V 1440p Low 100% 99%
RDR 2 1440p Low 100% 100%
Strange Brigate 1440p Low 100% 100%

At the more medium settings, we’re starting to see some more variation (Borderlands gets a few percent from SMT). We’re starting to see Deus Ex:MD drop off a bit with SMT enabled.

Stage 4: Lowest Lows
AMD Ryzen 9 5950X, SMT On vs SMT Off
AnandTech Settings Average
Chernobylite 360p Low 106% -
Civilization 6 480p Min 102% -
Deus Ex: MD 600p Min 91% 91%
Final Fantasy 14 768p Min 102% -
Final Fantasy 15 720p Standard 99% 102%
World of Tanks 768p Min 101% 100%
Borderlands 3 360p Very Low 108% 110%
F1 2019 768p Ultra Low 102% 105%
Far Cry 5 720p Low 100% 101%
GTA V 720p Low 99% 98%
RDR 2 384p Low 100% 103%
Strange Brigate 720p Low 95% 95%

This is perhaps our most varied set of results, with Deus Ex:MD showing an almost 10% drop with SMT enabled. DEMD is usually considered a CPU title, but so is Chernobylite, which sees a 6% gain. Borderlands is +8-10% with SMT enabled, which is more of a modern game. However, I doubt anyone is playing at these resolutions.

Overall Gaming Performance

If we take full averages from all the data points, then we’re seeing a rough +1% gain in performance in the more complex scenarios across the board.

Resolution Average Comparison
AMD Ryzen 9 5950X, SMT On vs SMT Off
AnandTech Setting aka Average
Stage 1 1080p Max Actual Gaming 101% 101%
Stage 2 4K+ Min All About Pixels 101% 101%
Stage 3 1440p Min Medium Lows 101% 101%
Stage 4 < 768p Min Lowest Lows 100% 101%

In reality, any loss or gain is highly dependent on the title in question, and can swing from one side of the line to the other. It’s clear that Deus Ex prefers SMT off, and F1 2019 or Borderlands prefers SMT on, but we are talking fine margins here.

CPU Performance Power Consumption, Temperature


View All Comments

  • quadibloc - Monday, December 14, 2020 - link

    The SPARC chips used SMT a lot, even going beyond 2-way, so I'm surprised they weren't mentioned as examples. Reply
  • mode_13h - Sunday, June 6, 2021 - link

    > When SMT is enabled, depending on the processor, it will allow two, four,
    > or eight threads to run on that core

    Intel's HD graphics GPUs win the oddball award for supporting 7 threads per EU, at least up through Gen 11, I think.

    IIRC, AMD supports 12 threads per CU, on GCN. I don't happen to know how many "warps" Nvidia simultaneously executes per SM, in any of their generations.
  • mode_13h - Sunday, June 6, 2021 - link

    Thanks for looking at this, although I was disappointed in the testing methodology. You should be separately measuring how the benchmarks respond to simply having more threads, without introducing the additional variable of SMT on/off. One way to do this would be to disable half of the cores (often an option you see in BIOS) and disable SMT. Then separately re-test with SMT on, and then with SMT off but all cores on. This way, we could compare SMT on/off with the same number of threads. Ideally, you'd also do this on a single-die/single-CCX CPU, to ensure no asymmetry in which cores were disabled.

    Even better would be it disable any turbo, so we could just see the pipeline behavior. Although, controlling for more variables poses a tradeoff between shedding more insight into the ALU behavior and making the test less relevant to real-world usage.

    The reason to separate to hold the number of threads constant is that software performance doesn't scale linearly with the number of threads. Due to load-balancing issues or communication overhead (e.g. lock contention), performance of properly-designed software always scales sub-linearly with the number of threads. So, by keeping the number of threads constant, you'd eliminate that variable.

    Of course, in real-world usage, users would be deciding between the two options you tested (SMT on/off; always using all cores). So, that was most relevant to the decision they face. It's just that you're limited in your insights into the results, if you don't separately analyze the thread-scaling of the benchmarks.
  • mode_13h - Sunday, June 6, 2021 - link

    Oops, I also intended to mention OS scheduling overhead as another source of overhead, when running more threads. We tend not to think of the additional work that more threads creates for the OS, but each thread the kernel has to manage and schedule has a nonzero cost. Reply
  • mode_13h - Sunday, June 6, 2021 - link

    As for the article portion, I also thought too little consideration was given towards the relative amounts of ILP in different code. Something like zip file compressor should have relatively little ILP, since each symbol in the output tends to have a variable length in the input, meaning decoding of the next symbol can't really start until the current one is mostly done. Text parsing and software compilation also tend to fall in this category.

    So, I was disappointed not to see some specific cases of low-ILP (but high-TLP) highlighted, such as software compilation benchmarks. This is also a very relevant use case for many of us. I spend hours per week compiling software, yet I don't play video games or do 3D photo reconstruction.
  • mode_13h - Sunday, June 6, 2021 - link

    A final suggestion for any further articles on the subject: rather than speculate about why certain benchmarks are greatly helped or hurt by SMT, use tools that can tell you!! To this end, Intel has long provided VTune and AMD has a tool called μProf.


Log in

Don't have an account? Sign up now