Power Consumption, Temperature

Two other arguments for having SMT enabled or disabled comes down to power consumption and temperature.

With SMT enabled, the core utilization is expected to be higher, with more instructions flowing through and being processed per cycle. This naturally increases the power requirements on the core, but might also reduce the frequency of the core. The trade-off is meant to be that the work going through the core should be more than enough to make up for extra power used, or any lower frequency. The lower frequency should enable a more efficient throughput, assuming the voltage is adjusted accordingly.

This is perhaps where AMD and Intel differ slightly. Intel’s turbo frequency range is hard-bound to specific frequency values based on core loading, regardless of how many threads are active or how many threads per core are active. The activity is a little more opportunistic when we reach steady state power, although exactly how far down the line that is will depend on what the motherboard has set the power length to. AMD’s frequency is continually opportunistic from the moment load is applied: it obviously scales down as more cores are loaded, but it will balance up and down based on core load at all times. On the side of thermals, this will depend on the heat density being generated in each core, but this also acts as a feedback loop into the turbo algorithm if the power limit has not been reached.

For our analysis here, we’ve picked two benchmarks. Agisoft, which is a variable threaded test performs practically the same with SMT On/Off, and 3DPMavx, a pure MT test which gets the biggest gain from SMT.

Agisoft

Photoscan from Agisoft is a 2D image to 3D model creator, using dozens of high-quality 2D images to generate related point maps to form a 3D model, before finally texturing the model using the images provided. It is used in archiving artefacts, as well as converting 2D sculpture into 3D scenes. Our test analyses a standardized set of 85 x 18 megapixel photos, with a result measured in time to complete.

Simply looking at CPU temperatures while running our real-world Agisoft test, our current setup (MSI X570 Godlike with Noctua NH12S) shows that both CPUs will flutter around 74ºC sustained. Perhaps the interesting element is at the beginning of the test, where the CPU temperatures are higher in SMT Off mode. Looking into the data, and during SMT Off, the processor is at 4300 MHz, compared to 4150 MHz when SMT is enabled. This would account for the difference.

Looking at power, we can follow that for the bulk of the test, both processors have similar package power consumption, around 130 W. The SMT Off is drawing more power during the first couple of minutes of the test, due to the higher frequency. Clearly the thermal density in this part of the test by only having one thread per core is allowing for a higher turbo.

If we measure the total power of the test, it’s basically identical in any metric that matters. Nearer the end of the test, where the workload is more variably threaded, this is where the SMT Off mode seems to come under power. This benchmark completion time is essentially the same due to the nature of the test, but SMT Off comes in at 2% lower power overall.

3DPMavx (3D Particle Movement)

Our 3DPM test is an algorithmic sequence of non-interactive random three-dimensional movement, designed to simulate molecular diffusive movement inside a gas or a fluid. The simulation is made non-interactive (i.e. no two molecules will collide) due to the original average movement of each particle taking collisions into account. Our test cycles through six movement algorithms at ten seconds apiece, followed by ten seconds of idle, with the whole loop being repeated six times, taking about 20 minutes, regardless of how fast or slow the processor is. The related performance figure is millions of particle movements per second. Each algorithm has been accelerated for AVX2.

On the temperature side of things, it is clear that the SMT Off mode again puts up a higher thermal profile. Temperatures this time peak at 66ºC, but it is clear the difference between the two modes.

On the power side, we can see why SMT Off mode is warmer – the cores are drawing more power. Looking at the data, SMT Off mode is running ~4350 MHz, compared to SMT On which is running closer to 4000 MHz.

With the higher frequency with SMT Off, the estimated total power consumption is 6.8% higher. This appears to be very constant throughout the benchmark, which lasts about 20 minutes total.

But, let us add in the performance numbers. Because 3DPMavx can take advantage of SMT On, that mode scores +77.5% by having two threads per core rather than one (a score of 10245 vs 5773). Combined this makes SMT On mode +91% better in performance per watt on this benchmark.

Gaming Performance (Discrete GPU) Conclusions: SMT On
Comments Locked

126 Comments

View All Comments

  • quadibloc - Friday, December 4, 2020 - link

    On the one hand, if one program does a lot of integer calculations, and the other does a lot of floating-point calculations, putting them on the same core would seem to make sense because they're using different execution resources. On the other hand, if you use two threads from the same program on the same core, then you may have less contention for that core's cache memory.
  • linuxgeex - Thursday, December 3, 2020 - link

    One of the key ways in which SMT helps keep the execution resources of the core active, is cache misses. When one thread is waiting 100+ clocks on reads from DRAM, the other thread can happily keep running out of cache. Of course, this is a two-edged sword. Since the second thread is consuming cache, the first thread was more likely to have a cache miss. So for the very best per-thread latency you're best off to disable SMT. For the very best throughput, you're best off to enable SMT. It all depends on your workload. SMT gives you that flexibility. A core without SMT lacks that flexibility.
  • Dahak - Thursday, December 3, 2020 - link

    Now I only glanced through the article, but would it make more sense to use a lower core count cpu to see the benefits of SMT as using a higher core count might mean it will use the real core over the smt cores?
  • Holliday75 - Thursday, December 3, 2020 - link

    Starting to think the entire point of the article was this subject is so damn complicated and hard to quantify for the average user that there is no point in trying unless you are running work loads in a lab environment to find the best possible outcome for the work you plan on doing. Who is going to bother doing that unless you work for a company where it makes sense to do so.
  • 29a - Thursday, December 3, 2020 - link

    I wonder what the charts would look like with a dual core SMT processor. I think the game tests would have a good chance of changing. I'm a little surprised the author didn't think to test CPU's with more limited resources.
  • Klaas - Thursday, December 3, 2020 - link

    Nice article. And nice benchmarks.

    Too bad you could not include more real-life test.
    I think that a processor with far less threads (like a 5600X)
    is easier to saturate with real-life software that actually uses threads.
    The Cache Sizes and Memory channels bandwith ratio to core is quite different
    for those 'smaller' processors. That will probably result in different benchmark results...
    So it would be interesting to see what those processors will do, SMT ON versus SMT OFF.
    I don't think the end result will be different, but it could even be a bigger victory for SMT ON.

    Another interesting area is virtualization.
    And as already mentioned in more comments it is very important that the Operating Systems
    will assign threads to the right core or SMT-Core combinations.
    That is even more important in virtualization situations...
  • MDD1963 - Thursday, December 3, 2020 - link

    Determining the usefulness of SMT with 16 cores on tap is not quite as relevant as when this experiment might be done with, say, a 5600X or 5800X....; naturally 16 cores without SMT might still be plenty (as even 8 non SMT cores on the 9700K proved)
  • thejaredhuang - Thursday, December 3, 2020 - link

    This would be a better test on a CPU that doesn't have 16 base cores. If you could try it on a 4C/8T part I think the difference would be more pronounced.
  • dfstar - Thursday, December 3, 2020 - link

    The basic benefit of SMT is it allows the processer to hide the impact of long latency instructions on average IPC, since it can switch to new thread and execute those instructions. In this way it is similar to OOO(which leverages speculative execution to do the same) and also more flexible than fine-grained multi-threading. There is an overhead and cost(area/power) due to the duplicated structures in the core that will impact the perf/watt of pure single-threaded workloads, I don't think disabling SMT removes all this impact ...
  • GreenReaper - Thursday, December 3, 2020 - link

    Perhaps not. But at the same time, it is likely that any non-SMT chip that has a SMT variant actually *is* a SMT chip, it is just disabled in firmware - either because it is broken on that chip, or because the non-SMT variant sold better.

Log in

Don't have an account? Sign up now