Modern computer processors are constantly changing their operating frequency (and voltage) depending on workload. For Intel processors, this is often handled by the operating system which will request a particular level of performance, known as the Performance State or P-State, from the processor. The processor then adjusts its frequencies and voltage levels to accomodate, in a DVFS (dynamic voltage and frequency scaling) sort of way, but only at the P-states fixed at the time of production. While the best for performance would be to run the system at the maximum all the time, due to the high voltage, this is the least efficient way to run a processor and wasteful in terms of energy used, which for mobile devices means a shorter battery life or thermal throttling. With the P-state model, to increase efficiency, the operating system can request lower P-states in order to save power, but if a task requires more performance, and the power/thermal budgets are sufficient, the P-State can be changed to accomodate. This 'technology' on Intel processors has historically been called 'Speed Step'.

With Skylake, Intel's newest 6th generation Core processors, this changes. The processor has been designed in a way that with the right commands, the OS can hand control of the frequency and voltage back to the processor. Intel is calling this technology 'Speed Shift'. We’ve discussed Speed Shift before in Ian’s Skylake architecture analysis, but despite the in-depth talk from Intel, Speed Shift was noticably absent at the time of the launch of the processors. This is due to one of the requirements for Speed Shift - it requires operating system support to be able to hand over control of the processor performance to the CPU, and Intel had to work with Microsoft in order to get this functionality enabled in Windows 10. As of right now, anyone with a Skylake processor is actually not getting the benefit of the technology, at least right now. A patch will be rolled out in November for Windows 10 which will enable this functionality, but it is worth noting that it will take a while for it to roll out to new Windows 10 purchases.

Compared to Speed Step / P-state transitions, Intel's new Speed Shift terminology, changes the game by having the operating system relinquish some or all control of the P-States, and handing that control off to the processor. This has a couple of noticable benefits. First, it is much faster for the processor to control the ramp up and down in frequency, compared to OS control. Second, the processor has much finer control over its states, allowing it to choose the most optimum performance level for a given task, and therefore using less energy as a result. Specific jumps in frequency are reduced to around 1ms with Speed Shift's CPU control from 20-30 ms on OS control, and going from an efficient power state to maximum performance can be done in around 35 ms, compared to around 100 ms with the legacy implementation. As seen in the images below, neither technology can jump from low to high instantly, because to maintain data coherency through frequency/voltage changes there is an element of gradient as data is realigned.

The ability to quickly ramp up performance is done to increase overall responsiveness of the system, rather than linger at lower frequencies waiting for OS to pass commands through a translation layer. Speed Shift cannot increase absolute maximum performance, but on short workloads that require a brief burst of performance, it can make a big difference in how quickly that task gets done. Ultimately, much of what we do falls more into this category, such as web browsing or office work. As an example, web browsing is all about getting the page loaded quickly, and then getting the processor back down to idle.

For this short piece, Intel was able to provide us with the Windows 10 patch for Speed Shift ahead of time, so that we could test and see what kind of gains it can achieve. This gives us a somewhat unique situation, since we can isolate this one variable on a new processor and measure its impact on various workloads.

To test Speed Shift, I’ve chosen several tasks which have workloads that could show some gain from Speed Shift. Tests which run the processor at its maximum frequency for long periods of time are not going to show any significant gain, since you are not limited by the responsiveness of the processor in those cases. The first test is PCMark 8, which is a benchmark which attempts to represent real-life tasks, and the workload is not constant. In addition, I’ve run the system through several Javascript tests, which are the best case scenario for something like Speed Shift, since the processor has to quickly complete a task in order to allow you to enjoy a website.

The processor in question is an Intel Core i7-6600U, with a base frequency of 2.6 GHz, and turbo frequency of 3.4 GHz. Despite the base frequency being rated on the box at 2.6 GHz, the processor can go all the way down to 400 Mhz when idle, so being able to ramp up quickly could make a big impact even on the U-series Skylake processors. My guess is that it will be even more beneficial to the Y series Core m3/m5/m7 parts since they have a larger dynamic range, and typically more thermal constraints.

PCMark 8

PCMark 8 - Home

PCMark 8 - Work

Both the Home and Work tests show a very small gain with Speed Shift enabled. The length of these benchmarks, which are between 30 and 50 minutes, would likely mask any gains on short workloads. I think this illustrates that Speed Shift is just one more tool, and not a holy grail for performance. The gain on Home is just under 3%, and the difference on the Work test is negligible.

JavaScript Tests

JavaScript is one of the use cases where short burst workloads are the name of the game, and here Speed Shift has a much bigger impact. All tests were done with the Microsoft Edge browser.

Mozilla Kraken 1.1

Google Octane 2.0

WebXPRT 2015

WebXPRT 2013

The time to complete the Kraken 1.1 test is the least affected, with just a 2.6% performance gain, but Octane's scores shows over a 4% increase. The big win here though is WebXPRT. WebXPRT includes subtests, and in particular the Photo Enhancement subtest can see up to a 50% improvement in performance. This bumps the scores up significantly, with WebXPRT 2015 showing an almost 20% score increase, and WebXPRT 2013 has a 26% gain. These leaps in performance are certainly the kind that would be noticeable to the end user manipulating photographs in something like Picasa or watching web-page based graph adjustments such as live stock feeds.

Power Consumption

The other side of the coin is power consumption. Having a processor that can quickly ramp up to its maximum frequency could mean that it will consume more power due to the greater penalty of increasing the voltage, but if it can complete the task quickly and get back to idle again, there is a chance to be more efficient when work is done in 10s of milliseconds rather than 100s of milliseconds, as the frequency ramps up and down again before the old P-state method has decided to do anything. The principle of 'work fast, finish now' was the backbone of Intel's 'Race To Sleep' strategy during the ultrabook era and focused on the impulse of response-related performance, however the drive for battery life means that efficiency has tended to matter more, especially as devices and batteries get smaller. 

Due to the way modern processors work, we don’t have the tools to directly measure the SoC power. Intel has told us that Speed Shift does not impact battery life very much, one way or the other, so to verify this, I've run our light battery life test with the option disabled and enabled.

Core i7-6600U Battery Efficiency

This task is likely one of the best case scenarios for Speed Shift. It consists of launching four web pages per minute, with plenty of idle time in between. Although Speed Shift seems to have a slight edge, it is very small and would fall within the margin of error on this test. Some tasks may see a slight improvement in efficiency, and others may see a slight regression, but Speed Shift is less of a power savings tool than other pieces of Skylake. Looking at it another way, if, for example, the XPS 13 with Skylake was to get 15 hours of battery life, Speed Shift would only change the result by about 7 minutes. Responsiveness increases, but net power use remains about the same.

Final Words

With Skylake, while there was not the large leap in clock for clock performance gain that we have become accustomed to with new Intel microarchitectures, but when you look at the overall package, there was a decent net gain in performance combined with new technologies. For example, being able to maintain higher Turbo frequencies on multiple cores has increased the stock to stock performance more than the smaller IPC gains.

Speed Shift is just one small part of the overall performance gain, and one that we have not been able to look at until now. It does lead to some pretty big gains in task completion, if the workloads are bursty and short enough for it to make a difference. It can’t increase the absolute performance of the processor, but it can get it to maximum performance in a much shorter amount of time, as well as get it back down to idle quicker. Intel is billing it as improved responsiveness, and it’s pretty clear that they have achieved that.

The one missing link is operating system support. We’ve been told that the patch to enable this is coming to Windows 10 in November. While this short piece looks at what Speed Shift can bring to the table in terms of performance, if you'd like to read more about how it is implemented, please check out the Skylake architecture analysis which goes into more detail.

Update: Daniel Rubino at Windows Central has tested the latest Windows 10 Insider build 10586 and it appears to enable Speed Shift on his Surface Pro 4, which is in-line with the November timeline we were provided.

Comments Locked

54 Comments

View All Comments

  • sr1030nx - Sunday, November 8, 2015 - link

    NoScript?
  • Denithor - Monday, November 9, 2015 - link

    In your first two paragraphs you misspelled noticeable as "noticable" - lost an "e" somehow.
  • MrSpadge - Tuesday, November 10, 2015 - link

    The first two frequency steps are VERY quick with Speed Shift. The CPU seems to go from 1.0 GHz to ~2.5 GHz in about 2 ms. That's encouraging. When doing number crunching on GPUs with BOINC we always have the problem that idle CPUs are very slow to feed the GPU with new work. It's not much work required, but it's got to happen quickly - otherwise the GPU runs dry. That's why partly loading the CPU with other tasks is currently a good "work-around" for that.
  • danwat1234 - Saturday, October 15, 2016 - link

    Wait.... the time it takes for a CPU core to ramp to full turboboost is way less than a second. .. I run Seti@home on my laptop's GPU and it takes maybe 20 seconds or so of CPU crunching for the GPU to start working. During that time, the CPU core is at full clockspeed I assume. I don't think that is your issue.
    Maybe the CPU power is just a limitation in your case or the priority of the process needs to be increased. What project are you running?

Log in

Don't have an account? Sign up now