Conclusionary Remarks: Arm v9 for Android

When we move through significant revisions of Arm’s architecture, up to v8 and now v9, it’s important to note that the new features defined in the ISA do not always fundamentally improve performance – it’s up to the microarchitecture teams to build the cores to the ISA specifications, and the implementation teams to enable the core in silicon with frequency and power efficiency. Accomplishing that requires a good process node, design technology co-optimization, and then partners that can execute by building the best devices for that processor.

Qualcomm’s target with the Snapdragon 8 Gen 1 is very clearly the 2022 Android Flagship smartphones. New cores, new graphics, enhanced machine learning capabilities, a step function in camera processing power, an integrated X65 modem, all built on Samsung’s 4nm process node technology. The flagship Android space is an area in which Qualcomm has been comfortable for a number of years, however the increased thermals of last generation’s Snapdragon S888 gave a number of analysts in the space a bit of a squeaky bum moment.

It’s hard to tell immediately in our small test if that still remains the case. Samsung’s 4nm node has improvements beyond the previous generation 5nm design, however Qualcomm’s presentational numbers were above and beyond those that Samsung provided, perhaps indicating that additional improvements both in architecture and implementation have led to those performance numbers.

Our testing shows +19% floating point performance on the X2 core, which is almost the +20% that Qualcomm quotes, but only +8% in integer, which is often the most quoted. We’re seeing power efficiency improvements for sure on the X2 core, with an overall efficiency improvement of 17%, but peak power has also increased, in part because some of our tests make use of the additional cache in the system. Our machine learning tests are +75% over the previous generation, although not the 4x numbers that Qualcomm states – we need to do more work here on power efficiency testing however. On the gaming side, our 'first run' numbers showcase some explosive gains in GPU throughput.

Although we’ve only done a few tests here, I would be remiss if I didn’t mention the elephant in the room: MediaTek. In the last month MediaTek announced a return to the high-end with a flagship processor of its own, using the same 1+3+4 configuration with slightly higher frequencies, more cache, and built on TSMC’s N4 process. Implementation here will be the key metric I feel, so how MediaTek has been able to optimize for TSMC N4 vs Qualcomm on Samsung 4nm is going to be analyzed. I should point out here that a processor is more than just the CPU cores, as we’ll see Adreno vs Mali on graphics, the different machine learning approaches, but also how the two companies approach 5G and connectivity, which has been one of Qualcomm’s most prominent strengths to date.

We look forward to testing the Qualcomm S8g1 in more detail in the New Year, as well as how many of the main smartphone OEMs choose Qualcomm for their flagship devices.

System-Wide Testing and Gaming
Comments Locked

169 Comments

View All Comments

  • eastcoast_pete - Tuesday, December 14, 2021 - link

    I guess that's one "impressive" benchmark score, just not the one any user would hope for. Less than 10% complete after half an hour is pretty abysmal. Doesn't bode well for ARM's supposedly much improved LITTLE core designs. Just for comparison, how did the last A55 cores perform in that test?
  • dudedud - Wednesday, December 15, 2021 - link

    Andrei said something along the lines of 14hrs for both int and fp SPEC 06.
  • Wilco1 - Thursday, December 16, 2021 - link

    Remember the little cores are much slower than the big cores since they have very little cache and run at a low frequency. In SD888 the little cores are 6.4 times slower than the big core. That should be reduced to about 5 times in 8gen1.

    I think having 4 little cores is too much, they don't contribute to benchmark scores, so you could have just 1 or 2 for background tasks and use the area to quadruple the tiny L2.
  • iphonebestgamephone - Thursday, December 16, 2021 - link

    They do contribute to benchmark scores, 4 of them can be helpful when you need to load up all 4 big cores for the foreground and theres still some background tasks going on.
  • Wilco1 - Saturday, December 18, 2021 - link

    That does not make sense. The little cores have almost no L2 cache so will be competing with (and slowing down) the big cores due to the small L3. Having fewer little cores with a much larger L2 means more L3 is available per core, improving performance when all cores are loaded.

    A little core is most useful for background tasks when the screen is off and mid/big cores are powered down. If you have more background tasks than a single little core could handle, then it's not really "background", and it would be better to run them on a mid core since that will be several times more efficient than 4 little cores (see the efficiency graph, the mid core in eg. SD888 has about the same efficiency as a maxed out little core).
  • iphonebestgamephone - Sunday, December 19, 2021 - link

    When all 3 mid cores and the prime core are fully loaded in apps that use 4 threads, the little cores are doing all the background tasks. How much l3 do background tasks need anyway?
  • syxbit - Tuesday, December 14, 2021 - link

    >>There is no AV1 decode engine in this chip, with Qualcomm’s VPs stating that the timing for their IP block did not synchronize with this chip.

    This is very disappointing. The Radeon 6800, which launched over a year ago has hardware AV1 decode. I imagine the 2021 Exynos and Tensor chips will all do AV1
  • TheinsanegamerN - Tuesday, December 14, 2021 - link

    AV1 wont be necessary for a decade at least. AV1 only hit stable 1.0 spec in 2019, this chip was likely already in the design phase beforehand.
  • movax2 - Wednesday, December 15, 2021 - link

    YouTube and Netflix already uses AV1 for a good portion of their videos. Your statement is wrong.
  • GC2:CS - Tuesday, December 14, 2021 - link

    The GPU upgrade seems absolutelly massive.

    I have not seen 50% gain in years if i remember corectly.

Log in

Don't have an account? Sign up now