AMD Beema/Mullins Architecture & Performance Previewby Anand Lal Shimpi on April 29, 2014 12:00 AM EST
When AMD launched its Kabini and Temash APUs last year it delivered a compelling cost/performance story, but its power story wasn’t all that impressive. Despite being built out of relatively low power components, nearly all of AMD’s entry level APUs carried 15W TDPs, with a couple weighing in at 8 - 9W and only a single 1GHz dual-core part dropping down to 3.9W. By comparison, Intel was shipping full blown Haswell Ultrabook parts at 15W - offering substantially better CPU performance, in a similar thermal envelope (although at a higher cost). The real disruption for AMD was Intel’s Bay Trail, which showed up with a similar looking micro architecture running at substantially higher clock speeds and TDPs below 8W.
AMD seemed to have all of the right pieces to build a power efficient mobile SoC, but for some reason we weren’t seeing it. Today that begins to change with the the successors to Kabini and Temash.
Codenamed Beema and Mullins, these are the 2014 updates to Kabini and Temash (respectively). Beema is aimed at entry level notebooks, while Mullins targets tablets. For now, both are designed for Windows machines. Although I suspect we’ll eventually see AMD address the native Android market head on, for now AMD is relying on running Android on top of Windows for those who really want it. No word on if/when we’ll get a socketed Beema for entry level desktops.
Like their predecessors, Beema and Mullins combine four low power AMD x86 cores (Puma+ this time, instead of Jaguar) with 128 GCN based Radeon GPU cores. AMD will continue to offer a couple of dual-core SKUs, but they are harvested from a quad-core die. AMD remains unwilling to release official die area figures, but there is a slight increase in transistor count:
|AMD/Intel Transistor Count & Die Area Comparison|
|SoC||Process Node||Transistor Count||Die Area|
|AMD Zacate||TSMC 40nm||450M+||75mm2|
|AMD Kabini/Temash||TSMC 28nm||914M||~107mm2 (est)|
|AMD Beema/Mullins||GF 28nm||930M||~107mm2 (est)|
|AMD Llano||GF 32nm SOI||1.18B||228mm2|
|AMD Trinity/Richland||GF 32nm SOI||1.30B||246mm2|
|AMD Kaveri||GF 28nm SHP||2.41B||245mm2|
|Intel Haswell (4C/GT2)||Intel 22nm||1.40B||177mm2|
I’d expect a similar die size to Kabini/Temash. It’s interesting to note that these SoCs have a transistor count somewhere south of Apple’s A7.
Puma+ is based on the same micro architecture as Jaguar. We’re still looking at a 2-wide OoO design with the same number of execution units and data structures inside the chip. The memory interface remains unchanged as well at 64-bits wide. These new SoCs are still built on the same 28nm process as their predecessor. The process however has seen some improvements. Not only are both the CPU and GPU designs slightly better optimized for lower power operation, but both benefit from improvements to the manufacturing process resulting in substantial decreases in leakage current.
AMD claims a 19% reduction in core leakage/static current for Puma+ compared to Jaguar at 1.2V, and a 38% reduction for the GPU. The drop in leakage directly contributes to a substantially lower power profile for Beema and Mullins.
AMD also went in and tweaked the SoC’s memory interface. Kabini/Temash had a standard PC-like DDR3 memory interface. All of the complexity required for broad memory module compatibility and variations in trace routing was handed by the controller itself. This not only added complexity to the DDR3 interface but power as well. With Beema and Mullins, AMD took a page from the smartphone SoC design guide and traded flexibility for power. These platforms now ship with more strict guidelines as to what sort of memory can be used on board and how traces must be routed. The result is a memory interface that shaves off more than 500mW when in this more strict, low power mode. OEMs looking to ship a design with socketed DRAM can still run the memory interface in a higher power mode to ensure memory compatibility.
These SoCs won’t be available in a PoP configuration unfortunately - OEMs will have to rely on discrete DRAM packages rather than a fully integrated solution. Beema/Mullins also show up to a 200mW reduction in power consumed by the display interface compared to Kabini/Temash.
The combination of all of this is 20% lower idle power compared to the previous generation of AMD entry level and low power APUs. AMD put together a nice graph illustrating its progress over the years:
Beema and Mullins are definitely in a good place, however they still do consume more power at idle than the smartphone SoCs we typically find in iOS and Android tablets. AMD isolated APU power for the graph above and is using an “eReader” workload (aka display on but not animating, system otherwise idle). It just so happens I gathered similar data for our 2013 Nexus 7 review. The workloads and measurements are different (AMD isolates APU power, I’m looking at total platform power minus display) but it’s enough to put things in perspective:
AMD has dropped power consumption considerably over the years, but it’s still not as power efficient as high end mobile silicon.
AMD sees no value in supporting Microsoft's Connected Standby standard at this point, which makes sense given the limited success of Windows 8 tablets. Once again this seems to point to AMD eventually adopting Android for its tablet aspirations.
Looking forward, AMD has more tricks up its sleeve to continue to drive power down. Most interesting on the list? We’ll see an integrated voltage regulator (ala Haswell’s FIVR) from AMD in 2015.
Post Your CommentPlease log in or sign up to comment.
View All Comments
name99 - Tuesday, April 29, 2014 - link"I’d expect a similar die size to Kabini/Temash. It’s interesting to note that these SoCs have a transistor count somewhere south of Apple’s A7."
Isn't this something of an apple's to oranges comparison?
This AMD SOC is basically CPU+GPU+memory controller.
A7 is all that plus secure storage, ISP, h264 encoder/decoder (the genuine low power deal, not some "hardware assisted" frankenstein that runs the CPU and GPU [together, both at high power] to do the job) along with god knows what else --- flash controller? fingerprint recognition cell?
mczak - Tuesday, April 29, 2014 - linkKabini / Temash also full custom hw video encode/decode (all gcn based chips do), though if you want some hybrid mode is still available, so that should be pretty comparable. Flash controller and the like, too. Yes no ISP, but OTOH there's quite a lot of stuff the A7 won't do too (like 2xsata, the 4x1 and 1x4 pcie 2.0 connectivity, 2xUSB 3.0, high-speed i/o isn't exactly cheap). Anyway, the transistor count and die size is comparable after all (based on the official numbers, Kabini is slightly larger, but the a7 has slightly more transistors, though there's both different methods to count transistors and measure die size, not to mention they come out of different fabs), and it shouldn't be a surprise.
lmcd - Friday, May 2, 2014 - linkAMD should try partnering with Broadcom (as Broadcom has no real SoCs for smartphones).
200380051 - Tuesday, April 29, 2014 - linkI am eager to see how Mantle-enabled games will perform on these Mullins tablets. It seems a good fit from a technical standpoint. It might just push the PC gaming sphere to dig into tablet space. This in turn directly expands the market of game studios.
Also, I wonder if AMD's mobile lineup is to be the first product they'll roll out on Samsung's 14nm FINFET process. The process will be available starting 2015, as per their agreement. Its up to AMD to cook us a shrinked revision of these chips in a timely fashion.
Things are getting interesting.
MartinT - Wednesday, April 30, 2014 - linkIt seems to me that performance numbers for these parts don't tell even half the story without the accompanying power readings, considering the 'use whatever power until the chassis burns the user' approach of AMD's turbo implementation.
kirilmatt - Wednesday, April 30, 2014 - linkHow AMD did this is amazing. Imagine if this was released instead of kabini/temash. This destroys Bay Trail. I only hope that it gets released soon so it doesn't have to compete with Intel's 14nm SoCs. Anyways, good job AMD!
R3MF - Wednesday, April 30, 2014 - linkUbuntu tablet please...
purerice - Wednesday, April 30, 2014 - linkWould one way to test the "non-turbo" performance be to loop some test 100 times and see the performance decrease over time? Considering the turbo would decrease as the CPU/APU heats up we could see the performance difference and also how long you really get "turbo" turned on for.
azazel1024 - Thursday, May 1, 2014 - linkI am impressed, but I am curious as to both why Bay Trail beats it in the PCMark testing by a fair margin, but not in individual CPU benchmarks. If that is thermal limits...well, I will say that a lot of tablet workloads are very short term. Windows tablet workloads (at least mine)...not so much.
Enough of what I do would likely hit those thermal constraints and at least in my testing, my T100 doesn't clock down even under very prolonged workloads, like 15+ minutes of converting RAW to JPEG images. Or long gaming, like an hour or two of KSP.
That and I have concerns about that idle and low power use. Seems to be pretty good under higher load and performance seems to be there (with caveat/concern)...but idle and low power could be an issue. According to those AMD specs, the APU itself is using darn near 2w of power streaming 1080p. Based on my math, my T100 TOTAL uses around 2.4w of power when streaming 1080p (around 13hrs of run time, 31whr battery). I assume that the display, wifi, signal processor, memory, etc, etc are consuming more than .6w of power.
Having a much bigger battery or much shorter run time could be a big sticking point for a lot of tablet users (I know I'd have an issue if my 6-7hrs gaming/10hrs normal use/13hrs video turned in to more like 3hrs gaming/6hrs normal use/8hrs video.
FITCamaro - Wednesday, May 7, 2014 - linkMy next tablet will likely be a Windows 8.1 tablet. I'd love the high end AMD CPU tested here even if it doesn't do as well on power as Baytrail but bests it in GPU performance. Would be nice to be able to do better light mobile gaming.