After a series of piecemeal announcements from different hardware vendors over the past year, the future of High Bandwidth Memory 2 (HBM2) is finally coming into focus. Continuing the industry’s ongoing momentum with HBM2 technology, late last month JEDEC published an updated revision of the HBM2 standard. The updated standard added support for even faster memory speeds of up to 3.2Gbps/pin, and in the process pushed the fastest speed for a complete stack of HBM2 memory to 410GB/sec. Meanwhile the memory manufacturers themselves have been preparing for this moment for a while, and Samsung has put out their own matching announcement regarding their Flashbolt HBM2 memory.

First and foremost, let’s dive into the latest version of the HBM2 standard. JESD235C, as it’s officially called, is a relatively small update to the HBM2 standard. After introducing more sizable changes a couple of years back with 12-Hi memory stacks, expanding both the speed and capacity of HBM2 memory, the latest revision is a more measured update focusing on performance.

The biggest change here is that the HBM2 standard has officially added support for two higher data rates, bringing 2.8Gbps/pin and 3.2Gbps/pin into the standard. Coming from the previous standard’s maximum rate of 2.4Gbps/pin, this represents an up-to 33% increase in memory bandwidth in the case of 3.2Gbps HBM2. Or to put this in more practical numbers, a single stack of 3.2Gbps HBM2 will deliver 410GB/sec of bandwidth, up from 307GB/sec in the last standard. Which for a modern, high-end processor supporting 4 stacks (4096-bit) of memory, this brings the aggregate bandwidth available to a whopping 1.64 TB/sec.

HBM2 Memory Generations
  JESD235C JESD235B JESD235A
Max Bandwidth Per Pin 3.2 Gb/s 2.4 Gb/s 2 Gb/s
Max Die Capacity 2 GB 2 GB 1 GB
Max Dies Per Stack 12 12 8
Max Capacity Per Stack 24 GB 24 GB 8 GB
Max Bandwidth Per Stack 410 GB/s 307.2 GB/s 256 GB/s
Effective Bus Width (1 Stack) 1024-bit
Voltage 1.2 V 1.2 V 1.2 V

All told, this latest update keeps even a single stack of HBM2 quite competitive on the bandwidth front. For comparison’s sake, a 256-bit GDDR6 memory bus with 14Gbps memory can reach 448GB/sec of aggregate bandwidth; so a single stack of HBM2 only slightly trails that. And, of course, HBM2 can scale up to a larger number of stacks more easily than GDDR6 can scale up in bus width, keeping larger HBM2 topologies well ahead of discrete GDDR6 memory chips as far as bandwidth is concerned.

The trade-off, as always, is cost and capacity. HBM2 remains a premium memory technology – due in part to the complexities involved in TSVs and die stacking, and in part to manufacturer product segmentation – and there aren’t currently any signs that this will change. Meanwhile the latest HBM2 standard does not increase memory capacities at all – either through density or larger stacks – so the maximum size of a single stack remains 24GB, allowing a 4 stack configuration to pack up to 96GB of memory.


HBM In A Nutshell

Meanwhile, it’s interesting to note that as of JESD235C, JEDEC has backed off just a bit with regards to standardizing HBM2 die stack dimensions. In the previous version of the standard, the dimensions for 12-Hi stacks were listed as “TBD”, but for the new revision the group has seemingly punted on any standardization whatsoever. As a result, there isn’t a single standard height for 12-Hi stacks, leaving it up to memory manufacturers to set their own heights, and for customers to accommodate any differences between the manufacturers.

It is also worth noting that while the HBM2 standard doesn’t directly impose power limits on its own, the standard does specify regular operating voltages. HBM2 since its inception has operated at 1.2V, and the latest standard has not changed this. So the faster memory speeds should come with little (if any) increase in power consumption, as they won’t require higher voltages to drive them.

Finally, it looks like JEDEC has passed on formally adopting the “HBM2E” moniker for the latest memory standard. In pre-standard technology announcements from Samsung, SK Hynix, and others, all of these groups referred to the memory as HBM2E. And indeed, Samsung still is. However this appears to be an entirely informal arrangement, as the official wording on both the JEDEC’s page as well as in the standard itself continue to refer to the memory as HBM2. So it is almost guaranteed that we’re going to see the two terms thrown around interchangeably over the next couple of years.

Samsung Flashbolt Memory Update: Volume Production In H1’2020

Following the HBM2 standard update, Samsung this afternoon has also issued its own announcement offering an update on the status of their third-generation Flashbolt HBM2E memory. Samsung was the first company to release information on the new speeds, announcing Flashbolt almost a year ago during NVIDIA’s 2019 GPU Technology Conference. At the time Samsung’s announcement was still preliminary, and the company wasn’t saying when they would actually go into mass production. But now we finally have our answer: the first half of this year.

Given that almost a year has passed since the original Flashbolt announcement, Samsung’s announcement is as much a reminder that Flashbolt exists as it is a proper update. Still, today’s announcement offers a bit more detail than Samsung’s relatively high-level reveal last year.

Samsung HBM2 Memory Comparison
  Flashbolt Aquabolt Flarebolt
Total Capacity 16 GB 8 GB 8 GB 4 GB 8 GB 4 GB
Bandwidth Per Pin 3.2 Gb/s
(4.2 Gb/s OC)
2.4 Gb/s 2 Gb/s 2 Gb/s 1.6 Gb/s 1.6 Gb/s
Number of DRAM ICs per Stack 8 8 8 4 8 4
DRAM IC Process Technology 1y 20 nm
Effective Bus Width 1024-bit
Voltage 1.2 V? 1.2 V 1.35 V 1.2 V
Bandwidth per Stack 410 GB/s
(538 GB/s OC)
307.2 GB/s 256 GB/s 204.8 GB/s

Of particular note, Samsung is only announcement 16GB stacks at this time, built using 2GB dies stacked in an 8-Hi configuration. And while this doesn’t preclude Samsung eventually going to 12-Hi, 24GB stacks in the future, it isn’t where the company is going to start at. The memory dies themselves are being manufactured on Samsung’s 1y process technology.

Meanwhile, Samsung appears to be setting some ambitious targets for data rates for Flashbolt. Along with supporting the new 3.2Gbps HBM2 standard, Samsung claims that they are able to go out of spec with Flashbolt, taking the memory to an even speedier 4.2Gbps. This would be a further 31% data rate increase over 3.2Gbps HBM2, and it would push the bandwidth available in a single stack to 538GB/sec, or better than half a terabyte a second. The key word here, of course, is “out of spec”; it’s not clear whether there are any HBM2 memory controllers that will be able to keep up with Samsung’s data rates, and of course there’s the question of power consumption. So while it’s all but guaranteed that Samsung has customers lined up to use Flashbolt at 3.2Gbps, it will be interesting to see whether we see any kind of high-volume products ship at data rates higher than that.

Overall, this makes Samsung the second vendor to announce out of spec HBM2 memory. Last year SK Hynix announced their own HBM2E effort, which is expected to reach 3.6Gbps. So whatever happens, it would seem we’ll now have multiple vendors shipping HBM2E memory rated to go faster than the brand-new 3.2Gbps spec.

Source: Samsung

Comments Locked

24 Comments

View All Comments

  • PeachNCream - Tuesday, February 4, 2020 - link

    I hope you're right and that AMD is heading in that direction. HBM costs will need to come down quite a bit though since APUs represent a portion of AMD's lower priced computing segment. The other problem is the inevitable improvement of technology. If HBM is cheap enough to be added to the APU package, what will higher performing products offer in terms of capabilites on a relative scale? Will a hypothetical HBM-equipped APU have enough performance to get relatively closer to the low- to mid-range discrete GPUs on sale if/when they go on sale and what does APU performance end up looking like when running games and other software that will be available at that time? Hopefully, we'll see the gap close, but it is going to be hard to predict how the cards will fall.
  • Xajel - Tuesday, February 4, 2020 - link

    Before almost 3 years, Samsung announced a LC-HBM2/3 solution, which uses a narrower bus, IIRC it was something 448~512bit compared to 1024bit current HBM's have.

    This narrower bus allows to use a much less expensive organic interposer packaging rather than the silicon interposer required by the wider 1024bit bus, in theory this can be used like any chip/chiplet in an MCM package.

    But Samsung was silence after that, nothing was released, it seems Samsung didn't find enough interest in the product. Sad if this was the case, it has a lot of potential for various of products.
  • 29a - Tuesday, February 4, 2020 - link

    I'd love to see what an APU with 4 stacks of this would perform like.
  • ballsystemlord - Friday, February 7, 2020 - link

    Grammar error:

    "Of particular note, Samsung is only announcement 16GB stacks at this time,..."
    Wrong suffix:
    "Of particular note, Samsung is only announcing 16GB stacks at this time,..."

Log in

Don't have an account? Sign up now