Microsoft: DirectStorage 1.1 with GPU Decompression Finally on Its Wayby Ryan Smith on October 14, 2022 10:45 AM EST
- Posted in
- DirectX 12
- Windows 11
As part of this week’s Microsoft Ignite developers conference, Microsoft’s DirectX team has published a few blog posts offering updates on the state of various game development-related projects. The biggest and most interesting of these is an update on DirectStorage, Microsoft’s API for enabling faster game asset loading. In short, the long-awaited 1.1 update, which adds support for GPU asset decompression, is finally on its way, with Microsoft intending to release the API to developers by the end of this year.
As a quick refresher, DirectStorage is Microsoft’s next-generation game asset loading API, and is designed to take advantage of the modern capabilities of both GPUs and storage hardware to allow for game assets to be more efficiently transferred directly to GPU. On the I/O side of matters, DirectStorage offers new batched I/O operations that are designed to cut down on the number of individual I/O operations, reducing the overall I/O overhead. But more even more notable than that, DirectStorage also enables (or rather, will enable) GPU asset decompression, allowing for modern compressed assets to bypass the CPU and be decompressed on the GPU instead.
The significance of DirectStorage is that Microsoft wants PCs (and console) to be able to better leverage the low random access times and high transfer rates of modern SSDs, enabling games to quickly stream in new assets rather than having to pre-load everything or suffering noticeably slow asset loading, as can be the case today. Under current game development paradigms, the CPU can be a bottlenecking factor in scaling up I/O rates to meet what SSDs can provide, as there are significant CPU costs both to tracking so many I/O operations and for decompressing game assets before passing them on to the GPU. DirectStorage, in turn, is designed to minimize both of these loads, and ultimately, try to remove the CPU as much as possible from game asset streaming.
DirectStorage technology was already implemented on Microsoft/s Xbox Series X/S consoles for their launch in 2020, so more recent efforts have been around porting DirectStorage to Windows and accounting for the non-homogenous hardware ecosystem. Earlier this year Microsoft rolled out DirectStorage 1.0, which implemented the I/O batching improvements, but not the GPU decompression capabilities. This is where DirectStorage 1.1 will come in, as it will finally be enabling the second (and most important) aspect of DirectStorage for PCs.
By allowing GPUs to do game asset decompression, that entire process is offloaded from the CPU. This not only frees the CPU up for other tasks, but it removes a potentially critical bottleneck in game asset streaming. Because modern SSDs are so fast – on the order of hundreds of thousands of IOPS and data transfer rates hitting 7GB/second – the CPU is the weakest link between speedy SSDs and massively parallel GPUs. So under DirectStorage, the CPU is getting cut out almost entirely.
As far as the performance benefits of DirectStorage 1.1 go, the full gains will depend on both the hardware used and how much data a game or other application is attempting to push. Games moving large amounts of data on very fast systems are expected to see the largest gains from the full DirectStorage 1.1 stack, though even lighter games can benefit from the fast access times to NVMe SSDs.
As part of Microsoft’s blog post, the company posted a screenshot from their Bulk Loading sample program for game developers, which offers a simple demonstration and benchmark of DirectStorage 1.1 in action. In Microsoft’s case, they were able to load 5.65GB of assets in 0.8 seconds using GPU decompression on an undisclosed PC, versus 2.36 seconds on the same system with CPU decompression – while maxing out the load on the CPU in the process. Like most SDK sample programs, this is a simple test case focused on just one feature, so the real-world gains aren’t likely to be quite so extreme, but it underscores the performance benefits of moving asset decompression from the CPU to the GPU when you have a large amount of asset data.
Moving under the hood, DirectStorage GPU decompression is being enabled via the introduction GDeflate, a general purpose compression algorithm that was originally developed by NVIDIA. GDeflate is a GPU-optimized variation on Deflate, which has been designed to better mesh with the massively parallel (and not-very-serial) nature of GPUs.
DirectStorage, in turn, will be implementing GDeflate support in two different manners. The first (and preferred) manner is to pass things off to the GPU drivers and have the GPU vendor take care of it as they see fit. This will allow hardware vendors optimize for the specific hardware/architecture used, and leverage any special hardware processing blocks if they’re available. All three companies are eager to get the show on the road, and it's likely some (if not all) of them will have DirectStorage 1.1-capable drivers ready before the API even ships to game developers.
Failing that, Microsoft is also providing a generic (but optimized) DirectCompute GDeflate decompressor, which can be run on any DirectX12 Shader Model 6.0-compliant GPU. Which means that, in some form or another, GDeflate will be available with virtually any PC GPU made in the last 10 years – though more recent GPUs are expected to offer much better performance.
Otherwise, the only things that will eventually be needed to take advantage of GPU decompression – and DirectStorage 1.1 in general – will be Windows 10 1909 (or later) or Windows 11, as well as a fast storage device. Technically, DirectStorage works against any storage device, including SATA SSDs, but it is explicitly being optimized for (and deliver the best results on) systems using NVMe SSDs.
Do note, however, that it will be up to individual games to implement DirectStorage to see the benefits of the API. That means not only using the necessary API hooks, but also shipping games with assets packed using the new GDeflate algorithm. The vast backwards compatibility of GDeflate means that game devs can essentially hit the ground running here on DX12 games – anything worth running a new game on is going to support DirectStorage and GDeflate – but the fact that it involves game assets means that full DirectStorage 1.1 support cannot be trivially added to existing games. Developers would need to redistribute (or otherwise recompress) game assets for GDeflate, which is certainly do-able, but would require gamers to re-download a large part of a game. So gamers should plan on seeing DirectStorage 1.1 arrive as a feature in future games, rather than backported into existing games.
Finally, as for Microsoft’s audience at hand (developers), this week’s announcement from Microsoft is meant to prod them into getting ready for the updated API ahead of its release later this year. Microsoft isn’t releasing the API documentation or tools at this time, but they are encouraging developers to get started with DirectStorage 1.0, so that they can take the next step and add GPU decompression once 1.1 is available later this year.
Source: Microsoft DirectX Dev Blog
Post Your CommentPlease log in or sign up to comment.
View All Comments
Threska - Sunday, October 16, 2022 - linkKeep their porn from prying eyes.
GreenReaper - Friday, December 2, 2022 - linkOr they are the system admin and want to keep away prying eyes.
deil - Monday, October 17, 2022 - linkThis version is at least usable, so it's a big step forward, don't ask Microshift to nail it, or they will nail the coffin of this tech.
Theolendras - Monday, October 17, 2022 - linkThis would be awesome, theorically the CPU involvement in textures should be minized as much as possible, there is not much that comes to mind that requires it that couldn't be accomplished more efficiently by the GPU.
Threska - Friday, October 14, 2022 - linkKind of funny how people want to cut the CPU out of the picture while AMD SAM puts it back in.
atragorn - Friday, October 14, 2022 - linkThink of it like this. If you want to move the water from a five gallon bucket but you can only move a single cup at a time it's going to take a while. If you could use a larger container it will take less time. Depends on the size but potentially much much faster. Like drinking a glass of water a teaspoon at a time versus just chugging the glass down. 😂
That's what SAM is meant for.
AMD is assuming lots of CPU power since obviously they sell them... Good for them 😉
Microsoft has an alternative method,
Contrary to what most people here have most consumers have a weak CPU compared to what we likely have.
Console systems need to maximize their utilization since typically they have a weaker CPU and quite limited resources compared to say an enthusiast computer. So if they can move assets with less CPU power that is good. So it makes perfect sense for them to do this.
Different needs require different solutions. It just so happens to also make sense with these fast nvme drives being sold now. A HDD would not benefit much if at all from this. The bottle neck is the HDD itself.
The new baldur gate game for example loads much faster on a nvme drive and this will probably make it load even faster still. Good stuff 😃
One thing about the article, game devs could simply send a utility to decompress and recompress assets instead of making us download the same assets over again. Not that hard to do that 🙂
Billy Tallis - Friday, October 14, 2022 - linkAMD's SAM is just another name for Resizable BAR, which does not make the CPU any more involved in anything. It's just a matter of dropping a 32-bit compatibility hack so that driver software doesn't have to jump through hoops to access all of a GPU's memory. It only matters when the CPU already needed to be interacting with the GPU's memory.
BinaryTB - Friday, October 14, 2022 - linkWill implementing DirectStorage in PC games require less work if the game is also on Xbox consoles? Meaning, GDeflate and the APIs, are they already "working" in Xbox consoles and don't require porting that portion to PC games (which I assume they do right now when making a PC version of the game).
Silver5urfer - Friday, October 14, 2022 - linkThe thing with this DX12 Direct Storage API is, first on paper it sounds like something great. But in reality it does not matter much due to multiple reasons which I will explain.
First is the Storage to Performance, the argument of loading assets faster is only beneficial to the garbage consoles which were limited by HDDs and moved to NVMe finally. So they already have some kind of hardware API in them on both PS5 and XBSX.
On PC if we compare the titles loading speed without any of the so called Storage APIs the loading speed is literally same. So expecting major things out of this is I sincerely do not expect anything. DX11 vs DX12 was heralded as similar BS and today Vulkan powered RDR2, DOOM Eternal have significant performance over DX12. Same for so many titles running that API didn't provide anything new.
Also if we talk about game fidelity, there's not a single game today which does not use TAA horrible technology and pushes Rasterization to max. The reason I brought it up here is Crysis 3 launched in 2013 and it's been 10 years. There are barely any few games which provide such extravagant experience in Tessellation and no TAA blurfest game. Because there's no developer today who will make any game exclusive to PC. Even if the new consoles are out they barely are fast DMC 5 on PC on a 1080Ti runs at 180FPS Full HD. So if anyone develops for a PC the fidelity would be blowing everything out of water but that era is gone because nowadays games have politics, and lowest effort done ever. Just look at Arkham Knight and compare that to modern Gotham Knights, it's insane how worse the latest game looks and that's very important because not only you have got a visual downgrade BUT pisspoor optimization as the Min Spec is so laughable for that new Gotham Knights. So I do not expect literally anything from this so called Direct X Storage from both AAA developers AND M$.
Heck Red Dead Redemption 2 is a fantastic photorealism and on top the RAGE scripting with the leading Euphoria engine for physics, still it is hampered by TAA thus reducing massive fidelity. Do not take Cybertrash because it's garbage and it's not fit to be called as a game regardless of Nvidia shoving infinite PR BS on it with idiotic Psycho mode. There are barely any new high effort AAA titles.
TAA is mentioned here nobody pushing for high fidelity anymore since Console is the primary market and target. Witcher 3 massive downgrade because ? Consoles. So there's nothing but dust in expecting any sort of high fidelity changes in AAA Gaming space.
Unreal Engine 5 is being heralded as some next gen but in reality their metahuman was total garbage. RE Engine from Capcom beats it totally in that Facial animations. RDR2 and GTA V fidelity showcase is ultra massive, look at GTA V Natural Vision Evolved mod, it stresses out RTX3090, that engine is ultimate in technology scale. Also the UE4 ports so many of them suffer from horrible stutters, I think Epig recently fixed some of it gotta see how UE5 games will hold up.
Second, Hardware Unboxed did a review on how NVMe SSDs have impact, in majority of games there's basically no improvement vs SATA SSD, and forget PCIe 4.0 vs 3.0 SSDs. So having a few seconds shaven off won't provide any benefit whatsoever, since the CPU is not going to magically improve FPS here nor add any benefit to the equation.
Overrated glorified tech that MS wants just because they can lock out PC users to Windows 11 disaster which hampers Win32 shell and the entire Explorer downgrade PLUS VBS CPU problems and nightmarish instability with their OS RTM release. Win10 1909 or later that means LTSC2019 is outdated, and only 21H2 is supported. No thanks I can stick to old WDDM stable LTSC since the new LTSC needs more time to fix it's bugs.
Silver5urfer - Friday, October 14, 2022 - linkForgot to mention Metro Exodus, 4A Engine is also on par with superb FPS experience but it is also marred by TAA. Id Tech 6 was also using no TAA which is why DOOM 2016 looked solid and now TAA based id Tech 7 DOOM Eternal loses out a lot of fidelity with blur.
Still both of those are only other noteworthy engines and games that can sit atop. However Exodus is the only RT worth title in the current industry because of it's RTGI. Adding reflections with horrible performance hit is worthless when you can get SSR done in a much better way, just more stickers and advertising nonsense here in AAA nowadays.
All in all concluding that developer talent provides more fidelity experience and real innovation than these sticker PR features which do not do anything tangible in reality for PC esp.