DirectStorage API Now Available on PC

Timu · Mar 14, 2022

Reinhard said:
Eh, Xbox Series NVME drive is pretty much the same as the PCI Gen 3 drive for read/write speeds, in fact it is slower than my PCI express gen 3 drive. And the 2 lanes of PCI express 4.0 bandwidth on the Series X is equivalent bandwidth to 4 lanes of PCI Express 3.0 on a PC. So I don't think there is going to be such a dramatic difference since I am sure devs will go for what's easiest, ie, having DirectStorage support be similar to what the Series X is doing in games with storage.

While that is true for Gen 3, what about those that target Gen 4 speed? From what I heard Gen 4 is suppose to be better once utilized. Otherwise, what would be the point of getting Gen 4 Nvmes if Gen 3 Nvmes are nearly as good? While I expect Gen 3 to be the norm, certain games will probably use Gen 4 to make those Nvmes worthwhile.

Bonfires Down · Mar 14, 2022

Teeth said:
Sure, but it's sort of like if Uber forced all of its drivers to use manual transmission cars; you can blame all the unskilled uber drivers for having herky-jerky rides because they spent 10 years driving automatic and now are fumbling through driving stick. You could point to the manual transmission cars as more efficient, easier to repair, with greater control over shifting, and all the rest of it and say it's not the manual transmission that is at fault, it's just a bunch of drivers who don't know what they are doing.

As the person riding in the back seat, I can look at how I rode with Driver A before and it was fine and smooth and when i drove with Driver A after the Uber Manual Policy Mandate it was herky jerky and a bad experience. It kind of doesn't matter to me that nVidia was spending millions making uniquely tuned automatic transmissions as smooth as possible for each individual car manufacturer.

I agree with this. It's not our problem to figure out what the stuttering issue is with DX12, we just know there is a problem compared to DX11. I'm sure once developers get their heads around DX12 we will all be better off than with DX11, but as it stands it's still an issue.

V3N1X · Mar 14, 2022

plagiarize said:
No worries, since I'd got the Hubble imagery with your link, it wasn't a total waste of time.

I'm not sure I'm getting any perceptual differences with my setup. My speeds already seem broken eitherway. I'm seeing similar read speed averages (5000 to 8000 MB/s average), and max read speeds upwards of 50,000 MB/s. Which. Uh... wow.

It's not broken I think? there's a scale number in the overlay next to the max field... 50,000 would be 5GB/s-ish.

You also don't need to use the built-in OSD, you can use Afterburner/CapFrameX to detect read speeds as well.

Edit: Also, I think the more important metric for DirectStorage wouldn't be sustained read speeds but random 64K reads? which DS would do in queues and batches and so you wouldn't have to wait on each read request to complete as was the case with old IO stacks.

Teeth · Mar 14, 2022

V3N1X said:
Talk about missing the point in its entirety.

Microsoft isn't forcing anyone to utilize DX12, but they can't keep adding new features to an old API like DX11... if FromSoftware for example didn't have time to optimize the DX12 back-end properly they could have easily utilized DX11 on PC but I suspect they just took the faster, easier way out and since they were gonna support DX12 on Xbox Series consoles anyway they chose the same for PC.

Choices like the above crop up in software dev all the time, and choosing the technology you're using correctly is one of the most important decisions that might cause technical debt to make/break a project because your engineers are unfamiliar with a core piece of technology upon which your codebase is built.

The SDK is not to blame here in any way shape or form.... maybe invest in training your engineers on new SDKs/APIs or hire people with history of optimizing for said SDKs and put them to work.

Edit: In the next few years take a look at geometry pipelines as engines/devs take the leap to using Mesh Shaders instead of Vertex Shaders... new tech, possibility to extract more performance but also more responsibility since they run like compute shaders so you can do anything with them... who's responsible for properly learning how to utilize/optimize for the new geometry pipeline?

The SDK is not to blame, but from a consumer perspective, it makes people wary when they hear/find out about something running on something that has historically caused, well, a lot of developers problems. Not even just developers with historically....janky...tech. Powerhouses like DICE have had issues. It's not blaming the SDK, it's the canary in the coalmine for "will this cause new problems in the face of solving old ones".

So it wouldn't be blaming DX12 for causing issues, it'd be blaming places for choosing to use DX12.

Also "choosing the correct technology" is often not up to the people directly having to do the coding.

Deleted member 12323 · Mar 14, 2022

TeenageFBI said:
Is it supported in Windows 10?

yes, but you'll want it for windows 11 due to the added benefits

LCGeek · Mar 14, 2022

pswii60 said:
SteamOS lets you have resume states across multiple games like on XS? How many games can you have resume states for at once?

Only talking about base functionality or consistency to run decently.

Hoddi · Mar 14, 2022

plagiarize said:
No worries, since I'd got the Hubble imagery with your link, it wasn't a total waste of time.

I'm not sure I'm getting any perceptual differences with my setup. My speeds already seem broken eitherway. I'm seeing similar read speed averages (5000 to 8000 MB/s average), and max read speeds upwards of 50,000 MB/s. Which. Uh... wow.

I wouldn't put too much focus on max read speeds (although the benchmark mode is good for testing your SSD). This is ultimately an SFS demo meaning that each read operation is just 64KB in size. It's just doing so many of them that it would start affecting CPU performance without DirectStorage. There's an option for enabling/disabling DS and there's frankly a big difference in CPU utilization between having it off and on.

It's also worth noting that rendering resolution and framerate have a big effect on the read rate. Running in fullscreen at 4k shows reads up to 400-500MB/s while running in a small window drops it into the sub-50MB/s range.

Hoddi · Mar 14, 2022

On a different note, DirectStorage isn't locked to NVMe drives.

HeWhoWalks · Mar 14, 2022

Hoddi said:
On a different note, DirectStorage isn't locked to NVMe drives.

Not locked to it, but it was specifically made with NVMes in mind. Much like being accessible on Windows 10, but built with Win11 in mind.

ILikeFeet · Mar 14, 2022

As can be seen in the video, there is virtually no difference in frame rate, or most other metrics, when enabling DirectStorage. However, in the normal demo mode, which would seem to better approximate the usual gaming experience, turning DirectStorage on lowers CPU usage and temperature by quite a bit. From around 66 degrees and a utilization in the low 20%, to about 59 degrees and a utilization of around 10%. In the benchmark mode, these metrics are closer, but the situation seems to be reversed. There is slightly higher CPU utilization with DirectStorage on.

A couple of things should be noted.

First, GPU based asset decompression is not supported by the DirectStorage SDK yet. Microsoft says that this is on the roadmap, however.

https://compusemble.com/insights/home/microsoft-directstorage-tested-lower-cpu-usage-and-temperature-in-some-scenarios-but-are-we-getting-the-full-benefit-of-windows-11s-new-storage-stack-yet

Timu · Mar 14, 2022

ILikeFeet said:
https://compusemble.com/insights/home/microsoft-directstorage-tested-lower-cpu-usage-and-temperature-in-some-scenarios-but-are-we-getting-the-full-benefit-of-windows-11s-new-storage-stack-yet

Nice, this needs to come as soon as possible!

ThereAreFourNaan · Mar 15, 2022

Based on how adoption of low level api's has gone I'm excited to see what the top tier developers will do with it but also frightened of what this might result in when not so technically inclined teams have granular over IO lol

Richardi · Mar 15, 2022

Hoddi said:
On a different note, DirectStorage isn't locked to NVMe drives.

Oh yeah ultrafast 5 1/4" copying!

V3N1X · Mar 15, 2022

Hoddi said:
On a different note, DirectStorage isn't locked to NVMe drives.

Lol, this is funny... can't remember the last time I held a floppy disk in hand. That said, glad they could abstract it to that extent... saves the dev time implementing asset loading for two different IO stacks otherwise.

V3N1X · Mar 15, 2022

ILikeFeet said:
https://compusemble.com/insights/home/microsoft-directstorage-tested-lower-cpu-usage-and-temperature-in-some-scenarios-but-are-we-getting-the-full-benefit-of-windows-11s-new-storage-stack-yet

Also of note, the SFS demo from Intel isn't using the IO Rings/BypassIO APIs specifically designed for Windows 11... so I expect even more of a decrease in CPU usage when these APIs are used.

Edit: Also, contrary to what the article says, BypassIO is not disabled in Win11, it's active for specific configurations only (maybe they're still ironing out bugs)

Vimto · Mar 15, 2022

Hoddi said:
Compiled version.

If you want to use the 16k images then you'll have to edit demo-hubble.bat and point it to where you've extracted the file. Demo.bat and stress.bat should otherwise run without any changes.

Edit: Fixed the original link with a proper version.

Thanks, ran it on my Samsung 980 Pro and was averaging 4.5GB/s.. Although it seem the speed was degrading over time, maybe throttling as it heating?

V3N1X · Mar 15, 2022

Vimto said:
Thanks, ran it on my Samsung 980 Pro and was averaging 4.5GB/s.. Although it seem the speed was degrading over time, maybe throttling as it heating?

You don't need to worry about that at all, the benchmark is actively trying to impose conditions upon which consistent disk reads are needed.

In real scenarios games would never need to constantly keep loading data from disk like this, the "demo" configuration is more akin to real world scenarios at which you can see how rapid camera movement leads to data loads in 100s of MB/s.

ILikeFeet · Mar 15, 2022

AMD will have a demo for Direct Storage

Announcing AMD support for DirectStorage launch

AMD is proud to announce our support of Microsoft® and its release of DirectStorage. The DirectStorage SDK is now available for download.

gpuopen.com

arsene_P5 · Mar 16, 2022

ILikeFeet said:
AMD will have a demo for Direct Storage

Announcing AMD support for DirectStorage launch

AMD is proud to announce our support of Microsoft® and its release of DirectStorage. The DirectStorage SDK is now available for download.

gpuopen.com

That's nice.

ILikeFeet · Mar 23, 2022

V3N1X · Mar 23, 2022

ILikeFeet said:

Great talk, also confirms GPU decompression is coming soon.

For now, the process is: Storage -> System Memory -> CPU Decompression (through a custom decompression queue) -> Copy to VRAM

I confirmed on the DX Discord that they plan to introduce GPU decompression which would make the process: Storage -> System Memory -> Copy to VRAM & Decompress

They're also working on DMA to VRAM, so peer-to-peer and the process would be Storage -> VRAM & Decompress

GPU decompression will come first before peer-to-peer but they have no timeline as of yet.

vixolus · Mar 23, 2022

V3N1X said:
Great talk, also confirms GPU decompression is coming soon.

For now, the process is: Storage -> System Memory -> CPU Decompression (through a custom decompression queue) -> Copy to VRAM

I confirmed on the DX Discord that they plan to introduce GPU decompression which would make the process: Storage -> System Memory -> Copy to VRAM & Decompress

They're also working on DMA to VRAM, so peer-to-peer and the process would be Storage -> VRAM & Decompress

GPU decompression will come first before peer-to-peer but they have no timeline as of yet.

Is gpu decompression direct storage part of Series X/S API already or will that come with the Windows upgrade? Did anyone ask/answer a similar q in the discord?

V3N1X · Mar 23, 2022

vixolus said:
Is gpu decompression direct storage part of Series X/S API already or will that come with the Windows upgrade? Did anyone ask/answer a similar q in the discord?

Series X|S consoles have a dedicated HW decompression block that takes care of that as part of the Velocity architecture... the Series X|S are quite far ahead of PC in that regard.

They're essentially at v4, they're not even using GPU compute for decompression, but have dedicated HW to do it... This will require new GPUs on PC with dedicated silicon for decompression, while GPU decompression (v2 essentially above) will work with existing GPUs that gamers have right now on their PCs.

PC DirectStorage will get there eventually, but for now the goal is DMA to VRAM with GPU compute decompression which would be quite ideal with current hardware.

Edit:

Just to clarify as to the version analogy:

v1: Storage -> System Memory -> CPU Decompression (through a custom decompression queue) -> Copy to VRAM [PC is HERE]
v2: Storage -> System Memory -> Copy to VRAM -> Decompress via Compute Shader
v3: Storage -> VRAM -> Decompress via Compute Shader [Achievable on the hardware available in PCs today]
v4: Storage -> VRAM -> Decompress via Dedicated HW [Xbox Series S|X are essentially here, requires HW advancements in PC GPUs]

Firefly · Mar 23, 2022

V3N1X said:
Series X|S consoles have a dedicated HW decompression block that takes care of that as part of the Velocity architecture... the Series X|S are quite far ahead of PC in that regard.

They're essentially at v4, they're not even using GPU compute for decompression, but have dedicated HW to do it... This will require new GPUs on PC with dedicated silicon for decompression, while GPU decompression (v2 essentially above) will work with existing GPUs that gamers have right now on their PCs.

PC DirectStorage will get there eventually, but for now the goal is DMA to VRAM with GPU compute decompression which would be quite ideal with current hardware.

How does RTX I/O factor into this pipeline?

V3N1X · Mar 23, 2022

Firefly said:
How does RTX I/O factor into this pipeline?

RTX I/O in my opinion is just Nvidia's marketing name for DirectStorage support in their GPUs... doesn't sound like Nvidia will do anything proprietary there.

Same way they're still using the "RTX" moniker even though it's just DXR (was always slated to be DXR/VRTe... for D3D/Vulkan respectively).

dgrdsv · Mar 23, 2022

V3N1X said:
RTX I/O in my opinion is just Nvidia's marketing name for DirectStorage support in their GPUs... doesn't sound like Nvidia will do anything proprietary there.

These two:

v2: Storage -> System Memory -> Copy to VRAM -> Decompress via Compute Shader
v3: Storage -> VRAM -> Decompress via Compute Shader [Achievable on the hardware available in PCs today]

Are proprietary since they are done by the GPU driver. You could argue that this is the main part of "RTX I/O" but you're right in a sense that "RTX I/O" as a whole is just Nvidia's name for DirectStorage support.
It could be that "RTX I/O" will be used on platforms without DirectStorage though (Linux or even Windows with Vulkan) so there are possible terminological differences.

ILikeFeet · Mar 23, 2022

You might be wondering if that's substantially faster than games run without DirectStorage, and Ono admits the answer is actually no, not yet: while you'll definitely see a huge speed boost from an SSD over the magnetic spinning platters of a hard drive, and from an NVMe SSD over a slower SATA-based drive, the current implementation of DirectStorage in Forspoken is only removing one of the big I/O bottlenecks — others exist on the CPU.

Forspoken dev says one-second game loads are now a tentative reality on PC

Microsoft DirectStorage + NVMe

www.theverge.com

Theorry · Mar 24, 2022

they are showing some directstorage at 1:55

Vimto · Mar 24, 2022

Wait, current API is capable of pushing 2.8GB/s?

How is that possible? And if so, why arent we seeing the 2-3 seconds loading times on older games??

Henrar · Mar 24, 2022

Vimto said:
Wait, current API is capable of pushing 2.8GB/s?

How is that possible? And if so, why arent we seeing the 2-3 seconds loading times on older games??

Because games during loading are doing more stuff than just moving files from disk to runtime memory. The bottleneck in those games exist elsewhere.

Edward850 · Mar 24, 2022

Vimto said:
Wait, current API is capable of pushing 2.8GB/s?

How is that possible? And if so, why arent we seeing the 2-3 seconds loading times on older games??

Loading isn't just about moving data from disk to RAM with most games. It can be, for example the idtech1 games (Doom, etc) had assets highly optimised around memory alignments and structures meaning most tasks were either straight up memory copies or aligned around linear reads, however the cost was no compression was used to store the assets, and tools had to be designed around manipulating and compiling binary structures. Unless you were bleeding for RAM that is, as it'd keep having to swap stuff off the zone allocator and then fetch it again off the disk for larger operations. :V

As time went on this kind of design was phased out, in exchange for compressed data and various forms of data pre-processing (parsing text files and implicit structures that needed to be expanded out to more complex states, for example). This kind of data is highly non-linear and can require a lot of CPU overhead to process, let alone read the information in the first place. Coupled with decompressing data on the CPU to then hand off to the GPU, things get busy very quick.

Kind of the funny thing about all this? The idea of games being able to load and swap levels and level data instantly has always been possible without SSDs. Halo 2 was already doing this on the original Xbox (in the campaign anyway, though occasionally popping in texture mips due to needing to rush the game out the door at the very end). Heck, Crash bandicoot was doing a form of this on the PS1 as it streamed level chunks off the disc. It's just workflows became highly focused around unprocessed and compressed data that made things rather inefficient on the CPU side. Not that it was a bad decision in a vacuum, mind you. Sort of a more "pick your poison" type scenario. SSDs and DirectStorage now are providing alternatives by allowing for highly efficient non-linear reads and the potential to stream and process data directly to and on the GPU instead.

dgrdsv · Mar 24, 2022

ILikeFeet said:
Forspoken dev says one-second game loads are now a tentative reality on PC

Microsoft DirectStorage + NVMe

www.theverge.com

That's pretty much what I've expected. DS isn't providing much improvement in load times, and likely won't provide much even with GPU decompression. Even 4.5->1.9 sec isn't anything worth pursuing IMO.
Now the ability to use storage for streaming more data each frame is much more interesting. But games will have to start requiring NVMe SSDs for that to happen.

Vimto said:
Wait, current API is capable of pushing 2.8GB/s?

How is that possible? And if so, why arent we seeing the 2-3 seconds loading times on older games??

They are fully capable of pushing way more than 2.8GB/s. PCIe4 NVMe drives are hitting >6GB/s easily. APIs are not and never were an issue. The need to process the data which is read before it can be used is the issue.

V3N1X · Mar 24, 2022

dgrdsv said:
That's pretty much what I've expected. DS isn't providing much improvement in load times, and likely won't provide much even with GPU decompression. Even 4.5->1.9 sec isn't anything worth pursuing IMO.
Now the ability to use storage for streaming more data each frame is much more interesting. But games will have to start requiring NVMe SSDs for that to happen.

They are fully capable of pushing way more than 2.8GB/s. PCIe4 NVMe drives are hitting >6GB/s easily. APIs are not and never were an issue. The need to process the data which is read before it can be used is the issue.

The stated goal of DirectStorage is minimizing CPU overhead, as of the current version saving are between 20-40%, which will increase a lot more when GPU decompression and DMA to VRAM come along.

No idea why they're comparing loading times here, improvements aren't expected to be reflected best there.

Deleted member 14089 · Mar 24, 2022

V3N1X said:
The stated goal of DirectStorage is minimizing CPU overhead, as of the current version saving are between 20-40%, which will increase a lot more when GPU decompression and DMA to VRAM come along.

No idea why they're comparing loading times here, improvements aren't expected to be reflected best there.

I advise you not to engage with that user, unless you want a headache.

Nontheless, it's a pre-mature conclusion by them anyway.

V3N1X · Mar 24, 2022

dgrdsv said:
These two:

Are proprietary since they are done by the GPU driver. You could argue that this is the main part of "RTX I/O" but you're right in a sense that "RTX I/O" as a whole is just Nvidia's name for DirectStorage support.
It could be that "RTX I/O" will be used on platforms without DirectStorage though (Linux or even Windows with Vulkan) so there are possible terminological differences.

There's nothing proprietary about using a compute shader to run a GPU-optimized decompression algorithm... it's non-proprietary by virtue.

Vendor specific driver optimization can be done to make a shader run faster on certain hardware, but that's about it.

Sweet Blue · Mar 24, 2022

Ok, I have a pretty vague idea of what the availability of this API can bring to the game industry and...
I tried out the Hubble Demo in Benchmark Mode on my machine and I had this :

I'm unsure of what the bandwith represent because uh... The numbers seem crazy high?
Is my NVME streaming textures @ 1TB/s? O_o

Deleted member 93062 · Mar 24, 2022

Sweet Blue said:
Ok, I have a pretty vague idea of what the availability of this API can bring to the game industry and...
I tried out the Hubble Demo in Benchmark Mode on my machine and I had this :

I'm unsure of what the bandwith represent because uh... The numbers seem crazy high?
Is my NVME streaming textures @ 1TB/s? O_o

It seems like it's averaging out at 1.118GB/s.

Sweet Blue · Mar 24, 2022

Sullivan said:
It seems like it's averaging out at 1.118GB/s.

Ah yeah, thanks =)

I really need more coffee for thinking 1k MB = 1 TB XD
Still, I'm pretty impressed =)

dgrdsv · Mar 24, 2022

V3N1X said:
The stated goal of DirectStorage is minimizing CPU overhead, as of the current version saving are between 20-40%, which will increase a lot more when GPU decompression and DMA to VRAM come along.

No idea why they're comparing loading times here, improvements aren't expected to be reflected best there.

20-40% of what? Current CPUs are hardly even loaded by reading NVMe drives at their full speed. You'll also be hard pressed to find a PCIE4 CPU which isn't somewhat current.

V3N1X said:
There's nothing proprietary about using a compute shader to run a GPU-optimized decompression algorithm... it's non-proprietary by virtue.

Vendor specific driver optimization can be done to make a shader run faster on certain hardware, but that's about it.

What? Compute shader is a program. It can be proprietary just as easily as any program.
It can also not be but considering that we're talking about a Windows GPU driver here - it will 100% be proprietary.
Note that DirectX is also "proprietary".

V3N1X · Mar 24, 2022

dgrdsv said:
20-40% of what? Current CPUs are hardly even loaded by reading NVMe drives at their full speed. You'll also be hard pressed to find a PCIE4 CPU which isn't somewhat current.

What? Compute shader is a program. It can be proprietary just as easily as any program.
It can also not be but considering that we're talking about a Windows GPU driver here - it will 100% be proprietary.
Note that DirectX is also "proprietary".

Microsoft and GPU vendors are working on a vendor-agnostic compression format that would be effective/suitable to decompress on the GPU (check Andrew Yeung's talk on GameStackLive last year) ... you can write your own decompression shader for the compression but what has that got to do with RTX I/O?.

I have no idea how to simplify more, the shader can be optimized at a driver level, or Nvidia can write an optimized shader for their own hardware... but everything about the compression format will be available for everyone to see, RTX I/O has nothing to do with this.

20-40% of current load on CPUs caused by IO requests, it's literally in one of the slides in the talk by one of the engineers developing DirectStorage, what are we even discussing here?

V3N1X · Mar 24, 2022

Here are diagrams from the Luminous Production talk on Forspoken's implementation of DirectStorage and how their current implementation will differ from the final version (when DirectStorage releases said version)

Current

Future

Vimto · Mar 26, 2022

Insightful video about Directstorage

Flappy Pannus · Mar 26, 2022

Thanks so much for the further explanatory posts V3N1X, I was precisely wondering what the point of this was right now as from my understanding, the primary reason for this existing was to move decompression entirely off the CPU and have it done by the GPU, which this version doesn't do quite yet. Every subsequent question I had you've answered.

Only downside to me is that this further amplifies the outsized bottleneck that all these game launchers on the PC have wrt actually getting into a game. The game can load in 4 seconds!...after you wait 30 seconds for the publishers launcher to load and sync all your save games. :(

Deleted member 93062 · Mar 26, 2022

Flappy Pannus said:
Thanks so much for the further explanatory posts V3N1X, I was precisely wondering what the point of this was right now as from my understanding, the primary reason for this existing was to move decompression entirely off the CPU and have it done by the GPU, which this version doesn't do quite yet. Every subsequent question I had you've answered.

Only downside to me is that this further amplifies the outsized bottleneck that all these game launchers on the PC have wrt actually getting into a game. The game can load in 4 seconds!...after you wait 30 seconds for the publishers launcher to load and sync all your save games. :(

Next Microsoft needs to bring Quick Resume to PC!

Flappy Pannus · Mar 26, 2022

Sullivan said:
Next Microsoft needs to bring Quick Resume to PC!

Would be very tricky - what do you do after a driver update for example when the shaders need to be recompiled?

Deleted member 93062 · Mar 26, 2022

Flappy Pannus said:
Would be very tricky - what do you do after a driver update for example when the shaders need to be recompiled?

Would this be an issue for DX12 games that use async shader compilation?

Flappy Pannus · Mar 26, 2022

Vimto said:
Insightful video about Directstorage

Somewhat. The video focuses pretty much solely on the advantage of having the GPU decompress textures directly into VRAM, and while that is a huge step, there are other bottlenecks in the current storage API's that DS relieves that weren't touched upon. This video could have been made 2 years ago when DS was first announced, it doesn't actually describe what ver 1 is bringing to the table.

V3N1X · Mar 26, 2022

Flappy Pannus said:
Thanks so much for the further explanatory posts V3N1X, I was precisely wondering what the point of this was right now as from my understanding, the primary reason for this existing was to move decompression entirely off the CPU and have it done by the GPU, which this version doesn't do quite yet. Every subsequent question I had you've answered.

Only downside to me is that this further amplifies the outsized bottleneck that all these game launchers on the PC have wrt actually getting into a game. The game can load in 4 seconds!...after you wait 30 seconds for the publishers launcher to load and sync all your save games. :(

🙏

In regards to loading saves, I feel the same way... like if the game can run offline, why wouldn't they sync the saves asynchronously and let me play?

V3N1X · Mar 26, 2022

Sullivan said:
Would this be an issue for DX12 games that use async shader compilation?

I'd say allow the game to hook its custom shader compilation code in this case? and instead of just going right away exactly back to where you are you get a loading screen while the compilation ends.

I don't know how feasible that would be though, might be more intricacies to figure out.

JahIthBer · Mar 26, 2022

V3N1X said:
Series X|S consoles have a dedicated HW decompression block that takes care of that as part of the Velocity architecture... the Series X|S are quite far ahead of PC in that regard.

They're essentially at v4, they're not even using GPU compute for decompression, but have dedicated HW to do it... This will require new GPUs on PC with dedicated silicon for decompression, while GPU decompression (v2 essentially above) will work with existing GPUs that gamers have right now on their PCs.

PC DirectStorage will get there eventually, but for now the goal is DMA to VRAM with GPU compute decompression which would be quite ideal with current hardware.

Edit:

Just to clarify as to the version analogy:

v1: Storage -> System Memory -> CPU Decompression (through a custom decompression queue) -> Copy to VRAM [PC is HERE]
v2: Storage -> System Memory -> Copy to VRAM -> Decompress via Compute Shader
v3: Storage -> VRAM -> Decompress via Compute Shader [Achievable on the hardware available in PCs today]
v4: Storage -> VRAM -> Decompress via Dedicated HW [Xbox Series S|X are essentially here, requires HW advancements in PC GPUs]

Tensor cores might take off some of the load, but Nvidia claim tensor cores do more than they actually do so who knows.

DirectStorage API Now Available on PC

Prophet of Truth

DF Deet Master

Prophet of Truth

Prophet of Truth

Prophet of Truth

DF Deet Master

Prophet of Regret

DF Deet Master

Prophet of Truth

Prophet of Truth

Prophet of Truth

Prophet of Truth

DF Deet Master

Software & Netcode Engineer at Nightdive Studios

Prophet of Truth

Deleted member 14089

Prophet of Truth

Account closed at user request

Prophet of Truth

Prophet of Truth

Account closed at user request

Account closed at user request

Prophet of Truth

Prophet of Truth