• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Are you DirectStorage ready?

Someone has compiled the DS 1.2 bulkload demo into an executable you can run yourself, so did a re-run and got:

2TB Sabrent Rocket 4 Plus Gaming (direct storage optimised firmware):
23.45 GB/s // Loaded 0.37s

8TB Sabrent Rocket 4 Plus:
19.97 GB/s // Loaded in 0.44s

8TB Samsung 870 QVO SATA connected to a USB-C 3.1 Gen 2 enclosure):
2.10 GB/s // Loaded in 4.11s

So assuming the above stays true in games that use DS1.2, even a SATA drive would be leveraging fast GPU decompression for real time streaming right to the GPU.


Can you please check if your GPU usage, power draw and temperature and vram usage changes Between ds 1.1 and 1.2

That's information is far more interesting than your loading times above

Thanks
 
Can you please check if your GPU usage, power draw and temperature and vram usage changes Between ds 1.1 and 1.2

That's information is far more interesting than your loading times above

Thanks
The 4 highlighted stats between DS 1.1 and 1.2 below:

1.1:
lDxYwWt.png


1.2:
pCTYdSa.png


I wasn't expecting there to be any difference really but you can see the 1.2 run did spool the fans up into low speed idle. I reset the values between 1.2 and 1.1 runs and let each cycle 5 times before screenshotting. They're essentially exactly the same when accounting for margin of error and looking at the average values in the right column.
 
Tried the bulkloaddemo, 5 point something gig (dont know exact as only one screen for a second) loads in just under 7 seconds on (actually intel DC p4600 gen 3x4 datacentre drive, wow its bad at this workload), any good?

Correction.

5.47gig in 0.27 seconds for 980 Pro.

--

Test done on Windows 10 21H2, so not even latest build of windows 10 never mind 11, the test proves can distribute DS libraries with the app/game and not rely on artifical OS limitations.

Do also have boxed 2TB SN850X I might test with (heatsink version), undecided yet if PC or PS5 gets it, was an impulse buy based on recent price drops.
 
Last edited:
  • Like
Reactions: mrk
RTX IO doesnt require DS, so hopefully we see use of that instead as thats not dependent on windows 11.
RTX IO is GPU Decompression, which is one of the features of DirectStorage. So if a game supports RTXIO, then it will also support DS, likewise vice versa.

The benefit of RTXIO is that it doesn't care what OS or hardware you have, it's open source by Nvidia to work on any platform that wants to support it. It just works better on RTX hardware because the hardware is engineered for it, and the driver is optimised for it.
 
Last edited:
The 4 highlighted stats between DS 1.1 and 1.2 below:

1.1:
lDxYwWt.png


1.2:
pCTYdSa.png


I wasn't expecting there to be any difference really but you can see the 1.2 run did spool the fans up into low speed idle. I reset the values between 1.2 and 1.1 runs and let each cycle 5 times before screenshotting. They're essentially exactly the same when accounting for margin of error and looking at the average values in the right column.

check the average column and not the peak. Average power draw is 10w higher, average core clock and memory clock are both several hundred MHz higher and both GPU core load and Vram load is 10% higher

The GPU is definitely working harder with ds1.2 as I expected, it's cool see how much harder and get that confirmation, thanks mate
 
It is working a bit harder, but ultimately the main takeaway is up to 10w higher power draw as an average, is this meaningful in the real world? That is open to debate given we are running cards that run at 350 watts or more during these gaming sessions :p

But yeah, to see the differences between DS 1.2 vs 1.1, this is interesting to compare for sure.
 
It is working a bit harder, but ultimately the main takeaway is up to 10w higher power draw as an average, is this meaningful in the real world? That is open to debate given we are running cards that run at 350 watts or more during these gaming sessions :p

But yeah, to see the differences between DS 1.2 vs 1.1, this is interesting to compare for sure.

We'll see next week when we get our first ds1.2 game
 
What is the game?

Ultimately the test will be if it stutters, thats what really it needs to fix.



 
Last edited:
Ultimately the test will be if it stutters, thats what really it needs to fix.
Even if GPU decompression creates enough legroom for the CPU cores to be free from being bogged down and thus stuttering, a game can still stutter if it's poorly optimised and barely makes proper use of even two cores on say an 8 core CPU. case in point, most games that have come out in the last 2 years lol, resulting in months of patches, to which even today games like Last of Us still have not quite nailed and all Unreal engined games also suffer as evidenced by Digital Foundry's recent video on the tech of UE 5.2.

Ratchet uses the Insomniac engine though and it's Nixxes doing the port, so most are expecting it to be largely excellent and well optimised. Spiderman for ex\mple use the same engine, and basically run excellent.
 
DS/RTXIO tech focus by DF, nothing new shown but seeing the technology outlined in this way in one place is really helpful as plenty of people are still asking what RTXIO is etc.


The main takeaway is that a SATA SSD with RTXIO loads textures faster in Portal Prelude than an NVMe gen 3 without RTXIO. The difference is small, but it's still a large percentage difference all the same, even though both instances are in the 1.x seconds to load range. Place this function into a modern game with vastly larger textures in an open world environment, like with Ratchet next week, ans those 0.xs differences suddenly would become several seconds of difference - A difference that could mean a loading or hitch, versus a seamless transition passing through rifts.
 
Last edited:
Some very interesting details in there like da/rtx io makes a sata drive faster for gaming than a Nvme with no ds/rtx io

Also nice to see my other theory confirmed that GPU type may affect performance and decompression performance does vary between GPUs

Now while not massive in Portal since it's such a simple looking game, it does show that a rtx4090 with RTX io/ds1.2 is going to load data faster than a system with a lower end GPU. It's not significant so no one needs to worry about it - an otherwise identical pc with a 12900k and Nvme storage drive, with a 4090 in the system it's about 30% faster decompression speed and loading times over a rtx2060 and 10% to 15% over a rtx4070
 
Last edited:
It makes sense since GPU decompression leverages the compute potential of the GPU, so the more powerful the GPU, the faster the decompression will be. Unlike for example the hardware AV1 encoders on all 40 series cards where encoding speeds are largely the same regardless of the model of 40 series since it's the dedicated chip which is the same across the card models in the family - Any differences will come from the rest of the system which is still leveraged to a lower degree, but still important all the same.

What I'd really like to see is normal decompression being done on the GPU now, like if I unzipped a large download then instead of what currently happens, the CPU doing all the work, the GPU is used instead. GPU decompression is used in the medical industry this way too for MRI/CT scans and suchlike where multi-GB/TB scans take place so there's no reason Windows apps can't make use of it either for stuff like decompressing files via the GPU instead surely. Obviously you'd need to have fast NVMe drives to write the data to and this is where PCIe Gen 4 drives and above come into their own as both read and write speeds will be paramount.

Edit*
Did some googling and seems WinZip supports GPU acceleration for encryption:
 
Last edited:
There was a period I would seek out the best. But now don't care so much. Winrar does the job. What exactly does 7 zip do better?
 
Back
Top Bottom