Digital Foundry || Control vs DLSS: Can 540p Match 1080p Image Quality? Full Ray Tracing On RTX 2060?

JahIthBer · Apr 5, 2020

Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

Teenage Fansub · Apr 5, 2020

EvilBoris said:
My only issue with DLSS at the moment is how it is incredibly oversharpened.

Yeah. I'd kind of rather have a soft bilinear upscale free of sharpening, ghosting and combing.

I think I'm pretty savy at being annoyed by temporal artifacting. The water splashes drove me nuts in Uncharted 4 and I could never find anyone else bothered by, or even noticing them through Googling.

GrrImAFridge · Apr 5, 2020

Dictator said:
It is 100% a togglable slider. They left it as is probably just because basic market research told them people like images sharp to the point of having ringing artefacts.

The sharpening halos are the problem as I see iot. The actual detail in inner surfaces in most shots which make it look different than native, is usually just because native has TAA overeveraging results in inner surfaces, turn off TAA in such shots and inner surface detail actually looks remarkably like DLSS.

The Nvidia bloke who did the DLSS 2.0 presentation mentioned on Twitter that they neglected to include the DLSS sharpness slider because the game itself doesn't have one. A bit of a silly oversight, really, but I'm sure it'll be added soon enough.

EvilBoris · Apr 5, 2020

Dictator said:
I think the interesting point is that to start getting into "differences" you have to do 400-800 % zooms.

With normal 2160 checkerboarding, the overaveraging of pixel results makes the differences between a real native and checkerboarded result in side by sides noticably different in aggregate.

Also the fact that we are comparing and looking for the "which is better/ where does it fail" in side by sides with native and zooming in to tell the reasons why it is deficient while it also is 130% faster.

That is... rather unique.

It is 100% a togglable slider. They left it as is probably just because basic market research told them people like images sharp to the point of having ringing artefacts.

Zedark said:
One of the lead developers behind the technology said sharpness is actually scalable, they just didn't happen to include it in any games yet.

Whew.

I remember how upset I was with it when Xbox One seemed to do it by default on many titles.

Principate · Apr 5, 2020

JahIthBer said:
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

Switch 2 is a good few years out. By the time Nintendo releases it the latest Ampere technology now would be quite old, so there's a good chance it'll use it.

Terbinator · Apr 5, 2020

EvilBoris said:
Whew.

I remember how upset I was with it when Xbox One seemed to do it by default on many titles.

Lol.

Beatle · Apr 5, 2020

By the time Switch 2 comes out we may be looking at DLSS 3.0 and all of its advantages.. mind boggles, exciting times ahead

Matemático · Apr 5, 2020

JahIthBer said:
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

Switch 2 should launch by 2023, Ampere will be 3 years old by there.

Deleted member 4970 · Apr 5, 2020

I played at 4K/30 ultra with my 2070 yesterday! DLSS 2.0 is the future! Hope it gets added to Metro Exodus

tuxfool · Apr 5, 2020

Zedark said:
The big TOPS differential makes the NVIDIA GPU more capable of DLSS-like algorithms (and of course that difference is powered by the tensor cores), but the specific point about the tensor cores being separate from the ALUs might not matter, because DLSS is implemented somewhere in the middle of the graphics pipeline, so the ALUs must wait for the DLSS algorithm to finish before they can continue. This would have been different if the DLSS 2.0 slotted in at the end of the pipeline, but it doesn't.

Therefore, the TOPS difference alone should dictate the time differential for computing the DLSS upscaling.

I should point out that it isn't necessarily the position in the pipeline that causes this, but rather that they share execution resources with the SM.

They could slot in at the end and still prevent further computation because the SMs would be blocked from executing.

UltraMagnus · Apr 5, 2020

Matemático said:
Switch 2 should launch by 2023, Ampere will be 3 years old by there.

It's fairly obvious too that Switch was intended to launch for holiday 2016, it got squeezed into Q1 2017 because the software was not ready, had nothing to do with the hardware.

Even for a 2022 launch 7nm Ampere will by then be fairly matured tech, not something that's so crazy cutting edge.

More Butter · Apr 5, 2020

Xando said:
Full Article:

Remedy's Control vs DLSS 2.0 - AI upscaling reaches the next level

Consider this. Ten years ago, Digital Foundry was mulling over Alan Wake's 960x540 resolution (actually 544p!) and wond…

www.eurogamer.net

Shame this won't be on consoles. Could be a megaton.

Didn't MS have a demo and it's similar tech a couple years ago?

ppn7 · Apr 5, 2020

Teenage Fansub said:
540->1080 didn't work for me. I could just see artifacts everywhere.

I didn't test it but I can see these artifact in the Alex's video. Around Jess and moving environment. It looks like motion interpolation artifact on TV.

EvilBoris said:
My only issue with DLSS at the moment is how it is incredibly oversharpened.
I'm not sure if this just because gamers have a
terrible eye for that kind of thing and prefer it (so it's forced for extra BAMpop) or if it's an unfortunate side effect of the training

I prefer too a slightly softer image than a oversharpened one. The aliasing for me is what reduce the immersion.

ImaginaShawn · Apr 5, 2020

More Butter said:
Didn't MS have a demo and it's similar tech a couple years ago?

DirectML SuperResolution. All but confirmed for series x.

Principate · Apr 5, 2020

UltraMagnus said:
It's fairly obvious too that Switch was intended to launch for holiday 2016, it got squeezed into Q1 2017 because the software was not ready, had nothing to do with the hardware.

Even for a 2022 launch 7nm Ampere will by then be fairly matured tech, not something that's so crazy cutting edge.

Yeah basically I'd expect expect a different gen tech for the next switch if Nintendo sticks with nvidia. No reason not to do so.

Deleted member 679 · Apr 5, 2020

Teenage Fansub said:
540->1080 didn't work for me. I could just see artifacts everywhere.

But the question is, what about 540->1080 vs 720p with regular upscaling? Alex is comparing it to native resolution to show how efficient it is (or isn't), but if you had the power to run at native res, you wouldn't need DLSS in the first place. The real world comparison would be DLSS vs regular sub-native res.

Reinhardt Schneider · Apr 5, 2020

JahIthBer said:
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

There's a difference between Switch and the Next installement of the machine.

Switch is basically using a leftover design from NVidia that nobody wanted.

When Nintendo decides it's time to release a new Switch, since they seem to have a 20 year contract, they're going to want a new chip, and NVidia is not going to let out the chance of getting the extra marketing of implementing such techniques in a tablet-like device.

Whenever they use Ampere or not, it will AT LEAST have Tensor Cores for DLSS, and probably RT cores if Nintendo wants to repeat the Switch's success when it comes to getting big third party games.

eso76 · Apr 5, 2020

Amazing.
I hope this kills the 8k nonsense before it even starts

Teenage Fansub · Apr 5, 2020

Gbraga said:
But the question is, what about 540->1080 vs 720p with regular upscaling? Alex is comparing it to native resolution to show how efficient it is (or isn't), but if you had the power to run at native res, you wouldn't need DLSS in the first place. The real world comparison would be DLSS vs regular sub-native res.

I'd take a blown up native 720P. Artifacts bug me more than softness.

edit: The solution seems acceptable with the higher base reses, though. It's definitely useful.

Minsc · Apr 5, 2020

Gbraga said:
But the question is, what about 540->1080 vs 720p with regular upscaling? Alex is comparing it to native resolution to show how efficient it is (or isn't), but if you had the power to run at native res, you wouldn't need DLSS in the first place. The real world comparison would be DLSS vs regular sub-native res.

Not to mention the framerate.

Would you rather play blurrier at 15-20 fps or sharper at 30-40 fps, blurrier at 60 fps or sharper at the magical 120 fps.

I dunno, there's lots of uses and if you can adjust the sharpness of it as well... damn. I like my stuff sharp anyway.

Teenage Fansub said:
I'd take a blown up native 720P easy. Artifacts bug me more than softness.

What about framerate? Would you take 15fps or 30+ fps? And did you see you can adjust the sharpness of the scaling too to make it look softer like the traditional method?

Teenage Fansub · Apr 5, 2020

Minsc said:
What about framerate? Would you take 15fps or 30+ fps? And did you see you can adjust the sharpness of the scaling too to make it look softer like the traditional method?

I'd just want a computer that could run the game at 60fps in at least native 1080P at that point :)

I know about the sharpening thing, but it's artifacting like you can see around the moving dark posts here that I don't like (I know it's compressed Youtube, but the native side didn't have it):

It reminds me of combing in a badly deinterlaced video.

prodyg · Apr 5, 2020

Xando said:
Full Article:

Remedy's Control vs DLSS 2.0 - AI upscaling reaches the next level

Consider this. Ten years ago, Digital Foundry was mulling over Alan Wake's 960x540 resolution (actually 544p!) and wond…

www.eurogamer.net

Shame this won't be on consoles. Could be a megaton.

My understanding is that MS/AMD are doing their own version of this on the XSX

Terbinator · Apr 5, 2020

Xando said:
Full Article:

Remedy's Control vs DLSS 2.0 - AI upscaling reaches the next level

Consider this. Ten years ago, Digital Foundry was mulling over Alan Wake's 960x540 resolution (actually 544p!) and wond…

www.eurogamer.net

Shame this won't be on consoles. Could be a megaton.

It could well be on consoles. MS demonstrated their own version of an AI solution last year at GDC through DirectML (admittedly on Nvidia HW).

Minsc · Apr 5, 2020

Teenage Fansub said:
I'd just want a computer that could run the game at 60fps in at least native 1080P at that point :)

I know about the sharpening thing, but it's artifacting like you can see around the moving dark posts here that I don't like (I know it's compressed Youtube, but the native side didn't have it):

It reminds me of combing in a badly deinterlaced video.

Yeah, I'll have to see it in motion playing a game to weigh the options properly, but I think it'll be an easier sell more in to next gen for me when the choices are spend another $2000 or get awesome ray tracing (and with higher framerates too against all logic - more details and more graphics and more framerate?) and deal with a bit of combing vs skipping out entirely on the ray tracing or having it down very low.

Who knows maybe they'll release a version 2.5 of it that removes the bulk of the combing artifacts down the road too.

Teenage Fansub · Apr 5, 2020

I'd love to see it try to blow up 16/32bit 240P 2D and 3D games.

ppn7 · Apr 5, 2020

Next step for nvidia : trying to make Motion interpolation (30 FPS looking like 60fps) with low input lag and any artifact.

Fafalada · Apr 5, 2020

Dictator said:
I think the interesting point is that to start getting into "differences" you have to do 400-800 % zooms.

The <1080p upscales didn't need zooming - there was a ton of temporal artifacting in your video that was actually rather disturbing to me.
The 1080p+ though do hold up well on normal viewing (sharpening defaults aside). I wish that NVidia presentation actually went into more details though - it was disappointingly marketing oriented, though I suppose they aren't interested discussing implementation details at this stage.

Anyway, as this gen sorted out temporal stability on regular displays for 'most part' - I really want to see how this holds up on low-persistence displays(actually you guys have those fancy high-spec CRT panels, you should be able to test some of that), especially for VR (the part of the talk that alluded to - but shared no details around how neighbour clamping improves will play a big role here). as that's the next image-quality step to take for future hw/sw, and many current AA/reconstruction methods just don't make that cut.

ILikeFeet · Apr 5, 2020

ppn7 said:
Next step for nvidia : trying to make Motion interpolation (30 FPS looking like 60fps) with low input lag and any artifact.

For videos? Because for games, that's what this is (sans the no artifacts).

Teenage Fansub · Apr 5, 2020

ILikeFeet said:
For videos? Because for games, that's what this is (sans the no artifacts).

It's not doubling frames like they're saying. It's letting you use lower resolution in order to pump out more frames (right?)

I think they're looking for the magic juice to run 30fps things through that is good looking and playable.

tusharngf · Apr 5, 2020

JahIthBer said:
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

imagine RTX3060 for Switch PRO docked version :D

Dictator · Apr 5, 2020

Fafalada said:
The <1080p upscales didn't need zooming - there was a ton of temporal artifacting in your video that was actually rather disturbing to me.
The 1080p+ though do hold up well on normal viewing (sharpening defaults aside).

I was talking about the 2160 results here, but yes.

Fafalada said:
I wish that NVidia presentation actually went into more details though - it was disappointingly marketing oriented, though I suppose they aren't interested discussing implementation details at this stage.

I wish it also went a bit more into how the NN weighting was done, rather than just an overview of what it is trying to do better than other clamping, history buffer solutions. Although, it was enjoyable seeing the various ideas about reprojecting and rejecting on a 2d graph.

Fafalada · Apr 5, 2020

ppn7 said:
Next step for nvidia : trying to make Motion interpolation (30 FPS looking like 60fps) with low input lag and any artifact.

Oculus solved this a few years ago - didn't need ML for it either although maybe there's additional quality gains possible.

ILikeFeet · Apr 5, 2020

Teenage Fansub said:
It's not doubling frames like they're saying. It's letting you use lower resolution in order to pump out more frames (right?)

I think they're looking for the magic juice to run 30fps things through that is good looking and playable.

I'm curious as to why motion interpolation would be desired over upscaling and using the overhead to boost framerate and visuals. I'd assume the latter is way easier than the former. Though I guess having motion vectors would help interpolation significantly

Teenage Fansub · Apr 5, 2020

ILikeFeet said:
I'm curious as to why motion interpolation would be desired over upscaling and using the overhead to boost framerate and visuals.

I'm guessing the question was for passing through locked framerate things like existing console games. A working alternative to using a TV's motion smoothing.

ppn7 · Apr 5, 2020

ILikeFeet said:
For videos? Because for games, that's what this is (sans the no artifacts).

Yeah both result with similarity but not in the same way. It could be use for gaming too.

Teenage Fansub said:
It's not doubling frames like they're saying. It's letting you use lower resolution in order to pump out more frames (right?)

I think they're looking for the magic juice to run 30fps things through that is good looking and playable.

Motion interpolation put an image between the previous and the next frame. You right it's not the same as DLSS.

ILikeFeet said:
I'm curious as to why motion interpolation would be desired over upscaling and using the overhead to boost framerate and visuals. I'd assume the latter is way easier than the former. Though I guess having motion vectors would help interpolation significantly

I don't know if one day we will see DLSS implemented directly in TV as G-Sync modul was in monitor. But now AI motion interpolation work on every content, it's not as good as using DLSS to get more FPS, but it works, with high input lag and/or many artifact.

Teenage Fansub said:
I'm guessing the question was for passing through locked framerate things like existing console games. A working alternative to using a TV's motion smoothing.

It was a question about TV manufacturer. I don't know if they will tend to continue with AI motion interpolation and their own AI upscale or maybe Nvidia could be partnership with them and try to innovate the TV market.
I think that TV processor are too weak compared to Tensor core right ? So not sure we will see any big improvement with TV

ppn7 · Apr 5, 2020

Fafalada said:
Oculus solved this a few years ago - didn't need ML for it either although maybe there's additional quality gains possible.

I didn't know that, but it seems that it work only when you get FPS drops. While in TV it's always ON we you choose to use it.
And i don't know if it work well on VR?

Fafalada · Apr 5, 2020

ppn7 said:
I didn't know that, but it seems that it work only when you get FPS drops. While in TV it's always ON we you choose to use it.

Well if you're already at target framerate there isn't really any use to insert additional frames. Anyway if you want low-latency response you can't really interpolate, it needs to be a form of extrapolation - and that needs access to control-inputs (and potentially game data) so TV is the wrong place to do the process in.

And i don't know if it work well on VR?

It was built 'for' VR, so yes. It's possible to do something equivalent on flat-screen games, but it will require API/driver and per-game modifications to integrate with the process.

I don't know if one day we will see DLSS implemented directly in TV

We won't - for the same reason as above. DLSS integrates 'inside' the rendering pipeline not at the end of it - and it needs access to internal data from the game to work.

Lukas Taves · Apr 5, 2020

JaseC said:
Adding operational modes that process low-precision workloads more efficiently isn't exactly adding hardware (unless you're referring to the shader cores themselves versus their X1/X counterparts, in which case, sure), but that's beside the point at this juncture. Nobody was claiming the XSX GPU doesn't have the means to accelerate DirectML workloads, just that it doesn't have hardware designed for and aimed squarely at this singular purpose a la tensor cores; the latter is what myself and others who've replied to Alucardx23 were referring to with "dedicated hardware". I'm sure you can agree there's a distinction to make between machine learning-oriented shader cores and a pool of discrete machine learning-oriented cores, and that's what we were trying to illustrate to Alucardx23, as he was conflating the two. That's all. The argument was never "The XSX GPU can't do what the tensor cores do."

Late edit: You may also want to read Liabe Brave's posts here and here (the latter especially). As it turns out, the ability to use an FP32 operation to process 4x INT8 or 8x INT4 isn't something Microsoft added; rather, it's inherent to the RDNA architecture. This means, incontrovertibly and unequivocally, that the XSX GPU isn't unique in that regard: said functionality is also supported by the PS5 GPU, any RDNA2-based GPUs AMD has in R&D, and even last year's RDNA1-based Radeon 5700 XT.

I never said it was unique to SX, in fact, I said both consoles, as I was aware that this is a rdna2 feature.

However, as Dictator pointed out, the consoles do have a custom rdna2, so they may not necessarily pick all available features, and he said sony haven't confirmed that feature to them. Now, I think that's more due to their messaging being a bit weird than they not having it.

However, I've been doing a bit of research on the topic, if anyone wants to validate those findings I'm more then welcoming the feedback:

- I remembered that back at the Xenos days, each of their alu cores had a vector and a scalar unit. The vector could do the same operations as a scalar unit, but on up to a 4 wide vector at once. This was back then touted as a big advantage because they could perform one vector multiply + add per cycle.
However, during real world usage, it was noted that most of the operations happened on a scalar, and that led to one effect: Most of the alu hardware was wasted because the vector unit was underutilized when doing scalar math. So for the next generation of their architecture, AMD replaced the vector unit with 4 scalar units and supported a vector operation.

- A similar thing happens to the tensor cores. The tensor cores are set up to perform in matrices, they can do a full matrix multiplication in one cycle. For a regular shader alu, I believe you need to distribute the matrix operation to 4 shader cores in order to have it performed in a single cycle. That's where the rpm comes in, by using int8 operations you can group a full 4x4 matrix in the registers and perform 1 4x4 matrix multiplication per cycle. For dimensions higher than 4x4 you still need more cores, but that's still the thing for the tensor cores, Nvidia also distributes the load across multiple cores if the matrix is too big.

- The above is likely why DLSS was implemented thus far without the tensor cores, and still performance wise it executed very well across the whole RTX lineup. So what are the advantages of the tensor cores then? From what I gather, 1 is that they can actually perform matrix 4x4 calculations on an FP16 space, so for int8 they can actually perform one 16x16 matrix multiplication per cycle, so either by FP16 or int8 space each core can perform way more matrix calculations per cycle than a single shader core.

- The problem is that the tensor cores suffer from a similar issue as xenos. They excel at matrix operations but are severely underutilized for scalar or smaller matrix workloads. Now I could only find somewhat documentation on that subject, but Nvidia actually acknowledges this, but the tensor cores also target non consumer workloads, for example, Nvidia wants their GPUs to be used in clusters for training the models, not just running them (especially true for fp16 matrix math), and even for those utilization of the tensor cores is no where near as high. This is supported by benchmarks that show how a 2080ti compares to a 1080ti in machine learning performance. If you account for the fact that 1080ti lacks some optimizations for int math and does not support rpm, and normalize the results just by the rpm multipliers, 2080ti stays in most cases below a 40% improvement over 1080ti.

- I found more reference for what I said above, but this one has a nice summary:

2080 Ti TensorFlow GPU Benchmarks | Hacker News

news.ycombinator.com

Granted, 36% over a 1080ti is no slouch, but it also means that due the poor utilization the tensor cores are delivering nowhere near close to the 110 tflops (fp16) or 220 tops (int8), and that's for the training phase which in theory has a higher utilization of the tensor cores than running the model.
Keep in mind that this is simply because the higher number of theoretical operations assumes a full matrix to be multiplied. If you don't have enough matrices to multiply then you are only using a fraction of the operations you could be performing in a cycle, not a Nvidia lied post or anything like that.
SX figures on the other hand, are much lower because they are scalar operations, but the thing about scalar operations is that you can always use them, so they have a penalty when dealing with matrices, but the quoted performance is fully achieved even by the simplest calculations.

With all that, I think that even though RTX can obviously achieve much higher theoretical matrix operations, in real world scenario utilization of the matrix units are so low that you don't really see the benefits unless your code is all about multiplying matrices, which even for ML training it's often not.

I don't know how that would translate to DLSS performance on consoles, but it honestly sounds like it really shouldn't be a problem for them, the fact that they have to give up graphics and compute alu to perform the ML workloads could be a problem, but as a counter point I think you can easily win them back by scaling down the resolution a bit, and I assume the process scales somewhat with the image resolution, so even an extreme case of a game running in 1080p on SX to scale up to 4k could see a scenario of the game running in 540p on Lockhart and scaling back to 1080p. I would say that the fact that the console is targeting a resolution 4x lower than SX but only has 3x less processing could actually account for that already. They are setting Lockhart to a higher flop per pixel rate than SX, likely to account for stuff that doesn't scale with resolution and perhaps also accounting for the extra load to perform machine learning upscaling and still reach the same performance. This could even lead to situations where the game is slightly below 4k on SX + reconstructed to a higher resolution, and on Lockhart it's still 1080p + reconstructed.

Obviously we will only know more once we see actual game code running on them, but thus far I'm fairly confident that similar models (with 4x resolution increase and that resolved subpixel detail) can be used in real time effectively on the consoles.

Dictator · Apr 5, 2020

Lukas Taves said:
So what are the advantages of the tensor cores then? From what I gather, 1 is that they can actually perform matrix 4x4 calculations on an FP16 space,

I think the advantage is the relative TOPs to die space - an RTX 2060 will be slower than XSX GPU in almost everything most likely, but separate tensor cores means it is 2x faster at int8, int4 etc.
Which is really not too horrible for such a small GPU - makes ML really practical for real time with those numbers.

Galava · Apr 5, 2020

Really looking forward to seeing DLSS 2.0 on Monster Hunter World AND of course, Cyberpunk 2077...that's going to be interesting.

Rndom Grenadez · Apr 5, 2020

Orayn said:
Basically just one of the best forms of upscaling we've seen so far. You can render a game internally at half the final output resolution and it still looks shockingly good. On PC it's being touted as a way to use GPU intensive features like ray tracing without as big of a performance hit, and on a Switch 2 it could mean games looking much better in docked mode.

a11244 said:
Games will look prettier with same old lower than low resolution.

ILikeFeet said:
games can be rendered at lower resolution and still have really good IQ. because it's not really the loss of texture resolution or visual effects that hurts Switch ports, but the low-ass resolution. this solves that problem

Matemático said:
On Switch 2, XSX and PS5 ports could run at 540p upscaled to 1080p undocked and at 1080p upscaled to 4K docked.

Whoa, so what you're saying is that there is potential for Nintendo not be left behind and still receive similar 3rd party support next gen because they won't need to output a resolution higher than 540? Leaving the Switch 2 GPU to focus on image quality and not resolution since resolution will be bumped from 540 to 1080 because of DLSS?

Dekuman · Apr 5, 2020

Liabe Brave said:
I think it's more that AI upscaling just isn't a magic bullet. Some TVs have had AI upscaling for years (not realtime, done with a trained DSP). It's pretty hard to see the supposed advantages. Game engines do have more potential upside, though, since it's not totally post-process.

That is true. I think DF's excitement is it is being done right now with existing hardware, actually the exact hardware as the Switch being used for upscale and clean up video streaming.

So as a jumping off point, I get the impression they were postulating what else may be possible in a future iteration with all new hardware specifically designed to do upscaling.

My point was (commenting on the video) is that their testing is just basic upscaling/sharpening and nothing to do with DLSS 2.0 which had specific hardware behind it. And the results could be much better on a gaming device with such technology.

ILikeFeet · Apr 5, 2020

Rndom Grenadez said:
Whoa, so what you're saying is that there is potential for Nintendo not be left behind and still receive similar 3rd party support next gen because they won't need to output a resolution higher than 540? Leaving the Switch 2 GPU to focus on image quality and not resolution since resolution will be bumped from 540 to 1080 because of DLSS?

don't go in expecting better 3rd party support. that's just a fact of life for Nintendo.

III-V · Apr 5, 2020

Great work here by the DF team, particularly Alex.

ToD_ · Apr 5, 2020

As many others have said by now, the artifacts that look like too much sharpening was applied (white and black halos near high contrast edges) are distracting. If this can indeed be adjusted, that would be swell. That being said, the technology really is impressive regardless. I am looking forward to seeing how DLSS matures, and I'm hopeful a lot more games start making use of it. It kind of seems like a no brainer to implement this in PC games going forward, given (from my understanding) training is no longer needed on a game by game basis.

Also, that was a great video. Those ray tracing and DLSS analyses are very interesting.

xyla · Apr 5, 2020

Anyone here tried running the 900p upscaled to 1440p with a 2060 Super?

If the game ever leaves the epic store, I wanna try it and seeing the results for the 2060 are okay for 1080, maybe by the time I'm gonna have a 1440p monitor but still the same card.

Would love to know if it can handle everything on hight but dlssed to 1440p at 60.

Lukas Taves · Apr 5, 2020

Dictator said:
I think the advantage is the relative TOPs to die space - an RTX 2060 will be slower than XSX GPU in almost everything most likely, but separate tensor cores means it is 2x faster at int8, int4 etc.
Which is really not too horrible for such a small GPU - makes ML really practical for real time with those numbers.

But that's the thing I was trying to get at, it offers more theorical performance because in theory each unit can do 4x more ops, but those numbers are assuming matrix math. For scalar operations each tensor core performs the same as a shader core, so even with tensor cores it could be that a 2060 does not match the ML performance of the SX in actual performance because the matrix units would be underutilized.

Of course, that's based on current benchmarks, I don't know if tensor cores being a thing will lead to higher utilization of matrix math to take advantage of it. Perhaps that's what Nvidia did going from dlss 1.9 to 2.0 for instance.

And if they do then yeah even a 2060 should pull way ahead of the SX performance.

Fafalada · Apr 5, 2020

Lukas Taves said:
Most of the alu hardware was wasted because the vector unit was underutilized when doing scalar math. So for the next generation of their architecture, AMD replaced the vector unit with 4 scalar units and supported a vector operation.

This is a continuous balancing act - scalar execution units have higher efficiency but trade off die-space / power / thermals for it. IIRC RDNA is back on SIMD execution units.
Also worth noting that this is the very same argument of "narrow vs wide" and associated efficiencies applied to Mhz vs CU count...

Anyway software plays a big role in that as well - it's obviously relatively easy to construct benchmarks that favor one or the other - but how that maps to real-world use is a different question. Also note tensor accelerators typically don't just outperform GPUs - they do so at much better efficiency, which may not matter to a console as much but it's relevant in the types of applications NVidia targets.
But even if costs turned out higher on console - reconstruction performance savings will more than offset that. What's the better question is how competing models to DLSS are going to turn out. NVidia has spent a fair amount of time on this and they're not sharing implementation details.

ThreepQuest64 · Apr 5, 2020

xyla said:
Anyone here tried running the 900p upscaled to 1440p with a 2060 Super?

2070 Super user here and 60fps aren't possible with 960p (volumetric lights and global reflections on medium, everything else maxed). I have to go with 720p which give also a little overhead for intense scenes. So with a 2060 Super which has 20% less performance, IIRC, you would either turn down some ray-tracing options (I'd start with indirect diffuse lighting first, because it's subtle but needs tremendous amount of performance) or need to scale from lower than 960p.

Lukas Taves · Apr 5, 2020

Fafalada said:
This is a continuous balancing act - scalar execution units have higher efficiency but trade off die-space / power / thermals for it. IIRC RDNA is back on SIMD execution units.
Also worth noting that this is the very same argument of "narrow vs wide" and associated efficiencies applied to Mhz vs CU count...

Anyway software plays a big role in that as well - it's obviously relatively easy to construct benchmarks that favor one or the other - but how that maps to real-world use is a different question. Also note tensor accelerators typically don't just outperform GPUs - they do so at much better efficiency, which may not matter to a console as much but it's relevant in the types of applications NVidia targets.
But even if costs turned out higher on console - reconstruction performance savings will more than offset that. What's the better question is how competing models to DLSS are going to turn out. NVidia has spent a fair amount of time on this and they're not sharing implementation details.

Yeah that's basically the points I wanted to make. Currently workloads does not favor matrix performance that much but it could simply be the nature of not having good matrix performance until recently.

And that having dedicated cores is not exactly a good fit for console when even a 2060 is much bigger than the whole SX apu. (Granted, 2060 is on a bigger node but I would expect 2060 to be bigger still even at 7nm, and at that point that's significantly more die area for dedicated cores).

And finally yeah, I agree the actual implementation will matter more. The fact that even Ms demos were done basing of a model generated by Nvidia shows how far ahead they might be on that.

Digital Foundry || Control vs DLSS: Can 540p Match 1080p Image Quality? Full Ray Tracing On RTX 2060?

ONE THOUSAND DOLLARYDOOS

Prophet of Truth - HDTVtest

User requested account closure

Prophet of Truth

DF Deet Master

Digital Foundry

DF Deet Master

Digital Foundry

▲ Legend ▲

Prophet of Truth

DF Deet Master