• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.
  • We have made minor adjustments to how the search bar works on ResetEra. You can read about the changes here.

JahIthBer

Member
Jan 27, 2018
10,382
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.
 
May 24, 2019
22,192
My only issue with DLSS at the moment is how it is incredibly oversharpened.

Yeah. I'd kind of rather have a soft bilinear upscale free of sharpening, ghosting and combing.

I think I'm pretty savy at being annoyed by temporal artifacting. The water splashes drove me nuts in Uncharted 4 and I could never find anyone else bothered by, or even noticing them through Googling.
 

GrrImAFridge

ONE THOUSAND DOLLARYDOOS
Member
Oct 25, 2017
9,674
Western Australia
It is 100% a togglable slider. They left it as is probably just because basic market research told them people like images sharp to the point of having ringing artefacts.

The sharpening halos are the problem as I see iot. The actual detail in inner surfaces in most shots which make it look different than native, is usually just because native has TAA overeveraging results in inner surfaces, turn off TAA in such shots and inner surface detail actually looks remarkably like DLSS.

The Nvidia bloke who did the DLSS 2.0 presentation mentioned on Twitter that they neglected to include the DLSS sharpness slider because the game itself doesn't have one. A bit of a silly oversight, really, but I'm sure it'll be added soon enough.
 
Last edited:

EvilBoris

Prophet of Truth - HDTVtest
Verified
Oct 29, 2017
16,683
I think the interesting point is that to start getting into "differences" you have to do 400-800 % zooms.

With normal 2160 checkerboarding, the overaveraging of pixel results makes the differences between a real native and checkerboarded result in side by sides noticably different in aggregate.

Also the fact that we are comparing and looking for the "which is better/ where does it fail" in side by sides with native and zooming in to tell the reasons why it is deficient while it also is 130% faster.

That is... rather unique.


It is 100% a togglable slider. They left it as is probably just because basic market research told them people like images sharp to the point of having ringing artefacts.


One of the lead developers behind the technology said sharpness is actually scalable, they just didn't happen to include it in any games yet.
Whew.

I remember how upset I was with it when Xbox One seemed to do it by default on many titles.
 

Principate

Member
Oct 31, 2017
11,186
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.
Switch 2 is a good few years out. By the time Nintendo releases it the latest Ampere technology now would be quite old, so there's a good chance it'll use it.
 

Beatle

Member
Dec 4, 2017
1,123
By the time Switch 2 comes out we may be looking at DLSS 3.0 and all of its advantages.. mind boggles, exciting times ahead
 

Matemático

Banned
Mar 22, 2019
332
Brazil
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

Switch 2 should launch by 2023, Ampere will be 3 years old by there.
 

Deleted member 4970

User requested account closure
Banned
Oct 25, 2017
12,240
I played at 4K/30 ultra with my 2070 yesterday! DLSS 2.0 is the future! Hope it gets added to Metro Exodus
 

tuxfool

Member
Oct 25, 2017
5,858
The big TOPS differential makes the NVIDIA GPU more capable of DLSS-like algorithms (and of course that difference is powered by the tensor cores), but the specific point about the tensor cores being separate from the ALUs might not matter, because DLSS is implemented somewhere in the middle of the graphics pipeline, so the ALUs must wait for the DLSS algorithm to finish before they can continue. This would have been different if the DLSS 2.0 slotted in at the end of the pipeline, but it doesn't.

Therefore, the TOPS difference alone should dictate the time differential for computing the DLSS upscaling.
I should point out that it isn't necessarily the position in the pipeline that causes this, but rather that they share execution resources with the SM.

They could slot in at the end and still prevent further computation because the SMs would be blocked from executing.
 

UltraMagnus

Banned
Oct 27, 2017
15,670
Switch 2 should launch by 2023, Ampere will be 3 years old by there.

It's fairly obvious too that Switch was intended to launch for holiday 2016, it got squeezed into Q1 2017 because the software was not ready, had nothing to do with the hardware.

Even for a 2022 launch 7nm Ampere will by then be fairly matured tech, not something that's so crazy cutting edge.
 

ppn7

Member
May 4, 2019
740
540->1080 didn't work for me. I could just see artifacts everywhere.

I didn't test it but I can see these artifact in the Alex's video. Around Jess and moving environment. It looks like motion interpolation artifact on TV.

My only issue with DLSS at the moment is how it is incredibly oversharpened.
I'm not sure if this just because gamers have a
terrible eye for that kind of thing and prefer it (so it's forced for extra BAMpop) or if it's an unfortunate side effect of the training

I prefer too a slightly softer image than a oversharpened one. The aliasing for me is what reduce the immersion.
 

Principate

Member
Oct 31, 2017
11,186
It's fairly obvious too that Switch was intended to launch for holiday 2016, it got squeezed into Q1 2017 because the software was not ready, had nothing to do with the hardware.

Even for a 2022 launch 7nm Ampere will by then be fairly matured tech, not something that's so crazy cutting edge.
Yeah basically I'd expect expect a different gen tech for the next switch if Nintendo sticks with nvidia. No reason not to do so.
 
Oct 25, 2017
14,741
540->1080 didn't work for me. I could just see artifacts everywhere.
But the question is, what about 540->1080 vs 720p with regular upscaling? Alex is comparing it to native resolution to show how efficient it is (or isn't), but if you had the power to run at native res, you wouldn't need DLSS in the first place. The real world comparison would be DLSS vs regular sub-native res.
 
Jun 2, 2019
4,947
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

There's a difference between Switch and the Next installement of the machine.

Switch is basically using a leftover design from NVidia that nobody wanted.

When Nintendo decides it's time to release a new Switch, since they seem to have a 20 year contract, they're going to want a new chip, and NVidia is not going to let out the chance of getting the extra marketing of implementing such techniques in a tablet-like device.

Whenever they use Ampere or not, it will AT LEAST have Tensor Cores for DLSS, and probably RT cores if Nintendo wants to repeat the Switch's success when it comes to getting big third party games.
 

eso76

Prophet of Truth
Member
Dec 8, 2017
8,115
Amazing.
I hope this kills the 8k nonsense before it even starts
 
May 24, 2019
22,192
But the question is, what about 540->1080 vs 720p with regular upscaling? Alex is comparing it to native resolution to show how efficient it is (or isn't), but if you had the power to run at native res, you wouldn't need DLSS in the first place. The real world comparison would be DLSS vs regular sub-native res.

I'd take a blown up native 720P. Artifacts bug me more than softness.

edit: The solution seems acceptable with the higher base reses, though. It's definitely useful.
 
Last edited:

Minsc

Member
Oct 28, 2017
4,123
But the question is, what about 540->1080 vs 720p with regular upscaling? Alex is comparing it to native resolution to show how efficient it is (or isn't), but if you had the power to run at native res, you wouldn't need DLSS in the first place. The real world comparison would be DLSS vs regular sub-native res.

Not to mention the framerate.

Would you rather play blurrier at 15-20 fps or sharper at 30-40 fps, blurrier at 60 fps or sharper at the magical 120 fps.

I dunno, there's lots of uses and if you can adjust the sharpness of it as well... damn. I like my stuff sharp anyway.

I'd take a blown up native 720P easy. Artifacts bug me more than softness.

What about framerate? Would you take 15fps or 30+ fps? And did you see you can adjust the sharpness of the scaling too to make it look softer like the traditional method?
 
May 24, 2019
22,192
What about framerate? Would you take 15fps or 30+ fps? And did you see you can adjust the sharpness of the scaling too to make it look softer like the traditional method?

I'd just want a computer that could run the game at 60fps in at least native 1080P at that point :)

I know about the sharpening thing, but it's artifacting like you can see around the moving dark posts here that I don't like (I know it's compressed Youtube, but the native side didn't have it):
DnQlMI9.jpg

It reminds me of combing in a badly deinterlaced video.
 

Terbinator

Member
Oct 29, 2017
10,241

Minsc

Member
Oct 28, 2017
4,123
I'd just want a computer that could run the game at 60fps in at least native 1080P at that point :)

I know about the sharpening thing, but it's artifacting like you can see around the moving dark posts here that I don't like (I know it's compressed Youtube, but the native side didn't have it):
DnQlMI9.jpg

It reminds me of combing in a badly deinterlaced video.

Yeah, I'll have to see it in motion playing a game to weigh the options properly, but I think it'll be an easier sell more in to next gen for me when the choices are spend another $2000 or get awesome ray tracing (and with higher framerates too against all logic - more details and more graphics and more framerate?) and deal with a bit of combing vs skipping out entirely on the ray tracing or having it down very low.

Who knows maybe they'll release a version 2.5 of it that removes the bulk of the combing artifacts down the road too.
 

ppn7

Member
May 4, 2019
740
Next step for nvidia : trying to make Motion interpolation (30 FPS looking like 60fps) with low input lag and any artifact.
 

Fafalada

Member
Oct 27, 2017
3,066
I think the interesting point is that to start getting into "differences" you have to do 400-800 % zooms.
The <1080p upscales didn't need zooming - there was a ton of temporal artifacting in your video that was actually rather disturbing to me.
The 1080p+ though do hold up well on normal viewing (sharpening defaults aside). I wish that NVidia presentation actually went into more details though - it was disappointingly marketing oriented, though I suppose they aren't interested discussing implementation details at this stage.

Anyway, as this gen sorted out temporal stability on regular displays for 'most part' - I really want to see how this holds up on low-persistence displays(actually you guys have those fancy high-spec CRT panels, you should be able to test some of that), especially for VR (the part of the talk that alluded to - but shared no details around how neighbour clamping improves will play a big role here). as that's the next image-quality step to take for future hw/sw, and many current AA/reconstruction methods just don't make that cut.
 

tusharngf

Member
Oct 29, 2017
2,288
Lordran
Man a lot of people assuming Switch 2 is going to be a beast with the latest Ampere technology, even though Switch used Maxwell when it was 3 years old. Simmer down the expectations people.
Im not going to lie though, if Nintendo made a high end console with an Ampere GPU, that would be sick, it's just unlikely.

imagine RTX3060 for Switch PRO docked version :D
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,930
Berlin, 'SCHLAND
The <1080p upscales didn't need zooming - there was a ton of temporal artifacting in your video that was actually rather disturbing to me.
The 1080p+ though do hold up well on normal viewing (sharpening defaults aside).
I was talking about the 2160 results here, but yes.
I wish that NVidia presentation actually went into more details though - it was disappointingly marketing oriented, though I suppose they aren't interested discussing implementation details at this stage.
I wish it also went a bit more into how the NN weighting was done, rather than just an overview of what it is trying to do better than other clamping, history buffer solutions. Although, it was enjoyable seeing the various ideas about reprojecting and rejecting on a 2d graph.
 
OP
OP
ILikeFeet

ILikeFeet

DF Deet Master
Banned
Oct 25, 2017
61,987
It's not doubling frames like they're saying. It's letting you use lower resolution in order to pump out more frames (right?)

I think they're looking for the magic juice to run 30fps things through that is good looking and playable.
I'm curious as to why motion interpolation would be desired over upscaling and using the overhead to boost framerate and visuals. I'd assume the latter is way easier than the former. Though I guess having motion vectors would help interpolation significantly
 

ppn7

Member
May 4, 2019
740
For videos? Because for games, that's what this is (sans the no artifacts).

Yeah both result with similarity but not in the same way. It could be use for gaming too.

It's not doubling frames like they're saying. It's letting you use lower resolution in order to pump out more frames (right?)

I think they're looking for the magic juice to run 30fps things through that is good looking and playable.

Motion interpolation put an image between the previous and the next frame. You right it's not the same as DLSS.

I'm curious as to why motion interpolation would be desired over upscaling and using the overhead to boost framerate and visuals. I'd assume the latter is way easier than the former. Though I guess having motion vectors would help interpolation significantly

I don't know if one day we will see DLSS implemented directly in TV as G-Sync modul was in monitor. But now AI motion interpolation work on every content, it's not as good as using DLSS to get more FPS, but it works, with high input lag and/or many artifact.

I'm guessing the question was for passing through locked framerate things like existing console games. A working alternative to using a TV's motion smoothing.

It was a question about TV manufacturer. I don't know if they will tend to continue with AI motion interpolation and their own AI upscale or maybe Nvidia could be partnership with them and try to innovate the TV market.
I think that TV processor are too weak compared to Tensor core right ? So not sure we will see any big improvement with TV
 

ppn7

Member
May 4, 2019
740
Oculus solved this a few years ago - didn't need ML for it either although maybe there's additional quality gains possible.

I didn't know that, but it seems that it work only when you get FPS drops. While in TV it's always ON we you choose to use it.
And i don't know if it work well on VR?
 

Fafalada

Member
Oct 27, 2017
3,066
I didn't know that, but it seems that it work only when you get FPS drops. While in TV it's always ON we you choose to use it.
Well if you're already at target framerate there isn't really any use to insert additional frames. Anyway if you want low-latency response you can't really interpolate, it needs to be a form of extrapolation - and that needs access to control-inputs (and potentially game data) so TV is the wrong place to do the process in.

And i don't know if it work well on VR?
It was built 'for' VR, so yes. It's possible to do something equivalent on flat-screen games, but it will require API/driver and per-game modifications to integrate with the process.

I don't know if one day we will see DLSS implemented directly in TV
We won't - for the same reason as above. DLSS integrates 'inside' the rendering pipeline not at the end of it - and it needs access to internal data from the game to work.
 

Lukas Taves

Banned
Oct 28, 2017
5,713
Brazil
Adding operational modes that process low-precision workloads more efficiently isn't exactly adding hardware (unless you're referring to the shader cores themselves versus their X1/X counterparts, in which case, sure), but that's beside the point at this juncture. Nobody was claiming the XSX GPU doesn't have the means to accelerate DirectML workloads, just that it doesn't have hardware designed for and aimed squarely at this singular purpose a la tensor cores; the latter is what myself and others who've replied to Alucardx23 were referring to with "dedicated hardware". I'm sure you can agree there's a distinction to make between machine learning-oriented shader cores and a pool of discrete machine learning-oriented cores, and that's what we were trying to illustrate to Alucardx23, as he was conflating the two. That's all. The argument was never "The XSX GPU can't do what the tensor cores do."

Late edit: You may also want to read Liabe Brave's posts here and here (the latter especially). As it turns out, the ability to use an FP32 operation to process 4x INT8 or 8x INT4 isn't something Microsoft added; rather, it's inherent to the RDNA architecture. This means, incontrovertibly and unequivocally, that the XSX GPU isn't unique in that regard: said functionality is also supported by the PS5 GPU, any RDNA2-based GPUs AMD has in R&D, and even last year's RDNA1-based Radeon 5700 XT.
I never said it was unique to SX, in fact, I said both consoles, as I was aware that this is a rdna2 feature.

However, as Dictator pointed out, the consoles do have a custom rdna2, so they may not necessarily pick all available features, and he said sony haven't confirmed that feature to them. Now, I think that's more due to their messaging being a bit weird than they not having it.

However, I've been doing a bit of research on the topic, if anyone wants to validate those findings I'm more then welcoming the feedback:

- I remembered that back at the Xenos days, each of their alu cores had a vector and a scalar unit. The vector could do the same operations as a scalar unit, but on up to a 4 wide vector at once. This was back then touted as a big advantage because they could perform one vector multiply + add per cycle.
However, during real world usage, it was noted that most of the operations happened on a scalar, and that led to one effect: Most of the alu hardware was wasted because the vector unit was underutilized when doing scalar math. So for the next generation of their architecture, AMD replaced the vector unit with 4 scalar units and supported a vector operation.

- A similar thing happens to the tensor cores. The tensor cores are set up to perform in matrices, they can do a full matrix multiplication in one cycle. For a regular shader alu, I believe you need to distribute the matrix operation to 4 shader cores in order to have it performed in a single cycle. That's where the rpm comes in, by using int8 operations you can group a full 4x4 matrix in the registers and perform 1 4x4 matrix multiplication per cycle. For dimensions higher than 4x4 you still need more cores, but that's still the thing for the tensor cores, Nvidia also distributes the load across multiple cores if the matrix is too big.

- The above is likely why DLSS was implemented thus far without the tensor cores, and still performance wise it executed very well across the whole RTX lineup. So what are the advantages of the tensor cores then? From what I gather, 1 is that they can actually perform matrix 4x4 calculations on an FP16 space, so for int8 they can actually perform one 16x16 matrix multiplication per cycle, so either by FP16 or int8 space each core can perform way more matrix calculations per cycle than a single shader core.

- The problem is that the tensor cores suffer from a similar issue as xenos. They excel at matrix operations but are severely underutilized for scalar or smaller matrix workloads. Now I could only find somewhat documentation on that subject, but Nvidia actually acknowledges this, but the tensor cores also target non consumer workloads, for example, Nvidia wants their GPUs to be used in clusters for training the models, not just running them (especially true for fp16 matrix math), and even for those utilization of the tensor cores is no where near as high. This is supported by benchmarks that show how a 2080ti compares to a 1080ti in machine learning performance. If you account for the fact that 1080ti lacks some optimizations for int math and does not support rpm, and normalize the results just by the rpm multipliers, 2080ti stays in most cases below a 40% improvement over 1080ti.

- I found more reference for what I said above, but this one has a nice summary:
Granted, 36% over a 1080ti is no slouch, but it also means that due the poor utilization the tensor cores are delivering nowhere near close to the 110 tflops (fp16) or 220 tops (int8), and that's for the training phase which in theory has a higher utilization of the tensor cores than running the model.
Keep in mind that this is simply because the higher number of theoretical operations assumes a full matrix to be multiplied. If you don't have enough matrices to multiply then you are only using a fraction of the operations you could be performing in a cycle, not a Nvidia lied post or anything like that.
SX figures on the other hand, are much lower because they are scalar operations, but the thing about scalar operations is that you can always use them, so they have a penalty when dealing with matrices, but the quoted performance is fully achieved even by the simplest calculations.

With all that, I think that even though RTX can obviously achieve much higher theoretical matrix operations, in real world scenario utilization of the matrix units are so low that you don't really see the benefits unless your code is all about multiplying matrices, which even for ML training it's often not.

I don't know how that would translate to DLSS performance on consoles, but it honestly sounds like it really shouldn't be a problem for them, the fact that they have to give up graphics and compute alu to perform the ML workloads could be a problem, but as a counter point I think you can easily win them back by scaling down the resolution a bit, and I assume the process scales somewhat with the image resolution, so even an extreme case of a game running in 1080p on SX to scale up to 4k could see a scenario of the game running in 540p on Lockhart and scaling back to 1080p. I would say that the fact that the console is targeting a resolution 4x lower than SX but only has 3x less processing could actually account for that already. They are setting Lockhart to a higher flop per pixel rate than SX, likely to account for stuff that doesn't scale with resolution and perhaps also accounting for the extra load to perform machine learning upscaling and still reach the same performance. This could even lead to situations where the game is slightly below 4k on SX + reconstructed to a higher resolution, and on Lockhart it's still 1080p + reconstructed.

Obviously we will only know more once we see actual game code running on them, but thus far I'm fairly confident that similar models (with 4x resolution increase and that resolved subpixel detail) can be used in real time effectively on the consoles.
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,930
Berlin, 'SCHLAND
So what are the advantages of the tensor cores then? From what I gather, 1 is that they can actually perform matrix 4x4 calculations on an FP16 space,
I think the advantage is the relative TOPs to die space - an RTX 2060 will be slower than XSX GPU in almost everything most likely, but separate tensor cores means it is 2x faster at int8, int4 etc.
Which is really not too horrible for such a small GPU - makes ML really practical for real time with those numbers.
 

Galava

▲ Legend ▲
Member
Oct 27, 2017
5,080
Really looking forward to seeing DLSS 2.0 on Monster Hunter World AND of course, Cyberpunk 2077...that's going to be interesting.
 

Rndom Grenadez

Prophet of Truth
Member
Dec 7, 2017
5,637
Basically just one of the best forms of upscaling we've seen so far. You can render a game internally at half the final output resolution and it still looks shockingly good. On PC it's being touted as a way to use GPU intensive features like ray tracing without as big of a performance hit, and on a Switch 2 it could mean games looking much better in docked mode.
Games will look prettier with same old lower than low resolution.
games can be rendered at lower resolution and still have really good IQ. because it's not really the loss of texture resolution or visual effects that hurts Switch ports, but the low-ass resolution. this solves that problem
On Switch 2, XSX and PS5 ports could run at 540p upscaled to 1080p undocked and at 1080p upscaled to 4K docked.
Whoa, so what you're saying is that there is potential for Nintendo not be left behind and still receive similar 3rd party support next gen because they won't need to output a resolution higher than 540? Leaving the Switch 2 GPU to focus on image quality and not resolution since resolution will be bumped from 540 to 1080 because of DLSS?
 

Dekuman

Member
Oct 27, 2017
19,026
I think it's more that AI upscaling just isn't a magic bullet. Some TVs have had AI upscaling for years (not realtime, done with a trained DSP). It's pretty hard to see the supposed advantages. Game engines do have more potential upside, though, since it's not totally post-process.

That is true. I think DF's excitement is it is being done right now with existing hardware, actually the exact hardware as the Switch being used for upscale and clean up video streaming.

So as a jumping off point, I get the impression they were postulating what else may be possible in a future iteration with all new hardware specifically designed to do upscaling.

My point was (commenting on the video) is that their testing is just basic upscaling/sharpening and nothing to do with DLSS 2.0 which had specific hardware behind it. And the results could be much better on a gaming device with such technology.
 
OP
OP
ILikeFeet

ILikeFeet

DF Deet Master
Banned
Oct 25, 2017
61,987
Whoa, so what you're saying is that there is potential for Nintendo not be left behind and still receive similar 3rd party support next gen because they won't need to output a resolution higher than 540? Leaving the Switch 2 GPU to focus on image quality and not resolution since resolution will be bumped from 540 to 1080 because of DLSS?
don't go in expecting better 3rd party support. that's just a fact of life for Nintendo.
 

ToD_

Member
Oct 27, 2017
405
As many others have said by now, the artifacts that look like too much sharpening was applied (white and black halos near high contrast edges) are distracting. If this can indeed be adjusted, that would be swell. That being said, the technology really is impressive regardless. I am looking forward to seeing how DLSS matures, and I'm hopeful a lot more games start making use of it. It kind of seems like a no brainer to implement this in PC games going forward, given (from my understanding) training is no longer needed on a game by game basis.

Also, that was a great video. Those ray tracing and DLSS analyses are very interesting.
 

xyla

Member
Oct 27, 2017
8,385
Germany
Anyone here tried running the 900p upscaled to 1440p with a 2060 Super?

If the game ever leaves the epic store, I wanna try it and seeing the results for the 2060 are okay for 1080, maybe by the time I'm gonna have a 1440p monitor but still the same card.

Would love to know if it can handle everything on hight but dlssed to 1440p at 60.
 

Lukas Taves

Banned
Oct 28, 2017
5,713
Brazil
I think the advantage is the relative TOPs to die space - an RTX 2060 will be slower than XSX GPU in almost everything most likely, but separate tensor cores means it is 2x faster at int8, int4 etc.
Which is really not too horrible for such a small GPU - makes ML really practical for real time with those numbers.
But that's the thing I was trying to get at, it offers more theorical performance because in theory each unit can do 4x more ops, but those numbers are assuming matrix math. For scalar operations each tensor core performs the same as a shader core, so even with tensor cores it could be that a 2060 does not match the ML performance of the SX in actual performance because the matrix units would be underutilized.

Of course, that's based on current benchmarks, I don't know if tensor cores being a thing will lead to higher utilization of matrix math to take advantage of it. Perhaps that's what Nvidia did going from dlss 1.9 to 2.0 for instance.

And if they do then yeah even a 2060 should pull way ahead of the SX performance.
 

Fafalada

Member
Oct 27, 2017
3,066
Most of the alu hardware was wasted because the vector unit was underutilized when doing scalar math. So for the next generation of their architecture, AMD replaced the vector unit with 4 scalar units and supported a vector operation.
This is a continuous balancing act - scalar execution units have higher efficiency but trade off die-space / power / thermals for it. IIRC RDNA is back on SIMD execution units.
Also worth noting that this is the very same argument of "narrow vs wide" and associated efficiencies applied to Mhz vs CU count...

Anyway software plays a big role in that as well - it's obviously relatively easy to construct benchmarks that favor one or the other - but how that maps to real-world use is a different question. Also note tensor accelerators typically don't just outperform GPUs - they do so at much better efficiency, which may not matter to a console as much but it's relevant in the types of applications NVidia targets.
But even if costs turned out higher on console - reconstruction performance savings will more than offset that. What's the better question is how competing models to DLSS are going to turn out. NVidia has spent a fair amount of time on this and they're not sharing implementation details.
 

ThreepQuest64

Avenger
Oct 29, 2017
5,735
Germany
Anyone here tried running the 900p upscaled to 1440p with a 2060 Super?
2070 Super user here and 60fps aren't possible with 960p (volumetric lights and global reflections on medium, everything else maxed). I have to go with 720p which give also a little overhead for intense scenes. So with a 2060 Super which has 20% less performance, IIRC, you would either turn down some ray-tracing options (I'd start with indirect diffuse lighting first, because it's subtle but needs tremendous amount of performance) or need to scale from lower than 960p.
 

Lukas Taves

Banned
Oct 28, 2017
5,713
Brazil
This is a continuous balancing act - scalar execution units have higher efficiency but trade off die-space / power / thermals for it. IIRC RDNA is back on SIMD execution units.
Also worth noting that this is the very same argument of "narrow vs wide" and associated efficiencies applied to Mhz vs CU count...

Anyway software plays a big role in that as well - it's obviously relatively easy to construct benchmarks that favor one or the other - but how that maps to real-world use is a different question. Also note tensor accelerators typically don't just outperform GPUs - they do so at much better efficiency, which may not matter to a console as much but it's relevant in the types of applications NVidia targets.
But even if costs turned out higher on console - reconstruction performance savings will more than offset that. What's the better question is how competing models to DLSS are going to turn out. NVidia has spent a fair amount of time on this and they're not sharing implementation details.
Yeah that's basically the points I wanted to make. Currently workloads does not favor matrix performance that much but it could simply be the nature of not having good matrix performance until recently.

And that having dedicated cores is not exactly a good fit for console when even a 2060 is much bigger than the whole SX apu. (Granted, 2060 is on a bigger node but I would expect 2060 to be bigger still even at 7nm, and at that point that's significantly more die area for dedicated cores).

And finally yeah, I agree the actual implementation will matter more. The fact that even Ms demos were done basing of a model generated by Nvidia shows how far ahead they might be on that.