• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.

orzkare

Member
Apr 9, 2020
653
Japan
AMD Navi 21, 22 and 23 Massive Technical Specifications Leak – Flagship Navi 21 GPU To Have 80 CUs

Technical specifications of AMD's upcoming Navi 21, 22 and 23 GPUs have leaked out via reddit user stblr (via Videocardz). This is an absolute motherlode of a leak that the user has managed to put together with what appears to be painstaking accuracy. The information was reverse-engineered from drivers present in the MacOs and is likely legitimate. We will not be marking this as a rumor as we have no reason to believe this information is inaccurate in any way.

Navi 21: 80 CUs, 2.2 GHz boost clock and 22.5 TFLOPs of compute
Navi 22: 40 CUs, 2.5 GHz boost clock and 12.8 TFLOPs of compute

wccftech.com

AMD Navi 21, 22 and 23 Massive Technical Specifications Leak - Flagship Navi 21 GPU To Have 80 CUs

Technical specifications of AMD's upcoming Navi 21, 22 and 23 GPUs have leaked out via reddit user stblr (via Videocardz). This is an absolute motherlode of a leak that the user has managed to put together with what appears to be painstaking accuracy. The information was reverse-engineered from...
 

Dio

Member
Oct 25, 2017
8,097
i just hope, at the very least, their RT solution is good, and they can get on the level of hardware encoding Nvenc offers. with at least both of these, i'm game.
 

Ra

Rap Genius
Moderator
Oct 27, 2017
12,203
Dark Space
People fighting the wars can stop calling the PS5's clock insane or worrisome now, 2.5Ghz yeesh.

There has to be a Navi 21with a cut down CU count, no?
 

StereoVSN

Member
Nov 1, 2017
13,620
Eastern US
Interesting. Navi 21 looks pretty good. Let's see RTX and other features out of AMD come October. Not like it's going to be possible in a normal manner to acquire 3080 (or 3070) by then.
 

Herne

Member
Dec 10, 2017
5,312
So that would put their top card past a 3070, but not quite a 3080? At least in terms of terraflops?
Nvidia and AMD teraflops are not analagous. Even cards from the same manufacturers but on different architectures are not analogous.

The most recent rumours (to be taken with a liberal pinch of salt, of course), put the top Navi 21 product above the 3080, who knows how far above.
 

TheNerdyOne

Member
Oct 28, 2017
521
Oh, new thread yay. Yeah, rdna2 is shaping up to be a complete monster, and anyone still doubting that after we have hard specs from amd themselves is delusional.
 

TheNerdyOne

Member
Oct 28, 2017
521
Sounds great, but without a very solid DLSS competitor will they be able to compete?
When DLSS gains support in more than six games per year, this will become maybe relevant, but as it stands, 12 supported titles in two years is irrelevant in the grand scheme of things. AMD might be able to counter by releasing a gpu that's fast enough to overcome the performance gains seen by DLSS, since navi21 is only a 240W gpu per the spec, they could release a 350W card like nvidia did at the top end. Will they? I don't know, but they could. A larger part at a lower clockspeed could see them gain another 25 - 30% performance at the very top of the stack, the only question is do they feel the need to, if the current 80CU part is already competitive or faster?
 

mordecaii83

Avenger
Oct 28, 2017
6,860
Ampere's TF is based off of FP16 performance. It's not a 1:1 comparison.

80CU RDNA2 should compete with 3080 on specs alone.
No it's not, Ampere has 30TF FP32 performance.

Edit: It's even in the article from the OP, it says "29.8TF single precision (aka FP32). FP64 is double precision and FP16 is half precision.
 
Last edited:

TheNerdyOne

Member
Oct 28, 2017
521
No it's not, Ampere has 30TF FP32 performance.

Great, why is it only 50 - 80% faster than a 10TF turing part? did they really lose that much perf/flop? AMD clearly hasn't lost any with rdna2, and infact will have gained some due to all the new features rdna2 supports that rdna1 doesn't (variable rate shading, mesh shaders, SFS, etc). The part nobody seems to mention is ampere is UP to 30TF of FP32 performance, when you're not doing a single int calculation at all, i.e. pretty much only for some types of compute and for tile based rendering, not for gaming. Anyway, they've demonstrated roughly 2x flops/watt with rdna2 (238w rdna2 being 2.3x faster than 225w rdna1 is absolutely massive, and blows away what they claimed before about 50% perf/watt earlier in the year). I think they baited nvidia with the 50% perf/watt claims when they knew it was much higher than that. We'll see soon enough i suppose. Then again microsoft at hot chips said the XSX gpu uses the same amount of power as the xbox one GPU, which also backs up rdna2 being insanely efficient vs rdna1 too. (I know xbox one is GCN, the point is we can say that the XSX SOC has to be 150w or less, of which the gpu must be around 120w, or the statement in microsofts slide doesn't work. If they're able to be 35% faster than a 5700XT while drawing half as much power, that's pretty much backing up what we're seeing here with the higher end parts.
 
Last edited:

mordecaii83

Avenger
Oct 28, 2017
6,860
Great, why is it only 50 - 80% faster than a 10TF turing part? did they really lose that much perf/flop? AMD clearly hasn't lost any with rdna2, and infact will have gained some due to all the new features rdna2 supports that rdna1 doesn't (variable rate shading, mesh shaders, SFS, etc). The part nobody seems to mention is ampere is UP to 30TF of FP32 performance, when you're not doing a single int calculation at all, i.e. pretty much only for some types of compute and for tile based rendering, not for gaming. Anyway, they've demonstrated roughly 2x flops/watt with rdna2 (238w rdna2 being 2.3x faster than 225w rdna1 is absolutely massive, and blows away what they claimed before about 50% perf/watt earlier in the year).
Why do you want to war in every thread? Someone posted incorrect information, I simply posted a correction.
Woah that's some awesome BW efficiency gains over Navi 1.0.



AMD most likely has the perf / tf crown this gen.
Yeah it's very possible, Nvidia specified they increased FP32 specifically to help scaling with RTX but in rasterized workloads it's possible AMD takes the "perf/tf crown". We'll see how they each end up in a few years with next-gen games.
 

BoredLemon

Member
Nov 11, 2017
1,002
DirectML is a thing, so let's see how that turns out.
DirectML is just an API for doing machine learning-related stuff.
An API is probably the least important part of DLSS.
DLSS is first and foremost a combination of hardware and AI model. Both of which came out as a result of Nvidia's long and massive investment into AI R&D.
And DirectML provides none of that.
 

TSM

Member
Oct 27, 2017
5,821
AMD most likely has the perf / tf crown this gen.

Third parties will be pushing ray tracing, and potentially DirectML, both of which Nvidia should have a huge advantage at given the dedicated hardware their GPUs have. Performance in big third party games will slowly shift away from pure rasterization.
 

TheNerdyOne

Member
Oct 28, 2017
521
Third parties will be pushing ray tracing, and potentially DirectML, both of which Nvidia should have a huge advantage at given the dedicated hardware their GPUs have. Performance in big third party games will slowly shift away from pure rasterization.

RDNA2 has dedicated hardware accelerated RT hardware too... everyone seems to be thinking that amd RT is a pure software solution when it absolutely isn't. 80 CU Rdna2 has 320 RT cores capable of 704 billion bvh intersect tests/second. This is a fact, and its a hardware solution. Traversal is done in software, amd clearly determined it was a better solution overall than doing traversal in hardware, or they would have also designed their RT cores for traversal. They're not stupid, no matter how much people want to claim they are. The RTX 3080 has 68 SMS, at its rated boost clock (i know it boosts higher than this in practice) of 1710mhz, it can do 116 billion bvh intersect tests per second. (each RT core can only do one bvh intersect test per second, and it still only has 68 of them)
 

Duxxy3

Member
Oct 27, 2017
21,699
USA
If it was just the tflops number and it was somehow directly comparable, the Navi 21 would be above the 3070 and 2080 ti but below the 3080. The Navi 22 looks like a direct replacement for the 5700xt.

If I had to guess on pricing, I'd put the Navi 21 at $599 and the Navi 22 at $399. Sitting right between the 3070 and 3080, and right below the 3070.

Should be an interesting reveal.
 

Ra

Rap Genius
Moderator
Oct 27, 2017
12,203
Dark Space
When DLSS gains support in more than six games per year, this will become maybe relevant, but as it stands, 12 supported titles in two years is irrelevant in the grand scheme of things.
How many major releases are there a year? If those games are supporting DLSS (hint: they are), Nvidia has the advantage.

That is what matters.
 

TSM

Member
Oct 27, 2017
5,821
RDNA2 has dedicated hardware accelerated RT hardware too... everyone seems to be thinking that amd RT is a pure software solution when it absolutely isn't. 80 CU Rdna2 has 320 RT cores capable of 704 billion bvh intersect tests/second. This is a fact, and its a hardware solution. Traversal is done in software, amd clearly determined it was a better solution overall than doing traversal in hardware, or they would have also designed their RT cores for traversal. They're not stupid, no matter how much people want to claim they are.

It's yet to be shown that AMD's solution is as performant as Nvidia's dedicated hardware. Also Nvidia's RT hardware is in addition to their rasterization whereas AMD has to give up rasterization performance to do ray tracing.
 

TheNerdyOne

Member
Oct 28, 2017
521
How many major releases are there a year? If those games are supporting DLSS (hint: they are), Nvidia has the advantage.

That is what matters.

How many major releases there are will depend on who you ask really, but its more than 6. I mean ubisoft puts out about that many games every year and those are considered major, and thats just the one company, so yeah. Sure, not every game is a cyberpunk level event, but then we only get one or two of those a generation.
 

Ra

Rap Genius
Moderator
Oct 27, 2017
12,203
Dark Space
How many major releases there are will depend on who you ask really, but its more than 6. I mean ubisoft puts out about that many games every year and those are considered major, and thats just the one company, so yeah. Sure, not every game is a cyberpunk level event, but then we only get one or two of those a generation.
I'm glad we agree that DLSS is far from irrelevant.
 

scabobbs

Member
Oct 28, 2017
2,103
benchmarks arent too far away, we'll see what AMD's flagship can do and for what price. It would have to be one godly card to beat 3080's price/perf. Feels like almost zero chance that happens
 

TheNerdyOne

Member
Oct 28, 2017
521
I'm glad we agree that DLSS is far from irrelevant.
DLSS is irrelevant insofar as its not widely supported, it took two years for 12 games to get supported. It also isn't native image quality 100% of the time, there are tradeoffs. Its better in some ways and worse in others, its not a direct replacement for native rendering and unless that changes, its not apples to apples. Beyond that, DLSS performance gain on high quality seems to be 30 - 35%, all amd has to do to counter is release a card that has 35% more raster performance and then it would still win, even in the apples to oranges comparison. AMD is absolutely capable of doing that based on what we're seeing here out of a 238W part when nvidia has now told the market that 350W+ at the high end is the new normal, amd still has 110W to play with for a larger gpu if they actually end up needing to. I half expect them to drop a 350W monstrosity the day after nvidia reveals whatever its got coming next year to counter rdna2. It would be a very fitting move, and a smart one, but hey, that's speculation. These leaked specs however, not speculation, and its already a monster, and those clockspeeds are pure insanity....
 

Mirado

Member
Jul 7, 2020
1,187
Key points to keep in mind:

1) Please avoid comparing Nvidia FLOPS to AMD FLOPS. They aren't using the same metric.
2) We don't know how well these cards perform with different workloads. The top part might beat a 3080 but not if the workload is ray-traced.
3 AMD's driver team stinks, or at least isn't as good as Nvidia's is. Real world performance of AMD (and ATI before them) GPUs is often disappointing compared to what it is capable of on paper. As a old ATI (and later AMD) fan, I've been bitten in the ass more than once by this.

With that said, I'm always up for more competition and choice and hope that RDNA2 really does make a dent in Nvidia's high end GPU monopoly. I'm tired of AMD having nothing better than a XX60 or XX70 competitor as their flagship.
 

TheNerdyOne

Member
Oct 28, 2017
521
It's yet to be shown that AMD's solution is as performant as Nvidia's dedicated hardware. Also Nvidia's RT hardware is in addition to their rasterization whereas AMD has to give up rasterization performance to do ray tracing.

When amd has 5x more RT hardware, and at 10%+ higher clockspeed to boot, and potentially a shader performance advantage as well (we'll find out soon), I'm not sure how much that's going to matter here. AMD can do 6x more BVH intersect tests/sec, and all of that's purely on dedicated hardware. The question becomes then... which is the more intensive task? the intersect testing, or the traversal? and does it even matter when you're shader bound out the other side of the RT pass anyway? We'll have to wait and see it tested. Either way its not as cut and dry as you're making it out to be, there are advantages and disadvantages to both companies' approaches, and clearly amds engineers thought the tradeoffs were worth it to get the performance they're getting with the solution, or they would have done something else.
 

RivalGT

Member
Dec 13, 2017
6,393
These just need to be priced well, even then I'm not sure if they will be able to compete with Nvidia mid range. I'm optimistic that these will perform well for the price.
 

TheNerdyOne

Member
Oct 28, 2017
521
These just need to be priced well, even then I'm not sure if they will be able to compete with Nvidia mid range. I'm optimistic that these will perform well for the price.

Nvidia's upper midrange (3070) is ~35 - 40% faster than a 5700XT if it actually matches a 2080Ti on average like nvidia claims.... uh, guess what? amd's midrange is 40 - 45% faster than a 5700XT by the numbers, not accounting for the perf/flop increases that will come from support for the new DX12U features like VRS, mesh shading, and SFS, which will see that balance shift further. AMD's 80CU part is significantly faster than a mere 35 - 40% better than a 5700XT (its twice the shaders, with definitely improved perf/flop, and at a 300mhz higher clockspeed, and almost the same power draw). Price is a thing we definitely need to hear about though, i expect they wont disappoint there, they're obviously coming to play this time.
 

Relix

Member
Oct 25, 2017
6,219
While AMD misses DLSS I won't even consider it. Went with 3080 this generation. While it's true few games use i, this was introduced two years ago. More and more games are coming with it, more now that UE4 includes it in their pipeline. Also as far as I am aware Nvidia has better RT tech.
 

Deleted member 25042

User requested account closure
Banned
Oct 29, 2017
2,077
I think they'll be competitive in rasterization vs Ampere.
RT, less so.
ML isn't even a talking point yet for AMD afaic as we've seen nothing to think they're even ready to offer something similar to NV (please don't say DirectML..)

High end Ampere being such a power hog was pretty disappointing to me so if AMD can offer similar enough performance in rasterization + decent RT in a 250-275W power envelope at a competitive price point that'd be pretty nice to me.
 

Deleted member 34714

User requested account closure
Banned
Nov 28, 2017
1,617
When amd has 5x more RT hardware, and at 10%+ higher clockspeed to boot, and potentially a shader performance advantage as well (we'll find out soon), I'm not sure how much that's going to matter here. AMD can do 6x more BVH intersect tests/sec, and all of that's purely on dedicated hardware. The question becomes then... which is the more intensive task? the intersect testing, or the traversal? and does it even matter when you're shader bound out the other side of the RT pass anyway? We'll have to wait and see it tested. Either way its not as cut and dry as you're making it out to be, there are advantages and disadvantages to both companies' approaches, and clearly amds engineers thought the tradeoffs were worth it to get the performance they're getting with the solution, or they would have done something else.
You better hope RDNA2 RT performs well with what you keep saying. I still doubt based on console RT performances so far. Even DF said XSX RT performed compared to a 2060 at 13TF RT cores.
 

TheNerdyOne

Member
Oct 28, 2017
521
While AMD misses DLSS I won't even consider it. Went with 3080 this generation. While it's true few games use i, this was introduced two years ago. More and more games are coming with it, more now that UE4 includes it in their pipeline. Also as far as I am aware Nvidia has better RT tech.

We have no idea who's RT tech is better, nobody's actually tested it in an apples to apples fashion on released hardware and software. On paper, they both have different strengths and weaknesses vs eachother. Navi21 can do ~6X more BVH intersect tests/second than the RTX 3080, but has to do traversal on shaders, we don't really know the impact that's going to have in the real world yet.
 

TSM

Member
Oct 27, 2017
5,821
These just need to be priced well, even then I'm not sure if they will be able to compete with Nvidia mid range. I'm optimistic that these will perform well for the price.

Judging by the recent Steam survey it's not really performance that is AMD's problem in 2020. They had comparable cards to Nvidia's 20 series, but they really didn't sell many. Nvidia has so many value added features like DLSS, Broadcast and game ready drivers for most significant releases that it's going to be tough to win mind share back.
 

TheNerdyOne

Member
Oct 28, 2017
521
You better hope RDNA2 RT performs well with what you keep saying. I still doubt based on console RT performances so far. Even DF said XSX RT performed compared to at 2060 at 13TF R" cores.

DF were looking at minecraft RT demo, a demo done in a few days by a single guy from what i understand, vs a released piece of software worked on by hundreds of devs over the course of months or years? Furthermore, they, nor anyone else, have talked about the way amds solution works, as well as the strengths and weaknesses of the approach. They're basing their statement on a flops figure that doesn't actually translate into anything, its as useless a figure as nvidia's gigarays was. And on incomplete, early tech demo software at that.... We're going to need testing by actually reliable tech outlets like hardware unboxed, in apples to apples comparisons before we can draw any conclusions, we definitely can't do so based on the minecraft RT tech demo, and that alone, which is what DF did. If they want to compare it to something, find a build of minecraft RTX that was as early in development as the XSX minecraft demo, and compare those, but of course they cant do that.
 

nitewulf

Member
Nov 29, 2017
7,195
That is only an API. The question becomes who has done the AI upscaling machine learning they would need to use to have a DLSS alternative?
Well the assumption is MS is doing that. But how will compare to DLSS, and how far along are they etc are all unknowns at this point. Will AMD come up with a solution? We don't know.
 

TheNerdyOne

Member
Oct 28, 2017
521
Can you provide source?

I can provide a source for the XSX bvh intersect tests/sec figure, and show you how you arrive at that figure (its RT cores times clockspeed, each RT core can do a single intersect test per cycle, XSX has 208 RT cores, at 1825mhz, that gets you 379.6 billion intersect tests/second, which is the figure microsoft explicitly states as the spec. The same is true on turing and ampere, each RT core can do one intersect test per cycle, each of the 68SMs on the 3080 has one RT core, ergo 116 billion intersect tests/sec (at 1710mhz, the official boost clock, its higher in practice, but that complicates the math)

Xbox-DXR2.jpg


confirmation of the ray-box (bvh intersect) test figure. 380G/sec (its 379.6G/sec, but they rounded up). As i said, traversal being done on shaders on RDNA2 complicates things, its not cut and dry, but the fact they can do traversal and shading concurrently is a plus. I'm not sure if you can do shading concurrently with RT on ampere, i don't think you could on turing. They clearly felt having a metric fuckton of intersect test performance was a worthwhile solution. I am not a developer, i cannot say for certain whether this approach is a good one or not, i can say that amd engineers aren't idiots, and i can say there's definitely gonna be tradeoffs to either solution. If you're curious, its 4 RT cores per CU, so 80CU rdna2 has 320 RT cores so it can, at 2.2ghz, do 704 billion bvh intersect tests/second.
 
Last edited: