Is Cerny right about "fast and narrow" being better than "wide and slow" or is it just spin?

Deleted member 16908 · Sep 21, 2020

From Cerny's "Road to PS5" talk:

Another set of issues for the GPU involved size and frequency. How big do we make the GPU and what frequency do we run it at?

This is a balancing act. The chip has a cost, and there's a cost for whatever we use to supply that chip with power and to cool it.

In general, I like running the GPU at a higher frequency. Let me show you why.

Here's two possible configurations for a GPU roughly of the level of the PlayStation 4 Pro. This is a thought experiment; don't take these configurations too seriously.

If you just calculate teraflops you get the same number, but actually the performance is noticeably different because teraflops is defined as the computational capability of the vector ALU.

That's just one part of the GPU. There are a lot of other units and those other units all run faster when the GPU frequency is higher: at 33% higher frequency, rasterization goes 33% faster; processing the command buffer goes that much faster; the L2 and other caches have that much higher bandwidth, and so on.

About the only downside is that system memory is 33% further away in terms of cycles. But the large number of benefits more than counterbalanced that.

As a friend of mine says, a rising tide lifts all boats.

Also it's easier to fully use 36CUs in parallel than it is to fully use 48CUs. When triangles are small, it's much harder to fill all those CUs with useful work.

Is this true, or is this just Sony trying to make the PS5's 36 CU GPU punch above its weight by running it at an insanely high frequency?

If it is better, then why does AMD not shoot for such high frequencies with their PC GPUs? Aren't the upcoming RDNA 2 cards going to have a ton of CUs? Isn't that a bad idea?

Please enlighten me.

Searsy82 · Sep 21, 2020

I think the best way to look at it is that there is no right or wrong. It is the right choice for what they wanted to accomplish.

brawndolicious · Sep 21, 2020

Never trust the baker to tell you how good their bread is.

Just wait and see.

Teenage Fansub · Sep 21, 2020

We call that Blast Processing.

New Username · Sep 21, 2020

Edit: deleted post

DukeBlueBall · Sep 21, 2020

At least for PC GPUs, more CUs is almost always better even if the teraflops are the same.

ImaginaShawn · Sep 21, 2020

Going by what we know of RDNA it scales better with more CUs than with high clock speed, and by a lot.

Spork4000 · Sep 21, 2020

I doubt it, but we'll see.

Outrun · Sep 21, 2020

brawndolicious said:
Never trust the baker to tell you how good their bread is.

Just wait and see.

This.

I don't care if it is Cerny or the MS dude.

They want to sell us on their philosophy.

plagiarize · Sep 21, 2020

At the same teraflops, with a good enough cooling setup, yep. He is spot on. Higher clock speeds are better for all the reasons he lays out. But that doesn't mean higher clock speeds are better than slower clock speeds with more teraflops.

ValeYard · Sep 21, 2020

I give Mark Cerny lots of credit as one of the most intelligent and visionary voices in the industry. Nevertheless, I also remember a lot of smart people telling us why ESRAM was the way to go in console architecture, or that Cell was great for that matter.

We'll just have to see how things pan out and what the devs can do with the hardware they've been given, I guess.

Servbot24 · Sep 21, 2020

I got a B in highschool Algebra, how the hell would I know

Zyae · Sep 21, 2020

Probably not.

Deleted member 224 · Sep 21, 2020

New Username said:
If the title was a generality and the tradeoff was linear then you would see everything with 1CU and much higher clockspeed. Since you don't, it's clearly not true in general and this is at most a local maximization problem.

This. Sony clocked higher because they wanted to hit a specific performance spec (10tf) but needed a specific number of CUs because of their BC solution.

It was the right choice. Might as well squeeze out as much as you can. But it's a 12tf machine vs 10tf. That's it. There is not secret tricks that will make either machine perform better than what the specs suggest.

ForZoey · Sep 21, 2020

Mark Cerny is correct in the context he made those specific remarks. He had a finite budget and wanted to invest a significant amount on the SSD and custom I/O unit. The best way to do that (1) while remaining competitive (2) without driving up cost, was to go small and fast. RDNA is more power efficient, allowing far higher clocks. These power gains would have been communicated to Cerny by AMD early on.

Digital Foundry, Oct 10, 2019

The easiest win in improving graphics performance from a console SoC is to improve the clock speed. Every part of the GPU gets extra juice from that, far more so than adding Compute Units and whatnot.

00:27:14

Favio Bolo · Sep 21, 2020

probably not. this whole overclock thing is to minmax weaker hardware.
we gotta see how much the cpu downclock to let the gpu get all the power

Mister_X · Sep 21, 2020

Think each has it's pros and cons. Luckily we will find out in a few months

Schopenhauer · Sep 21, 2020

From what I have heard fast and wide is the best combo!

Canucked · Sep 21, 2020

I just trust Cerny. Seems like an expert to me. Knack aside...

wiggler · Sep 21, 2020

New Username said:
If the title was a generality and the tradeoff was linear then you would see everything with 1CU and much higher clockspeed. Since you don't, it's clearly not true in general and this is at most a local maximization problem.

If there was no ceiling for how high we could clock CPUs, we'd still have single core CPUs.

LumberPanda · Sep 21, 2020

Hey, Intel told me the same thing a number of years ago

Khrol · Sep 21, 2020

Probably best to just wait and see but generally it's good to ignore most PR fluff

Charcoal · Sep 21, 2020

Seems like the hardware team took the "wide" comments to heart.

henhowc · Sep 21, 2020

Who knows. We'll have some idea how things compare once third party titles are out.

Teeth · Sep 21, 2020

As a friend of mine says, a rising tide lifts all boats.

Cerny is friends with John F. Kennedy?

Lobster Roll · Sep 21, 2020

Both consoles are efficiently and intelligently designed. Cerny is marketing his console.

James Sawyer Ford · Sep 21, 2020

Why is there still this myth that Sony went with 36 CUs for BC? That's not a requirement

they went with 36 CUs because it's cheaper to overclock, power, and cool a smaller chip than it is to fab a larger chip. End of story.

platocplx · Sep 21, 2020

There are pluses and minuses and he's right in what he explained as the advantages, he also talked about the disadvantages too.

dodmaster · Sep 21, 2020

Hopefully there won't be much of a noticeable difference either way and we avoid plumbing the depths of another Grassgate.

Deleted member 224 · Sep 21, 2020

James Sawyer Ford said:
Why is there still this myth that Sony went with 36 CUs for BC? That's not a requirement

they went with 36 CUs because it's cheaper to overclock, power, and cool a smaller chip than it is to fab a larger chip. End of story.

Because that's how it worked for the PS4 pro.

The fact that a number of BC titles will run in "base PS4 mode" or "boost mode" also seems to imply that. The ps5 will basically be mimicking those systems when running certain games. Otherwise all titles would see framerate boosts.

samred · Sep 21, 2020

this subject's phrasing has me all

Garrett 2U · Sep 21, 2020

The variable clocks and massive console size tell me that no, it probably isn't the better strategy.

senj · Sep 21, 2020

I wouldn't claim to be able to tell from a piece of paper, so 'wait and see' is the only real answer

but fwiw GPUs have been trending towards wide and slow for years because that's where the performance is for embarrassingly parallel workloads, so I have to at least raise an eyebrow about it. He's not wrong that you put more pressure on the rasterizers w/ wide-and-slow, though, but I guess I'd just be surprised if moving the bottleneck to memory instead is really much of a win.

Searsy82 · Sep 21, 2020

samred said:
this subject's phrasing has me all

LOOOOOOL

2Blackcats · Sep 21, 2020

henhowc said:
Who knows. We'll have some idea how things compare once third party titles are out.

I think it will take longer then that to tell. 3rd parties in 18months maybe

Deleted member 7148 · Sep 21, 2020

I'm putting my money on slow and girthy.

hanshen · Sep 21, 2020

Ctrl Alt Del · Sep 21, 2020

Wouldn't longer memory cycles introduce occasional micro-stuttering on software that push the hardware to the limit?

Hey Please · Sep 21, 2020

Conventional wisdom and indeed tests done by DF show that wider and slower is better than faster and narrower (5700XT vs OC5700).

However, it is vital to remember that how things play out on PC (given its assembled nature and how developers optimize for PC titles in general) may not yield the same performance results on a purpose build closed system with its own OS, APIs, performance profilers. Every part of the system is set in stone and optimization involves every part works in sync with every other part for the best outcome. Furthermore, given consoles in general tend to aim locked framerates (30, 60 and now 120 for select titles), this metric of wide vs narrow becomes moot as it simply becomes a matter of reaching a target performance throughout the entirety of the title.

Myself · Sep 21, 2020

Games on both will look fantastic

delete · Sep 21, 2020

He isn't wrong obviously with what he says, but more CUs means more overall transistors to do any processing.

I think the reason that they chose 36 CUs is because eventually they would want to release a ps5 pro with at least double the number of CUs (at least 72) and at least at the same clock speed, if they went too big too soon it would add cost overall now and later when they would design a ps5 pro.

Robinson said:
From Cerny's "Road to PS5" talk:

Is this true, or is this just Sony trying to make the PS5's 36 CU GPU punch above its weight by running it at an insanely high frequency?

If it is better, then why does AMD not shoot for such high frequencies with their PC GPUs? Aren't the upcoming RDNA 2 cards going to have a ton of CUs? Isn't that a bad idea?

Please enlighten me.

it's rumoured that AMD's big navi (next gen graphics card to compete with nvidia 3000 series) will have 80CUs and will run at a high frequency similar to PS5.
Sony running the GPU at high frequencies is in line with the rumours and information so far released about AMD's RDNA2 graphics cards.

What Cerny said isn't spin but it's also not the whole picture, and probably only time will tell.

Lukas Taves · Sep 21, 2020

Of course not. If it was a good design AMD and Nvidia would follow that design approach.

Waaghals · Sep 21, 2020

It can be harder to get effective utilization from a chip with very many CUs. I don't think the Series X is quite big enough for that to be a problem.
Using a smaller chip but clocking it higher allows for more dies per wafer. That should bring the cost per chip down. Then you compensate with higher clocks to gain some performance back.

Not being an expert, I assume that there is not guarantee of the chip clocking as you hope when you are starting out, which may explain why MS went with a big chip in the SX to ensure the highest performance. At any rate, like CUs you do not get a linear increase of performance when raising clocks.

There were some rumors that Sony had poor yields for the PS5. That might not be outright broken dies, but rather that many of them do not clock high enough on a reasonable voltage.

Of course, that rumor could also be baseless.

SuperYlvis · Sep 21, 2020

According to place Cerny is the second coming of Jesus Christ, so it has to be true.

Deleted member 1003 · Sep 21, 2020

Searsy82 said:
I think the best way to look at it is that there is no right or wrong. It is the right choice for what they wanted to accomplish.

Done

OneBadMutha · Sep 21, 2020

Wide is good for machine learning and ray tracing. In the end, depends on what you're trying to do.

Deleted member 1594 · Sep 21, 2020

brawndolicious said:
Never trust the baker to tell you how good their bread is.

Just wait and see.

"This will be the best bread you've ever tasted!"

"The baker later clarified his comments, adding that his statement was only meant to be true for people that have literally never had any other bread before."

Tomo815 · Sep 21, 2020

The way I see it, Sony told Cerny that he had a strict budget and had to work out something special within the confines of that budget.

So Cerny had a choice of spending more on more CUs or have less CUs and use the budged elsewhere (SSD for example) and he took the second option.

ForZoey · Sep 21, 2020

Lukas Taves said:
Of course not. If it was a good design AMD and Nvidia would follow that design approach.

Did you read the OP? It feels like you are replying to a different question.

CrispyGamer · Sep 21, 2020

Umm the games look amazing so maybe idk maybe not

Is Cerny right about "fast and narrow" being better than "wide and slow" or is it just spin?

Deleted member 16908

Eating crackers

Prophet of Truth

Deleted member 224

Comics Council 2020 & Chicken Chaser

signature-less, now and forever

2020 Member Elect

Deleted member 224

Amico fun conversationalist

Deleted member 7148

User requested account closure

Account closed at user request