• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.

Deleted member 16908

Oct 27, 2017
9,377
From Cerny's "Road to PS5" talk:
Another set of issues for the GPU involved size and frequency. How big do we make the GPU and what frequency do we run it at?

This is a balancing act. The chip has a cost, and there's a cost for whatever we use to supply that chip with power and to cool it.

In general, I like running the GPU at a higher frequency. Let me show you why.


Here's two possible configurations for a GPU roughly of the level of the PlayStation 4 Pro. This is a thought experiment; don't take these configurations too seriously.

20200329151024.jpg


If you just calculate teraflops you get the same number, but actually the performance is noticeably different because teraflops is defined as the computational capability of the vector ALU.

That's just one part of the GPU. There are a lot of other units and those other units all run faster when the GPU frequency is higher: at 33% higher frequency, rasterization goes 33% faster; processing the command buffer goes that much faster; the L2 and other caches have that much higher bandwidth, and so on.

About the only downside is that system memory is 33% further away in terms of cycles. But the large number of benefits more than counterbalanced that.

As a friend of mine says, a rising tide lifts all boats.


Also it's easier to fully use 36CUs in parallel than it is to fully use 48CUs. When triangles are small, it's much harder to fill all those CUs with useful work.


Is this true, or is this just Sony trying to make the PS5's 36 CU GPU punch above its weight by running it at an insanely high frequency?

If it is better, then why does AMD not shoot for such high frequencies with their PC GPUs? Aren't the upcoming RDNA 2 cards going to have a ton of CUs? Isn't that a bad idea?

Please enlighten me.
 

Searsy82

Member
May 13, 2019
860
I think the best way to look at it is that there is no right or wrong. It is the right choice for what they wanted to accomplish.
 

ImaginaShawn

Banned
Oct 27, 2017
2,532
Going by what we know of RDNA it scales better with more CUs than with high clock speed, and by a lot.
 

plagiarize

Eating crackers
Moderator
Oct 25, 2017
27,508
Cape Cod, MA
At the same teraflops, with a good enough cooling setup, yep. He is spot on. Higher clock speeds are better for all the reasons he lays out. But that doesn't mean higher clock speeds are better than slower clock speeds with more teraflops.
 

ValeYard

Member
Oct 25, 2017
445
I give Mark Cerny lots of credit as one of the most intelligent and visionary voices in the industry. Nevertheless, I also remember a lot of smart people telling us why ESRAM was the way to go in console architecture, or that Cell was great for that matter.

We'll just have to see how things pan out and what the devs can do with the hardware they've been given, I guess.
 

Deleted member 224

Oct 25, 2017
5,629
If the title was a generality and the tradeoff was linear then you would see everything with 1CU and much higher clockspeed. Since you don't, it's clearly not true in general and this is at most a local maximization problem.
This. Sony clocked higher because they wanted to hit a specific performance spec (10tf) but needed a specific number of CUs because of their BC solution.

It was the right choice. Might as well squeeze out as much as you can. But it's a 12tf machine vs 10tf. That's it. There is not secret tricks that will make either machine perform better than what the specs suggest.
 

ForZoey

Member
May 29, 2019
66
Mark Cerny is correct in the context he made those specific remarks. He had a finite budget and wanted to invest a significant amount on the SSD and custom I/O unit. The best way to do that (1) while remaining competitive (2) without driving up cost, was to go small and fast. RDNA is more power efficient, allowing far higher clocks. These power gains would have been communicated to Cerny by AMD early on.

Digital Foundry, Oct 10, 2019

The easiest win in improving graphics performance from a console SoC is to improve the clock speed. Every part of the GPU gets extra juice from that, far more so than adding Compute Units and whatnot.




00:27:14
 
Last edited:

Favio Bolo

Banned
Aug 17, 2020
387
probably not. this whole overclock thing is to minmax weaker hardware.
we gotta see how much the cpu downclock to let the gpu get all the power
 

wiggler

Member
Oct 27, 2017
473
If the title was a generality and the tradeoff was linear then you would see everything with 1CU and much higher clockspeed. Since you don't, it's clearly not true in general and this is at most a local maximization problem.
If there was no ceiling for how high we could clock CPUs, we'd still have single core CPUs.
 

Khrol

Member
Oct 28, 2017
4,179
Probably best to just wait and see but generally it's good to ignore most PR fluff
 

Lobster Roll

signature-less, now and forever
Member
Sep 24, 2019
34,305
Both consoles are efficiently and intelligently designed. Cerny is marketing his console.
 
Feb 23, 2019
1,426
Why is there still this myth that Sony went with 36 CUs for BC? That's not a requirement

they went with 36 CUs because it's cheaper to overclock, power, and cool a smaller chip than it is to fab a larger chip. End of story.
 

platocplx

2020 Member Elect
Member
Oct 30, 2017
36,072
There are pluses and minuses and he's right in what he explained as the advantages, he also talked about the disadvantages too.
 

dodmaster

Member
Apr 27, 2019
2,548
Hopefully there won't be much of a noticeable difference either way and we avoid plumbing the depths of another Grassgate.
 

Deleted member 224

Oct 25, 2017
5,629
Why is there still this myth that Sony went with 36 CUs for BC? That's not a requirement

they went with 36 CUs because it's cheaper to overclock, power, and cool a smaller chip than it is to fab a larger chip. End of story.
Because that's how it worked for the PS4 pro.

The fact that a number of BC titles will run in "base PS4 mode" or "boost mode" also seems to imply that. The ps5 will basically be mimicking those systems when running certain games. Otherwise all titles would see framerate boosts.
 

Garrett 2U

Banned
Oct 25, 2017
5,511
The variable clocks and massive console size tell me that no, it probably isn't the better strategy.
 

senj

Member
Nov 6, 2017
4,430
I wouldn't claim to be able to tell from a piece of paper, so 'wait and see' is the only real answer

but fwiw GPUs have been trending towards wide and slow for years because that's where the performance is for embarrassingly parallel workloads, so I have to at least raise an eyebrow about it. He's not wrong that you put more pressure on the rasterizers w/ wide-and-slow, though, but I guess I'd just be surprised if moving the bottleneck to memory instead is really much of a win.
 

Deleted member 7148

Oct 25, 2017
6,827
I'm putting my money on slow and girthy.
 

Hey Please

Avenger
Oct 31, 2017
22,824
Not America
Conventional wisdom and indeed tests done by DF show that wider and slower is better than faster and narrower (5700XT vs OC5700).

However, it is vital to remember that how things play out on PC (given its assembled nature and how developers optimize for PC titles in general) may not yield the same performance results on a purpose build closed system with its own OS, APIs, performance profilers. Every part of the system is set in stone and optimization involves every part works in sync with every other part for the best outcome. Furthermore, given consoles in general tend to aim locked framerates (30, 60 and now 120 for select titles), this metric of wide vs narrow becomes moot as it simply becomes a matter of reaching a target performance throughout the entirety of the title.
 

delete

Member
Jul 4, 2019
1,189
He isn't wrong obviously with what he says, but more CUs means more overall transistors to do any processing.

I think the reason that they chose 36 CUs is because eventually they would want to release a ps5 pro with at least double the number of CUs (at least 72) and at least at the same clock speed, if they went too big too soon it would add cost overall now and later when they would design a ps5 pro.

From Cerny's "Road to PS5" talk:

Is this true, or is this just Sony trying to make the PS5's 36 CU GPU punch above its weight by running it at an insanely high frequency?

If it is better, then why does AMD not shoot for such high frequencies with their PC GPUs? Aren't the upcoming RDNA 2 cards going to have a ton of CUs? Isn't that a bad idea?

Please enlighten me.

it's rumoured that AMD's big navi (next gen graphics card to compete with nvidia 3000 series) will have 80CUs and will run at a high frequency similar to PS5.
Sony running the GPU at high frequencies is in line with the rumours and information so far released about AMD's RDNA2 graphics cards.

What Cerny said isn't spin but it's also not the whole picture, and probably only time will tell.
 
Last edited:

Waaghals

Member
Oct 27, 2017
856
It can be harder to get effective utilization from a chip with very many CUs. I don't think the Series X is quite big enough for that to be a problem.
Using a smaller chip but clocking it higher allows for more dies per wafer. That should bring the cost per chip down. Then you compensate with higher clocks to gain some performance back.

Not being an expert, I assume that there is not guarantee of the chip clocking as you hope when you are starting out, which may explain why MS went with a big chip in the SX to ensure the highest performance. At any rate, like CUs you do not get a linear increase of performance when raising clocks.

There were some rumors that Sony had poor yields for the PS5. That might not be outright broken dies, but rather that many of them do not clock high enough on a reasonable voltage.

Of course, that rumor could also be baseless.
 

Tomo815

Banned
Jul 19, 2019
1,534
The way I see it, Sony told Cerny that he had a strict budget and had to work out something special within the confines of that budget.

So Cerny had a choice of spending more on more CUs or have less CUs and use the budged elsewhere (SSD for example) and he took the second option.