A couple choice quotes here:
So, when I made the statement that the GPU will spend most of its time at or near its top frequency, that is with 'race to idle' taken out of the equation - we were looking at PlayStation 5 games in situations where the whole frame was being used productively. The same is true for the CPU, based on examination of situations where it has high utilisation throughout the frame, we have concluded that the CPU will spend most of its time at its peak frequency."
This is important to show they're not gaming the metric.
Cerny also stresses that power consumption and clock speeds don't have a linear relationship. Dropping frequency by 10 per cent reduces power consumption by around 27 per cent.
This is a cubic relationship. 0.9^3 = 72.9%. That means drastic reductions in power. Downclocks should indeed be minor.
It's an innovative approach, and while the engineering effort that went into it is likely significant, Mark Cerny sums it up succinctly: "One of our breakthroughs was finding a set of frequencies where the hotspot - meaning the thermal density of the CPU and the GPU - is the same. And that's what we've done. They're equivalently easy to cool or difficult to cool - whatever you want to call it."
This is also an important point. Some Intel desktop CPUs run faster than their integrated graphics-less counterparts because those idle graphics cores actually add as a thermal spreader. If you had uneven thermal density, they'd be contributing to each other's hot spots.
There's likely more to discover about how boost will influence game design. Several developers speaking to Digital Foundry have stated that their current PS5 work sees them throttling back the CPU in order to ensure a sustained 2.23GHz clock on the graphics core. It makes perfect sense as most game engines right now are architected with the low performance Jaguar in mind - even a doubling of throughput (ie 60fps vs 30fps) would hardly tax PS5's Zen 2 cores. However, this doesn't sound like a boost solution, but rather performance profiles similar to what we've seen on Nintendo Switch. "Regarding locked profiles, we support those on our dev kits, it can be helpful not to have variable clocks when optimising. Released PS5 games always get boosted frequencies so that they can take advantage of the additional power," explains Cerny.
Yes, developers already see dropped clocks, but it's intentional. Some PC benchmarking sites tear their hair out trying to interpret benchmark results because of unpredictable boost behavior. This eliminates that.
"All of the game logic created for Jaguar CPUs works properly on Zen 2 CPUs, but the timing of execution of instructions can be substantially different," Mark Cerny tells us. "We worked to AMD to customise our particular Zen 2 cores; they have modes in which they can more closely approximate Jaguar timing. We're keeping that in our back pocket, so to speak, as we proceed with the backwards compatibility work."
This is the subject of several of Cerny's patents.
GPUs process hundreds or even thousands of wavefronts; the Tempest engine supports two," explains Mark Cerny. "One wavefront is for the 3D audio and other system functionality, and one is for the game. Bandwidth-wise, the Tempest engine can use over 20GB/s, but we have to be a little careful because we don't want the audio to take a notch out of the graphics processing. If the audio processing uses too much bandwidth, that can have a deleterious effect if the graphics processing happens to want to saturate the system bandwidth at the same time."
This is very important. It gives us an upper bound for how much memory bandwidth we would want on a per CU basis. Since the clock is the same as the GPU, PS5's number is 720GB/s.
As a result, with the GPU if you're getting 40 per cent VALU utilisation, you're doing pretty damn well. By contrast, with the Tempest engine and its asynchronous DMA model, the target is to achieve 100 percent VALU utilisation in key pieces of code."
This shows just how little of a system's teraflops can be realistically used. 40% VALU usage!