I've not been on the thread for a while, I've seen many people starting pretty rough conversations and getting (rightfully) banned for it. We apparently drifted and ended up talking purely clocks, so I wanted to make sure I address a couple of things said earlier that cause a lot of tumble now.
I mean, yes it does. It's inefficient. They're pushing over the efficiency of their GPU design.
[...]
What traditional boost ?
Where's the smart design in pushing clocks higher than the efficiency point ?
[...]
Then again, it's fine to not care about power draw. I didn't make it as a bad point. Just that it's not a smart design as some people claims. There's nothing smart to push higher clocks to the point of inefficiency. It almost sounds like an afterthought.
We actually don't know where lies the efficient, linear part of the curve for those new RDNA2 GPUs, since the new process node and AMD claim on clocking circuitry improvements, we know there should be a larger threshold before we see poor returns on clock frequency. Sony didn't only push higher clocks, it's also the XSX chip that definitely can't reach too high, and generally drives more power anyway, so they had headroom.
The 3.2GHz cores were not conservative for their time unless you're comparing those devices with pretty pricy alternatives. The fact that power scales exponentially with clock is not some big secret and there is no magic sweet spot where this isn't true. At lower clock speeds it's possible to keep a constant voltage and merely ride a power of two curve, but as Intel has noted with current leakage and other effects alongside higher frequencies it's getting a lot closer to a power of three. Yes, you can find a "sweet spot" where you're closer to scaling with the square of the clock speed, but that still means that you get less than 5% performance increase for 10% additional power. Once it reaches cube scaling it's closer to 3% increase for each 10% in additional power.
This isn't new. We've been pushing clock speeds as high as we can get away with forever, because it keeps die sizes and therefore manufacturing costs down for the performance levels.
I wouldn't say it's exponential, the formula for dynamic power would be something like ~ C * V² * f (C is capacitance, V the supply voltage and f the clock frequency the design runs at). Capacitance more or less stays the same, but you'd need to increase voltage to push higher clocks, and at some point yes you could see exponential voltage requirements to push clocks (you also probably completely need to remove the thermal headroom to see this).
It depends what the V/f curve looks like for the new AMD parts, on average TSMC reports N7P on average lower isofrequency power by 7%. By design, the XSX GPU should use marginally higher voltages and leak more power. Temperature also tends to raise leakage power, hence I hope they cautiously fine tuned for low temperatures (<80°C would be nice).
My understanding is thats how it works on the XSX as well, if you access the faster or slower pool you lock access to the entire bus from any other device from accessing it for the period of the memory transaction. The difference being that the memory transactions will take the same amount of time on the PS5 regardless of where it is but the majority of the memory access by the CPU will likely take longer on the XSX.
I think they mentionned using 13.5GB on games, 6GB being part of the slower pool in 16GB. 6 out of 10 memory controllers are linked to the smaller pools, it does make quite a lot of conflict accessing at least 12GB though.
There is no large GPU power difference between 2 consoles. It's smallest that has ever been and resolution difference will absolute be the only tangible delta you are going to (most probably not) notice.
While CPU cache is insanely fast, it's also much, much smaller than the unified GDDR6 memory pool. Seems to me like each would allow for different approaches, meaning techniques optimized for the console architecture may not be able to translate to the PC's advantages.
Remember the goal of the cache is not to replace a faster pool of memory, it can effectively store a "very" small fraction of assets and hide larger latencies when accessing VRAM. I think the I/O ASICs would be an interesting addition to further generations of desktop CPUs, however I don't think separate pools of faster or slower memory require that much rework ?
Clocks are some of the "easiest" parts of a GPU/CPU to adjust.
Definitely untrue, especially when you consider +100/200MHz is a big deal for high end GPUs. In the case Sony asks that much of a frequency hike, it's unlikely AMD and TSMC could supply enough chips that pass the 2.2GHz threshold. Binning would not allow it.
performance scales linearly with clocks. its the temperatures that do not.
those in game results are off because 5700xt is bad at rendering games at 4k. maybe its the bandwidth or some other limitation elsewhere. a better test would be to test games at 1440p.
the firestrike scores scale linearly as you increase clocks.
I'm afraid it's not a great way to test, firestrike score is not a measure of performance. When raising clocks, you also see diminishing returns because your clocks can be on average stable, but cause performance regressions when faster calculations also result in errors. This also appears quite significantly when overclocking memory clocks. I don't think the "5700xt is bad at rendering 4K games", a GPU with more CUs usually shows higher differences when the resolution increases, because the size of the problem increases and lower frequencies become less of an issue. It's
confirmed on both RDNA, Turing when gaming at 1080p/1440p and 4K. Using more games makes for a better figure, also.
but they dont. this kind of variable clock logic have never been used in PCs, let alone consoles. That's what everyone has been saying. it is NOT like what we have seen in games before. PCs dont downclock based on load, they just drop frames and run hotter. the way cerny described it, he clearly says its based on activity.
I don't necessarily agree, Intel, AMD and Nvidia use boost algorithms clearly based on load, temperature and power, and AMD firmware made for consoles is not unique to those.