• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.

MetalKhaos

Member
Oct 31, 2017
1,692
I've had a PS2 fat disk drive fail on me, my PS3 got YLOD (a lot of OG units suffer this too), my PS4 had the jet engine issue (not a failure I know) and my Pro had a full drive failure after 6 months.

I've also had 1 OG Xbox drive failure and 1 RROD.

Electronics are fickle

They are, and correction, I had a RROD on my OG 360 as well, though to be fair, the early 360's were rife with that problem.
 

neoak

Member
Oct 25, 2017
15,252
In response to someone citing hardware failures in previous generations?

Context ftw.
Again, it is being called as a bad design and possible point of failure after a long period of time. What is that period of time? Who knows?

My launch Wii had memory video artifacts due to being on for WiiConnect24 and Nintendo not turning on the fan for it, but that took a few years to develop.

PC Video cards have trouble with memory developing video artifacts due to hot memory.

You're dismissing stuff that happens in PCs too.

You quoted me when I said it would take years to know.

Edit: and if you're trying to dismiss the PS2 drive failure rate, ask how many people had to rebuy PS2s.
 

acebeam

Alt-Account
Banned
Nov 23, 2020
128
The PS4 and specifically that picture shows square pads that cover almost the entire die, which is exactly what the GN video suggests.

Will it work fina as is, probably, but a few pennies worth of pads could lower the temp a couple degrees.

Neither in PS4 and PS5 thermal pad doesn't cover the whole memory chip and pad is basically the same size on both PS4 and PS5 ( maybe a little smaller) . Just in PS5 is a round pad
 
Last edited:

Mokubba

Member
Oct 27, 2017
467
Good to see a follow up video but even after watching the first I didn't see anything to worry about.

Also, in the comments, he said he might do a video with them pointing a heat gun at the memory module to see at what temps it starts to affect gameplay.

That could be a good idea to see how much headroom the memory modules have.
 

GoldPunch

Member
Apr 20, 2018
40
Turkey
Good to see a follow up video but even after watching the first I didn't see anything to worry about.

Also, in the comments, he said he might do a video with them pointing a heat gun at the memory module to see at what temps it starts to affect gameplay.

That could be a good idea to see how much headroom the memory modules have.


Yes, it's a good idea, but I don't know gamersnexus can sacrifice own ps5 for it.
 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
Update video with questions and answers.


Very very nice follow up, I recommend anyone here to watch it carefully and in whole. I agree with most of his explanations.

He used the side of the case because it's a flipchip and he still gets a reasonable temp measurement because of the ground-plane conduction. But which side of the case he uses would be the difference between getting the hotspot of the controller and PHY, versus the dram array which is much colder. The logic is a big line through the middle, with the dram arrays on each sides. (I'm not entirely sure, should be fun to test all sides of one chip)

He's still wrong about the TIM being a standard pad, the engineer said it's a cured compound. It's round because it's literally applied in liquid form. I do agree with his opinion that it's easy to replace it with a pad with good characteristics, so not a big deal. In any case, the chances are slim that their cured compound have wildly different characteristics than a "good" silpad, because sony implied it's to reduce the height variability between each chips to the aluminum EMI shield. But using double-sided sticky pad would solve this so I don't understand the reasoning.

https://www.4gamer.net/games/990/G999027/20201016035/
(google translate)
"The TIM applied to the shield looks like a circular rubber sheet, but it seems to be liquid at the manufacturing process stage. It cures over time and eventually becomes a rubber-like texture. The height of the chip group that is the heat source is expected to vary to some extent, but this TIM is liquid at the time of application, so that variation can be absorbed."

The liquid metal is also not a known alloy, and I hope modder understand what they're getting into, personally I'm certainly not removing that side off without knowing what alloy was chosen and why, usually they'd modify the alloy to be less reactive to whatever they have around the well. So it's a risk to replace it.

https://www.4gamer.net/games/990/G999027/20201016035/
(google translate)
"Although the composition of the liquid metal and the collaborative manufacturer were not disclosed, it is said that a general gallium alloy is used as the liquid metal TIM. However, the interview emphasized that SIE is a custom-made product."

Anyway with that said, I'll do a tear down and put some thermocouples on mine. I have a lot of different 1mm thick pads from 1.5w/mk (about 10 cents) to a 13w/mk vertically aligned graphite pads ($1 each). Would be fun to measure the stock ones, I'm not sure how to do this. It might need thicker ones than 1mm.

Big respect for both DF and GN having to deal with their twitter feed, what a mess.
 
Last edited:

GoldPunch

Member
Apr 20, 2018
40
Turkey
Very very nice follow up, I recommend anyone here to watch it carefully and in whole. I agree with most of his explanations.

He used the side of the case because it's a flipchip and he still gets a reasonable temp measurement because of the ground-plane conduction. But which side of the case he uses would be the difference between getting the hotspot of the controller and PHY, versus the dram array which is much colder. The logic is a big line through the middle, with the dram arrays on each sides. (I'm not entirely sure, should be fun to test all sides of one chip)

He's still wrong about the TIM being a standard pad, the engineer said it's a cured compound. It's round because it's literally applied in liquid form. I do agree with his opinion that it's easy to replace it with a pad with good characteristics, so not a big deal. In any case, the chances are slim that their cured compound have wildly different characteristics than a "good" silpad, because sony implied it's to reduce the height variability between each chips to the aluminum EMI shield. But using double-sided sticky pad would solve this so I don't understand the reasoning.

https://www.4gamer.net/games/990/G999027/20201016035/
(google translate)
"The TIM applied to the shield looks like a circular rubber sheet, but it seems to be liquid at the manufacturing process stage. It cures over time and eventually becomes a rubber-like texture. The height of the chip group that is the heat source is expected to vary to some extent, but this TIM is liquid at the time of application, so that variation can be absorbed."

The liquid metal is also not a known alloy, and I hope modder understand what they're getting into, personally I'm certainly not removing that side off without knowing what alloy was chosen and why, usually they'd modify the alloy to be less reactive to whatever they have around the well. So it's a risk to replace it.

https://www.4gamer.net/games/990/G999027/20201016035/
(google translate)
"Although the composition of the liquid metal and the collaborative manufacturer were not disclosed, it is said that a general gallium alloy is used as the liquid metal TIM. However, the interview emphasized that SIE is a custom-made product."

Anyway with that said, I'll do a tear down and put some thermocouples on mine. I have a lot of different 1mm thick pads from 1.5w/mk (about 10 cents) to a 13w/mk vertically aligned graphite pads ($1 each). Would be fun to measure the stock ones, I'm not sure how to do this. It might need thicker ones than 1mm.

Big respect for both DF and GN having to deal with their twitter feed, what a mess.

Thank you.

By the way just how we can check temps from Aida64, I think Sony can able to see mem temperatures on the software side. If there is a temperature sensor, of course. If what he said was true, it would be a disgrace if they didn't realize that this memory module would get this hot.
 
Oct 27, 2017
4,912
His main point in the new video comes down to pointing out that the cooling on one side had bad engineering design/uneven cost-cutting. GN's content does have both nerdy and technical analysis of how a product was made as well as straight-to-the-point consumer advice so I think there was some confusion between him being critical and him saying whether or not to buy it.

For the nerds, there's some fun discussion to be had here but for the people who just want to know if the console will work fine for a long time, he's firmly saying that no one should make an assumption either way. The engineers at Sony have certainly tested everything and will have an idea how long each component would last in the real-world but whether that durability target was 3 years or 30 years, you'll just have to wait and see. Basically, a corner was cut but it was intentional and will be pointless to speculate how much it'll matter until you have actual data in your hands.
 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
Thank you.

By the way just how we can check temps from Aida64, I think Sony can able to see mem temperatures on the software side. If there is a temperature sensor, of course. If what he said was true, it would be a disgrace if they didn't realize that this memory module would get this hot.
It's obvious their control loop would includes the memory module. But it's up to Micron, Samsung, and more importantly JEDEC to determine the operating limits and the method of measurement. Not Steve, nor Aida64, not some random internet poster.

The correct way to measure the temperature in order to comply with TOPER, so that it has a guaranteed reliability and longevity, is described in JEDEC document JESD51ā€2.
"Operating Temperature TOPER is the case surface temperature on the center / top side of the DRAM"

Basically Steve took a measurement that give a much higher temperature reading (from the die side of the flipchip, which would be quite a few degrees hotter under max load), and other people also ignored that GDDR6 is designed for higher operating temperature (raised by 10C compared to GDDR5). This lead to the wrong conclusion that "it's too hot" because someone on the internet said so.

And I repeat that it's not simple to measure the top center of the chip to comply with the JEDEC methodology. You have to remove and replace the TIM with something else, and immediately, your test is no longer valid if you wanted to see how efficient the TIM was. His solution was clever but he ended up with an unknown additional deltaT in the form of Tj-Tc, and JEDEC doesn't require Tj in the reliability data , only the center of Tc. So there's no way to know if it exceeds specs, there's no specs to compare it to. What his test provide is an interesting ballpark figure, with a significant margin of error requiring discussions.
 

Deleted member 15311

User requested account closure
Banned
Oct 27, 2017
1,088
Again, it is being called as a bad design and possible point of failure after a long period of time. What is that period of time? Who knows?

My launch Wii had memory video artifacts due to being on for WiiConnect24 and Nintendo not turning on the fan for it, but that took a few years to develop.

PC Video cards have trouble with memory developing video artifacts due to hot memory.

You're dismissing stuff that happens in PCs too.

You quoted me when I said it would take years to know.

Edit: and if you're trying to dismiss the PS2 drive failure rate, ask how many people had to rebuy PS2s.
It happened a lot in some Vaios with Nvidia GPU's. My wives Vaio had that problem.
 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
If the material felt close enough to a silicone pad, I mean close enough to fool GN into thinking it's a normal pad, it would be a much softer cured compound than the usual "cured" stuff that is relatively hard and brittle.

I wasn't familiar with this stuff at all. I see there's quite a few companies making these kind of silicone-based putting compounds, so impossible to know what they used. Parker makes a product that cures relatively soft rubbery hardness. It's basically a 2 part silicone filled with aluminum oxide. After application, it would cure into a round shape as we see in the PS5. Parker CIP35 is curing to a hardness of 55 Shore A which sounds about right for what would feel like a silpad.

But the whole idea is that if you tear it off, then it behaves like a silpad that requires compressive force to be effective, the harder the compound, the more it requires a clamping force. The pads which we see on PS4 and other products, are extremely soft, thick, and gummy for that reason, they pretty much stick to the surfaces and needs almost no clamping. A cured compound can be harder (denser means potentially more conductive) and doesn't require ANY force applied as long as it's left in place undisturbed.

Ref:

THERM-A-FORM CIP35 Material - Parker Chomerics | DigiKey

Parker Chomerics' THERM-A-FORM is a dispensable form-in-place compound designed for heat transfer without compressive force in electronics cooling applications.
 

GoldPunch

Member
Apr 20, 2018
40
Turkey
MrFox

The new thermal review is just arrived. His temperatures are not bad as GamersNexus. Even he didnt use stand. What do you think about these temperatures?

 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
MrFox

The new thermal review is just arrived. His temperatures are not bad as GamersNexus. Even he didnt use stand. What do you think about these temperatures?


It's nice, he shows exactly how his setup is done, and he closed everything back. This guy repairs consoles I think? He did a proper teardown of the DS5, I watched his video to know where the black section clips were.

I liked his tests trying to make it overheat to see how it would react, covered all the outlet, and even disconnected the fan completely, and never got any memory corruption, didn't even crash. At least it's a good empirical test to show there's nothing "borderline" about the thermal design, it has a lot of margin available for all conditions. I do wish he let it continue without the fan until it really crashed, but I'm not doing this on mine either, so I can't complain.

He didn't use the stand, and I noticed a bit of heat coming out of the bottom on mine, so the stand is probably more useful than we previously thought. Personally if my PS5 needed to be vertical, I would definitely use the stand for thermal reasons.

Sadly he didn't pick up on the fact it's a cured compound, meaning he didn't read the engineer's interview either. He did mention he thinks the "pads" are cheap based on being brittle and they disintegrate easily. I'm not sure about that subjective statement, since brittleness can mean a higher oxide or ceramic density, you can't tell anything by the brittleness.

He destroyed them as much as GN did before doing the tests, so all the questions I have remain unanswered about the nature or comparative performance of this cured compound. We still have no idea if tearing them off affects it, since there's no tests before and after. We still don't know if repairing a PS5 needs a clean up there and reapplication of a cured solution, or if a normal "gummy pad" would be fine.
 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
I finally did a first sweep of temperature tests on mine.

I place a probe carefully in the the center of each ram chip but on the other side of the TIM, so the heat goes through the TIM, and through the thin aluminum layer. Rebuilt the PS5 completely and tested right into my equipement rack, squeezed on the shelf so everything is a real world situation. This won't provide any figures about the efficiency of the TIM, nor the exact die temperature, but it will show if there's an outlier chip that gets hotter, and hint at the possibility of experimental errors from other people's tests if they have outlier values when measuring from the other side of the board.

Code:
[FRONT]
   8  1
7        2      [FAN]
6        3
   5  4
[REAR]

Chip   Youtube(4K)    Astrobot(Memory Level)

1    45.0C        50.5C
2    39.7C        48.8C
3    41.1C        52.6C
4    43.0C        54.4C
5    44.2C        55.2C
6    42.1C        56.6C
7    41.7C        55.6C
8    40.9C        55.0C

The assumption required here is that all TIM on each chip, and the die-to-case characteristics should be almost identical, and each chip should have the same dissipated power. The variation expected would be from airflow, the shape of the aluminum sheet, differences on the PCB and proximity to heat sources.

The thermal resistance comprised of the aluminum thickness is negligible, and while the TIM is unknown, anything correctly applied would be maybe 2C/W on a pad of this area and thickness. The limits we're looking for here is the 95C on the surface of the case, it's up to the chip manufacturer to guarantee those specs, including the GDDR6 packaging for automotive use that allows a 105C case surface temperature (which indicates automotive GDDR6 used in self driving cars would need to be tested to be reliable up to a 120C die, unless they significantly underclock).

Youtube test was just to see what happens with medium-low power consumption, just a 4K playback for about 30 mins. For the Astrobot test I entered the memory level and left the character there for 1 hour (seemed to reach a steady state after about 30 mins). It's not the worst consumption across the entire game, but it allows to redo the exact same test with the sensor placed elsewhere (had to do 3 separate tests, since I only have 3 probes, and I didn't want to fill the precious space over the aluminum sheet with 8 wires that would really mess up the airflow). I'll be able to redo an exact test after the TIM is tore off or replaced. and I might test again after drilling a tiny hole in the aluminum plate to put a sensor right on the surface of the chip.

So there isn't any outlier chip. It shows 1 2 3 are being cooled a little better by their proximity to the aluminum area closest to the fan, and 4 5 6 7 8 being on the other side and under the optical drive so that surface area of the aluminum heatsink isn't getting as much cooling. All of the chips are an equal distance from the main heat source too. The chip closest to the NAND controller could have been an issue, but it doesn't seem to show anything.

A 20C higher die than the point of measurement here would be reasonable, the math adds up to 15C junction to case, 5C through the TIM, and negligible through the aluminum thickness. However the GDDR6 specs is 95C case surface temperature, leading to a calculated limit of 110C junction, as opposed to GDDR5 which was 85C case limit, and 100C junction.

Other people's tests on GDDR6 confirm the 15C delta we should expect from GDDR6 between the die and the case surface:
www.igorslab.de

GDDR6 memory temperatures comprehensibly explained and remeasured - is AMD doing everything right? | Basics | Page 2 | igorĀ“sLAB

All the errors and confusions about the suddenly occurring value of the memory temperatures of AMDā€™s current graphics cards understandably lead to uncertainty among many users. Modern memory chipsā€¦

Basically my test is closer to the case temperature (JEDEC requirements), while GN and Spawn Wave tests are measuring almost directly the die temperature (a much more important figure for overclockers). And combining all of this with 15C junction-to-case, and 5C through the TIM, it's all adding up except for the GN outlier chip. Spawn wave is 5C higher than my hottest chip, but I didn't try to find the highest consumption across astrobot so it seems a reasonable difference.

TL;DR

Unless I missed something, the math don't logically add up to more than 70C to 75C chip surface temp (from a 95C allowed design limit) unless there's been ridiculous amount of damage to the TIM, or my Astrobot test is very far from the highest consumption.

Whichever way we try to deduce the actual case surface temperature, to see how close it is from the of the JEDEC requirements, either from the top going through the TIM, or from the bottom, going through the vias, die, and package, I don't see any indication the 95C limit is reached within that thermal sandwich.
 
Last edited:

Mubrik_

Member
Dec 7, 2017
2,723
MrFox
Thanks for great additions to the thread, i doubt anyone should be worried after your detailed explanations adding to the videos we've seen
 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
Thanks MrFox for the testing!
Do you think PS5DE runs cooler then?
I don't know, but I'm guessing there wouldn't be any difference. Instead of a drive, there's a closed plastic cover which I assume they would be designed with exactly the same clearance.

I missed something else in my post above: the farthest chips from the fan are also closest to the VRMs, which are cooled by the same large aluminum sheet, so that would also account for the slightly higher temp on that side of the SoC.
 

Iron Eddie

Banned
Nov 25, 2019
9,812
Update video with questions and answers.



His quote,
"I don't know why the console audience is so much more willing to accept what's given to them"

Probably for two reasons, there is no other choice (until a PS5 slim comes out) and the price. Console gamers don't expect top of the line parts to be used.
 

Issen

Member
Nov 12, 2017
6,807
His quote,
"I don't know why the console audience is so much more willing to accept what's given to them"

Probably for two reasons, there is no other choice (until a PS5 slim comes out) and the price. Console gamers don't expect top of the line parts to be used.
I don't think making sure that components are adequately cooled by the cooling solution is exactly "top of the line" stuff.
 

ManOfWar

Member
Jan 6, 2020
2,464
Brazil
I finally did a first sweep of temperature tests on mine.

I place a probe carefully in the the center of each ram chip but on the other side of the TIM, so the heat goes through the TIM, and through the thin aluminum layer. Rebuilt the PS5 completely and tested right into my equipement rack, squeezed on the shelf so everything is a real world situation. This won't provide any figures about the efficiency of the TIM, nor the exact die temperature, but it will show if there's an outlier chip that gets hotter, and hint at the possibility of experimental errors from other people's tests if they have outlier values when measuring from the other side of the board.

Code:
[FRONT]
   8  1
7        2      [FAN]
6        3
   5  4
[REAR]

Chip   Youtube(4K)    Astrobot(Memory Level)

1    45.0C        50.5C
2    39.7C        48.8C
3    41.1C        52.6C
4    43.0C        54.4C
5    44.2C        55.2C
6    42.1C        56.6C
7    41.7C        55.6C
8    40.9C        55.0C

The assumption required here is that all TIM on each chip, and the die-to-case characteristics should be almost identical, and each chip should have the same dissipated power. The variation expected would be from airflow, the shape of the aluminum sheet, differences on the PCB and proximity to heat sources.

The thermal resistance comprised of the aluminum thickness is negligible, and while the TIM is unknown, anything correctly applied would be maybe 2C/W on a pad of this area and thickness. The limits we're looking for here is the 95C on the surface of the case, it's up to the chip manufacturer to guarantee those specs, including the GDDR6 packaging for automotive use that allows a 105C case surface temperature (which indicates automotive GDDR6 used in self driving cars would need to be tested to be reliable up to a 120C die, unless they significantly underclock).

Youtube test was just to see what happens with medium-low power consumption, just a 4K playback for about 30 mins. For the Astrobot test I entered the memory level and left the character there for 1 hour (seemed to reach a steady state after about 30 mins). It's not the worst consumption across the entire game, but it allows to redo the exact same test with the sensor placed elsewhere (had to do 3 separate tests, since I only have 3 probes, and I didn't want to fill the precious space over the aluminum sheet with 8 wires that would really mess up the airflow). I'll be able to redo an exact test after the TIM is tore off or replaced. and I might test again after drilling a tiny hole in the aluminum plate to put a sensor right on the surface of the chip.

So there isn't any outlier chip. It shows 1 2 3 are being cooled a little better by their proximity to the aluminum area closest to the fan, and 4 5 6 7 8 being on the other side and under the optical drive so that surface area of the aluminum heatsink isn't getting as much cooling. All of the chips are an equal distance from the main heat source too. The chip closest to the NAND controller could have been an issue, but it doesn't seem to show anything.

A 20C higher die than the point of measurement here would be reasonable, the math adds up to 15C junction to case, 5C through the TIM, and negligible through the aluminum thickness. However the GDDR6 specs is 95C case surface temperature, leading to a calculated limit of 110C junction, as opposed to GDDR5 which was 85C case limit, and 100C junction.

Other people's tests on GDDR6 confirm the 15C delta we should expect from GDDR6 between the die and the case surface:
www.igorslab.de

GDDR6 memory temperatures comprehensibly explained and remeasured - is AMD doing everything right? | Basics | Page 2 | igorĀ“sLAB

All the errors and confusions about the suddenly occurring value of the memory temperatures of AMDā€™s current graphics cards understandably lead to uncertainty among many users. Modern memory chipsā€¦

Basically my test is closer to the case temperature (JEDEC requirements), while GN and Spawn Wave tests are measuring almost directly the die temperature (a much more important figure for overclockers). And combining all of this with 15C junction-to-case, and 5C through the TIM, it's all adding up except for the GN outlier chip. Spawn wave is 5C higher than my hottest chip, but I didn't try to find the highest consumption across astrobot so it seems a reasonable difference.

TL;DR

Unless I missed something, the math don't logically add up to more than 70C to 75C chip surface temp (from a 95C allowed design limit) unless there's been ridiculous amount of damage to the TIM, or my Astrobot test is very far from the highest consumption.

Whichever way we try to deduce the actual case surface temperature, to see how close it is from the of the JEDEC requirements, either from the top going through the TIM, or from the bottom, going through the vias, die, and package, I don't see any indication the 95C limit is reached within that thermal sandwich.

This is an extremely useful and insightful post, should put everyone at ease about their PS5s for the long run.

Thanks for taking the time and effort.
 

mordecaii83

Avenger
Oct 28, 2017
6,852
I finally did a first sweep of temperature tests on mine.

I place a probe carefully in the the center of each ram chip but on the other side of the TIM, so the heat goes through the TIM, and through the thin aluminum layer. Rebuilt the PS5 completely and tested right into my equipement rack, squeezed on the shelf so everything is a real world situation. This won't provide any figures about the efficiency of the TIM, nor the exact die temperature, but it will show if there's an outlier chip that gets hotter, and hint at the possibility of experimental errors from other people's tests if they have outlier values when measuring from the other side of the board.

Code:
[FRONT]
   8  1
7        2      [FAN]
6        3
   5  4
[REAR]

Chip   Youtube(4K)    Astrobot(Memory Level)

1    45.0C        50.5C
2    39.7C        48.8C
3    41.1C        52.6C
4    43.0C        54.4C
5    44.2C        55.2C
6    42.1C        56.6C
7    41.7C        55.6C
8    40.9C        55.0C

The assumption required here is that all TIM on each chip, and the die-to-case characteristics should be almost identical, and each chip should have the same dissipated power. The variation expected would be from airflow, the shape of the aluminum sheet, differences on the PCB and proximity to heat sources.

The thermal resistance comprised of the aluminum thickness is negligible, and while the TIM is unknown, anything correctly applied would be maybe 2C/W on a pad of this area and thickness. The limits we're looking for here is the 95C on the surface of the case, it's up to the chip manufacturer to guarantee those specs, including the GDDR6 packaging for automotive use that allows a 105C case surface temperature (which indicates automotive GDDR6 used in self driving cars would need to be tested to be reliable up to a 120C die, unless they significantly underclock).

Youtube test was just to see what happens with medium-low power consumption, just a 4K playback for about 30 mins. For the Astrobot test I entered the memory level and left the character there for 1 hour (seemed to reach a steady state after about 30 mins). It's not the worst consumption across the entire game, but it allows to redo the exact same test with the sensor placed elsewhere (had to do 3 separate tests, since I only have 3 probes, and I didn't want to fill the precious space over the aluminum sheet with 8 wires that would really mess up the airflow). I'll be able to redo an exact test after the TIM is tore off or replaced. and I might test again after drilling a tiny hole in the aluminum plate to put a sensor right on the surface of the chip.

So there isn't any outlier chip. It shows 1 2 3 are being cooled a little better by their proximity to the aluminum area closest to the fan, and 4 5 6 7 8 being on the other side and under the optical drive so that surface area of the aluminum heatsink isn't getting as much cooling. All of the chips are an equal distance from the main heat source too. The chip closest to the NAND controller could have been an issue, but it doesn't seem to show anything.

A 20C higher die than the point of measurement here would be reasonable, the math adds up to 15C junction to case, 5C through the TIM, and negligible through the aluminum thickness. However the GDDR6 specs is 95C case surface temperature, leading to a calculated limit of 110C junction, as opposed to GDDR5 which was 85C case limit, and 100C junction.

Other people's tests on GDDR6 confirm the 15C delta we should expect from GDDR6 between the die and the case surface:
www.igorslab.de

GDDR6 memory temperatures comprehensibly explained and remeasured - is AMD doing everything right? | Basics | Page 2 | igorĀ“sLAB

All the errors and confusions about the suddenly occurring value of the memory temperatures of AMDā€™s current graphics cards understandably lead to uncertainty among many users. Modern memory chipsā€¦

Basically my test is closer to the case temperature (JEDEC requirements), while GN and Spawn Wave tests are measuring almost directly the die temperature (a much more important figure for overclockers). And combining all of this with 15C junction-to-case, and 5C through the TIM, it's all adding up except for the GN outlier chip. Spawn wave is 5C higher than my hottest chip, but I didn't try to find the highest consumption across astrobot so it seems a reasonable difference.

TL;DR

Unless I missed something, the math don't logically add up to more than 70C to 75C chip surface temp (from a 95C allowed design limit) unless there's been ridiculous amount of damage to the TIM, or my Astrobot test is very far from the highest consumption.

Whichever way we try to deduce the actual case surface temperature, to see how close it is from the of the JEDEC requirements, either from the top going through the TIM, or from the bottom, going through the vias, die, and package, I don't see any indication the 95C limit is reached within that thermal sandwich.
Hopefully the worries can finally be put to rest, thanks for the in-depth testing!
 

Ploid 6.0

Member
Oct 25, 2017
12,440
I finally did a first sweep of temperature tests on mine.

I place a probe carefully in the the center of each ram chip but on the other side of the TIM, so the heat goes through the TIM, and through the thin aluminum layer. Rebuilt the PS5 completely and tested right into my equipement rack, squeezed on the shelf so everything is a real world situation. This won't provide any figures about the efficiency of the TIM, nor the exact die temperature, but it will show if there's an outlier chip that gets hotter, and hint at the possibility of experimental errors from other people's tests if they have outlier values when measuring from the other side of the board.

Code:
[FRONT]
   8  1
7        2      [FAN]
6        3
   5  4
[REAR]

Chip   Youtube(4K)    Astrobot(Memory Level)

1    45.0C        50.5C
2    39.7C        48.8C
3    41.1C        52.6C
4    43.0C        54.4C
5    44.2C        55.2C
6    42.1C        56.6C
7    41.7C        55.6C
8    40.9C        55.0C

The assumption required here is that all TIM on each chip, and the die-to-case characteristics should be almost identical, and each chip should have the same dissipated power. The variation expected would be from airflow, the shape of the aluminum sheet, differences on the PCB and proximity to heat sources.

The thermal resistance comprised of the aluminum thickness is negligible, and while the TIM is unknown, anything correctly applied would be maybe 2C/W on a pad of this area and thickness. The limits we're looking for here is the 95C on the surface of the case, it's up to the chip manufacturer to guarantee those specs, including the GDDR6 packaging for automotive use that allows a 105C case surface temperature (which indicates automotive GDDR6 used in self driving cars would need to be tested to be reliable up to a 120C die, unless they significantly underclock).

Youtube test was just to see what happens with medium-low power consumption, just a 4K playback for about 30 mins. For the Astrobot test I entered the memory level and left the character there for 1 hour (seemed to reach a steady state after about 30 mins). It's not the worst consumption across the entire game, but it allows to redo the exact same test with the sensor placed elsewhere (had to do 3 separate tests, since I only have 3 probes, and I didn't want to fill the precious space over the aluminum sheet with 8 wires that would really mess up the airflow). I'll be able to redo an exact test after the TIM is tore off or replaced. and I might test again after drilling a tiny hole in the aluminum plate to put a sensor right on the surface of the chip.

So there isn't any outlier chip. It shows 1 2 3 are being cooled a little better by their proximity to the aluminum area closest to the fan, and 4 5 6 7 8 being on the other side and under the optical drive so that surface area of the aluminum heatsink isn't getting as much cooling. All of the chips are an equal distance from the main heat source too. The chip closest to the NAND controller could have been an issue, but it doesn't seem to show anything.

A 20C higher die than the point of measurement here would be reasonable, the math adds up to 15C junction to case, 5C through the TIM, and negligible through the aluminum thickness. However the GDDR6 specs is 95C case surface temperature, leading to a calculated limit of 110C junction, as opposed to GDDR5 which was 85C case limit, and 100C junction.

Other people's tests on GDDR6 confirm the 15C delta we should expect from GDDR6 between the die and the case surface:
www.igorslab.de

GDDR6 memory temperatures comprehensibly explained and remeasured - is AMD doing everything right? | Basics | Page 2 | igorĀ“sLAB

All the errors and confusions about the suddenly occurring value of the memory temperatures of AMDā€™s current graphics cards understandably lead to uncertainty among many users. Modern memory chipsā€¦

Basically my test is closer to the case temperature (JEDEC requirements), while GN and Spawn Wave tests are measuring almost directly the die temperature (a much more important figure for overclockers). And combining all of this with 15C junction-to-case, and 5C through the TIM, it's all adding up except for the GN outlier chip. Spawn wave is 5C higher than my hottest chip, but I didn't try to find the highest consumption across astrobot so it seems a reasonable difference.

TL;DR

Unless I missed something, the math don't logically add up to more than 70C to 75C chip surface temp (from a 95C allowed design limit) unless there's been ridiculous amount of damage to the TIM, or my Astrobot test is very far from the highest consumption.

Whichever way we try to deduce the actual case surface temperature, to see how close it is from the of the JEDEC requirements, either from the top going through the TIM, or from the bottom, going through the vias, die, and package, I don't see any indication the 95C limit is reached within that thermal sandwich.
Wow what a effort, thanks for all of this. Interesting.
 

MrFox

VFX Rendering Pipeline Developer
Verified
Jun 8, 2020
1,435
I did one last test before I removed the temp sensors. There was a question about whether the thin bottom vent is useful. It's a vent that is blocked when you put it vertically without the stand, and the design of the stand looks very purposefully shaped to keep this vent unobstructed. So I tested with the vent open and then completely blocked, then open again to confirm.

Astrobot test:
Vent open 1 hour
Vent blocked 30 min
Vent reopened 30 min

Coldest chip:
Open: 49.1C
Blocked: 50.6C
Reopened: 49.4C

Hottest chip:
Open: 57.2C
Blocked: 58.4C
Reopened: 57.5C

While it's not a big difference because the fan will be spinning up to compensate the SoC and Memory temps, there's also a corner of the console that could slowly accumulate trapped heat if those vents are blocked.This test was at 20C ambient and it would be a lot worse in the summer without A/C.

About the white panels, they are designed to provide the correct front-to-back airflow for AV racks/cabinets, to muffle slightly the fan noise, and the holes that were designed for vaccuum cleaning are closed shut by the panel. So if you really want to remove them for some reason make sure you plug the vaccuum holes. On the positives, removing the panels helps the fan run a little more efficiently. You gain some you lose some. It can't be a big difference either way, so not even worth testing it. As opposed to the PS4, there are no parts in the upstream path of the fan, making this inconsequential. (The HDD on PS4 was cooled by the inlet air before it reached the fan, so those who put a hole in the case above the fan risked causing the HDD to overheat as these are degrading above 40C and it gets no more air when you create an easier path. This is why you often see HDDs being placed as the first component upstream of the fan in consoles, we have no such issues with SSDs. It's also a path helping the memory cooling on the launch PS4)
 
Last edited:

GoldPunch

Member
Apr 20, 2018
40
Turkey
This is new. He just replaced thermal pads with new ones.




And this is from TronicsFix - Liquid metal vs thermal paste

 
Last edited: