• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
Are you a computer scientist? Doesn't read like it.

Teraflops is definitely a measure of computing power (with caveats). What it definitely isn't is a measure of graphics performance. So it definitely means something, but one just has to understand its value is only in very specific contexts.
lol yes i'm a computer scientist. I'll revise my hyperbole of "it means nothing" to "no developer cares about TFLOPS". Teraflops is totally a measure of performance, and if you're an academic writing a paper than I'm happy to read your tables that mention it. But it's purely theoretical. Machine learning is pure FLOPS so it's a bit applicable there.

It's not a measure of GPU performance though. A GPU is much, much more than FLOPS and using it to measure the GPU's performance is marketing smoke and mirrors. It's being equated to a "horsepower" number and it's just not that. The way it's used in these articles means nothing.
 
Last edited:

Briareos

Member
Oct 28, 2017
3,037
Maine
It's equally silly to pretend TFLOPs aren't a useful metric along with bandwidth and functional utility. Yes, you can engineer low occupancy workloads with silly bandwidth requirements that don't exercise your ALU, but it's still an important metric for many workloads. There's really no point in discussing specifics of GPU architecture here, though, it's not an audience that understands or appreciates it.
 

tuxfool

Member
Oct 25, 2017
5,858
lol yes i'm a computer scientist. Teraflops is totally a measure of performance, and if you're an academic writing a paper than I'm happy to read your tables that mention it. It's not a measure of GPU performance though. A GPU is much, much more than FLOPS and using it to measure the GPU's performance is marketing smoke and mirrors.

90% of people don't even know what a "floating point operation" is, let alone that GPUs don't even spend most of their time doing them.
But then TFs aren't meaningless. They have meaning in specific contexts. The stated figure is a perfectly valid thing for a GPU manufacturer to state.

Car manufacturers state the top speed of their cars, that says nothing of the specific requirements to achieve that speed or is a complete characterization of the performance of the car.

Maybe one should be educating people on what these figures mean, instead of what they don't.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
It's equally silly to pretend TFLOPs aren't a useful metric along with bandwidth and functional utility. Yes, you can engineer low occupancy workloads with silly bandwidth requirements that don't exercise your ALU, but it's still an important metric for many workloads. There's really no point in discussing specifics of GPU architecture here, though, it's not an audience that understands or appreciates it.
They're a useful metric for researchers and academics. Not enthusiasts on a gaming forum. I disagree about the esoteric nature of the workloads. I think low occupancy workloads will become increasingly important as time goes on. Volumetric effects and ray tracing are both pretty hard to make high-occupancy. I get that you don't think there's a point to explaining it, but I think it's worth saying how pointless this whole discussion is.

But then TFs aren't meaningless. They have meaning in specific contexts. The stated figure is a perfectly valid thing for a GPU manufacturer to state.

Car manufacturers state the top speed of their cars, that says nothing of the specific requirements to achieve that speed or is a complete characterization of the performance of the car.

Maybe one should be educating people on what these figures mean, instead of what they don't.
Fine I'll satisfy your pedantry. TFs aren't useful here because there are so many performance characteristics that differ between a console and desktop GPU. This is the point of my earlier posts. When comparing two desktop GPUs, TFs may be a valid measure. They're still mostly marketing fuzz, but they can be useful if you know that your workload is (as said above) high-occupancy. This means the majority of operations are FLOPS. Machine learning is an applicable workload.

Comparing a console GPU and a desktop GPU with just TFLOPs is nonsense though. There's so much more going on. I'm sorry if saying "TFs don't mean anything" pricked you, but as far as this conversation is concerned, they don't. They're not useful and they just end up confusing people who think they know more than they do.
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,930
Berlin, 'SCHLAND
From a GTC slide....



Console GPUs don't have to travel the PCI-express bus. They're allowed to perform better as a result. Comparing a console GPU to a desktop GPU is just marketin babble. "Terraflops don't mean shit if you can't feed the beast"
Wait what. Are you saying that console GPUs do not have bandwidth considerations or something? What on earth do you mean here?
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
Wait what. Are you saying that console GPUs do not have bandwidth considerations or something? What on earth do you mean here?
I mean there's no latency from travelling through the PCI-E bus. On a desktop you have to make that extra trip, and deal with coordinating that extra hardware. Whereas on a console, there's no trip. The CPU and GPU lie together in an SOC, and use a unified memory architecture.

I was trying to find a simple way of saying "Hey they ripped apart your GPU and shove all the parts on the same board, so it's completely different"

As for the TFLOPS quote, it's something I've heard a lot as a graphics programmer: "The ALU performance doesn't matter if you can't feed the beast" i.e. it doesn't matter if you're 12TFLOPS if you don't have 12TFLOPS of fresh data to work on coming in every second. The beast may be able to eat all that data, but you have to be able to feed it.
 

Briareos

Member
Oct 28, 2017
3,037
Maine
They're a useful metric for researchers and academics.
As someone who has spent much of the last eight years doing low level performance analysis of GPU microcode execution, I can assure you it's more than just researchers and academics to whom it is relevant. Props for explaining aniso throughput, though, but I'm guessing it gets lost in the noise, similarly if we started talking about TA FIFO queue stalls.
but I think it's worth saying how pointless this whole discussion is
Perhaps a more useful conversation would be how IHVs *could* articulate the relevant aggregate throughput of their hardware to consumers who are largely unable to make those judgments, but even that's probably hopeless. Ultimately consumers should judge their purchases on the available software since it's the games that are the entire point here, but in absence of titles I suppose this is the conversation you get.
 

tokkun

Member
Oct 27, 2017
5,399
Wait what. Are you saying that console GPUs do not have bandwidth considerations or something? What on earth do you mean here?

Consoles use a shared coherent view of unified system memory between CPU and GPU. If you want to learn more, the term for this is HUMA.

The GPU can access main memory without traversing the PCI bus, so latency for a main memory reference for code running on the GPU is closer to 50 ns rather than 1 us+. From a programming standpoint, this allows you to write interactions between code running on CPU and GPU using a synchronous model rather than an asynchronous one.

FWIW, this tech exists in AMD's general-purpose APUs; it is not console-exclusive. It is just that no one on PC optimizes their graphics engines around APUs.
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,930
Berlin, 'SCHLAND
I mean there's no latency from travelling through the PCI-E bus. On a desktop you have to make that extra trip, and deal with coordinating that extra hardware. Whereas on a console, there's no trip. The CPU and GPU lie together in an SOC, and use a unified memory architecture.

I was trying to find a simple way of saying "Hey they ripped apart your GPU and shove all the parts on the same board, so it's completely different"

As for the TFLOPS quote, it's something I've heard a lot as a graphics programmer: "The ALU performance doesn't matter if you can't feed the beast" i.e. it doesn't matter if you're 12TFLOPS if you don't have 12TFLOPS of fresh data to work on coming in every second. The beast may be able to eat all that data, but you have to be able to feed it.
This is a Real considerations under the impression the Gpu and CPU are addressing the same memory - something a Computer scientist may want to do. But in an environment where GPUs and CPUs are doing very specialised things, like im a video game, it is not very important. last gen consoles showed like the PS4 and x1, UMA share memory was not utilised really as it is still too slow and still bottlenecks even without pci-e when you share address. Just looking at the Hardware Design back then Shows that with how slow the various buses were.

I do not think it will be dramatically better this time, relativly speaking as we are looking at game consoles meant for gaming. Not data Science machines utilising huge share data Sets.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
As someone who has spent much of the last eight years doing low level performance analysis of GPU microcode execution, I can assure you it's more than just researchers and academics to whom it is relevant. Props for explaining aniso throughput, though, but I'm guessing it gets lost in the noise, similarly if we started talking about TA FIFO queue stalls.

Perhaps a more useful conversation would be how IHVs *could* articulate the relevant aggregate throughput of their hardware to consumers who are largely unable to make those judgments, but even that's probably hopeless. Ultimately consumers should judge their purchases on the available software since it's the games that are the entire point here, but in absence of titles I suppose this is the conversation you get.
I getchu and I know you're right. I don't know many developers who think like that though. There's a lot of "write a shader and forget about it". I was trying to be hyperbolic, and it didn't work out. I hate when people bring up TFLOPS so yeah. My point is that TFLOPS is not a good measure for comparing a desktop GPU to a console GPU. It's a measure for determining theoretical performance when making those considerations as an engineer that's been co-opted by marketing and twisted beyond its original meaning.

Also thanks I'm honestly proud of the aniso thing lol. Felt like a real-life interview there for a second.

I like the "this is what the FPS is" charts that compare various games. That's probably the most accurate metric you can find on how a game will perform. Look at something that uses the same engine and a similar art direction, and it'll probably run pretty similar.
 
Last edited:

VariantX

Member
Oct 25, 2017
16,880
Columbia, SC
Only thing I can do is shrug because you get what you pay for, including the rest of the PC around that card that costs as much as or more than a console alone.
 

Bosch

Banned
May 15, 2019
3,680


link is time stamped.
unknown.png


I think this is the first time NVidia referenced a next generation console in their slides.
then again the slides are not explicit regarding what is the comparison metric here, whether that is raw TF, average framerates, ray tracing capabilities etc.

And u guys had any doubts? only in dreams land consoles would perform like a RTX 2080.

Consoles will be in line with a 2070.
 

chromatic9

Member
Oct 25, 2017
2,003
The better these consoles are in terms of pure price/performance, the more pressure it puts on a market that has been price gouged to hell and back for the last years. We might get a veritable Ryzen moment for GPUs out of this launch, and that alone excites me.

Exactly.

We don't want an Xbox One 7770 1.3 tflop situation. That was laughably weak when we heard it in 2012 even. Give the GPU designers something to do once the consoles launch to push the tech forward.
 

Conkerkid11

Avenger
Oct 25, 2017
13,945
Are you a computer scientist? Doesn't read like it.

Teraflops is definitely a measure of computing power (with caveats). What it definitely isn't is a measure of graphics performance. So it definitely means something, but one just has to understand its value is only in very specific contexts.
Computer scientists are generally programmers though. What's being a computer scientists have to do with anything?
 

tokkun

Member
Oct 27, 2017
5,399
But then TFs aren't meaningless. They have meaning in specific contexts. The stated figure is a perfectly valid thing for a GPU manufacturer to state.

Car manufacturers state the top speed of their cars, that says nothing of the specific requirements to achieve that speed or is a complete characterization of the performance of the car.

Maybe one should be educating people on what these figures mean, instead of what they don't.

GPU manufacturers never give us enough context to make an accurate assessment, though. We have seen them state TF figures that are not actually possible to achieve even in a synthetic workload due to instruction issue bandwidth. We have seen them treat FP16 and FP32 as the same. We have seen them treat MAC as two operations.

In your car analogy, it is like a manufacturer stating a top speed and later you find out that it was a theoretical speed only achievable in vacuum. Not only completely useless in a practical sense, but also muddying the definition so much as to make any sort of comparisons useless.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
This is a Real considerations under the impression the Gpu and CPU are addressing the same memory - something a Computer scientist may want to do. But in an environment where GPUs and CPUs are doing very specialised things, like im a video game, it is not very important. last gen consoles showed like the PS4 and x1, UMA share memory was not utilised really as it is still too slow and still bottlenecks even without pci-e when you share address. Just looking at the Hardware Design back then Shows that with how slow the various buses were.

I do not think it will be dramatically better this time, relativly speaking as we are looking at game consoles meant for gaming. Not data Science machines utilising huge share data Sets.
I think the point of my very ranty (and honestly inebriated) posts were that comparing the two now is pointless. I wanted to use the unified memory architecture, and the lack of a PCI-E slot as two good ways to show that to a layman - that they're apples and oranges. Point isn't that one is faster than the other (but boy howdy do I got some papers on that), but just that they can be faster and slower in unexpected ways.

In your car analogy, it is like a manufacturer stating a top speed and later you find out that it was a theoretical speed only achievable in vacuum. Not only completely useless in a practical sense, but also muddying the definition so much as to make any sort of comparisons useless.
Thanks for putting it better than I could. This is what I meant.
 

GhostTrick

Member
Oct 25, 2017
11,304
From a GTC slide....

As someone who actually goes to GTC, the entire premise of this thread is extremely flawed. And I don't really know what else to say.

Console GPUs don't have to travel the PCI-express bus. They're allowed to perform better as a result. Comparing a console GPU to a desktop GPU is just marketing babble. "Terraflops don't mean shit if you can't feed the beast"

If you're reading this thread, don't take this seriously. Please. It has no technical worth whatsoever. Jensen makes bizarre claims like this all the time to drum up enthusiasm during his speeches. It's just how he rolls.


Ah, it's been a while since I saw the good old "PCI express bus".
If this held any truth, why is there next to no performance loss/gain when you compare PCIe 2.0, 3.0 and 4.0 ?
 

z0m3le

Member
Oct 25, 2017
5,418
Rumour recent specs... Not official specs... official we have 2x performance of Xbox ONE X what doesn't say much.
I agree that it is still rumored, however the XB1X is a 16nm SoC with 6TFLOPs in a console half the size of the Xbox SX, which is using a 7nm process that can have twice the transistors, higher clock speeds and use less power per transistor/cycle. 12TFLOPs for the size of the Xbox, along with Microsoft using the term 'When we do the math' 4x CPU (XB1X is a 2GHz 8 core CPU, Ryzen 2 Xbox is 8 core and probably 16 threads at ~3.5GHz, so this works out to something similar to 4 times the CPU performance) 2x GPU performance, in terms of math, since Microsoft does market 6TFLOPs, it does lead the average person to read that as 12TFLOPs. Again considering the technology and size of the Xbox SX, there is no reason it can't be a 12TFLOPs console.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
Ah, it's been a while since I saw the good old "PCI express bus".
If this held any truth, why is there next to no performance loss/gain when you compare PCIe 2.0, 3.0 and 4.0 ?
Because PCI-E2/3/4 increase the bandwidth. They do not lower latency. 128G/s doesn't mean shit if it takes an hour for the first byte to get there. You still have to talk to a chip that's physically outside your main computer. Still the same trip regardless of the version.
 

Bjones

Member
Oct 30, 2017
5,622
Comparing Nvidia and AMD tflops doesn't work. AMD has always boasted higher number, but the cards generally perform worse per tflops than the competition.

Same thing for the 5700xt, it has 9.75 tflops but it's around 8% slower than a 2070 super with only 9 tflops. 12 tflops navi are definitely not faster than a 2080.

the 5700xt is close and better on some games so they seem pretty even to me and that's just a .75 difference. 10-12 isn't close.
 

Admiral Woofington

The Fallen
Oct 25, 2017
14,892
I don't think it's a negative these consoles will perform so God damn well if it'll directly translate to Nvidia being forced to lower their God damn price and introduce more powerful cards at a better price as well. What is my incentive to build. 2000+ dollar machine with the latest and greatest cards when I could buy the next gen Xbox for a short while and play cyberpunk at the same level or around that as the 2080 for anywhere around 600?
 

z0m3le

Member
Oct 25, 2017
5,418
Ah, it's been a while since I saw the good old "PCI express bus".
If this held any truth, why is there next to no performance loss/gain when you compare PCIe 2.0, 3.0 and 4.0 ?
He is talking about how consoles don't have to move data from the DDR4 memory to the GPU VRAM memory. It's a bottleneck that doesn't effect game performance because developers move as much of that data into the VRAM while the scene is loading and try to keep the need for new transfers to a minimum outside of loading.
Because PCI-E2/3/4 increase the bandwidth. They do not lower latency. 128G/s doesn't mean shit if it takes an hour for the first byte to get there. You still have to talk to a chip that's physically outside your main computer. Still the same trip regardless of the version.
I haven't dived into this really, because all of this is just a hobby for me, but usually when you increase speed, you also increase latency. Do these PCI-E data bus, not come with an increase in latency as they get faster?
 

Dr Pears

Member
Sep 9, 2018
2,671
Nvidia should be grateful for the new consoles. The new consoles will probably push more Ray Tracing in new game releases compared to currently.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
I haven't dived into this really, because all of this is just a hobby for me, but usually when you increase speed, you also increase latency. Do these PCI-E data bus, not come with an increase in latency as they get faster?
I have no idea PCI-E is very out of my depth, but honestly you piqued my interest too. I do wonder if that is a consideration.
 

Heshinsi

Member
Oct 25, 2017
16,091
Why are people referring to the desktop 2080 video cards, when the presentation (where the comparison slide is from) is about laptop 2080s? The laptop variant is quite a bit below the desktop model, so isn't it a bit disingenuous for nVidia to put up a slide like that? I mean it could technically be true that the RTX2080 is more powerful that whatever is in the next gen consoles, but I doubt they mean the laptop model.
 

z0m3le

Member
Oct 25, 2017
5,418
the 5700xt is close and better on some games so they seem pretty even to me and that's just a .75 difference. 10-12 isn't close.
In these terms, the 5700xt's 9.75TFLOP GPU is acting like a RTX 2070 Super with only 8.28TFLOPs instead of 9TFLOPs (the 8% slower number here) this is why it's notable, though of course there are many other differences to these cards, but if you were ultimately just using TFLOPs as an indicator of ultimate performance, this is how it looks to the average user, they could down clock the 2070 super by 8% and see a similar performance to the 5700xt despite these numbers.
I have no idea PCI-E is very out of my depth, but honestly you piqued my interest too. I do wonder if that is a consideration.
The PCI-E lanes are generally faster because of wider lanes, but I assume there is a clock increase to the bus as well, which is why I asked. Maybe they don't increase clock rates for this very reason.
 

Firefly

Member
Jul 10, 2018
8,621
Nvidia's pricing is irrelevant for next gen consoles because they wont be manufacturing a GPU for then. AMD has aggressive pricing so these remarks about a "$600 RTX 2080 in a < $500 console" make no sense.
 

MPrice

Alt account
Banned
Oct 18, 2019
654
lol Nvidia always take these little potshots at consoles when they don't get a contract. Imagine having to point out that a card this size is better than what you can fit into a console:

dims
 

Minsc

Member
Oct 28, 2017
4,118
Please don't be so pedantic. It's pointless for business reasons. You could push a shooter to 120FPS but it wouldnt' look good. It wouldn't go through HDMI. No TV would be able to display it, and no average consumer will be able to tell the difference. As a result, we prioritized things that the consumer would notice or has marketed to them (4k).

Huh? Isn't like every single TV on the market for > $200 capable of displaying 120fps? Generally they can do even more now too, if you spend a little more.

It would be pretty neat if games on next gen consoles had presets for

Best possible graphics at any framerate
30fps
60fps
Highest possible framerate reducing graphics

Once a while you see a console game allow for some customization in graphics, but it's pretty rare. Either way I don't think the lack of 120fps support has anything to do with TV hardware - because I believe the TVs out now support that framerate just fine.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
Huh? Isn't like every single TV on the market for > $200 capable of displaying 120fps? Generally they can do even more now too, if you spend a little more.
Mine can only do 60 and 30. Don't know about others. And I don't know enough about HDMI to know if the standard supports that.

EDIT: Did some googling. HDMI supports 120. Seems pretty uncommon for a TV to have a 120FPS setting though.
 

Minsc

Member
Oct 28, 2017
4,118
Mine can only do 60 and 30. Don't know about others. And I don't know enough about HDMI to know if the standard supports that.

EDIT: Did some googling. HDMI supports 120. Seems pretty uncommon for a TV to have a 120FPS setting though.

The resetera darling HDTV does - LG's C9. If you filter on rtings, there's a ton of others that do, but I'm not sure rtings is even all that comprehensive of a database.

Anyway, you gotta figure if the TV can do 4k@60fps (and really - which TVs now can't), it can probably do 1080p@120. And more and more newer TVs are doing 4k@120fps like the C9 now too.

Also the new cables seem to support even 8k@120hz so there's that too, think even that's in a few TVs as well.
 

Deleted member 29195

User requested account closure
Banned
Nov 1, 2017
402
The resetera darling HDTV does - LG's C9. If you filter on rtings, there's a ton of others that do, but I'm not sure rtings is even all that comprehensive of a database.

Anyway, you gotta figure if the TV can do 4k@60fps, it can probably do 1080p@120. And more and more newer TVs are doing 4k@120fps like the C9 now too.
That's a big difference from "Any TV > 200".

You could have discussion for hours about why businesses are choosing to prioritize 4k over FPS, but they are and that's just the decision. I don't necessarily agree with it, but it's the reality of today and tomorrow.