I remind you this started with you quoting me to say "the XSX
does not have dedicated hardware for this purpose, so committing resources to DirectML reconstruction means taking them away from other rendering processes"
I replied with:
"We knew that many inference algorithms need only 8-bit and 4-bit integer positions for weights and the math operations involving those weights comprise the bulk of the performance overhead for those algorithms," says Andrew Goossen. "
So we added special hardware support for this specific scenario. The result is that Series X offers 49 TOPS for 8-bit integer operations and 97 TOPS for 4-bit integer operations. Note that the weights are integers, so those are TOPS and not TFLOPs.
The net result is that Series X offers unparalleled intelligence for machine learning." Link
Things should't be so complicated. There is no need to move the conversation to "Oh, but the XSX will
also have to use shader cores, so the performance won't be the same as the tensor cores." I have already said that the tensor cores will have more performance, no need to go there as that was never the point. However you want to put it, the XSX does have dedicated hardware to accelerate machine learning code. The minute you admit to this, it invalidates what you said at first
"the XSX does not have dedicated hardware for this purpose". It does have it and I have no other way to make this more clear than to show you someone from Microsoft itself saying that they added hardware for this purpose. This will be my last reply to you on this.