How would the PS5 Pro and Xbox Series X be upgraded in hardware?

During the previous generation of consoles, we saw the appearance of the PS4 Pro and the Xbox One X. Both consoles were designed to be able to use 4K TVs, but without the need to develop a new one. generation of consoles. These iterations had a completely new SoC, with better specs but fully compatible with the standard versions of PS4 and Xbox One. Can we expect the same from the PS5 and Xbox Series X?

Upgraded versions of PS5 and Xbox Series X?

First of all, we have to take into account that every product has to have a motivation to exist, in other words, there has to be a purchase reason from which its marketing can be built. The first thing that may come to mind is the existence of 8K TVs, but the power required to move the graphics to these resolutions would require very large chips, if the GPU is to be expanded with more full cores.

What makes more sense is the improvement of the part related to AI, namely: Deep Learning and Machine Learning. Which are starting to be of great importance thanks to NVIDIA, but for which AMD has a considerable delay, to the point where they have not yet implemented Tensor-type units in their gaming GPUs.

In the case of Microsoft, we have the DirectML libraries, which are used for the super-resolution algorithms in the style of NVIDIA’s DLSS and for the denoising that occurs in ray tracing. Especially super-resolution algorithms such as DLSS due to their mode of operation, which allow higher frame rates to be achieved, which will improve the performance of games with Ray Tracing.

Chips are more and more expensive to manufacture

The cost of developing and deploying new manufacturing nodes is increasingly expensive. These costs end up being charged against the cost of the wafers which are then used to make the chips. The results? Although more transistors can fit per mm² the cost per zone of the chips has increased. We saw it in the SoC of SONY consoles, where the PS4 measured 348 mm², the PS4 Pro 324 mm² and that of the PS5 is close to 290 mm².

All of this means that adding compute units to the GPU is getting harder and harder. Fortunately, adding units to do the AI compute costs a lot less than adding more GPU cores. Thanks to them, we can apply super resolution algorithms by adding less than 5% of the transistors in an SoC.

SONY and Microsoft use TSMC’s 7nm node, an improved iteration should take advantage of a more advanced node but comes with the issue of the additional cost of the move. TSMC 5nm design rules are not the same as 7nm. While it’s too early to say anything, the perfect knot for SONY and Microsoft is 16nm, this gives them enough space for tensor units by having 18% higher density, on top of that. to allow the use of a voucher is part of the design of the PS5 and Xbox Series X SoCs without having to adapt it to a new node.

Zen 3+ Como processor

One advantage of the Zen 3 architecture is that it supports the integrated memory controller, Scalable Data Fabric in AMD lingo, designed for Zen 2, so it would cost AMD nothing to change the Ryzen-style Zen 2 cores. 4000 for Zen 3 of the Ryzen 5000U and Ryzen 5000H, since they could do it without having to change the rest of the hardware and without losing compatibility with games.

Zen 3’s higher CPI would help the frame rate of games or give the GPU more time to improve graphics. It would also be essential to get extra milliseconds for post-processing purposes.

Super resolution and computer vision

Computer vision needs Tensor units to run as fast as possible. The idea is to teach the system to have eyes and to observe in order to draw simple conclusions. He must first learn to see the basic shapes and later from these shapes identify objects, thanks to this an AI can do the following:

Identify the objects.
Identify the fonts.
Observe and act like a painter.

Super resolution algorithms like NVIDIA’s DLSS rely on the processing unit’s ability to see and render as close as possible to another resolution. This requires training the AI so that the system can see. These algorithms use the multiplication of matrices, which see their performance accelerate when tensor type units are used. Although it can be applied on different types of hardware, computer vision to perform effects such as NVIDIA’s DLSS or FidelityFX Super-resolution is greatly accelerated by these types of units.

Tensor Cores upgraded versions of PS5 and Xbox SX

AMD has implemented Tensor Cores in its CDNA architecture, which is not a gaming GPU and is aimed at the high performance computing and AI market. It is an evolution of the GCN architecture in which AMD gave it the possibility of executing tensor-type instructions. If we make an observation to the architecture, we will see that there is no kernel tensor or NPU. How did AMD do it? Well the way you did it may work for upgraded versions of PS5 and Xbox Series X.

The first thing we need to understand is that Tensor discs are actually evolutions of what a systolic array is. Where, in this type of chain, each ALU has as input data that calculated by the previous one. The systolic table moves in one direction, the systolic table in two dimensions and the Tensor unit in three dimensions, think of each dimension as the direction of the arrows in this diagram representing a 4 x 4 Tensor:

Therefore, the intercommunication between the different tensioner ALUs can be controlled as follows:

ALU number	1ª Dimension	2ª Dimension	3rd Dimension
1	2	5	6
2	3	6	7
3	4	7	8
4	Departure	8	Departure
5	6	9	ten
6	7	ten	11
7	8	11	12
8	Departure	12	Departure
9	ten	13	14
ten	11	14	15
12	13	15	16
12	Departure	16	Departure
13	14	Departure	Departure
14	15	Departure	Departure
15	16	Departure	Departure
16	Departure	Departure	Departure

What did AMD do in the CDNA? Well, very simple, they decided to convert the SIMD units to Tensor units as well. How? ‘Or’ What? Well, add a series of communication channels between the 16 ALUs that make up the SIMD unit, so that they can act as a Tensor unit without having to implement a separate one in each compute unit. This is ideal for implementing a Tensor unit in a console where there will be little space on the chip.