The keys that will limit the future performance of PCs and consoles

In reality, no matter where you look in the industry, the architectures, although different, are increasingly focused on more specific sectors, yes, but they have the same base, the same problems, the same advantages . This is why we are going to see the main hardware bottlenecks and their evolution to understand where we are going.

Different components, different limitations, performance and latencies

Logically, the limitations or bottlenecks are different in each component, but they have one thing in common in all cases with more or less importance: latency. In some cases it is key, in others it walks on tiptoe, but it will undoubtedly mark the performance in the years to come. Moreover, it is indistinct for PC or console, where, safeguarding their peculiarities, they are equally affected.

To give us an idea of just how important latency is, we have this graph that became famous at the time that pretty much illustrates what it means in various components, nanoseconds, milliseconds, and seconds, versus time such as we normally perceive it as humans. .

As can be seen in a processor for 3GHz a delay in clock cycles of only 0.3 ns would imply a perception of 1 second for us. Access to L3, which takes on average about 12 ns depending on the processor architecture, represents 43 seconds of our life.

Stock	average latency	Completion time for one person
Time of a clock cycle at 3 GHz	0.3ns	1 second
Access time to the L1 cache of a CPU	0.9ns	3 seconds
Processor L2 cache access time	2.8ns	9 seconds
L3 access time	12.9ns	43 seconds
RAM access time	Between 70 to 100 ns	Between 3.5 minutes and 5.5 minutes
NVMe SSD I/O Synchronization	Between 7 and 150 picoseconds	Between 2 hours and 2 days
Hard disk input and output times	Between 1 and 10ms	Between 11 days and 4 months
Internet access times from San Francisco to New York	40 milliseconds	1.2 years
Internet time between San Francisco and Australia	183ms	6 years
Restarting the virtualization of an OS	4 seconds	127 years old
Restart a virtualization	40 seconds	1200 years
Restart a physical system	90 seconds	3 millennia

If we extrapolate this to RAM and go up to 100 ns, that would equate to traveling a distance to our destination of 5.5 minutes. Perhaps the most striking are Internet latency times, something more common that anyone can understand, and that is that if we have 40 milliseconds between San Francisco and New York, that would equate to losing 1, 2 years of our life and if we change the destination to Australia no less than 6 years.

Therefore, latency is very important on a PC or on a console, where, as we can see, each generation has been fighting for more than 40 years to reduce it in order to increase performance per instruction and per cycle. That said, let’s see how it affects the main components and if there are any improvements in this aspect in the short or long term.

CPU latency

This is by far the component that suffers the most. Former AMD chief architect Jim Keller brilliantly defined it at the time:

Performance limits are the predictability of instructions and data

In other words, if you can predict what resources are needed for each instruction and data, you can better manage them and therefore generate less time between them or increase performance.

Again the latency here and that’s the problem was first seen by AMD and now Intel is going to solve it partly in Raptor Lake: increase the size of the caches to mitigate access times and the passing of instructions and data in the cache hierarchy.

What is attempted is not to access RAM, or to limit access cycles as much as possible. AMD has already done it with Ryzen and Zen 2 to Zen 3, Intel will do it now in its next architecture.

RAM and GDDR6 memory

This is perhaps the most important aspect of these two components. RAM memory is always in question due to latency, but what is really demanded is more bandwidth, more frequency, more speed without compromising the ratios with the timings. DDR5 kicked this off in earnest and while we don’t notice it as much on PCs as we do on servers, it’s a necessary technology for the industry in general.

As for GDDR6, the latency is not as important as the bandwidth resulting from it, since the computing capacity of GPUs increases and they need to provide data from their associated memories. Latency is therefore secondary, even if it is far from negligible.

There are also no improvements in sight over GDDR6X as such, where speed and frequency are increased while maintaining latency at the same clock cycles.

SSD, its performance and latency on PC

They are the least dependent on this factor, but latency is needed for high-bandwidth random operations. Controllers need to exchange more and more data with cells and therefore performance cannot be lost with clock cycles impacting raw bandwidth based on IOPS.

In summary, it’s the CPU cores that are most affected by their cache, which will happen shortly with GPUs, since they also increase their size and export them out of Shaders groups as AMD did with Infinity Fabric. and Infinity Cache, where precisely they intend not to depend on higher GDDR speed on GPUs and at the same time not take up space on CU units.

There is a very curious animation that by simply clicking on the various elements represented, the importance of the latency of the system in its various components is perfectly understood. Just go to a website and start clicking to see the animation of how things work,

It’s especially interesting when you keep clicking on system memory, then you go to L2 and then L1 to see the organization and flow of performance and latency between them, really curious and instructive. From then on, AMD’s move with Zen 2 and Zen 3 was crucial in being able to take on Intel at the cost of a very large space in the DIE, something Intel now has to replicate that it didn’t have to do before in because of its lithographic process. .

Temperature, more serious than latency and performance on PC?

Logically, a determining factor in the performance of any chip is temperature. The problem is that this is inherent in the technology, since any chip that has a voltage will have a higher or lower temperature by simple operation. The greater the complexity of the chip, the more cores and units it has, the more frequency it has and logically the more voltage it needs, so it will heat up more.

As expected, this is somewhat ambiguous, as it will always be a limiting factor, but at the same time heat is not a technological factor and we just have to live with it as we have since the creation of the first chip.

In short, the latency in PC and the performance of the various components is the key factor that limits and will limit this (performance) more than any other, because it is worthless to have a higher general speed if the access time and the transfer of information between components is increasingly delayed.