CPU vs GPU, why do they adapt performance differently?

There are many explanations for this curious disparity, but the main and the basis of it all is precisely the fate of each type of hardware and the way in which software approaches it. In this perspective, the range of explanations opens up and requires a more in-depth study, since we move from the lithographic process to software developers …

GPUs are always one step ahead of processors – here’s why

The first reason is logically what processors and graphics cards are used for. As we well know, a CPU is an extremely complex component, the heart of the system, but when we talk about workloads to have a performance gain in a thread and with it in IPC, we must keep in mind that A limiting factor is precisely the frequency.

And with that comes the limitation of the node in service. Architecture improvements improve Front-End and a Back-End Much more simplified, as well as access to caches and registers generally increase performance, but we cannot forget the parallelism that these modern processors need.

If we add all of the above, we have a bottleneck which is always given in the first place by the lithographic process. Including more transistors per mm2 is most optimal if you want to include more cores and thus increase the overall performance, but at the thread level we have to push a wire at the highest frequency possible. We are currently at 5 GHz with Intel, so if you have this limitation we apply the Ley de Amdahl (a workload is difficult to accelerate and with greater complexity it becomes more complicated, even if it is parallelized) we have a difficulty which can be exponential in certain tasks.

Another point to discuss is, of course, the executions and instructions that are added to a processor, where we can optimize and gain performance in a more or less complex way, but these are usually direct improvements in a thread or a thread. . But of course a CPU works in parallel with technologies such as speculative or command execution, for example, not to mention the greater number of available cores, caches and RAM access, or technologies such as HT or SMT.

At the end of the day, all of these technologies are trying to do one very simple thing: keep every CPU and thread busy for as long as possible and in the most perfect order available for each task, so that there is no delay. between the data. Why is this the case and how is it different from GPUs?

Super scaling and parallelization, key to the differences between CPU and GPU

The CPU has to perform a lot of different simple and complex tasks, but it also has to interconnect with any component of the PC, which involves receiving information and transmitting it through different buses and at the fastest speed. highest possible. A GPU on the other hand has a different, much simpler way of working.

The change of information, in the way of working, is called the change of context and here the GPU has a lot of advantages, because by the nature of these, the work that they have to do requires very few changes of context. , because it is extremely parallelizable and the loads are generally homogeneous.

Developers operate differently, since a GPU has as many cores as the Shaders integrate its silicon, so parallelization is extremely easy as they can integrate up to 6912 Shaders

real without much problemNVIDIA A100) where each Shader acts as an almost independent core of a processor.

Therefore, we have a large number of cores to work with whose performance is logically limited by the speed of the node for each chip designed and at the same time by the efficiency of the chip. Keep in mind that in GPUs we are talking about ten huge ones with unthinkable consumption for a processor.

The tradeoff is slower speeds due to the nature of the architecture, but the parallelization is unmatched, so it’s easier to scale performance with it. Finally, we must take into account the Ley de Dennard, which we have already spoken about more than once and which precisely takes into account efficiency as the main pillar, where the use of energy is maintained in proportion to the area of the chip.

Therefore, if you can parallelize a series of tasks, it will be very simple that by adding more cores to a GPU, you will be able to increase the performance much more, where in addition the number of transistors is much larger. and with it consumption does the same, but it is dissipable. As a GPU does not reach the frequency limit of a node, it is not limited in this aspect, but in efficiency, which, having more headroom than a CPU, allows greater gains if the we combine everything that is explained.