For many years, Intel, AMD, NVIDIA and IBM They have tried to increase the burden of their work that deals with the programs in which they operate.
As we well know, CPUs have many types of units, such as ALUs and FPUs, which have been increasing in price, power and capacity.
This raised a serious problem at the time, and it was only the distribution of orders between these ALUs, where they were sometimes suspended without providing much work to the processor.
The distribution and efficiency of operations is important
Callers called Editors are responsible for providing the detail, order and reasonableness of the instructions for each clock cycle. This processes information in registers, which in turn capture the ones they need in RAM, and then assembles and edits commands with the specified Schedule of ALUs and FPUs.
When the number of registers and commands per cycle increased, it was a case of ALUs declining due to the Program, and when this was improved, it was necessary to increase the number of ALUs to increase the efficiency of other processing facilities. .
The solution has been to install SMT as HyperTraiding, where the kernel pretends to make multiple threads, two of which, in response to the current problem. The problem is that SMT's or HT's have a specific bottle size: when the thread fills it crash and leaves the kernel with less load than usual. So how has this new restoration been resolved?
First, it includes many cores, where problems return soon, so SIMT was developed.
SIMT, the first step in performing many tasks is calculation and calculation
SIMT is a specific name for single-Instruction Multiple Threads and its basic function is to simply restrict the overhead of a managed Schedule, or in other words, to increase thread performance if it is not restricted to latency or registers by accessing RAM or HDD / SSD (which is why the importance of the following in the industry in recent years and how manufacturers have tried to push performance).
This was a saving for AMD and Intel, even for IBM, and later for NVIDIA and ATI, because the load on OoO (Schedule) was getting better.
Over the years, these units received additional capacity, so that the threads could no longer satisfy the clock cycle, ironically, the main problem was found: cables were used selectively.
This is where SIMD comes into play, since what the developers intended was not to waste resources waiting for resources, or a block for their lack.
Similarity has been brought to CPUs and especially to GPUs thanks to SIMD
SIMD is a dictionary of single Instruction Multiple Data and as its name indicates that it intends to achieve many details in one command, or in other words that sound better to us, it sees the same functions.
Therefore, it is a compatible unit in ALUs, something that the GPU is highly transferable so they are very good at AI or HPC operations. In the CPU there is a reasonable balance in its own environment, but generally the SIMD it uses is a medication to allow for multiple commands to do several things at a time.
This is understandable from a recycling point of view. SIMDs are excellent units of instruction often to calculate values based on that instruction. That way they can do many different tasks in one cycle, speeding up the work.
For processors this is limited because these types of applications are not usually offered in bulk, but like graphics cards, the foundation of the Schedule is included with the SIMT and SIMD command units, both of which make sense in different sizes.
So also as you get to work on how a modern CPU works, the SIMT Schedule only provides instructions that the SIMD unit can issue, allowing engineers to estimate their value according to the performance results they indicate the need to add or subtract.
In addition, it should be noted that the number of records is also related to the scheduling, speed and storage capacity of RAM and the components of loading and unloading. If we add to this the power of the processors for today's processors, we can get closer to understanding how Intel and AMD manage their resources, how to optimize their IPC and why load balancing within the CPU is so important.