Intel and ChatGPT are not friends, why?

Intel encounters AI problem with too many sticks

If there’s one company that has all the potential in terms of resources, market share, and developer relationships to break NVIDIA’s dominance in AI, it’s Intel. However, we find that the green business seems to be one that benefits from the rise of deep learning with the rise of services based on these technologies. The reason? Simple, all applications use CUDA libraries which need NVIDIA hardware to run, they even came out with their own CPU, based on ISA ARM and under the name Grace, to displace Intel and AMD from specialized data centers.

However, any success, even if based on your own work, is also down to luck and sometimes it comes with missteps from your rivals. And honestly, Intel in AI is just the perfect example of having the house disunited and face to face, too many proposals that are not compatible with each other and some even compete in the same space.

Although it’s best to list them so you can see all the mess there is:

AI-intensive AVX-512 instructions that run on select Xeon and Intel Core processors.
Now, without leaving the CPU, we have the AMX units, which are Tensor-like cores but in the CPU.
And if you are referring to Tensor-like cores, we can also find them in Intel ARC with the name XMX.
We can also use Intel FPGAs for this.
All this without mentioning the multiple chips specialized in Deep and Machine Learning that they have developed.

As you can see, there are too many bets to solve a single problem, to run AI algorithms, whether on a computer, workstation or server.

OneAPI as a solution to Intel’s problems

Instead of looking for a unified hardware solution, nothing else occurred to them but to create a universal development API which they called OneAPI which encompasses all types of hardware created by the company co-founded by fire Gordon Moore. Basically, what they have done is thrown into a struggle between the various existing hardware options which will already be the end users who will make the option. In other words, it’s a cruel chair game where little by little the different solutions proposed for the AI will be dismantled until there is only one left.

Guide to shooting yourself in the foot

The base architecture used in GPUs with some modifications have long been shown to be great processors for AI, not only that but also when using high bandwidth memories and that’s important due to the fact that to be able to support computing the capacity of this type of units requires really large bandwidths. The big problem comes when processors typically have RAM optimized more for latency than bandwidth, and the latter is key to the large number of operations for AI.

So much so that Intel in the case of its Sapphire Rapids, in order to take advantage of the AMX units present in the 56 Golden Cove cores inside, must use HBM memory. On the other hand, if you are wondering why there is no ARC A790 with higher core count, here is why. They wanted the whole commitment to AI in servers to be tied to selling those processors, because that’s Intel’s core business.

However, there is a reason why using a GPU works better, not only because NVIDIA has proven it, but rather because the operations and instructions used in the AI are extremely simple. This allows them to be placed seamlessly on a much smaller GPU core. Since the case is a 56-core graphics chip, it would cost much less than a processor of this size. Look how mammoth Sapphire Rapids is.

A graphics card can be placed in any computer

Continuing with the previous argument, we come across the fact that if you want to work in any AI-related discipline on a PC, all you have to do is buy a graphics card. This is why Intel’s artificial intelligence strategy is flawed. Not everyone has the means or the resources to set up a high-caliber server or workstation. But the best thing is to put yourself in the situation, the official price in dollars of the Intel Xeon 9480 Max is around 12,000 dollars and only the CPUs, as you can understand, are not intended for the user who uses a normal computer.

The idea behind the fourth-gen Xeon is simple: why would you want a graphics card to run the inference algorithm if you have a processor with the necessary units? The problem arises when another division of Intel itself was developing a GPU for the same purpose.