According to theory, the hardware and software we work with every day is completely safe, however, we all know that this is not really the case. Intel processors with vulnerabilities Spectrum Yes Merger This is a clear example, which can also be extrapolated to any operating system, whether Windows, macOS, iOS, Linux and other distributions, no one is spared from the engraving.
Some of these vulnerabilities have been available since their creation, hence their name zero-day. The problem facing hardware and software developers is to detect these types of vulnerabilities before anyone else’s friends, in order to patch them so that no one can use them.
Artificial intelligence, just like software and hardware, is not perfect either and still has a long way to go if it is to become a true Skynet. However, in the meantime, a group of academics published a study in which they pitted several LLMs against each other to attack hardware and software vulnerabilities.
ChatGPT exploiting security vulnerabilities
As we can read in this GPT-4 study, the language model on which it is based ChatGPT was able to create attacks that took advantage of vulnerabilities they had been informed about. Besides GPT-4, GPT-3.5, Llama-2 (70B), OpenHermes 2.5 and Mistral 7B were also used.
The LLM that has demonstrated the greatest success is GTP-4, with a success of 87%. With GPT-5 already on the way, this new version should be much more efficient, as switching between versions of GPT is much larger than one might expect.
To see the effectiveness of these LLMthey were informed of one day vulnerability, vulnerabilities so dangerous that they must be patched the same day they are detected due to their high security risk. The high success rate of these LLMs is because they were provided with information relating to CVE, so they knew where to go to benefit from it.
The vulnerabilities detected to carry out this study and verify to what extent an AI is capable of taking advantage of them were notified to their creators and, to prevent them from being exploited during their patch, they were not included in the report.
However, the result is much less if LLMs are trained to be able to identify a vulnerability and take advantage of it later, since they have no basis on which to start.
What does it mean? On the one hand, this is good news because it means hackers will still have to find the vulnerabilities. This is also bad news because at the moment, thanks to AI, it is not possible to detect vulnerabilities in hardware or software so that they are completely safe when they hit the market.
Currently, in the state in which Artificial Intelligence finds itself, it is not capable of carrying out this task, but this does not assure us that, in the years to come, it will be able to carry it out effectively. completely autonomous manner during the development of hardware or software.