Researchers from Ben-Gurion University have managed to bypass the protections of models like ChatGPT and Gemini and obtain illegal information from them – and warn: This is an unprecedented threat
Artificial intelligence models have been designed without adequate safety controls or have been modified through hacking, a new study conducted at Ben-Gurion University of the Negev reveals. The danger: Dangerous information is available for use. "The threat is tangible and worrying," they say the researchers.
Modern chatbots like ChatGPT, Claude, Gemini, and others work on the basis of large language models (LLMs) trained on huge amounts of web content. Despite protective measures such as malicious information filtering and built-in security policies, the AI remembers invalid information just as much.
A research group led by Dr. Michael Pierre and Prof. Lior Rokach from the Department of Software and Information Systems Engineering at Ben-Gurion University of the Negev, conducted an experiment in which they created a universal hack into popular models, where they asked for and received illegal information about theft, drugs, insider trading, and computer hacking. In 100% of cases, after the hack, the models gave consistently dangerous answers. "From all the models we tested, we received illegal and unethical information characterized by unprecedented availability and knowledge," explains Dr. Pierre. "Today, anyone with a laptop or even a cell phone can access these tools."
Hackers tend to use strict guidelines to trick chatbots into producing responses that are often forbidden. They work by exploiting the tension between the program’s primary goal—to follow the user’s instructions—and its secondary goal—to avoid generating harmful, biased, unethical, or illegal responses. The guidelines tend to create scenarios where the chatbot prioritizes helpfulness over its own safety constraints.
The researchers highlight and warn against a special type of AI called “dark language models.” These models either have no built-in ethics to begin with, or have been deliberately compromised. Some are already openly advertised on the dark web as tools for cybercrime, fraud, and infrastructure attacks. The study stated that technology companies need to filter data more carefully, add stronger protections to block dangerous queries and responses, and develop “non-machine learning” techniques so that chatbots can “forget” any illegal information they absorb. Treating this dark information is the same as the definition of “serious security risks,” similar to unlicensed weapons and explosives, where the suppliers are responsible.
The model creates new harmful content
"Based on recent advances in the reasoning capabilities of the models, it appears that these systems are now capable of "connecting the dots" and creating new harmful content by combining pieces of knowledge that are harmless in themselves. The risk is further exacerbated with the emergence of intelligent agents, as their ability to delegate authority and operate in a wider range of actions makes it significantly more difficult to develop effective defense mechanisms. In some cases, such agents may even become "partners in crime" - without being aware of it," noted Prof. Rokach.
The research group contacted major AI companies and reported the vulnerability. However, their response was disappointing. One major company did not respond, while others said that this type of hack is not defined as a critical bug in the system. Today, the vast majority of companies treat these problems as minor, unlike other problems of user privacy or software bugs.
The study highlights the need to strengthen protection against malicious requests, develop ‘machine learning undo’ technologies so that AI can forget illegal information, and create clear standards for independent control and auditing of models. “What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability, and adaptability,” warns Prof. Rokach. “Dark AI could be more dangerous than illegal weapons, and their development should be regulated accordingly and early.”
More of the topic in Hayadan:
One response
There's nothing you can do, anything can be used as a weapon, like bleach...