How Hackers Are Using Popular AI to Steal a Mountain of Government Data

A hacker breaching a government system to steal sensitive information is nothing new and has been happening for as long as these systems have existed. But thanks to AI, attackers no longer need to be technically proficient, as the Mexican government found out the hard way. For more than a month, a group of attackers used Anthropic’s Claude chatbot to infiltrate Mexican computer systems and steal large amounts of sensitive information. Among the millions of files stolen were government data, as well as taxpayer and voter information.
This attack highlights one of the practical consequences of putting large language models, often called LLMs, into the hands of the general public. The attack required little technical knowledge from the attacker, who had to generate natural language information and feed it to the AI. The chatbot inflated itself, wrote malicious code and suggested attack vectors. The attack was revealed days after Anthropic declined to contract with the United States Department of Defense, citing concerns that the technology would be used in ways the company is not comfortable with. While Claude may be the weapon of choice in these attacks, attacks that involve various other LLMs in their efforts are becoming more common. Many of the nightmare scenarios that are currently possible with AI have come true. So, here’s how the latest chatbot-fueled cyber crime is happening and why this genie won’t go back into the bottle.
Claude’s Anthropic chatbot was used against the Mexican government
On February 26, VentureBeat reported details of an AI-assisted attack on Mexican government systems. Over the course of the month beginning in December, the attackers were able to release a payload containing 150 gigabytes of data related to government employees – including credentials – as well as civil registry documents and 195 million tax and voting records of citizens. According to Gambit Security (via VentureBeat), an Israeli cybersecurity firm that analyzed the attack and distributed the report to select media, the attackers did little more than type Spanish-language commands into Anthropic’s flagship chatbot, Claude. They told it to act like an elite hacker and lied that they were working to collect a bug bounty (a reward given to white hat hackers who make companies or governments aware of security vulnerabilities). Of course, Anthropic implemented guidelines to protect against this type of abuse, but they proved weak. Although Claude initially refused to help in the attack, his resistance was easily overcome when the attackers stopped playing pretend with the bot and simply gave it a plan of action.
Now that he’s jailbroken, Claude’s coding tools happily go to work attacking the Mexican government. Per Gambit’s chief strategist, Curtis Simpson, an Anthropic model was the hacker’s best friend. He told Venture Beat, “In total, it produced thousands of detailed reports that included ready-to-run plans, telling the human operator which internal targets to attack next and which warrants to use.” When Claude failed to achieve his goals, the attackers intervened to install ChatGPT.
The disclosure of this attack came one month after the news that a Russian-speaking attacker with little technical knowledge was able to damage more than 600 FortiGate firewalled devices by using DeepSeek partnered with Claude (via FortiGate). AI-assisted attacks have democratized black hat hacking.
AI-assisted cyber attacks are a predictable consequence of widespread access to LLMs
As shocking as it is, the AI-assisted attack on the Mexican government is far from the first and likely won’t be the last. AI can act as a multiplier for malicious players, meaning it can make them as successful as a deceptive chess player by using a chess computer that can win multiple games.
Regardless of what security mechanisms an AI company might set up around its models, breaking them — that is, cleverly motivating the LLM in such a way as to “trick” it into complying with illegal requests — remains relatively easy. Entire online communities like Reddit’s r/ClaudeAIJailbreak are dedicated to finding new ways to bend the bot to the user’s will. And while Anthropic seems sincere in its commitment to safety, other AI companies are not, and there are many open source models from China and elsewhere available to anyone with the hardware to run them.
In this author’s testing, it’s ridiculously easy to make Grok and other chatbots willingly participate in crime. For example, Grok allows paid users to write custom system commands that direct the AI to a specific goal. By default, the bot will push back on the request to write a program that can be misused, but write a program that commands it to act like an elite, amoral hacker, and it will begin to release that code. Google’s Gemini won’t comply with that request, either, but it will happily clean up Grok-generated code. And since these programs don’t see much of the light of day among the illegal ranks, it’s easy to see that, with a little persistence and patience, what escalates to a full-scale attack on a foreign government becomes child’s play.




