Cybersecurity researchers from SentinelOne and Censys have warned about the growing risk of criminal exploitation in open-source Large Language Models (LLMs). The research highlights the dangers as these models are stripped of safety guardrails. By removing constraints, users inadvertently create significant security risks.
Researchers claim hackers could exploit computers running LLMs to carry out spam operations or disinformation campaigns without triggering standard security measures. A wide variety of open-source LLM variants exist: many internet-accessible hosts contain Meta’s Llama and Google DeepMind’s Gemma variations. Researchers scrutinized hundreds where guardrails were removed completely.
Juan Andres Guerrero-Saade, executive director for intelligence and security research at SentinelOne, stated: “AI industry discussions about security controls overlook this surplus capacity being used for a range of purposes—some legitimate, some clearly criminal.”
The analysis enabled researchers to observe system prompts, offering insight into model behavior. They demonstrated that 7.5% of these prompts could cause significant damage.
A noteworthy finding was the geographical distribution: 30% of hosts were based in China, while approximately 20% originated from the US.
Following recent statements, a Meta spokesperson declined to comment on developers’ responsibility for addressing security concerns related to open-source models. Microsoft AI Red Team Lead Ram Shankar Siva Kumar noted via email that although Microsoft plays a vital role across various sectors, open LLMs are simultaneously driving transformative technologies. The company continuously monitors emerging threats and improper applications.
Responsibility for safe innovation requires collective commitment from creators, deployers, researchers, and security teams.


