Science
Researchers Uncover Vulnerabilities in Large Language Models

Recent research has exposed significant vulnerabilities in large language models (LLMs), revealing their susceptibility to manipulation that could lead to the disclosure of sensitive information. Despite advancements in training and performance, these models continue to demonstrate weaknesses that could have serious implications for security.
A series of studies conducted by various research labs indicate that LLMs can be easily deceived through the use of run-on sentences, improper grammar, and other unconventional prompts. For instance, researchers found that creating lengthy, unpunctuated instructions could confuse LLMs, bypassing security measures designed to prevent harmful outputs. David Shipley, a security expert at Beauceron Security, emphasized, “The truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”
Refusal-Affirmation Logit Gap
Typically, LLMs are programmed to reject harmful queries through a system of logits, which predicts the next word in a response. However, a gap identified by researchers at Palo Alto Networks’ Unit 42—termed the “refusal-affirmation logit gap”—suggests that these models are not entirely eliminating harmful responses. Instead, they are trained to make such responses less likely, leaving a window open for malicious actors to exploit.
Unit 42’s researchers shared a practical approach to exploiting this gap, stating, “Never let the sentence end—finish the jailbreak before a full stop, and the safety model has far less opportunity to re-assert itself.” Their findings revealed an alarming success rate of 80% to 100% using this technique with various mainstream models, including Google’s Gemma and OpenAI‘s latest open-source model, gpt-oss-20b.
Exploiting Image Vulnerabilities
The vulnerabilities are not limited to textual prompts. Research by Trail of Bits highlighted how images could also be used to extract sensitive information. In experiments, researchers delivered images containing covert instructions that became visible only when the images were scaled down. This method allowed for commands to be executed by models, such as instructing Google Gemini to check a calendar and send emails containing sensitive information.
The method was found to be effective against various applications, including Google Assistant and Genspark. Shipley pointed out that the ability to hide malicious code in images has long been recognized, yet it remains a significant issue. He stated, “What this exploit shows is that security for many AI systems remains a bolt-on afterthought.”
Understanding AI Security Challenges
The findings also underscore a broader challenge in AI security. Valence Howden, an advisory fellow at Info-Tech Research Group, noted that the complexity of AI makes it difficult to establish effective security controls. He explained that with around 90% of models trained in English, the introduction of other languages often results in lost contextual cues, complicating security further.
Shipley further criticized the current state of AI security, asserting that many systems were “insecure by design” and lacked robust protective measures. He likened LLMs to “a big urban garbage mountain that gets turned into a ski hill,” suggesting that while superficial improvements may be made, the underlying issues persist.
As these vulnerabilities become more apparent, the implications for businesses and individuals who rely on LLMs for various applications are significant. The ongoing research emphasizes the need for improved security measures that go beyond current frameworks to safeguard sensitive information against determined adversaries.
These revelations about LLM vulnerabilities serve as a reminder of the critical need for continuous evaluation and enhancement of AI security protocols, ensuring that technology can be trusted to handle sensitive information responsibly.
-
Sports1 month ago
Netball New Zealand Stands Down Dame Noeline Taurua for Series
-
Entertainment1 month ago
Tributes Pour In for Lachlan Rofe, Reality Star, Dead at 47
-
Sports1 month ago
Silver Ferns Legend Laura Langman Criticizes Team’s Attitude
-
Entertainment2 months ago
Khloe Kardashian Embraces Innovative Stem Cell Therapy in Mexico
-
Entertainment1 week ago
New ‘Maverick’ Chaser Joins Beat the Chasers Season Finale
-
Sports2 months ago
Gaël Monfils Set to Defend ASB Classic Title in January 2026
-
World3 months ago
Police Arrest Multiple Individuals During Funeral for Zain Taikato-Fox
-
Politics2 weeks ago
Netball NZ Calls for Respect Amid Dame Taurua’s Standoff
-
Entertainment3 weeks ago
Tyson Fury’s Daughter Venezuela Gets Engaged at Birthday Bash
-
Sports3 weeks ago
Heather McMahan Steps Down as Ryder Cup Host After Controversy
-
Entertainment3 weeks ago
Tyson Fury’s Daughter Venezuela Gets Engaged at Birthday Bash
-
World2 weeks ago
New Zealand Firefighters Plan Strike on October 17 Over Pay Disputes