Science
Researchers Uncover Vulnerabilities in Large Language Models

Recent research has exposed significant vulnerabilities in large language models (LLMs), revealing their susceptibility to manipulation that could lead to the disclosure of sensitive information. Despite advancements in training and performance, these models continue to demonstrate weaknesses that could have serious implications for security.
A series of studies conducted by various research labs indicate that LLMs can be easily deceived through the use of run-on sentences, improper grammar, and other unconventional prompts. For instance, researchers found that creating lengthy, unpunctuated instructions could confuse LLMs, bypassing security measures designed to prevent harmful outputs. David Shipley, a security expert at Beauceron Security, emphasized, “The truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”
Refusal-Affirmation Logit Gap
Typically, LLMs are programmed to reject harmful queries through a system of logits, which predicts the next word in a response. However, a gap identified by researchers at Palo Alto Networks’ Unit 42—termed the “refusal-affirmation logit gap”—suggests that these models are not entirely eliminating harmful responses. Instead, they are trained to make such responses less likely, leaving a window open for malicious actors to exploit.
Unit 42’s researchers shared a practical approach to exploiting this gap, stating, “Never let the sentence end—finish the jailbreak before a full stop, and the safety model has far less opportunity to re-assert itself.” Their findings revealed an alarming success rate of 80% to 100% using this technique with various mainstream models, including Google’s Gemma and OpenAI‘s latest open-source model, gpt-oss-20b.
Exploiting Image Vulnerabilities
The vulnerabilities are not limited to textual prompts. Research by Trail of Bits highlighted how images could also be used to extract sensitive information. In experiments, researchers delivered images containing covert instructions that became visible only when the images were scaled down. This method allowed for commands to be executed by models, such as instructing Google Gemini to check a calendar and send emails containing sensitive information.
The method was found to be effective against various applications, including Google Assistant and Genspark. Shipley pointed out that the ability to hide malicious code in images has long been recognized, yet it remains a significant issue. He stated, “What this exploit shows is that security for many AI systems remains a bolt-on afterthought.”
Understanding AI Security Challenges
The findings also underscore a broader challenge in AI security. Valence Howden, an advisory fellow at Info-Tech Research Group, noted that the complexity of AI makes it difficult to establish effective security controls. He explained that with around 90% of models trained in English, the introduction of other languages often results in lost contextual cues, complicating security further.
Shipley further criticized the current state of AI security, asserting that many systems were “insecure by design” and lacked robust protective measures. He likened LLMs to “a big urban garbage mountain that gets turned into a ski hill,” suggesting that while superficial improvements may be made, the underlying issues persist.
As these vulnerabilities become more apparent, the implications for businesses and individuals who rely on LLMs for various applications are significant. The ongoing research emphasizes the need for improved security measures that go beyond current frameworks to safeguard sensitive information against determined adversaries.
These revelations about LLM vulnerabilities serve as a reminder of the critical need for continuous evaluation and enhancement of AI security protocols, ensuring that technology can be trusted to handle sensitive information responsibly.
-
Sports1 week ago
Gaël Monfils Set to Defend ASB Classic Title in January 2026
-
World4 weeks ago
Police Arrest Multiple Individuals During Funeral for Zain Taikato-Fox
-
Top Stories3 weeks ago
Former Superman Star Dean Cain Joins U.S. Immigration Agency
-
Sports4 weeks ago
Richie Mo’unga’s All Blacks Return Faces Eligibility Hurdles
-
Health4 weeks ago
Navigating the Complexities of ‘Friends with Benefits’ Relationships
-
World4 weeks ago
Fatal ATV Crash Claims Life on Foxton Beach
-
Business3 weeks ago
Grant Taylor Settles Before Zuru Nappy Trial, Shifting Dynamics
-
Sports7 days ago
Warriors Sign Haizyn Mellars on Three-Year Deal Ahead of 2028 Season
-
Entertainment3 weeks ago
Ben MacDonald Exits MasterChef Australia in Fifth Place
-
Entertainment3 weeks ago
New Zealand’s Ben MacDonald Reflects on MasterChef Australia Journey
-
Business1 week ago
Software Glitch Disrupts Air Traffic Control in New Zealand
-
Health3 weeks ago
Qatar Basketball Team Reveals Roster for FIBA Asia Cup 2025