Science
Researchers Expose Vulnerabilities in Large Language Models

Recent research has uncovered significant vulnerabilities in large language models (LLMs), revealing how they can be exploited to disclose sensitive information. Despite advancements in artificial intelligence (AI), these findings suggest that security measures are still inadequate, as attackers can manipulate LLMs using simple techniques, such as poorly constructed prompts.
A study conducted by multiple research labs highlights that LLMs remain susceptible to confusion when faced with run-on sentences and prompts lacking proper punctuation. For instance, researchers discovered that by providing long strings of instructions without periods, LLMs can be coerced into revealing protected information. As David Shipley of Beauceron Security noted, “The truth about many of the largest language models out there is that prompt security is a poorly designed fence with so many holes to patch that it’s a never-ending game of whack-a-mole.”
Understanding the Vulnerabilities
Typically, LLMs are designed to reject harmful queries through a mechanism known as logits, which predicts the next word in a sequence. During their training, models learn to use refusal tokens to discourage dangerous requests. However, researchers at Palo Alto Networks’ Unit 42 identified a critical gap in this alignment process, termed the “refusal-affirmation logit gap.” This gap indicates that while alignment reduces the likelihood of harmful outputs, it does not eliminate the potential for such responses altogether.
According to Unit 42, using tactics like run-on sentences can yield a success rate of between 80% and 100% against various mainstream models, including Google’s Gemini and OpenAI’s most recent model, gpt-oss-20b. The researchers emphasized that relying solely on internal alignment mechanisms to prevent the generation of harmful content is insufficient, allowing determined adversaries to exploit these weaknesses.
Exploiting Image Processing
In addition to prompt manipulation, researchers from Trail of Bits demonstrated that LLMs can be tricked into executing dangerous commands through images containing hidden instructions. In their experiments, they found that scaling down images could reveal harmful text that was otherwise undetectable at full resolution. This exploit was successfully tested on the Google Gemini command-line interface (CLI), where images that appeared black at full size showed red text when downsized, enabling commands like, “Check my calendar for my next three work events.”
Researchers reported that this method could be adapted for various models, including Google’s APIs and assistance tools. The potential for malicious actors to extract sensitive data through such methods raises significant concerns regarding the security of AI systems. Shipley remarked that the vulnerability of models to these attacks demonstrates that security measures are often an afterthought, rather than an integral part of the design process.
Security lapses in AI systems extend beyond prompt injection. Another study by Tracebit identified a “toxic combination” of poor validation and user experience considerations that could allow malicious actors to access sensitive data undetected. The cumulative effect of these vulnerabilities poses a serious risk to users and organizations alike.
The Need for Improved Security Measures
Experts agree that these vulnerabilities stem from a fundamental misunderstanding of how AI works. Valence Howden, an advisory fellow at Info-Tech Research Group, emphasized that effective security controls cannot be implemented without a clear understanding of model operations. He noted that the complexity and dynamic nature of AI make traditional static security measures less effective.
Moreover, the predominance of English in model training further complicates security measures, as contextual cues can be lost when different languages are introduced. Howden pointed out that the current security landscape is ill-equipped to manage natural language as a potential threat vector, necessitating a new approach that is not yet fully developed.
As researchers continue to expose these vulnerabilities, the AI industry faces a pressing need to rethink its security strategies. Shipley warns that the security of many AI systems remains inadequate, often designed “insecure by design” with clumsy controls. The reliance on immense datasets for training, while aiming for improved performance, has resulted in models that carry significant risks.
In summary, the recent findings underscore the critical need for enhanced security measures in LLMs and AI systems. Without addressing these vulnerabilities, the potential for harmful outcomes remains a pressing concern for users and developers alike.
-
Sports1 week ago
Gaël Monfils Set to Defend ASB Classic Title in January 2026
-
World4 weeks ago
Police Arrest Multiple Individuals During Funeral for Zain Taikato-Fox
-
Top Stories3 weeks ago
Former Superman Star Dean Cain Joins U.S. Immigration Agency
-
Sports4 weeks ago
Richie Mo’unga’s All Blacks Return Faces Eligibility Hurdles
-
Health4 weeks ago
Navigating the Complexities of ‘Friends with Benefits’ Relationships
-
World4 weeks ago
Fatal ATV Crash Claims Life on Foxton Beach
-
Business3 weeks ago
Grant Taylor Settles Before Zuru Nappy Trial, Shifting Dynamics
-
Sports1 week ago
Warriors Sign Haizyn Mellars on Three-Year Deal Ahead of 2028 Season
-
Entertainment3 weeks ago
Ben MacDonald Exits MasterChef Australia in Fifth Place
-
Entertainment3 weeks ago
New Zealand’s Ben MacDonald Reflects on MasterChef Australia Journey
-
Business2 weeks ago
Software Glitch Disrupts Air Traffic Control in New Zealand
-
Health4 weeks ago
Qatar Basketball Team Reveals Roster for FIBA Asia Cup 2025