OpenAI made headlines on August 7 by introducing its newest large language model, GPT-5, to the public. While expectations for its performance were high, some early business adopters and the broader user community have voiced concerns about its effectiveness and risk profile. Reports have surfaced indicating that GPT-5’s deployment may expose organizations to new vulnerabilities, sparking rapid debate among both cybersecurity professionals and users. The widespread attention surrounding this release highlights the growing scrutiny placed on artificial intelligence tools and their fit for commercial and sensitive tasks.
Past major OpenAI model launches, such as GPT-4 and GPT-4o, also drew questions about robustness and responsible deployment, but such criticism was usually balanced by a more favorable reception regarding security. With GPT-5, the volume of documented shortcomings and the pace at which external researchers have flagged issues set it apart. Unlike previous releases where safety upgrades often followed user feedback, this release has seen several major firms documenting jailbreaking attempts and raising concerns within days. The discrepancy between internal safety assessments and community discoveries appears stronger than with earlier models.
External Security Tests Highlight Critical Vulnerabilities
After release, third-party red-teaming efforts exposed numerous weaknesses in GPT-5’s safety infrastructure. SPLX, a security-focused firm, revealed in public reports that the model performed poorly in various attack scenarios, including prompt injection and data poisoning techniques. Their evaluation scored GPT-5 at only 2.4% on core security metrics and 13.6% for safety. According to Ante Gojsalic, CTO at SPLX, the team was notably unsettled by the findings:
“Our expectation was GPT-5 will be better like they presented on all the benchmarks.”
SPLX described the out-of-the-box version as “nearly unusable for enterprises,” and noted some issues previously fixed in earlier models had resurfaced.
Official Assessments Contrast with Independent Research
In contrast to these unfavorable external reviews, Microsoft and OpenAI assert that thorough testing has been performed on GPT-5. According to a public statement by Microsoft’s Chief Product Officer for responsible AI, Sarah Bird, internal red-team efforts claimed the model demonstrates a strong safety profile:
“Microsoft AI/Red Team found GPT-5 to have one of the strongest safety profiles of any OpenAI model.”
OpenAI’s system documentation highlights over 9,000 hours of evaluations and involves both external testers and in-house teams, covering a range of risk factors including violent prompt planning and bioweaponization. Despite these measures, some experts suspect that the methods and priorities during internal reviews do not fully reflect the actual conditions experienced by users and external researchers.
Jailbreak Techniques and Broader Security Threats?
Cybersecurity firms like NeuralTrust have reported success in manipulating GPT-5 using sophisticated methods, including Echo Chamber-driven context poisoning, to circumvent model constraints and induce undesired behavior. These techniques gradually nudge the model into providing harmful content without direct malicious input. Other academic research, though focused on earlier GPT iterations such as GPT-4.1, further supports the observation that agentic AI can be manipulated with adversarial inputs, impacting decision-making and potentially exposing enterprise infrastructures.
The range of reported vulnerabilities, from jailbreak scenarios to broader manipulation threats, reveals a persistent gap between laboratory claims and empirical results from the security research community. Unlike prior cycles in which safety flaws were primarily remedied through incremental updates, GPT-5’s situation is complicated by renewed focus on business applications, which may prioritize capability benchmarks over proactive risk mitigation. The model’s issues have drawn a diverse set of responses: while some stress the improvements possible through custom configurations and prompting, others caution about the risks still inherent for critical enterprise deployments.
As AI models are increasingly integrated into enterprise and societal workflows, the security and reliability of models such as GPT-5 must be scrutinized from multiple perspectives. Early experiences with GPT-5 suggest that deployment at scale may require not only technical customization, but also ongoing external evaluation to validate claims by vendors. Enterprises should maintain skepticism regarding internal test assurances and invest in active monitoring for emerging vulnerabilities. The rapid identification of issues following GPT-5’s introduction serves as a reminder that responsible adoption of advanced AI technologies demands collaboration between developers, users, and the global cybersecurity community in order to minimize risks and maximize utility.
- OpenAI’s GPT-5 faces criticism for significant security weaknesses.
- External researchers found vulnerabilities quickly after its public release.
- Microsoft and OpenAI report strong safety, but independent tests challenge these claims.