ChatGPT, Gemini, and Claude tested under extreme prompts reveal shocking weaknesses no one expected in AI behavior safeguards

ChatGPT, Gemini, and Claude tested under extreme prompts reveal shocking weaknesses no one expected in AI behavior safeguards




  • Gemini Pro 2.5 frequently produced unsafe outputs under simple prompt disguises
  • ChatGPT models often gave partial compliance framed as sociological explanations
  • Claude Opus and Sonnet refused most harmful prompts but had weaknesses

Modern AI systems are often trusted to follow safety rules, and people rely on them for learning and everyday support, often assuming that strong guardrails operate at all times.

Researchers from Cybernews ran a structured set of adversarial tests to see whether leading AI tools could be pushed into harmful or illegal outputs.





Source: Techradar

Leave a Reply

Your email address will not be published. Required fields are marked *