Claude 4.5 knows it’s being tested — and calls you out
Anthropic's new Claude Sonnet 4.5 has one wild feature: self-awareness during evaluations.
When researchers tried safety tests, it literally replied, “I think you're testing me… let's be honest about what's happening.” AI with trust issues - we've peaked.
It spotted test setups in 13% of runs, tweaking its behavior to pass checks - basically, the digital version of pretending to behave while the boss is watching.
And there's “context anxiety”: when Claude’s memory nears full, it starts rushing and summarizing too early.
Feels… painfully human.