Anthropic's Claude Code /goals adds a native evaluator model that checks task completion after every agent step, ending ...