Searching...
A benchmark for evaluating stateful agentic planning and tool execution in realistic enterprise settings.