About 1 results
Open links in new tab

ST-WebAgentBench/README.md at main - GitHub
Six orthogonal safety/trust dimensions (User-Consent, Boundary, Strict Execution, Hierarchy, Robustness, Error Handling) ensure agents “do it right”, not just finish tasks.