PhysDBPhysical AI Map

Evaluation

Safety evaluation

Testing failure modes, unsafe actions, out-of-distribution scenes, human interaction, and recovery behavior.

safetyfailure

What it is

Safety evaluation links model behavior to physical risk, not just task completion.

Why it matters

A robot can fail by doing the wrong action confidently, too late, too fast, or in the wrong place.

How not to overread it

PhysDB does not certify safety; it maps where safety evidence would need to live.

Related edges

requires

Physical AI

Physical deployment

No page should imply safety certification.

extends

Benchmarks

Safety-relevant evaluation

Safety evaluation needs failure modes, not only success scores.