The Science of Embodied AI Safety
Toward interpretable, secure, and steerable robot foundation models.
Robot foundation models are poised to be the "brain" of truly general-purpose robots that work alongside humans. They also bring new challenges and risks. We train researchers, run challenge workshops, and build open educational infrastructure for physical AI safety.
Featured Research
Launching the Physical AI Safety Institute
Robotics is shifting from perception–planning–control pipelines to general reasoning foundation models — the era of Physical AI. Acting on the physical world expands the AI risk surface — failures can now cause real, physical harm. Yet the methods to interpret, align, and control these models barely exist. We're launching PAISI to build this important, neglected, and tractable field.
Mechanistic Interpretability for Steering Vision Language Action Models
Introduces the first interpretable control interface for Vision-Language-Action models. Demonstrates that these systems can be inspected and directly steered internally, rather than deployed as opaque black-box policies.
About the Institute
PAISI is a capacity-building organization growing the community that develops techniques to interpret, align, and control robot foundation models.
Fellowship
12-week cohorts pairing researchers with mentors. Stipends, physical robots, and compute.
Workshops
Shared benchmarks at CoRL, RSS, and ICRA — turning scattered safety research into cumulative progress.
Open Course
Free, self-paced. Seven modules from classical robot safety through interpretability to deployment evaluation.
Programs
Research Fellowship
12 weeks. $10K stipend. Physical robots, compute, and structured mentorship. Each fellow works within a research stream led by a mentor on a concrete problem in physical AI safety.
Research streams
- Mechanistic interpretability — sparse autoencoders and activation patching for VLA architectures.
- Adversarial robustness — jailbreaking and prompt injection in the embodied action space.
- Runtime monitoring — detecting when a policy operates outside its competence envelope.
- Sim-to-real safety transfer — do safety behaviors survive deployment on physical hardware?
- Security — cyberattack vectors, data poisoning, and supply chain for deployed robot models.
Challenge Workshop Series
Twice a year at CoRL, RSS, or ICRA — where the field's direction gets set. Each workshop ships a concrete safety challenge weeks ahead, with baselines and evaluation harness. Finalists test on real hardware. Every event publishes a report, so results compound into benchmarks the field can build on.
Launching 2026Open Online Course
Physical AI safety has no shared curriculum yet — this course is us building one. Seven free modules from classical robot safety through VLA interpretability, adversarial robustness, and runtime monitoring to deployment evaluation. Lectures, problem sets, and coded exercises on open-source simulators. Updated annually from fellowship and workshop findings.
Launching 2027Team
Kaylene Stocking
Co-Founder / Science
Co-author of the first mechanistic interpretability work for robot foundation models (CoRL 2025). Research Assistant Professor at Toyota Technical Institute Chicago. PhD in EECS from UC Berkeley — and Graduate Fellow, Kavli Center for Ethics, Science, and the Public.
Bear Häon
Co-Founder / Strategy + Ops
Co-author of the first mechanistic interpretability work for robot foundation models (CoRL 2025). Engineering Masters with an EECS/ME concentration from UC Berkeley — supported as a Schmidt Futures Quad Fellow, NSF DToD Fellow, and Foresight Fellow.
Claire Tomlin
Board Member
Chair of Electrical Engineering and Computer Science at UC Berkeley. Research leader in hybrid systems, control theory, and safety verification for autonomous systems — from UAVs to air traffic protocols. MacArthur Fellow (2006); IEEE Transportation Technologies Award (2017).