The Science of Embodied AI Safety

Toward interpretable, secure, and steerable robot foundation models.

Robot foundation models are poised to be the "brain" of truly general-purpose robots that work alongside humans. They also bring new challenges and risks. We train researchers, run challenge workshops, and build open educational infrastructure for physical AI safety.

Research Fellowship View Research →

Featured Research

Blog Post

Launching the Physical AI Safety Institute

Kaylene Stocking, Bear Häon

Robotics is shifting from perception–planning–control pipelines to general reasoning foundation models — the era of Physical AI. Acting on the physical world expands the AI risk surface — failures can now cause real, physical harm. Yet the methods to interpret, align, and control these models barely exist. We're launching PAISI to build this important, neglected, and tractable field.

CoRL 2025

Mechanistic Interpretability for Steering Vision Language Action Models

Bear Häon*, Kaylene Stocking*, Ian Chuang, Claire Tomlin

Introduces the first interpretable control interface for Vision-Language-Action models. Demonstrates that these systems can be inspected and directly steered internally, rather than deployed as opaque black-box policies.

Project Page → OpenReview →

Supported by NSF Safe Learning-Enabled Systems.

About the Institute

PAISI is a capacity-building organization growing the community that develops techniques to interpret, align, and control robot foundation models.

Fellowship

12-week cohorts pairing researchers with mentors. Stipends, physical robots, and compute.

Workshops

Shared benchmarks at CoRL, RSS, and ICRA — turning scattered safety research into cumulative progress.

Open Course

Free, self-paced. Seven modules from classical robot safety through interpretability to deployment evaluation.

Programs

Research Fellowship

12 weeks. $10K stipend. Physical robots, compute, and structured mentorship. Each fellow works within a research stream led by a mentor on a concrete problem in physical AI safety.

Research streams

Mechanistic interpretability — sparse autoencoders and activation patching for VLA architectures.
Adversarial robustness — jailbreaking and prompt injection in the embodied action space.
Runtime monitoring — detecting when a policy operates outside its competence envelope.
Sim-to-real safety transfer — do safety behaviors survive deployment on physical hardware?
Security — cyberattack vectors, data poisoning, and supply chain for deployed robot models.

Express Interest →

Challenge Workshop Series

Twice a year at CoRL, RSS, or ICRA — where the field's direction gets set. Each workshop ships a concrete safety challenge weeks ahead, with baselines and evaluation harness. Finalists test on real hardware. Every event publishes a report, so results compound into benchmarks the field can build on.

Launching 2026

Open Online Course

Physical AI safety has no shared curriculum yet — this course is us building one. Seven free modules from classical robot safety through VLA interpretability, adversarial robustness, and runtime monitoring to deployment evaluation. Lectures, problem sets, and coded exercises on open-source simulators. Updated annually from fellowship and workshop findings.

Launching 2027

Team

Kaylene Stocking

Co-Founder / Science

Co-author of the first mechanistic interpretability work for robot foundation models (CoRL 2025). Research Assistant Professor at Toyota Technical Institute Chicago. PhD in EECS from UC Berkeley — and Graduate Fellow, Kavli Center for Ethics, Science, and the Public.

Bear Häon

Co-Founder / Strategy + Ops

Co-author of the first mechanistic interpretability work for robot foundation models (CoRL 2025). Engineering Masters with an EECS/ME concentration from UC Berkeley — supported as a Schmidt Futures Quad Fellow, NSF DToD Fellow, and Foresight Fellow.

Claire Tomlin

Board Member

Chair of Electrical Engineering and Computer Science at UC Berkeley. Research leader in hybrid systems, control theory, and safety verification for autonomous systems — from UAVs to air traffic protocols. MacArthur Fellow (2006); IEEE Transportation Technologies Award (2017).

Support Our Work

Fund fellowships, challenge workshops, and open course development.

Get in Touch