May 10, 2026

Physical AI Is Already Here. Where Are the Guardrails?

By Dr. Barak Or, Founder & CEO, State16

I spend most of my week talking to people who build robots, drones, and autonomous vehicles for a living. The mood has changed in the last six months. We are no longer arguing about when Physical AI ships. It is shipping.

At CES 2026 in January, Jensen Huang put it plainly: "The ChatGPT moment for robotics is here." That same week, Boston Dynamics announced that its entire 2026 production allocation of the electric Atlas was already committed, primarily to Hyundai's Robot Metaplant Application Center and Google DeepMind. Figure 03, running Helix 02, is now operating in unsupervised, full-body autonomy on the BMW Spartanburg line. Unitree shipped over 5,500 humanoids in 2025 and is targeting 10,000 to 20,000 in 2026, with the H2 priced near $29,900. NVIDIA released Cosmos Predict 2.5 in late 2025 and GR00T N1.7 in Early Access in April 2026. AGIBOT unveiled its "One Body, Three Intelligences" stack last month. 1X is taking NEO pre-orders for the home. Tesla Optimus Gen 3 entered final pre-production this spring.

Analysts are calling 2026 the Mass Production Year Zero for humanoid robotics. From where I sit, that label is fair.

That is the optimistic side of the story. There is another side, and it is the reason I started State16.

What "Physical AI" Means in May 2026

For the PMs, engineers, and operators stepping into this space, here is the short taxonomy I find useful.

World Foundation Models (WFMs). Generative video and dynamics models trained at scale to simulate the physical world. NVIDIA Cosmos Predict 2.5 (2B and 14B), Cosmos Reason 2, Wayve's GAIA / LINGO lineage, Meta's V-JEPA family. They compress physics, dynamics, and embodiment into pretraining priors, then act as closed-loop simulators for synthetic data, scenario coverage, and policy evaluation.

Vision-Language-Action (VLA) models. The successor to vision-language. Instead of emitting tokens, they emit continuous, high-rate motor control. GR00T N1.7 (3B parameters, Cosmos-Reason2 backbone, 20,000+ hours of egocentric video), Physical Intelligence's π-zero, Google DeepMind's Gemini Robotics, Skild Brain, Toyota's Large Behavior Models, Figure's Helix 02. This is the architectural shift that finally bridges language-grade reasoning and continuous-time hardware control.

Closed-loop autonomy stacks. End-to-end systems (Tesla FSD v12+, Wayve, Waymo, the new Atlas / DeepMind stack) that take perception in and emit actuation out, without hand-engineered intermediaries.

Edge compute substrate. NVIDIA Jetson Thor and Blackwell-class modules now ship inside a meaningful fraction of new humanoid and AMR designs, finally giving WFM-class inference a defensible thermal and power envelope at the edge.

The capital underneath all of this is real. Morgan Stanley projects humanoids alone at roughly $5T by 2050. Yann LeCun's new venture, AMI Labs, raised over $1B at a $3.5B valuation in early 2026 with no shipped product, on the explicit thesis that LLMs cannot reach human-level intelligence and the path forward runs through world models. Capital is no longer the bottleneck.

The Part That Worries Me

I am a navigation and AI engineer by training. I have spent the last decade working on inertial systems, GPS-denied navigation, and AI-based tracking. What used to keep me up at night about classical control was sensor drift. What keeps me up at night about Physical AI is something different and, in some ways, worse: a neural network can be 100% confident and catastrophically wrong in the same forward pass.

The training objective for these models is to minimize loss over a finite distribution. It is not Newtonian mechanics. It is not ISO 26262 functional safety. It is not DO-178C avionics rigor. It is not "do not drive a 70 kg humanoid into a coworker."

This is not a thought experiment.

In October 2023, a Cruise robotaxi in San Francisco struck a pedestrian and, instead of executing an emergency stop, performed a standard pull-over maneuver, dragging her roughly 20 feet before halting. California suspended Cruise's autonomous deployment permit, NHTSA opened a federal investigation, and the post-mortem reads as a cascading Sense → Plan → Act failure under high model confidence.

In late 2025, a Tesla robotics technician named Peter Hinterdobler filed a $51M lawsuit after a Fanuc industrial arm at Fremont allegedly activated unexpectedly, struck him with the force of an 8,000-pound counterbalance weight, and pinned him against a table. According to OSHA data cited in the case, 77 robot-related incidents between 2015 and 2022 produced 93 injuries, including amputations and fractures.

NHTSA's investigation into Tesla FSD's behavior in low-visibility conditions, covering millions of vehicles, remains active as of early 2026. A jury has already returned a $243M verdict against Tesla in one fatal Autopilot crash.

These are not edge cases. They are leading indicators of what scaling Physical AI without a runtime safety layer looks like at fleet scale.

Yann LeCun has been blunt about the underlying technical problem. He has publicly dismissed the idea that scaling LLMs alone reaches human-level intelligence, arguing that the path forward must run through models that genuinely understand the physical world, not just statistical correlations between tokens. He is talking about cognition. The same critique cuts even harder when the model's outputs are joint torques and steering commands instead of paragraphs.

The Missing Layer

The market is rapidly discovering that two existing things, individually, are not enough.

A World Model alone, however physics-aware, is an offline black box. It can simulate. It cannot enforce.

A hardware OEM safety layer alone, however well-engineered, relies on threshold alarms, sensor redundancy, and hand-coded geofences. It does not understand what the model is trying to do, why, or whether the proposed action is in-distribution.

What Physical AI deployment actually requires is a third element: a deterministic runtime authority that wraps the foundation model, watches its outputs against physics, kinematics, business rules, spatial constraints, and regulatory envelopes, and has the authority to override before an action reaches the actuator. A layer that produces an auditable, certifiable trace of every decision, the kind a DO-178C or ISO 26262 reviewer can actually sign off on. A layer that lets enterprises evaluate one VLA against another on identical hardware, under identical edge cases, with reproducible scoring.

This is the layer the market is missing. It is also the layer that decides whether Physical AI scales into regulated industries (automotive, aerospace, medical, industrial logistics) or stalls at the pilot stage.

What We Are Building

State16 is a universal guardrail and evaluation platform for Physical AI. We wrap World Models and VLA policies, regardless of vendor, embodiment, or sensor stack, in a deterministic runtime enforcement layer driven by proprietary multi-horizon spectral math, well beyond the rule-based residuals OEMs are attempting in-house.

Enterprises use State16 to define operational rules (kinematic, spatial, regulatory, business), evaluate black-box model behavior under uncertainty before and during deployment, and benchmark competing foundation models against identical physical hardware. Across the heterogeneous fleet we wrap (drones, AMRs, humanoids, autonomous vehicles), we are aggregating what we believe will become the world's largest proprietary dataset of how Physical AI actually behaves, and fails, in the real world. That dataset is the flywheel.

The Honest Question

Physical AI is here. The hardware ships in 2026. The capital is committed. The foundation models exist.

The honest question for the industry, and the one I keep asking myself, is whether the layer that makes any of this auditable, certifiable, and operationally safe will be in place in time, or whether we will spend the next 18 months relearning the lessons we already learned in robotaxis, this time with humanoids in warehouses and homes.

I think we can do better. We are building accordingly.

State16 is the universal guardrail and evaluation platform for Physical AI. We are currently engaging with robotics, autonomous vehicle, and industrial logistics partners on early-access deployments and POCs.

Reach out: barak@state16.ai · www.state16.ai