Evolution path

AI video to world models is the shift from clips to state.

AI video to world models explains how generated clips become action-conditioned worlds, persistent 3D spaces, and agent societies.

From clip to placeShort explanationNext step
From clip to place

Scene explainer

Read the page as scenes.

Each page reads as a visual path first, then keeps the source-backed links nearby.

01

Stage 1

AI video made synthetic reality visible.

The first mainstream mental model was simple: type a prompt, receive a moving scene. That made synthetic environments easy to see, share, and evaluate.

EMO dossier
02

Stage 2

Interactive worlds changed the verb from watch to control.

Genie, Oasis, and related systems make user action part of the generated output. Movement, steering, and camera control become important signals, not just afterthoughts.

Genie 3 dossier
03

Stage 3

Persistent worlds make return visits and downstream workflows possible.

Marble, HY-World 2.0, and other 3D world systems shift attention toward editable geometry, exported assets, larger spaces, and reusable world state.

Marble dossier

Stage 1

AI video made synthetic reality visible.

The first mainstream mental model was simple: type a prompt, receive a moving scene. That made synthetic environments easy to see, share, and evaluate.

The limit is that the viewer remains outside the frame. The clip can be impressive while still lacking controllable space, persistent state, or action feedback.

Stage 2

Interactive worlds changed the verb from watch to control.

Genie, Oasis, and related systems make user action part of the generated output. Movement, steering, and camera control become important signals, not just afterthoughts.

This is the bridge from AI video toward world models. The user is no longer only watching a generated scene. The user is testing whether the scene behaves like a place.

Stage 3

Persistent worlds make return visits and downstream workflows possible.

Marble, HY-World 2.0, and other 3D world systems shift attention toward editable geometry, exported assets, larger spaces, and reusable world state.

Once a world can be returned to, edited, exported, or populated with agents, it becomes a platform surface rather than a one-time generated asset.