World model evolution

World model evolution from Explore to Build.

The useful story is not a date list. World model evolution becomes practical as people can explore possible worlds, create spaces, control them, simulate action, and build inhabited systems.

01 / Explore

Explore Possible Worlds

The first user-facing leap is that a generated scene can continue, change, and suggest what may happen next.
What changed
The goal moves beyond making a frame. The model has to keep enough structure to forecast what can happen next.
Why it matters
Surveys describe the field around two jobs: representing the world now and predicting future states for decisions.
Boundary
This stage is a function of world models, not a claim that every visual demo has deep physical reasoning.
Interactive Game WorldGenie

AI learns world rules and creates playable environments.

Realtime Playable WorldOasis

An action-conditioned world model renders an interactive Minecraft-like experience frame by frame.

02 / Create

Create Spaces

The output becomes a place: 360 horizons, spatial scenes, editable 3D worlds, and reusable geometry.
What changed
World models stop looking like only video and start producing scenes that can be entered, edited, or exported.
Why it matters
3D/4D surveys separate spatial world modeling from ordinary 2D video generation because geometry and time matter.
Boundary
Skybox is still a panoramic environment lane; Marble and HY-World are stronger spatial-world examples.
360 Environment WorldSkybox

Generate immersive world horizons from text.

Spatial World ModelMarble

AI starts generating worlds you can enter.

3D World ModelHY-World

Tencent's HY stack turns prompts and visual inputs into persistent 3D worlds.

03 / Control

Control the World

The user begins steering the world with movement, prompts, camera direction, or player-like actions.
What changed
The screen stops being a clip. It becomes a place that reacts while the next moment is still being generated.
Why it matters
Embodied and action-model surveys emphasize action-conditioned rollouts, control, and long-horizon consistency.
Boundary
Interactive demos are not automatically open-ended platforms; availability and access must stay source-backed.
Realtime Playable WorldOasis

An action-conditioned world model renders an interactive Minecraft-like experience frame by frame.

Interactive Game WorldGenie

AI learns world rules and creates playable environments.

Open World ModelHappy Oyster

An AI-generated open world with wandering exploration.

04 / Simulate

Simulate Action

World models become useful when they can support action, planning, physical consistency, and simulation.
What changed
The important question shifts from visual fidelity to whether a generated rollout can guide or test behavior.
Why it matters
Physical AI and embodied-AI surveys point to metrics like task performance, physical consistency, and real-time control.
Boundary
The seven-model visual axis shows the simulation direction; robotics-specific systems remain in the news layer.
3D World ModelHY-World

Tencent's HY stack turns prompts and visual inputs into persistent 3D worlds.

Realtime Playable WorldOasis

An action-conditioned world model renders an interactive Minecraft-like experience frame by frame.

Interactive Game WorldGenie

AI learns world rules and creates playable environments.

05 / Build

Build Inhabited Worlds

A world becomes a social system when many agents can form routines, roles, markets, and memory.
What changed
The world is no longer only terrain or camera motion. It becomes a substrate for collective behavior.
Why it matters
World-model surveys include social simulacra and decision environments as part of the wider evolution.
Boundary
Project Sid is a many-agent simulation lane, not a generative visual world model like Marble or Genie.
Many-Agent CivilizationProject Sid

A Minecraft-based simulation where large groups of AI agents specialize, form rules, and transmit culture.