World Models Can Render Anything. But Can They Think?

Introducing WM Bench: A Benchmark for Cognitive Intelligence in World Models FINAL Bench Family · March 2026 The field of world models has made remarkable progress. From NVIDIA Cosmos to Meta V-JEPA 2, from DeepMind Genie 3 to Physical Intelligence π0, the pace of development is extraordinary. Yet a question remains largely unanswered: How do we measure whether a world model actually understands what is happening — not just renders it convincingly? FID tells us a model's output looks realistic. FVD tells us its videos flow naturally. HumanML3D and BABEL tell us its motions are human-like. None of them tell us whether the model thinks . The Gap We're Trying to Address Consider a simple scenario: a charging beast, 3 meters away, closing fast. A world model with excellent FID scores can generate that scene beautifully. But does it know the character should sprint away — not walk? Does it respond differently when the threat is a human rather than an animal? Does it remember that the left c

World Models Can Render Anything. But Can They Think?

Related Articles

UVWATAUAVAWH, The Pushy String

15 Years of Forking (Waterfox)

The Steam Controller D0ggle Adventure

Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation

telecheck and tyms past

Related Articles

News
UVWATAUAVAWH, The Pushy String
Lobsters • 2h ago

News
15 Years of Forking (Waterfox)
Lobsters • 3h ago

News
The Steam Controller D0ggle Adventure
Lobsters • 3h ago

News
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation
Dev.to • 7h ago

News
telecheck and tyms past
Lobsters • 8h ago