2.1 KiB
2.1 KiB
title, created_date, updated_date, aliases, tags
| title | created_date | updated_date | aliases | tags |
|---|---|---|---|---|
| General Robotics | 2024-10-24 | 2024-10-24 |
General Robotics
Hardware
Software
AI and Machine Learning
World Model
https://www.1x.tech/discover/1x-world-model https://worldmodels.github.io/ https://techterrain.substack.com/p/world-models-vs-kalman-filter?r=vudom&utm_campaign=post&utm_medium=web&triedRedirect=true
[!Quote]- World Model Definition by Yann LeCunn Lots of confusion about what a world model is. Here is my definition:
Given:
- an observation x(t)
- a previous estimate of the state of the world s(t)
- an action proposal a(t)
- a latent variable proposal z(t)
A world model computes:
- representation: h(t) = Enc(x(t))
- prediction: s(t+1) = Pred( h(t), s(t), z(t), a(t) )
Where- Enc() is an encoder (a trainable deterministic function, e.g. a neural net)
- Pred() is a hidden state predictor (also a trainable deterministic function).
- the latent variable z(t) represents the unknown information that would allow us to predict exactly what happens. It must be sampled from a distribution or or varied over a set. It parameterizes the set (or distribution) of plausible predictions.
The trick is to train the entire thing from observation triplets (x(t),a(t),x(t+1)) while preventing the Encoder from collapsing to a trivial solution on which it ignores the input.
Auto-regressive generative models (such as LLMs) are a simplified special case in which
- the Encoder is the identity function: h(t) = x(t),
- the state is a window of past inputs
- there is no action variable a(t)
- x(t) is discrete
- the Predictor computes a distribution over outcomes for x(t+1) and uses the latent z(t) to select one value from that distribution.
The equations reduce to:
s(t) = [x(t),x(t-1),...x(t-k)]
x(t+1) = Pred( s(t), z(t) )
There is no collapse issue in that case.