The agent has to decide between two actions - moving the cart left or right - so that the pole attached to it stays upright.
State is the difference between the current screen patch and the previous one. This will allow the agent to take the velocity of the pole into account from one image.