Markov Chains

kw4u19
Nov 5, 2020
2 min read

Updated: Jan 20, 2021

How does a games AI decide which action to take? It uses matrix multiplication in the form of Markov Chains.

You begin with a state matrix which keeps track of the values we’re going to use for our AI to make decisions.

Next, there is a transition matrix. This makes changes to the values in the state matrix every time there is some in-game action.

When this happens, the state matrix is multiplied by the transition matrix, and you get a new state matrix.

Here is an example state matrix:

S = [ 0.1, 0.1, 0.1 ]

Each number is representative of a location. The total safety of the area is 4. The higher the number, the safer the area. An AI needs to know which area is safest so it can move to cover when the player is shooting at it.

A transition matrix has to be made for when a shot is fired into the area. A transition matrix shows the influences on the other aspects. When the player shoots at location A, for example, this area will obviously be much less than the others.

A shot is fired at position A. Here is the transition matrix:

A is the active target so isn’t very safe at all, so we give it a value of 10% safe.

The overall safety of the are a=has decreased so the other location become less safe. Let’s say the safety decreases by 20% in these other zones.

Because A is the active target, the other locations are benefiting and somewhat safer, so A has an influence.

We multiply this transition matrix by the beginning state matrix to get the new overall safety of the area.

(STATE MATRIX) X (TRANSITION A MATRIX) = NEW STATE MATRIX

MY MARKOV CHAIN

Scenario: The player is trying to escape an earthquake, and the ground is breaking beneath them. There are multiple directions they can choose to travel in, and the AI’s goal is to predict which path that they will to take in order to try and cut them off.

As the player choose paths along one axis (X), the AI will cause more damage along the perpendicular axis ahead of them to make them fail.

So, as the player choose different paths further to the left, right side or centre of the track, the AI will destroy the path the player is most likely to take. This means the player would have to play very tactically and erratically.

There are 3 options: paths A, B and C.

Here is the state matrix:

S = [ 0.1, 0.1, 0.1]

Each value represents the use of the paths and therefore where it will crumble most. While low, the AI knows the player is not using it often so won’t want cause too much damage there but instead on the other routes.

Once the player chooses a path, the number will increase and that’s where the path will mostly crumble to challenge the player.

Here is a transition matrix for when the player chooses path A:

This shows how any changes to path A when the player interacts in it will result in the most changes being applied to that chosen path and less to the other two alternative paths.

Markov Chains

Recent Posts

Comments