Markov Chains: A Practitioner's Refresher

Transition matrices, n-step probabilities, and stationary distributions — the mathematical scaffolding behind regime-switching models, credit migration matrices, and any system where the next state depends only on the current one.

Markov Chains appear throughout quantitative finance: regime-switching models for asset returns, credit rating migration matrices, Hidden Markov Models for signal detection, and any system where the probability of transitioning to the next state depends only on the current state. This post walks through the core mechanics — transition matrices, multi-step probabilities, forecasting, and stationary distributions.

Definition

A Markov Chain is a stochastic process describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

A two-state Markov process diagram showing states E and A with transition probabilities. From state A, the probability of transitioning to E is 0.4 and remaining in A is 0.6.

Formally:

  • Let sts_t be a random variable taking values in {1,2,,N}\{1, 2, \ldots, N\}
  • Let ai,j=P(st=jst1=i)a_{i,j} = \mathbb{P}(s_t = j \mid s_{t-1} = i) denote the transition probability from state ii to state jj
  • sts_t is a Markov Chain if P(st=jst1=i,st2=k,)=P(st=jst1=i)=ai,j\mathbb{P}(s_t = j \mid s_{t-1} = i, s_{t-2} = k, \ldots) = \mathbb{P}(s_t = j \mid s_{t-1} = i) = a_{i,j}

The state dynamics are fully specified by the transition matrix:

A=[a1,1a1,Nai,1ai,NaN,1aN,N]A = \begin{bmatrix} a_{1,1} & \cdots & a_{1,N} \\ a_{i,1} & \cdots & a_{i,N} \\ a_{N,1} & \cdots & a_{N,N} \end{bmatrix}

Since the elements of each row must sum to unity:

ai,1++ai,N=1a_{i,1} + \cdots + a_{i,N} = 1

State Vectors

Let eie_i denote the ii-th row vector of the N×NN \times N identity matrix. Let ξt\xi_t denote a 1×N1 \times N row vector that equals eie_i when the state sts_t is equal to ii:

ξt=(0,,0,1i-th element,0,,0)\xi_t = (0, \ldots, 0, \underbrace{1}_{i\text{-th element}}, 0, \ldots, 0)

The expectation of ξt+1\xi_{t+1} is a vector whose jj-th element is the probability that st+1=js_{t+1} = j:

E(ξt+1st=i)=(ai,1,,ai,N)\mathbb{E}(\xi_{t+1} \mid s_t = i) = (a_{i,1}, \ldots, a_{i,N})

We infer that E(ξt+1st=i)=ξtA\mathbb{E}(\xi_{t+1} \mid s_t = i) = \xi_t \mathbf{A}, or more generally E(ξt+1ξt)=ξtA\mathbb{E}(\xi_{t+1} \mid \xi_t) = \xi_t \mathbf{A}, and since sts_t follows a Markov Chain:

E(ξt+1ξ1,,ξt)=ξtA\mathbb{E}(\xi_{t+1} \mid \xi_1, \ldots, \xi_t) = \xi_t \mathbf{A}

In plain English: when st=is_t = i, ξt\xi_t selects the ii-th row of the identity matrix. Multiplying by the transition matrix A\mathbf{A} extracts the corresponding row of transition probabilities. Because the chain is Markov, only the current state matters — all the history is irrelevant. It’s really just a structured way to look up the right transition probabilities.

Two-Step Transition Probabilities

The probability that st+2=js_{t+2} = j given st=is_t = i is:

P(st+2=jst=i)=ai,1a1,j++ai,NaN,j\mathbb{P}(s_{t+2} = j \mid s_t = i) = a_{i,1}a_{1,j} + \cdots + a_{i,N}a_{N,j}

This is the (i,j)(i, j) element of A2\mathbf{A}^2. Intuitively, to get from state ii to state jj in two steps, we sum over all possible intermediate states — weighting each path by the product of its transition probabilities.

N-Step Transition Probabilities

In general, the probability that st+m=js_{t+m} = j given st=is_t = i is the (i,j)(i, j) element of Am\mathbf{A}^m:

E(ξt+mξt,ξt1,,ξ1)=ξtAm\mathbb{E}(\xi_{t+m} \mid \xi_t, \xi_{t-1}, \ldots, \xi_1) = \xi_t \mathbf{A}^m

We locate the appropriate entry in the transition matrix raised to the mm-th power.

Forecasting Transitions

Assume we have received information ItI_t up to date tt. Let

πt=[P(st=1It),,P(st=NIt)]\pi_t = [\mathbb{P}(s_t = 1 \mid I_t), \ldots, \mathbb{P}(s_t = N \mid I_t)]

denote the conditional probability distribution over the state space. Since πt=E(ξtIt)\pi_t = \mathbb{E}(\xi_t \mid I_t), we infer that

πt+1=E(ξt+1It)=E(ξtIt)A\pi_{t+1} = \mathbb{E}(\xi_{t+1} \mid I_t) = \mathbb{E}(\xi_t \mid I_t) \mathbf{A}

If ItI_t contains no leading information, the forecast state distribution is:

πt+1=πtA\pi_{t+1} = \pi_t \mathbf{A}

The one-step-ahead conditional distribution is the current distribution multiplied by the transition matrix. All the information is already embedded in πt\pi_t.

Stationary Distributions

A distribution π\pi is stationary if it satisfies π=πA\pi = \pi \mathbf{A} — the forecast distribution is the same as the current one. If the Markov Chain is ergodic, the system

π=πAπι=1\begin{aligned} \pi &= \pi \mathbf{A} \\ \pi \iota &= 1 \end{aligned}

has a unique solution, where ι\iota denotes the N×1N \times 1 vector of ones. This stationary distribution is the long-run proportion of time spent in each state, regardless of the starting point — a property that makes it central to equilibrium analysis in credit models, economic regime models, and any setting where we care about steady-state behaviour.

Where This Shows Up in Practice

The machinery above is the scaffolding behind several standard tools in quantitative finance:

  • Regime-switching models for asset returns (Hamilton, 1989) treat the economy as a Markov chain over latent states — typically a “high-volatility” and “low-volatility” state — and use maximum likelihood or Bayesian inference to back out the transition probabilities from observed returns.
  • Credit rating migration matrices published by rating agencies are transition matrices for a Markov chain over rating categories. The n-step probability calculation gives the distribution of likely ratings n years from today, which feeds directly into credit portfolio risk models.
  • Hidden Markov Models extend the framework by treating the state as unobservable and the observed data (returns, volumes) as emissions from the hidden state. HMMs are used for trade classification, market microstructure analysis, and anomaly detection.

In all three applications, the stationary distribution tells you the long-run proportion of time the system spends in each state — a quantity that matters for setting capital reserves, computing unconditional risk premia, and understanding the base-rate behaviour of the system you’re modelling.