Hossein Falaki Computer Science Department, University of California, Los Angeles firstname.lastname@example.org
This is an introduction to the concepts of Markov Chains, Jump Processes, and Diﬀusions. We will start with deﬁning the Markov property, and Markov chains, which will lead to the deﬁnition of the transition matrix of a Markovchain. We will derive the stationary distribution of a Markov chain and present some results related to Markov Chains Monte Carlo (MCMC) methods. We will deﬁne the mixing rate of a Markov chain and observe the arrow of time. We will generalize these concepts by considering continuous time transitions. This will lead us to jump processes. We will derive the Kolmogorov equations to ﬁnd the generatormatrix of a jump process, and use these tools to model Brownian motion as a special case. We will introduce Stochastic Diﬀerential Equations and the Ito Calculus that helps solving such equations. As an example we will model the stock market as a jump process and solve the equivalent SDE.
A process has the Markov property if the probability distribution of its future stateis independent of past states, and only depends on the current state. In other words, if each Xi represents the state at time step i: P r(Xt+1 = y|X0 = x0 , . . . , Xt = xt ) = P r(Xt+1 = y|Xt = xt ) Or Xt+1 ⊥(Xt−1 , . . . , X1 )|Xt In a ﬁnite state space Ω, a sequence of random variables, Xi , on Ω that has the Markov property if is called a Markov chain . Markov chains are often graphicallyrepresented with a graph that includes a vertex for each state of Ω and a directed edge from state A to state 1
Figure 1: An example Markov chain . Each vertex represents a city and each edge represents the probability of traveling from a city to another. B, if P r(Xt = B|Xt−1 = A) is non-zero. Each edge is labeled with the transition probabily. Figure 1 is an example Markov chain that representsthe probability of a person’s travel between three cities. A good way to understand Markov chains is the population perspective. Assume each state of the Markov chain is a city (Figure 1). Also assume there is a total population of 1 million distributed among these cities. At each step each person cam “migrate“ to a neighboring city with some probability (according to the weight of the edge betweenthose two cities). We will use this perspective throughout this note to understand many of the concepts.
The conditional probabilities of a Markov chain can be summarized in a matrix, referred to as the transition matrix. Assuming that Ω has n states, the transition matrix is an n × n matrix and each element, kij , is deﬁned as follows: kij = P r(Xt = j|Xt−1 = i)For simpler notation let Pi Pj
(t) (t) (t)
= P r(Xt = i). Then: =
Also let P (t) = [P1 , P2 , . . . , Pn ], then: P (t+1) = P (t) × K 2
Similar to the transition matrix, a 2-step transition matrix, K (2) , is an n × n matrix that: (2) kij = P r(Xt+2 = j|Xt = i) The 2-step transition matrix can be derived from the transition matrix.
P r(Xt+2 = j|Xt = i)=
P r(Xt+2 = j ∩ Xt+1 = l|Xt = i) P r(Xt+1 = l|Xt = i)P r(Xt+2 = j|Xt+1 = l)
Therefore: K (2) = K 2 And in general: K (n) = K n (1)
In a Markov chain the probability distribution is stationary if and only if it does not change at each step. In other words: Π=Π×K where Π = [π1 , π2 , . . . , πn ]. Theorem 2.1. For a ﬁnite Markovchain there exists a unique stationary distribution such taht: limt→∞ P (t) = Π We will present the proof to this theorem later, but for now we can present the formal deﬁnition of a Markov chain . Deﬁnition 2.2. A Markov chain is deﬁned as (Ω, K, Π), where Ω is a ﬁnite state space, K is the transition matrix and Π is the stationary distribution. Proof. Consider the Kullback-Leibler distance the...