Why is Kemeny's constant a constant?

44 downloads 0 Views 139KB Size Report
Nov 9, 2017 - arXiv:1711.03313v1 [math.PR] 9 Nov 2017 ... November 10, 2017. Abstract .... by the key renewal theorem (Resnick [9, Section 3.8]),. = πjEi[θj].
arXiv:1711.03313v1 [math.PR] 9 Nov 2017

Why is Kemeny’s constant a constant? Dario Bini Jeffrey J. Hunter Guy Latouche Beatrice Meini Peter G. Taylor November 10, 2017 Abstract In their 1960 book on finite Markov chains, Kemeny and Snell established that a certain sum is invariant. This sum has become known as Kemeny’s constant. Various proofs have been given over time, some more technical than others. We give here a very simple physical justification, which extends without a hitch to continuoustime Markov chains on a finite state space. For denumerably infinite state space, the physical argument holds but the constant may be infinite. We consider the special case of birth-and-death processes and determine the condition for Kemeny’s constant to be finite. Keywords: Discrete Markov chains, passage times, renewal processes, deviation matrix

1

Introduction

Define a discrete-time, irreducible Markov chain {Xt : t = 0, 1, . . .} on a finite state space S, with transition matrix P and stationary probability vector π: π T P = π T , π T 1 = 1. Define the first return times {Ti : i ∈ S} Ti = inf{t ≥ 1 : Xt = i}. Kemeny and Snell [7, Theorem 4.4.10] proved that X πj Ei [Tj ] = K,

(1)

(2)

j∈S

independently of the initial state i, where Ei [·] denotes the conditional expectation given that X0 = i; K is known as Kemeny’s constant. A prize was offered to the first person to give an intuitively plausible reason for the above sum to be independent of i (Grinstead and Snell [6, 1

Page 469], the prize was won by Doyle with an argument given in the next section. We show in Section 3 that (2) results from the obvious fact that a discrete-time Markov chain takes n steps during an interval of time of length n, independently of the initial state i. We extend our argument to finite continuous-time Markov chains in Section 4. If the state space is denumerably infinite, the situation becomes more complex as the sum in (2) becomes a series and it might not converge. We consider in Section 5 the case of positive recurrent birth-and-death processes and determine conditions for K to be finite. In short, the series always diverges for discrete-time birth-and-death processes, while in continuous time, K is finite if transitions from state i occur sufficiently fast as i approaches infinity.

2

A simple algebraic proof

The simplest proof goes as follows: define ωi = X1 and write X X Pik Ek [Tj ] ωi = 1 + πj k∈S,k6=j

j∈S

=1+ =

X j∈S

=

X

X j∈S

X

πj

Pik Ek [Tj ]

πj

X

Pik Ek [Tj ] −

X

Pij ,

P

j∈S

πj Ei [Tj ], condition on

using πj = 1/Ej [Tj ],

j∈S

k∈S

k∈S

Pik ωk

k∈S

or ω = P ω in vector notation; Doyle [4] argued from the maximum principle that all components of ω must be equal. Alternatively, one may conclude from the Perron-Frobenius Theorem that ω must be proportional to the eigenvector 1 of P . Instead of the return times Tj , we shall use the first passage times {θi : i ∈ S} with θi = inf{t ≥ 0 : Xt = i}. (3) The only difference is that θi = 0 < Ti if X0 = i, otherwise θi = Ti ≥ 1. Using θj instead of Tj , we obtain another version of Kemeny’s constant: X πj Ei [θj ] = K ′ j∈S

2

where K ′ = K −1. We prefer to work with this version of Kemeny’s constant because it helps us establish a direct connection with the deviation matrix D of the Markov chain. We shall discuss this in the next section.

3

Discrete time

Our physical justification is based on the following argument. We start from X Ei [θj ] X πj Ei [θj ] = (4) E j [Tj ] j∈S j∈S that we transform to =

X j∈S

lim (Ej [Nj (n)] − Ei [Nj (n)]),

n→∞

where Nj (n) =

X

(5)

1{Xt = j}.

0≤t≤n

is the total number of visits to j during the interval of time [0, n]. The formal justification for the transition from (4) to (5) is given in Lemma 3.1, but we give a heuristic argument first, it is explained with the help of the figure below. × ×× × × × × ×✲ ◦ × × × × ✲ The line on top is a representation of a trajectory of the renewal process (k) (k) {θj : k ≥ 0} of successive visits to j, starting from X0 = j; the θj s are marked with a cross ×. The second line represents a trajectory of the delayed renewal process of visits to j, starting from X0 = i. Now, the jth term in the right-hand side of (5) is the expected difference between the total number of events in both processes. We observe fewer visits to j if the process starts from i 6= j because of the initial delay; Ei [θj ] is the expected length of that delay, Ej [Tj ] is the expected length of intervals between visits to j, and the ratio Ei [θj ]/Ej [Tj ] is the expected number of visits that are missed over the whole history of the process by starting from i instead of j. The formal argument is given now. Lemma 3.1 For a discrete-time Markov chain, Ei [θj ] = lim (Ej [Nj (n)] − Ei [Nj (n)]) Ej [Tj ] n→∞ for all i and j. 3

Proof The statement is obvious if i = j for then Ej [θj ] = 0 by definition of θj . We assume now that i and j are different, arbitrary but fixed, and ei (n) = Ei [Nj (n)] and fi (t) = Pi [θj = t], to simplify the notation we define N with fi (0) = 0. We have X ej (n) = N Pj [Xν = j]. 0≤ν≤n

Furthermore, we condition on the first visit to state j and write X ei (n) = ej (n − t) N fi (t)N 0≤t≤n

=

X

fi (t)

0≤t≤n

=

X

Pj [Xν = j]

0≤ν≤n−t

X

Pj [Xν = j]

0≤ν≤n

X

fi (t)

0≤t≤n−ν

for n ≥ 0 and so ej (n) − N ei (n) = N

X

Pj [Xν = j]Pi [θj > n − ν].

0≤ν≤n

Finally, ej (n) − N ei (n)) lim (N X Pj [Xn−ν = j]Pi [θj > ν] = lim

n→∞

n→∞

=

0≤ν≤n

1 X Pi [θj > ν] Ej [Tj ] ν≥0

by the key renewal theorem (Resnick [9, Section 3.8]), = πj Ei [θj ]. This completes the proof.



If S is finite, we may immediately write Equation (5) as X X X πj Ei [θj ] = lim ( Ej [Nj (n)] − Ei [Nj (n)]) j∈S

n→∞

= lim ( n→∞

j∈S

X

j∈S

Ej [Nj (n)] − (n + 1))

j∈S

4

(6)

independently of i. If S is denumerably infinite, we need to be more careful but the exchange of limits is justified because the differences Ej [Nj (n)] − Ei [Nj (n)] are positive for all i, j and all n. We have successively X lim (Ej [Nj (n)] − Ei [Nj (n)]) (7) j∈S

n→∞

X

= lim

K→∞

j∈SK

lim (Ej [Nj (n)] − Ei [Nj (n)])

n→∞

where {SK } is a monotone sequence of finite subsets converging to S X (Ej [Nj (n)] − Ei [Nj (n)]) = lim lim K→∞ n→∞

(9)

j∈SK

X

= lim lim

n→∞ K→∞

(8)

(Ej [Nj (n)] − Ei [Nj (n)])

(10)

j∈SK

by monotone convergence, with the possibility that both sides are infinite, X X Ei [Nj (n)]) (11) Ej [Nj (n)] − = lim lim ( n→∞ K→∞

= lim ( n→∞

X

j∈SK

Ej [Nj (n)] −

j∈S

X

j∈SK

Ei [Nj (n)])

(12)

j∈S

since both series are bounded by n + 1. Our physical interpretation is based on a change of perspective, from first passage times to number of visits. The deviation matrix lends itself beautifully to such a change of point of view. It is defined as X D= (P n − 1 · π T ) (13) n≥0

if the series converges. By Syski [10, Proposition 3.2], Coolen-Schrijner and van Doorn [3, Theorem 4.1], the series converges if and only if Eπ [θj ] < ∞ for some state j ∈ S — and then it is finite for every j. If |S| < ∞, then D = (I − P )# , the group inverse of I − P (see Campbell and Meyer [2]). Obviously, Dij = limn→∞ (Ei [Nj ] − Eπ [Nj ]), where Eπ [·] denotes the conditional expectation, given that X0 has the distribution π. In addition, Djj = πj Eπ [θj ] by Coolen-Schrijner and van Doorn [3, Equation (5.7)], and so X K′ = Djj . (14) ′

To see this, we write K = of summation.

P

j∈S

i∈S

πi

P

j∈S

5

πj Ei [θj ] and interchange the order

4

Continuous time

Consider a continuous-time Markov chain with irreducible generator Q. Define the first passage time and the first return time as θi = inf{t ≥ 0 : Xt = i} Ti = inf{t ≥ J1 : Xt = i}, where J1 is the first jump time of the Markov chain, if X0 = i, then θi = 0 < Ti , otherwise θi = Ti > 0. Lemma 3.1 becomes Lemma 4.1 For a continuous Markov chain, we have πj Ei [θj ] = lim (Ej [Mj (t)] − Ei [Mj (t)]) t→∞

where Mj (t) =

Rt 0

1{X(u) = j} du is the total time spent in j until time t.

Proof We follow the same steps as in Lemma 3.1 with the only difference that here Z t Ej [Mj (t)] = Pj [X(u) = j] du 0

and Ei [Mj (t)] =

Z

t

dGi (v)Ej [Mj (t − v)]

0

where Gi (t) = Pi [θj ≤ t].



From this, we obtain X X πj Ei [θj ] = lim ( Ej [Mj (t)] − t) t→∞

j∈S

=

X

independently of i,

j∈S

(15)

πj Eπ [θj ].

j∈S

=

X

Djj

j∈S

Here, the deviation matrix is D =

5

R∞ 0

(eQt − 1 · π T ) dt.

Birth-and-death processes

Let us assume now that {X} is a birth-and-death process on the infinite state space {0, 1, . . .}. Choosing X0 = 0 without loss of generality, we have X K′ = πj E0 [θj ]. (16) j≥0

6

We shall examine continuous- and discrete-time processes simultaneously. It will emerge that the series (16) always diverges in the discrete-time case. In continuous-time, we denote by λn and µn the transitions rates from n to n+1 and from n to n−1, respectively; in discrete-time these are the one-step transition probabilities. We assume that λn > 0 for all n ≥ 0, µn > 0 for all n ≥ 1, so that the process is irreducible. We furtherP assume that the birth-and-death process is positive recurrent, so that B = n≥0 βn is finite, where β0 = 1,

βn =

λ0 λ1 · · · λn−1 µ1 µ2 · · · µn

for n ≥ 1,

(17)

and the stationary distribution is given by πn = B −1 βn , see [3, Equation (6.3)]. Theorem 5.1 For an irreducible, positive recurrent, birth-and-death process on {0, 1, . . .}, the constant K ′ is finite if and only if Θ < ∞, where X X πj . (18) (λk πk )−1 Θ= j≥k+1

k≥0

In that case, K ′ = Θ − Eπ [θ0 ], with X X Eπ [θ0 ] = (λk πk )−1 ( πj )2 < Θ. k≥0

Proof We start from (16) and write X X X K′ = πj (λk πk )−1 πℓ j≥1

=

X

0≤k≤j−1

(λk πk )−1 (

k≥0

=

X k≥0

X

(λk πk )

X

j≥k+1

by [3, Equation (6.4)].

0≤ℓ≤k

πj )(1 −

j≥k+1 −1

(19)

j≥k+1

πj −

X k≥0

X

πℓ )

ℓ≥k+1

(λk πk )−1 (

X

πj )2

(20)

j≥k+1

if both series converge. The first series is Θ by definition, the second is equal to Eπ [θ0 ] by [3, Equation (6.6)]. It is obvious that Eπ [θ0 ] ≤ Θ. If Θ < ∞, then Eπ [θ0 ] < ∞, the deviation matrix is finite and K ′ = Θ − Eπ [θ0 ] by (20). If Θ = ∞ and Eπ [θ0 ] < ∞, then K ′ = ∞ by (20) again. Finally, if Θ = ∞ and Eπ [θ0 ] = ∞, then Eπ [θj ] = ∞ for all j and K ′ = ∞ by (15).  7

Remark 5.2 The right hand side of (18) is the ’D Series’ for the continuoustime birth and death process with birth rates λn and death rates µn , see Anderson [1, Page 261], or Kijima [8, page 245]. A recurrent continuous-time birth and death process with a finite D Series is said to have an entrance boundary at ∞, a classification that goes back to Feller [5]. Example 5.3 For the M/M/1 queue, λn = λ and µn = µ, independently of n. The process is positive recurrent if and only if the ratio ρ = λ/µ is strictly less than 1, and πn = (1 − ρ)ρn . Equation (18) becomes X X X Θ= λ−1 ρ−k ρj k≥0

=

X

j≥k+1

1/(µ − λ) = ∞

k≥0

so that Kemeny’s constant is infinite. However, it may be finite for a process that we name the speeded-up M/M/1 queue: we take an arbitrary sequence {λn } and define µn = ρλn−1 , with ρ < 1. Here, βn = ρn so that the process is positive recurrent, πn = (1 − ρ)ρn for any λs and X X X −k Θ= λ−1 ρ ρj k k≥0

=

X

j≥k+1

λ−1 k ρ(1

− ρ)

k≥0

which converges if λn → ∞ sufficiently fast. In that case, µn tends to ∞ also. The speeded-up M/M/1 queue example suggests that for Kemeny’s constant to be finite, transitions have to occur faster as the process is further away from 0. Actually, as we show in the next lemma, it is necessary that transitions from n to n − 1 occur sufficiently fast, transition rates from n to n + 1 being less critical. Lemma P 5.4 For Θ to be finite, it is necessary, but not sufficient, that the series j≥1 1/µj converges. Proof We rewrite (18) as Θ=

X j≥1

8

fj

(21)

with fj = πj

X

(λk πk )−1

0≤k≤j−1

= πj−1

λj−1 X ( (λk πk )−1 + (λj−1 πj−1 )−1 ) µj 0≤k≤j−2

for j ≥ 1,

= (λj−1fj−1 + 1)/µj

(22)

if we define f0 = 0. P This shows that fj ≥ 1/µj , so that the series (21) diverges if j≥1 1/µj diverges. The proof that this is not a sufficient condition is given by Example 5.7 below.  Remark 5.5 From [3, Equation (6.6)], we see that Θ = limn→∞ En [θ0 ] and we may interpret Theorem 5.1 as saying that, for Kemeny’s constant to be finite, it is necessary (and sufficient) that having ventured to any state n, no matter how far from the origin, the process will reach state 0 in bounded expected time. In discrete-time, every transition from a state to one of its neighbours requires at least one unit of time, so that En [θ0 ] ≥ n is unbounded. This tells us why (16) diverges in that case. Lemma 5.4 givesP a formal justification: here, µn ≤ 1 − λn < 1 by assumption, and the series j≥1 1/µj diverges.

Example 5.6 A direct consequence of Lemma 5.4 is that K ′ is infinite for the M/M/∞ queue for which µn = nµ: the transition rates from n to n − 1 are not large enough. We may, however, use Lemma 5.4 to design processes for which Kemeny’s constant is finite. To that end, we write (22) as µj = (λj−1fj−1 + 1)/fj ,

(23)

choose a sequence {fj } such that the series (21) converges, define {µj } by (23) and findP a sequence {λj } such that the process is positive recurrent, that is, such that n≥0 βn converges, with βn defined in (17). Two such examples follow. For the first, f1 = 0, fj = 1/j 2 , for j ≥ 1, µ1 = 1, µj = j 2 (1 + 1/(j − 1)2 ), for j ≥ 2, and λj = 1 for all j.

9

We easily see that βn = 1/

Y

j 2 (1 + 1/(j − 1)2 ) < 1/(n!)2 .

2≤j≤n

For the second, fj = γ j , with γ < 1, µj = γ −j + λj−1 γ −1 , and {λj } is arbitrary. Here, βn =

Y

0≤j≤n−1

λj /

Y

(λj γ −1 + γ −(j+1) ) < γ n .

0≤j≤n−1

Example 5.7 This last example shows that Lemma 5.4 is not a necessary and sufficient condition. Take µj = j 1+α , with 0 < α < 1, and λj = µj , for j ≥ 1, λ0 = 1. P P With these parameters βn = 1/µn , so that both n≥1 1/µn and n≥1 βn converge. By (22), we have fj µj = 1 + fj−1 µj−1 = j, so that X X X fj = j/µj = 1/j α j≥1

j≥1

j≥1

diverges.

6

Acknowledgment

This paper is an outgrowth of discussions that the second, third and fifth authors had during the International Workshop on Matrices and Statistics in Funchal, in June, 2016. P.G. Taylor’s research is supported by the Australian Research Council (ARC) Laureate Fellowship FL130100039 and the ARC Centre of Excellence for the Mathematical and Statistical Frontiers (ACEMS).

10

References [1] W. J. Anderson. Continuous-Time Markov Chains: An ApplicationsOriented Approach. Springer-Verlag, New York, 1991. [2] S. L. Campbell and C. D. Meyer, Jr. Generalized Inverses of Linear Transformations. Dover Publications, New York, 1991. Republication. [3] P. Coolen-Schrijner and E. A. van Doorn. The deviation matrix of a continuous-time Markov chain. Probab. Engrg. Informational Sci., 16:351–366, 2002. [4] P. Doyle. The Kemeny constant of a Markov chain. ArXiv e-prints, arXiv:0909.2636v1 [math.PR], 2009. [5] W. Feller. The birth and death processes as diffusion processes. J. Math. Pures Appl. (9), 38:301–345, 1959. [6] C. M. Grinstead and J. L. Snell. Introduction to Probability. AMS, Providence, R.I., 1997. [7] J. G. Kemeny and J. L. Snell. Finite Markov Chains. Van Nostrand, Princeton, NJ, 1960. [8] M. Kijima. Markov Processes for Stochastic Modelling. Chapman and Hall, London, 1997. [9] S. I. Resnick. Adventures in Stochastic Processes. Birkhäuser Boston, Cambridge, MA, 1992. [10] R. Syski. Ergodic potential. Stoch. Proc. Appl., 7:311–336, 1978.

11