Deterministic approximation for stochastic control problem

1 downloads 0 Views 184KB Size Report
Key Words : Stochastic and deterministic control, Stochastic differential equa- ... prelimit models by using the optimal control of the limiting deterministic system.
DETERMINISTIC APPROXIMATION FOR STOCHASTIC CONTROL PROBLEMS

R.Sh.Liptser*, W.J.Runggaldier**, M.Taksar*** *Department of Electrical Engineering-Systems Tel Aviv University 69978 - Ramat Aviv, Tel Aviv, ISRAEL **Dipartimento di Matematica Pura ed Applicata, Universit´ a di Padova, Via Belzoni 7 35131 - Padova, ITALY ***Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony Brook,N.Y.11794-3600,USA

Abstract We consider a class of stochastic control problems where uncertainty is due to driving noises of general nature as well as to rapidly fluctuating processes affecting the drift. We show that, when the noise ”intensity” is small and the fluctuations become fast, the stochastic problems can be approximated by a deterministic one. We also show that the optimal control of the deterministic problem is asymptotically optimal for the stochastic problems. Key Words : Stochastic and deterministic control, Stochastic differential equations, Weak convergence, Asymptotic optimality. AMS Subject Classification: 93E20, 93C15, 60B10, 60F17, 60G44, 49J15, 49K40, 49M45

Acknowledgement : This work was partially supported by GNAFA / CNR of the Italian National Research Council allowing a visit of the first and last authors at the University of Padova. The work of the last author was also supported by National Science Foundation Grant DMS 9301200 and NATO Scientific Exchange Grant CRG 900147 1

1.Introduction There are only few stochastic control problems that can be solved in closed form. A lot of effort has therefore been put into developing approximation techniques for such problems. One approach in this direction is to consider, instead of the original model, a model where the underlying processes are replaced by simpler ones. This approach makes it possible to construct nearly optimal controls for the original model, based on the solution to the simpler model. This simpler model may involve underlying processes that are diffusions (”diffusion approximation”), but it may also simply be a deterministic model (”fluid approximation”). A general tool, especially for diffusion approximations, are techniques of weak convergence of random processes ([1],[3],[6],[15]) combined with an averaging principle ([5]). This methodology is actively used in various practical problems of engineering, manufacturing, queueing, inventory and others and is studied e.g., in [9],[10],[11],[12],[7],[8],[13]. The underlying idea of this methodology is actually rather simple, but the mathematics required for its implementation is in general quite sophisticated. Although there exist some general approaches (see e.g. [9]), in each particular case the rigorous verification of the convergence of the controlled systems requires specific technical tools and ideas. In the present paper we apply ”fluid approximation” techniques to a rather general stochastic control model with convex control cost function. In this model the controlled process X is described by a stochastic differential equation with respect to a general (not necessarily continuous) martingale M . The control affects the drift of X; this drift is furthermore affected by a rapidly fluctuating exogenous process ξ. To implement the approximation approach, we embed the given model into a family of similar models, parametrized by a small parameter ε > 0. We consider the case when the ”intensity” of the random noise disturbance M becomes small with ε while the ”contaminating” process ξ fluctuates with increasing speed. For such a case the limiting model becomes deterministic and it is possible to obtain asymptotically (as ε ↓ 0) optimal controls for the prelimit models by using the optimal control of the limiting deterministic system. Although we consider explicitly only the case when the controlled state process X can be completely observed, nevertheless our results hold in the same form when the state is only partially observed. In a more formal way, we have a family of controlled stochastic systems, parametrized by a small (positive) parameter ε, (ε ↓ 0), with dynamics   dXtε = a(Xtε , ξt/ε ) + b(Xtε )uε (t) dt + dMtε (1.1) and initial condition X0ε . Here X ε = (Xtε ) is the controlled state (or signal) process, ξ = (ξt ) is the ”contamination” process affecting the drift of X ε , while M ε = (Mtε ) is a process representing the noise in the system. The random function uε = (uε (t)) is the control that affects the drift of X ε in a linear way and satisfies the usual requirements for admissibility (see Definition 2.1 below). Given a finite horizon T > 0, with each control uε we associate the cost (Z ) T

ε

ε

[p(Xtε ) + q(uε (t))]dt + r(XTε ) ,

J (u ) = E

0

2

(1.2)

where p(x), q(u) and r(x) are nonnegative functions on the real line referred to as holding cost, control cost, and terminal cost functions respectively. The objective is to find V ε = infε J ε (uε ).

(1.3)

u

and an optimal (minimizing) control. For practical purposes one may just as well be interested in finding a nearly optimal control or, as will be the case here, an asymptotically (as ε ↓ 0) optimal control. To describe the limiting control model, we assume that the following ergodic properties hold : P − lim X0ε = x0 , x0 ∈ R (1.4) ε→0

1 a(x) = P − lim t→∞ t

Z

t

a(x, ξs )ds ;

0

x∈R

P − lim sup |Mtε | = 0. ε→0 t≤T

(1.5) (1.6)

In the next section we formulate conditions under which (1.4)-(1.6) are valid. The dynamics of the limiting system is given by the following ordinary differential equation (1.7) dx(t) = [a(x(t)) + b(x(t))u(t)] dt ; x(0) = x0 . Here x(t) is a (deterministic) controlled process and u(t) is a (deterministic) control. Define Z T j(u) := [p(x(t)) + q(u(t))] dt + r(x(T )) (1.8) 0

and

v := inf j(u), u

(1.9)

where the infimum is taken over all (deterministic) measurable functions on [0, T ]. Our main results are the following two theorems. Theorem 1.1. The following relation holds lim V ε = v.

ε→0

Theorem 1.2. Let u∗ (t), 0 ≤ t ≤ T, be an optimal deterministic control for (1.7)-(1.9). Then u∗ (t) is asymptotically optimal for (1.1)-(1.3) in the sense that lim |J ε (u∗ ) − V ε | = 0.

ε→0

Remark 1. If for the limit model there exists a feedback control u∗ (t) = u◦ (t, x∗ (t)), where x∗ (t) is the controlled process defined by the differential equation (1.7) with u(t) = u∗ (t), and the function u◦ (t, x) is Lipschitz continuous in x uniformly in t ∈ [0, T ], 3

then the statement of Theorem 1.2 remains true with u◦ (t, Xtε ) replacing u∗ (t), i.e. the feedback control u◦ (t, Xtε ) is asymptotically optimal. Remark 2. The results obtained here for the one dimensional control problem can be extended to an n-dimensional problem. The motivation to consider just the scalar case is to present the main ideas in the simplest form. The main contribution of this paper is twofold : from a more theoretical point of view we obtain a stability result for the optimal control of a deterministic system in the sense that this control is asymptotically optimal for a large class of stochastic control problems of a rather complicated nature. From a practical point of view our results allow one to compute an asymptotically optimal control for a variety of problems under quite general conditions, where a direct approach would be impossible. The proof consists of two parts carried out in Sections 3 and 4 : first we show that v is an asymptotically lower bound for the optimal cost functions V ε . Then we show that the deterministic optimal control of the limiting problem can be applied to the prelimit models, yielding asymptotically optimal cost. Results of more technical nature, interesting in their own right, are moved to appendices (Sections 5,6, and 7). 2. Main assumptions and notations For simplicity we assume ε ∈ (0, 1]. For each ε let SB := (Ω, F, Fε = (Ftε )t≥0 , P ) be a fixed stochastic basis, where (Ω, F, P ) is a complete probability space and Fε is a filtration satisfying the ”usual assumptions” (see [2]). The initial value X0ε of the state process is F0ε - measurable, while (ξtε ), (Mtε ) are Fε -adapted.

Definition 2.1. The control process uε = (uε (t))t≥0 is said to be admissible if it is Fε -adapted and Z T |uε (t)| dt < ∞ , P − a.s. (2.1) 0

Throughout the paper we make the following assumptions : (A.1) The control cost function q(u) is nonnegative convex satisfying q(u) ≥ c|u|1+γ ; c, γ > 0. (A.2) The cost functions p(x) and r(x) are continuous nonnegative satisfying p(x), r(x) ≤ c1 (1 + |x|γ1 ), ; c1 , γ1 > 0. (A.3) There exist x0 ∈ R and positive constants c2 , γ2 such that (i) P − limε→0 X0ε = x0 ∗ (ii) E|X0ε |2n < c2 , where n∗ is the smallest integer such that γ1 < n∗ . (A.4) The function a(x, y) is measurable in (x, y) and satisfies the linear growth and Lipschitz conditions in x (uniformly in y), i.e., there exists ` > 0 such that (i) |a(x, y)| ≤ `(1 + |x|) ; x, y ∈ R (ii) |a(x0 , y) − a(x00 , y)| ≤ `|x0 − x00 | ; x0 , x00 , y ∈ R. (A.5) The function b(x) is bounded and Lipschitz, i.e., (i) |b(x)| ≤ ` (ii) |b(x0 ) − b(x00 )| ≤ `|x0 − x00 | ; x0 , x00 ∈ R. 4

(A.6) The random process ξ = (ξt )t≥0 is ergodic, namely there exists a probability measure λ(dy) on R such that for any bounded and measurable function g(y) Z Z 1 t P − lim g(ξs ) ds = g(y) λ(dy). t→∞ t 0 R (A.7) The process M ε = (Mtε )t≥0 is a square integrable martingale with paths in the Skorokhod space D[0, ∞) whose predictable quadratic variation hM ε it satisfies Rt (i) hM ε it = ε 0 mεs ds with bounded density mεs . The latter means that there exists a constant c3 such that (ii) mεt ≤ c3 ; t ≤ T P − a.s. The jumps ∆Msε := Msε −limv↑s Mvε are bounded, i.e., there exists a constant L > 0 such that (iii) |∆Mtε | ≤ L ; t ≤ T, ε ∈ (0, 1]. Notice that by assumptions (A.4) and (A.5) equation (1.1) has a unique strong solution X ε for every admissible control uε . We shall refer to X ε as the state process associated with uε . The only requirement for the ”contamination” process ξ is its ergodicity; no stationarity of ξ or independence from other processes is required. We furthermore remark that our results remain valid if M ε is any process with paths in D satisfying P i) supt≤T |Mtε | −→ 0, ∀T > 0 (see derivations below (4.5) and (6.3)), ii) supε E supt≤T |Mtε |2n < ∞, n ≥ 1 (see Section 7). In this more general case, a rigorous representation of the dynamics of the system should be made in the integral form below rather than in the differential form (1.1) Z t   ε ε Xt = X0 + a(Xsε , ξs/ε ) + b(Xsε )uε (s) ds + Mtε . 0

Finally notice that our assumptions on the cost functions are quite natural and represent a minimal set of assumptions for the problem to be meaningful : (A.1) guarantees that we stay within the classical control problems rather than having also to deal with singular controls (e.g., see [14]), while (A.2) is the usual polynomial growth condition assumption. 3. Asymptotic lower bound for the optimal cost functions Let v and V ε be the optimal cost functions, corresponding to the deterministic and the original control problems respectively (see (1.7)-(1.9) and (1.1)-(1.3)). The aim of this section is to prove the following theorem. Theorem 3.1. Let the assumptions of Section 2 be satisfied, then lim inf V ε ≥ v ε→0

Proof: We may limit ourselves to the case when lim inf ε→0 J ε (uε ) = β < ∞. Take a subsequence εk → 0, (k → ∞) such that limk J εk (uεk ) = lim inf ε→0 J ε (uε ). Then for k large enough J εk (uεk ) ≤ 2β (3.1) 5

(for notational convenience we shall assume that (3.1) holds for all k). ¿From (3.1) and (1.2) it follows that Z T E q(uεt k ) dt < 2β. (3.2) 0

εk

Let X be the state process associated with uεk . Given (3.2), we may apply Theorem 6.1 to conclude that the sequence Rt εk (X , U εk , ||U εk ||), k ≥ 1 is relatively compact, where Utεk = 0 uεk (s)ds and ||U εk ||t = Rt ε |u k (s)|ds. Let (X εk˜ , U εk˜ , ||U εk˜ ||) be a weakly converging subsequence with limit 0 (X, U, ||U ||). Then, by Theorem 6.1, we have Z t Xt = x0 + [a(Xs ) + b(Xs )u(s)] ds, 0 Z t U (t) = u(s)ds, (3.3) 0

where x0 being the ”limit of X0ε ” (see assumption (A.3)), a(x) is defined in (1.5), and b(x) is the same as in (1.1). Since lim inf J ε (uε ) = lim J εk˜ (uεk˜ ), ε→0

˜ k→0

(3.4)

where (εk˜ ) is any subsequence of (εk ), we use (3.4) with (εk˜ ) corresponding to the weakly converging sequence (X εk˜ , U εk˜ , ||U εk˜ ||). Then by Theorems 5.1 and 6.1 we get (Z ) T

εk ˜

εk ˜

lim J (u ) ≥ E

[p(Xt ) + q(u(t))] dt + r(XT ) .

˜ k

(3.5)

0

¿From (3.4) and (3.5) we derive lim inf J ε ≥ v. ε→0

(3.6)

If an optimal control exists, then the statement of the theorem is a consequence of (3.6). Otherwise we approximate the optimal value function by the cost associated with δ-optimal controls. 4. Proofs of Theorems 1.1 and 1.2 It follows from Theorem 3.1 that the lower limit of the optimal costs is bounded from below by the optimal cost corresponding to the deterministic model (1.7)-(1.9). The existence of an optimal control u∗ for problem (1.7)-(1.9) can be shown by standard arguments (see the Remark at the end of Section 6 or the proof of Theorem III.4.1 in [4]). Notice also that assumption (A.1) implies Z T |u∗ (t)|1+γ dt < ∞. (4.1) 0

6

Next let x∗ (t) be the (deterministic) solution of (1.7) corresponding to the control u∗ (t) and let X ∗,ε = (Xt∗,ε )0≤t≤T be the (stochastic) state process, associated with the control uεt ≡ u∗ (t) via (1.1). We first show that P − lim sup |Xt∗,ε − x∗ (t)| = 0. (4.2) ε→0 t≤T

Let ∆εt := |Xt∗,ε − x∗ (t)|.

(4.3)

Using (1.1) and (1.7), we get the inequality Z t ε ε ∆t ≤ |X0 − x0 | + [ |a(Xs∗,ε ) − a(x∗ (s))| + |b(Xs∗,ε ) − b(x∗ (s))||u∗ (s)| ] ds 0 Z t + sup | [a(Xs∗,ε , ξs/ε ) − a(Xs∗,ε )] ds| + sup |Mtε |. t≤T

t≤T

0

By the Lipschitzianity of a(x) and b(x) (see assumptions (A.4) and (A.5)) it follows that Z t n o ε ε ∆t ≤ |X0 − x0 | + sup | [a(Xs∗,ε ,ξs/ε ) − a(Xs∗,ε )] ds| + sup |Mtε | t≤T

0

+`

Z

t≤T

t

0

(1 + |u∗ (s)|) |∆εs | ds.

Therefore, by the Gronwall-Bellman inequality Z t n ∗,ε ε ε ∗,ε sup |∆t | ≤ |X0 − x0 | + sup [a(Xs , ξs/ε ) − a(Xs )] ds + t≤T

t≤T

0

Z o ε + sup |Mt | exp ` t≤T

0

T

!

[1 + |u∗ (s)|] ds .

(4.4)

Now, by assumption (A.2) we have P − limε→0 |X0ε − x0 | = 0; furthermore, using a similar argument as in the proof of (6.8) below, we get Z t (4.5) P − lim sup [a(Xs∗,ε , ξs/ε ) − a(Xs∗,ε )] ds = 0. ε→0 t≤T

0

Finally, by assumption (A.7) and by Problem 1.9.2 in [15] P − limε→0 supt≤T |Mtε | = 0. Thus, (4.2) holds. As a consequence of (4.2) we have P − lim p(Xt∗,ε ) = p(x∗ (t)), ε→0

P − lim r(XT∗,ε ) = r(x∗ (T )). ε→0

t ∈ [0, T ], (4.6)

Next we need to prove that the families p(Xt∗,ε ) of functions on [0, T ] × Ω and of random variables r(XT∗,ε ) are uniformly integrable with respect to the measures dt × dP and dP

7

on [0, T ] × Ω and Ω respectively. To this end it is sufficient to show that there exists a constant c > 0 such that E [p(Xt∗,ε )]2 ≤ c

E [r(XT∗,ε )]2 ≤ c.

,

(4.7)

By assumption (A.2) we have p(x), r(x) ≤ c1 (1 + |x|γ1 ). Let n∗ be the smallest integer such that γ1 < n∗ . Evidently, (4.7) holds if there exists a constant c0 such that ∗

E sup |Xt∗,ε |2n ≤ c0 .

(4.8)

t≤T

Using (1.1) as well as assumptions (A.4) and (A.5), we get Z T Z t  ∗,ε ε ∗,ε |u∗ (t)| dt + sup |Mtε |. 1 + sup |Xτ | ds + ` sup |Xs | ≤ |X0 | + ` s≤t

τ ≤s

0

The Gronwall-Bellman inequality implies ( sup |Xs∗,ε | ≤ e`T

|X0ε | + `T + `

s≤T

s≤T

0

¿From (7.1) we have

Z

T

0

)

|u∗ (t)| dt + sup |Mtε | . s≤T



E sup |Mtε |2n ≤ const.

(4.9)

(4.10)

t≤T

Inequality (4.8) is therefore a consequence of (4.9),(4.10) and Assumption (A.3) By virtue of (4.8) and Theorem 5.4 in [1] (Z ) T

ε



[p(Xt∗,ε ) + q(u∗ (t))] dt + r(XT∗,ε )

lim J (u ) = lim E

ε→0

ε→0

=

Z

0

T

[p(x∗ (t)) + q(u∗ (t))] dt + r(x∗ (T )) = v.

0

Since V ε ≤ J ε (u∗ ) we have lim supε→0 V ε ≤ v. This inequality and Theorem 3.1 imply Theorem 1.1 and Theorem 1.2. 5. Relative compacteness of (U ε , ||U ε ||) 1. Let q(u) be the control cost function from (1.2). Assume Z T sup E q(uε (t))dt < ∞. ε≤1

Recall that U ε (t) = by

Rt 0

(5.1)

0

uε (s)ds and denote its total variation in the time interval [0, t] ε

||U ||t =

Z

t 0

|uε (s)|ds.

(5.2)

The process ||U ε ||t , 0 ≤ t ≤ T has paths in a subset of C[0,T ] of continuous increasing + functions C[0,T ] . Also, ρ will be used for designating of the uniform metric in C[0,T ] . 8

Theorem 5.1. Let assumption (A.1) and (5.1) be satisfied. Then the family of random processes (U ε , ||U ε ||) = (Uε (t), ||U ε ||t )0≤t≤T , ε ≤ 1 is relatively compact in the + metric space C[0,T ] × C[0,T ], ρ × ρ . εk εk If (U , ||U ||) is any weakly converging sequence with limit (U, ||U ||), then there exists a measurable process (u(t))0≤t≤T such that RT 1. E 0 |u(t)|1+γ dt < ∞; 2. for any t ≤ T and P -a.s. Z t Z t U (t) = u(s)ds, ||U ||t = |u(s)|ds; 0

3. lim inf E k→∞

0

Z

0

T εk

q(u (t)) ≥ E

Z

T

q(u(t))dt.

(5.3)

0

+ Proof : Since C[0,T ] is closed in C[0,T ] in the metric ρ, by virtue of Prokhorov’s + theorem (see e.g. [1]) only tightness of the family in C[0,T ] × C[0,T ] has to be checked. Due to Theorems 8.2 and 15.2 in [1], we verify two conditions:

lim lim sup P (sup ||U ε ||t > c) = 0 ε→0 t≤T  ε lim lim sup P sup ||U ||t − ||U ε ||s > ν = 0 ∀ν > 0

c→∞

δ→0

ε→0

(5.4)

t,s≤T :|t−s|≤δ

and the same conditions for U ε . Conditions (A.1) and (5.1) imply Z T sup E |uε (t)|1+γ dt < ∞. ε≤1

(5.5)

0

Thereby, conditions (5.4) are verified by H¨older’s inequality. Namely Z T 1/(1+γ) ε ε γ/(1+γ) ε (1+γ) sup ||U ||t = ||U ||T ≤ T |u (t)| dt t≤T

(5.6)

0

and for any random t, s ≤ T : |t − s| ≤ δ

Z ε ε γ/(1+γ) ||U ||t − ||U ||s ≤ δ

0

T

1/(1+γ) |uε (t)|(1+γ) dt .

(5.7)

We conclude by using Chebyshev’s inequality. The validity of the conditions of the type (5.4) for U ε is proved analogously. Let W (t) be any random process with paths from C[0,T ] and let I n = {si = 2iTn , i = 0, 1, ..., 2n }, n ≥ 1 be subdivisions of the time interval [0, T ]. Put wn (t) =

Wsi − Wsi−1 , si − si−1

9

si−1 ≤ t < si .

(5.8)

It is known (see [16]) that under the assumption Z T sup E |wn (t)|2 dt < ∞ n

(5.9)

0

the process W (t) is absolutely continuous (with respect to Lebesgue measure Λ(dt) = dt), i.e. there exists a measurable process w(t) such that for any t ≤ T and P -a.s. Z t Z T W (t) = w(s)ds, E |w(t)|2 dt < ∞ (5.10) 0

0

and, what is more, w(t, ω) = lim wn (t, ω),

Λ × P − a.s.

n

The same proof shows that under the assumption: for some γ > 0 Z T sup E |wn (t)|1+γ dt < ∞ n

(5.11)

(5.12)

0

RT we have that (5.10) with E 0 |w(t)|1+γ dt < ∞ and (5.11) hold. Let W (t) ≡ U (t) and correspondingly un (t) ≡ wn (t). Therefore, statements 1. and 2. take place if for γ the same as in (A.1) Z T sup E |un (t)|1+γ dt < ∞. (5.13) n

0

To this end, defining uεnk (t) in the same way as wn (t), but with W (t) ≡ U εk (t), we find εk

En (U ) =

Z

T

0

2 X εk 1+γ |un (t)| dt = n

i=1

R

i 2n i−1 2n

uεk (t)dt 1+γ 2−n . −n 2

On the other hand, due to Jensen’s inequality and assumption (A.1) 2 X n

i=1

R

i 2n i−1 2n

Z T 2n Z in uεk (t)dt 1+γ X 2 −n εk 1+γ 2 ≤ |u (t)| dt = |uεk (t)|1+γ dt i−1 2−n 0 2n i=1 Z T 1 q(uεk (t))dt. ≤ c 0

(5.14)

By virtue of the weak convergence of U εk and assumption (5.1), for any N ≥ 1 we get Z T     εk E min N, En (U ) = lim E min N, En (U ) ≤ sup E |uε (t)|1+γ dt < ∞. k

ε≤1

0

By the monotone convergence Theorem supn E En (U ) < ∞ and so, noticing that En (U ) RT = 0 |un (t)|1+γ dt, we conclude that (5.13) holds.

10

To prove statement 3. of the Theorem, introduce En,q (U εk ) =

Z

0

T

2n  X q(uεnk (t))dt = q i=1

R

i 2n i−1 2n

uεk (t)dt  2−n . 2−n

Since by Jensen’s inequality Z

0

T

2n  X q(uεnk (t))dt = q i=1

R

i 2n i−1 2n

Z T uεk (t)dt  −n 2 ≤ q(uεk (t))dt 2−n 0

we derive statement 3. by Fatou’s lemma and by (5.11), reformulated for un (t): Z T   lim inf E q(uεk (t))dt ≥ lim inf lim lim E min N, En,q (U εk ) n N →∞ k k 0   = lim inf lim E min N, En,q (U ) n N →∞ Z T = lim inf E En,q (U ) = lim inf E q(un (t))dt n

≥E

Z

n

T

lim inf q(un (t))dt = E n

0

0

Z

T

q(u(t))dt.

0

 6. Relative compactness of X ε , U ε , ||U ε ||

1. Let X ε = (Xtε )t≥0 be defined as in (1.1) and ||U ε ||t in (5.2). We consider the ε(t) + triple (X ε , U ε , ||U ε ||) = (Xtε , Ut , ||U ε ||)t )0≤t≤T with values in D[0,T ] × C[0,T ] × C[0,T ], where D[0,T ] is Skorokhod’s space. Theorem 6.1. Let the assumptions of Section 2 and (5.1) be satisfied. Then the family (X ε , U ε , ||U ε ||), ε ≤ 1 is relatively compact in the metric space D[0,T ] × C[0,T ] ×  + C[0,T , ρ × ρ × ρ . If (X εk , U εk , ||U εk ||) is any weakly converging sequence with limit ] (X, U, ||U ||), then the statements of Theorem 5.1 hold and Z t (6.1) Xt = x0 + [ a(Xs ) + b(Xs )u(s) ] ds, t ≤ T, 0

where a(x) is defined as in (1.5) and u(s) is the process from Theorem 5.1. For any continuous nonnegative functions p(x) and r(x), nZ T o nZ T o εk εk lim inf E p(Xt )dt + r(XT ) ≥ E p(Xt )dt + r(XT ) . (6.2) k

0

0

Proof : Parallel to Xtε introduce a process Xtε,◦ defined by (compare to (1.1)) Z t   ε,◦ ε (6.3) Xt = X0 + a(Xsε,◦ , ξs/ε ) + b(Xsε,◦ )uε (s) ds 0

11

Due to (1.1), (6.3) and assumptions (A.4), (A.5), the process Ytε = sups≤t Xsε − Xsε,◦ satisfies the inequality: Z t ε Yt ≤ ` Ysε d[s + ||U ε ||s ] + sup Msε |, t ≤ T s≤T

0

and so by the Gronwall-Bellman inequality we get YTε ≤ ` sup Msε | exp{`[T + ||U ε ||T ]}. s≤T

By virtue of assumption (A.7) and Problem 1.9.2 in [15], supt≤T Mtε | → 0, ε → 0 in probability and ||U ε ||T satifies (5.4). Consequently YTε → 0, ε → 0 in probability and by Theorem 4.1, Ch.1 in [1] the result of the Theorem remains true if its statements will be proved only for the triple (X ε,◦ , U ε , ||U ε ||). By virtue of (5.4), it is sufficient to verify only the following two conditions (see Theorems 8.2 and 15.2 in [1]):   ε,◦ lim lim sup P sup |Xt | > c = 0 c→∞

ε→0

t≤T

lim lim sup P

δ→0

ε→0

sup t,s≤T :|t−s|≤δ

|Xtε,◦ − Xsε,◦ | > ν

!

= 0,

∀ν > 0.

(6.4)

It follows from (6.3) and assumptions (A.4) and (A.5) that for any t ≤ T Z t ε,◦ ε sup |Xs | ≤ |X0 | + ` [1 + sup |Xτε,◦ |ds + ||U ε ||T s≤t

τ ≤s

0

and so, using Gronwall-Bellman’s inequality, we get  sup |Xsε,◦ | ≤ e`T |X0ε | + `||U ε ||T .

s≤T

Evidently, the first condition in (6.4) holds by the proof Theorem 5.1 and by assumption (A.3.i). For any t − s < δ we can apply assumptions (A.4) and (A.5) to write Z t∨s  ε,◦ ε,◦ |Xt − Xs | ≤` [1 + sup |Xτε,◦ | ] dτ + ` ||U ε ||t∨s − ||U ε ||t∧s t∧s

τ ≤T

 ≤ `δ[1 + sup |Xτε,◦ | ] + ` ||U ε ||t∨s − ||U ε ||t∧s . τ ≤T

Therefore, the validity of the second condition in (6.4) follows from the proof of Theorem 5.1 and from the first condition in (6.4) which has already been proved. Let (X εk ,◦ , U εk , ||U εk ||), k ≥ 1 be a weakly converging sequence with limit (X, U,||U ||). Denote by Q the distribution of the limit (X, U, ||U ||), i.e. Q is a probability

12

+ measure on C[0,T ] × C[0,T ] × C[0,T ] . For any element (X, U, ||U ||) from C[0,T ] × C[0,T ] × + C[0,T ] put

Φt (X, U, ||U ||) := Xt − x0 −

Z

t 0

a(Xs ) ds −

Z

t

b(Xs ) dU (s),

(6.5)

0

where the function a(x) is defined by (1.5) and x0 is the same as in assumption (A.3.i). The second statement of the Theorem holds if sup | Φt (X, U, ||U ||) | = 0

t≤T

Q − a.s..

(6.6)

To prove the validity of (6.6), we show that the functional supt≤T | Φt (X, U, ||U ||) | is continuous in the product-metric ρ3 = ρ×ρ×ρ. Let (X 0 , U 0 , ||U 0 ||) and (X n , U n , ||U n ||), + n ≥ 1 be elements of C[0,T ] × C[0,T ] × C[0,T ] such that lim ρ3 ((X 0 , U 0 , ||U 0 ||), (X n , U n , ||U n ||) = 0. n

We show that limn supt≤T | Φt (X n , U n , ||U n ||) = supt≤T | Φt (X 0 , U 0 , ||U 0 ||). Taking into account (6.5), we get n n n n 0 0 0 L := sup |Φt (X , U , ||U ||)| − sup |Φt (X , U , ||U ||)| t≤T t≤T n n n 0 0 0 ≤ sup Φt (X , U , ||U ||) − Φt (X , U , ||U ||) t≤T



+

2 sup |Xtn t≤T

Z

0



Xt0 |

+

T

| b(Xsn )



Z

T

0

| a(Xsn ) − a(Xs0 ) | ds

b(Xs0 ) | d||U n ||s

Z t + sup b(Xs0 ) d[U n (s) − U 0 (s)] . t≤T

0

Using the Lipschitzianity of the functions a(x) (it is inherited from a(x, y), see (A.4.ii)) and b(x), we obtain the following upper bound for Ln :  Ln ≤ ρ(X n , X 0 ) 2 + `T + `||U 0 ||T + `ρ(||U n ||, ||U 0 ||) + Lnb , where

Lnb Lnb

Z t b(Xs0 ) d[U n (s) − U 0 (s)] . := sup t≤T

0

The quantity can be evaluated from above in the following way (below [α] stands for the integer part of α) Z t n b(X 0[N s] ) d[U n (s) − U 0 (s)] Lb ≤ sup t≤T

+`

N

0

sup

1 |s0 −s00 |≤ N

|Xs00 − Xs000 | [ (||U n ||T + ||U 0 ||)T ].

Therefore, lim supn Ln ≤ 2`||U 0 ||T sup|s0 −s00 |≤ N1 |Xs00 − Xs000 | → 0, for N → ∞, i.e. 13

supt≤T | Φt (X, U, ||U ||) | is continuous functional. Using this fact, the equality     Q sup | Φt (X, U, ||U ||) | ≥ ν = lim P sup | Φt (X εk ,◦ , U εk , ||U εk ||) | ≥ ν , k

t≤T

ν > 0,

t≤T

is implied by the weak convergence mentioned above , and the estimate Z t εk εk ,◦ εk εk sup | Φt (X , U , ||U ||) | ≤ |X0 − x0 | + sup [a(Xsεk , ξs/εk ) − a(Xsεk )]ds , (6.7) t≤T

t≤T

0

we can conclude that (6.6) holds, if the rigth hand side of (6.7) goes to zero in probability as k → ∞. Taking into account assumption (A.3.i), for the validity of (6.6) only Z t [a(Xsεk , ξs/εk ) − a(Xsεk )]ds = 0 P − lim sup (6.8) k t≤T

0

has to be checked.  Evidently, for a piecewise constant function such that φ(t) = φ ni for Z t P − lim [ a(φ(s), ξs/εk ) − a(φ(s)) ] ds = 0, ∀t ≤ T k→∞

i n

≤t
0, m ≥ 1, n ≥ 1 and putting ξsk = ξs/εk   Z t εk [ a(Xsk,m,n , ξsk ) − a(Xsk,m,n ) ] ds = 0. (6.11) P − lim I sup |Xt | ≤ c sup k

t≤T

t≤T

0

On the other hand, taking into account the weak convergence of (Xtεk )0≤t≤T which implies limk lim supc→∞ P supt≤T |Xtεk | > c = 0, for the validity of (6.8) it remains to show that Z T P − lim lim | a(Xsεk , ξsk ) − a(Xsk,m,n , ξsk ) | ds = 0, m,n→∞ k

0

14

P−

Z

lim lim

m,n→∞ k

T

| a(Xsεk ) − a(Xsk,m,n ) | ds = 0.

0

Taking into account the Lipschitzianity of the function a(x, y) (see assumption (A.4)), which is also inherited by the function a(x), it is sufficient to show Z T P − lim lim | Xsεk − Xsk,m,n | ds = 0. (6.12) m,n→∞ k

0

k To this end put Xtk,n = X ε[nt] , where [α] is the integer part of α. Then n

| Xsεk − Xsk,m,n | ≤ | Xsεk − Xsk,n | + | Xsk,n − Xsk,m,n |. Obviously | Xsk,n − Xsk,m,n | ≤ Z

0

T

| Xsεk

− Xsk,m,n

T | ds ≤ + m

1 m.

Z

0

Consequently

T

| Xsεk − X k[ns] | ds ≤ n

T +T m

sup s,t≤T :|s−t|≤1/n

|Xtεk − Xsk |.

Therefore, for any ν > 0 P

Z

0

T

| Xsεk − Xsk,m,n | ds > ν

!

≤P

sup s,t≤T :|s−t|≤1/n

|Xtεk

ν 1 − Xsk | > − T m

(Xtεk )0≤t≤T

!

.

(6.13) which implies the

As a result, (6.12) follows from weak convergence of convergence to zero of the right hand side of (6.13). It remains to prove (6.2). Due to the weak convergence of (Xtεk )0≤t≤T , k ≥ 1, we find nZ T o nZ T o  εk εk lim inf E p(Xt )dt + r(XT ) ≥ lim E N ∧ p(Xtεk ) dt + N ∧ r(XTεk ) k

k

0

=E

nZ

0

T

0

o  N ∧ p(Xt ) dt + N ∧ r(XT ) ∀N ≥ 1

and conclude by using the monotone convergence theorem. 2. Remark The method of proof of Theorem 6.1 can be adapted to the following deterministic problem. Let un (t), n ≥ 1, be a sequence of measurable functions satisfying Z T sup |un (t)|1+γ dt < ∞, γ > 0. n

0

For each n consider the differential equation d xn (t) = a(xn (t)) + b(xn )un (t) dt with the initial condition xn (0) = x0 . Put Z t n U (t) = un (s) ds,

||U ||nt

0

15

=

Z

0

t

|un (s)| ds,

By the same technique as in the proof of Theorem 6.1, one can show that the family (xn (t), U n (t), ||U ||nt )0≤t≤T , n ≥ 1 is uniformly bounded and equicontinuous. Then by the Arzel´a-Ascoli theorem this family is relatively compact and there exists a subsequence (xnk (t), U nk (t), ||U ||nt k )0≤t≤T converging uniformly to a limit (x0 (t), U 0 (t), ||U ||0t )0≤t≤T , with absolutely continuous U 0 (t), i.e. there exists a meaRt surable function u0 (t) such that U 0 (t) = 0 u0 (s)ds. Furthermore, x0 (t) is the unique solution of the differential equation d x0 (t) = a(x0 (t)) + b(x0 )u0 (t) dt with the initial condition x0 (0) = x0 . 7. Upper bound for E supt≤T |Mtε |2n In this section we prove, under assumption (A.7), that for any n > 1 and T > 0 sup E sup |Mtε |2n < ∞. ε≤1

(7.1)

t≤T

In the case of E |MTε |2n < ∞, we can apply Doob’s inequality (see e.g. [15]) to obtain  2n 2n ε 2n E sup |Mt | ≤ E |MTε |2n . 2n − 1 t≤T Thus, it suffices to show that sup E |MTε |2n < ∞.

(7.2)

ε≤1

We shall use the notations k, Nt , and Vt to denote a generic positive constant depending on (c3 , L, n), a local martingale, and a non decreasing process (with paths in D[0,∞) ) respectively, where Nt and Vt are adapted to the filtration Fε (all these objects might be different in different formulas). To check the validity of (7.2), we shall show that (Mtε )2n admits the representation Z t ε 2n (Mt ) = k [1 + (Msε )2n ]ds + Nt − Vt . (7.3) 0

¿From (7.3) the desired result follows immediately. In fact, by Ito’s formula we find Z t Z t −kt −kt ε 2n −ks + e (Mt ) = 1 − e e dNs − e−ks dVs 0 0 Z t ≤1+ e−ks dNs . (7.4) 0

Rt

The Ito integral 0 e−ks dNs is a local martingale. Denote its localizing sequence of R t∧τ stopping times by (τj )j≥1 , i.e. for any t > 0, E 0 j e−ks dNs = 0, j ≥ 1. Thereby, from (7.4) it follows E e−k(T ∧τj ) (MTε ∧τj )2n ≤ 1, 16

j≥1

and so we conclude by using Fatou’s Lemma. Thus, only (7.3) has to be proved. By Ito’s formula Z t Z t ε 2n ε 2n−1 ε (Mt ) = 2n (Ms− ) dMs + n(2n − 1) (Msε− )2n−2 dhM ε,c is 0 X 0  + (Msε )2n − (Msε− )2n − 2n(Msε− )2n−1 ∆Msε ,

(7.5)

s≤t

where hM ε,c it is the predictable quadratic variation of the continuous part of the martingale Mtε . The representation (7.5) is nothing but (Mtε )2n = Nt + Bt

(7.6)

with the local martingale Nt = 2n

Z

t ε 2n−1 (Ms− ) dMsε ,

(7.7)

0

and the non decreasing process Z t X  ε 2n ε 2n−1 Bt = n(2n−1) (Msε− )2n−2 dhM ε,c is + (Msε )2n − (Ms− ) − 2n(Ms− ) ∆Msε . 0

s≤t

ε

Denote by µ (dt, dz) the measure of jumps of the martingale compensator. Since (R◦ = R \ {0}) X  ε 2n ε 2n−1 (Msε )2n − (Ms− ) − 2n(Ms− ) ∆Msε s≤t

=

Z tZ 0

R◦

and the process Nt =

Z tZ 0



Z tZ 0

R◦

R◦

Mtε

(7.8) and by ν (dt, dz) its ε

 ε  ε 2n ε 2n−1 (Ms− + z)2n − (Ms− ) − 2n(Ms− ) z µε (ds, dz)

 ε  ε 2n ε 2n−1 (Ms− + z)2n − (Ms− ) − 2n(Ms− ) z µε (ds, dz)

  ε ε 2n−1 ε 2n ) z ν ε (ds, dz) ) − 2n(Ms− (Ms− + z)2n − (Ms−

is a local martingale too, we arrive to a new decomposition of the type (7.6) with local martingale Z t ε 2n−1 Nt = 2n (Ms− ) dMsε 0 Z tZ  ε  ε ε 2n ε 2n−1 + (Ms− + z)2n − (Ms− ) − 2n(Ms− ) [µ − ν ε ](ds, dz) (7.9) 0

R◦

17

and non decreasing process Z t Bt = n(2n − 1) (Msε− )2n−2 dhM ε,c is 0 Z tZ   ε ε 2n ε 2n−1 + (Ms− + z)2n − (Ms− ) − 2n(Ms− ) z ν ε (ds, dz). 0

(7.10)

R◦

Using the fact that |∆Msε | ≤ L, we get ν ε (ds, dz) = I(|z| ≤ L)ν ε (ds, dz). Therefore, by virtue of Taylor’s expansion for the function f (x) = x2n and H¨older’s inequality one can find a constant k such that Z ε 2n−2 ε 2n−2 ε,c (Ms− ) (1 + z 2 )ν ε (dt, dz). dBt ≤ n(2n − 1)(Mt− ) dhM it + k |z|≤L

Recall that the quadratic variation [M ε , M ε ]t of Mtε is defined as X [M ε , M ε ]t = hM ε,c it + (∆ Msε )2 s≤t

= hM ε,c it +

Z tZ 0

z 2 ν ε (dt, dz).

R◦

Consequently, taking into account that x2n−2 ≤ 1 + x2n we obtain ε 2n−2 ε 2n dBt ≤ 2[n(2n − 1) + k](Mt− ) d[M ε , M ε ]t ≤ 2[n(2n − 1) + k](1 + Mt− ) d[M ε , M ε ]t .

Define a non decreasing process Vt = 2[n(2n − 1) + k]

Z

0

t ε 2n (1 + Ms− ) d[M ε , M ε ]s − Bt .

Then, for (Mtε )2n we have the following decomposition: Z t ε 2n ε 2n (Mt ) = Nt + 2[n(2n − 1) + k] (1 + Ms− ) d[M ε , M ε ]s − Vt ,

(7.11)

0

where the local martingale Nt is defined in (7.9). Since [M ε , M ε ]t − M ε t is a local martingale we arrive to a new representation for (Mtε )2n : Z t

ε 2n ε 2n ) d M ε s − Vt (7.12) (Mt ) = Nt + 2[n(2n − 1) + k] (1 + Ms− 0

with the same non decreasing process Vt and a new local martingale Nt Due to assumption (A.7) we have (for ε ≤ 1) d M ε t ≤ c3 dt, i.e. Vt0

Z t hZ t

i ε 2n ε 2n = 2[n(2n − 1) + k] ) d Mε s (1 + Ms ) c3 ds − (1 + Ms− 0

0

is a non decreasing process. Thus, (7.3) is implied by (7.12) and (7.13).

18

(7.13)

[1] [2] [3] [4] [5] [6] [7] [8]

[9] [10]

[11]

[12] [13]

[14] [15] [16]

References P.BILLINGSLEY,Convergence of Probability Measures, John Wiley, New York, 1968. C.DELLACHERIE, Capacit´es et Processus Stochastiques, Springer, Berlin, 1972. S.N.ETHIER AND T.G.KURTZ, Markov Processes : Characterization and Convergence, John Wiley, New York, 1986. W.H.FLEMING AND R.W.RISHEL, Deterministic and Stochastic Optimal Control, Springer, Berlin, 1975. M.I.FREIDLIN AND A.D.WENTZELL, Random Perturbations of Dynamical Systems, Springer, New York, 1984. J.JACOD AND A.N.SHIRYAYEV, Limit Theorems for Stochastic Processes, Springer, Berlin-New York, 1987. E.V.KRICHAGINA AND M.I.TAKSAR, Diffusion Approximation for GI/G/1 Controlled Queues, QUESTA, 12 (1992), pp. 333-368. E.V.KRICHAGINA, S.X.C.LOU,S.P.SETHI AND M.I.TAKSAR, Production Control in a Failure-Prone Manufacturing System : Diffusion Approximation and Asymptotic Optimality, Annals Appl.Probab., 3 (1993), pp. 421-453. H.J.KUSHNER, Approximation and Weak Convergence Methods for Random Processes, MIT-Press, Cambridge, 1984. H.J.KUSHNER AND W.J.RUNGGALDIER, Nearly Optimal State Feedback Controls for Stochastic Systems With Wideband Noise Disturbances,SIAM J.Control and Optimiz., 25 (1987), pp. 298-315. H.J.KUSHNER AND K.M.RAMACHANDRAN, Optimal and Approximately Optimal Control Policies for Queues in Heavy Traffic, SIAM J.Control and Optimiz., 27 (1989), pp.1293-1318. H.J.KUSHNER AND L.F.MARTINS, Routing and Singular Control for Queueing Networks in Heavy Traffic, SIAM J.Control and Optimiz., 28 (1990), pp.1209-1233. J.LEHOCZKY, S.SETHI, M.SONER AND M.TAKSAR, An Asymptotic Analysis of Hierarchical Control of Manufacturing Systems, Math. of Oper. Res., 16 (1991), pp. 596-608. J.LEHOCZKY AND S.SHREVE, Absolutely Continuous and Singular Stochastic Control, Stochastics, 17 (1986), pp. 91-109. R.S.LIPTSER AND A.N.SHIRYAYEV, Theory of Martingales, Kluwer Academic Publ., Dordrecht, 1989. A.D.WENTZELL, Additive Functionals of a Multidimensional Wiener Process, Soviet Mathematics 2 (1961), pp.848-851.

19