Production-Inventory Systems with Imperfect Advance Demand ...

4 downloads 68523 Views 703KB Size Report
Oct 25, 2010 - We also consider the impact of holding and backorder costs ..... Orders do not incur backorder costs ...... For each m, extend the domain of v∗.
Production-Inventory Systems with Imperfect Advance Demand Information and Updating Saif Benjaafar∗

William L. Cooper∗

Setareh Mardan†

October 25, 2010

Abstract We consider a supplier with finite production capacity and stochastic production times. Customers provide advance demand information (ADI) to the supplier by announcing orders ahead of their due dates. However, this information is not perfect, and customers may request an order be fulfilled prior to or later than the expected due date. Customers update the status of their orders, but the time between consecutive updates is random. We formulate the productioncontrol problem as a continuous-time Markov decision process and prove there is an optimal state-dependent base-stock policy, where the base-stock levels depend upon the numbers of orders at various stages of update. In addition, we derive results on the sensitivity of the statedependent base-stock levels to the number of orders in each stage of update. In a numerical study, we examine the benefit of ADI, and find that it is most valuable to the supplier when the time between updates is moderate. We also consider the impact of holding and backorder costs, numbers of updates, and the fraction of customers that provide ADI. In addition, we find that while ADI is always beneficial to the supplier, this may not be the case for the customers who provide the ADI. Keywords: Advance demand information, production-inventory systems, make-to-stock queues, continuous-time Markov decision processes



Program in Industrial and Systems Engineering, University of Minnesota, 111 Church Street S.E., Minneapolis, MN 55455 † PROS, 3100 Main Street, Suite #900, Houston, TX 77002

1

Introduction

It is increasingly common for members of the same supply chain to share advance demand information (ADI). This practice has been facilitated by information technologies such as the Internet, electronic data interchange (EDI), and radio frequency identification (RFID). It has also been supported by initiatives such as the inter-industry consortium on Collaborative Planning, Forecasting and Replenishment (CPFR), which provides a framework for participating companies to share future demand projections and coordinate ordering decisions. Large manufacturers, such as Toyota and Boeing, are tightly integrated with their first tier suppliers with whom they share production status, inventory usage, and even future design plans. Large retailers, such as Wal-Mart and Best Buy, have invested in sophisticated information collecting and processing infrastructure that enables them to share real-time inventory usage and point-of-sale (POS) data with thousands of their suppliers. Several manufacturers that sell directly to the consumer, such as Dell, and online retailers, such as Amazon, encourage their customers to place their orders early by offering discounts to those that accept later delivery dates. In some industries, suppliers allow their long-term customers to place soft orders far ahead of due dates, which may then later be firmed up, modified, or canceled. Although ADI can take on different forms and may be enabled by a variety of technologies, it typically reduces to customers providing advance notice to their suppliers about the timing and size of future orders. This information can be perfect (exact information about future orders) or imperfect (estimates of timing or quantity of future orders). The information can also be explicit, with customers directly stating their intent about future orders, or implicit, with customers allowing suppliers to observe their internal operations and to determine estimates of future orders (the systems we consider in this paper are in part motivated by settings with such implicit information; we provide examples and further discussion later in this section). It is generally believed that ADI, even if imperfect, can improve supply chain performance. In particular, with information about future demand, a supplier may be able to reduce the need for inventory or excess capacity. Customers may also benefit through improved service quality or lower costs. However, the availability of ADI raises questions. How should a supplier use ADI to make decisions? How valuable is ADI to suppliers and customers and how is this value affected by operating characteristics of the supplier and the quality of information provided by customers? How significant are benefits from receiving information further in advance or increasing the portion of customers that provide ADI? Is ADI equally beneficial to all parties in the supply chain and could it be harmful, particularly to the customer who provides it? 1

We address these and other related questions for a supplier that produces a single product. Customers furnish the supplier with ADI by announcing orders ahead of their due dates. However, this information is not perfect, and orders may become due prior to or later than the announced expected due date or they can be canceled altogether. Hence, the demand leadtime (the time between when an order is announced and when it is requested or canceled) is random. Customers provide status updates as their orders progress towards becoming due, but the time between consecutive updates is also random and independent of updates for other orders. In this paper, we are primarily motivated by settings where customers implicitly provide ADI to the supplier by allowing the supplier to observe their internal operations (e.g., order fulfillment, manufacturing, inventory usage), thereby enabling the supplier to estimate when customers will eventually place orders. We refer to such internal operations as the demand leadtime system. For most of the paper, we assume that the actual due dates of different orders are independent of each other and orders that are announced later can become due before (or after) those that are announced earlier. Updates are also independent and do not follow a first-announced, first-updated rule. We refer to this as a system with independent due dates (IDD). In Section 5, we show how our treatment can be extended to systems where announced orders are updated and the orders become due in the sequence in which they are announced. We refer to this as a system with sequential due dates (SDD). The following examples illustrate the types of settings we model in this paper. Consider a supplier that provides a component to a manufacturer, such as Boeing, of a large and complex product (e.g., an aircraft). The manufacturer informs the supplier each time it initiates the production of a new product and each time it completes a stage of the production process. The component provided by the supplier is not immediately needed and is required only at a later stage of the production process. The manufacturer does not accept early deliveries, but wishes to receive the component as soon as it is needed in a just-in-time fashion. The supplier uses the information about the progression of the product through the manufacturer’s production process to estimate when it will need to make a delivery to the manufacturer. To make such estimates, the supplier uses its knowledge of the manufacturer’s operations and available data from past interactions. However, the estimates are imperfect and the manufacturer (due to inherent variability) may complete a production stage sooner or later than expected. The manufacturer may initiate, in response to its own demand, the production of multiple products simultaneously (e.g., an aircraft manufacturer may assemble multiple airplanes in parallel). The evolution of these products through the production process is largely independent, so that a product that enters a particular stage of production later than

2

another product may complete it sooner. This type of ADI may also arise in settings other than manufacturing. For example, van Donselaar et al. (2001) present a case study of how builders provide material suppliers with ADI about the start and progress of construction projects. The suppliers use the information to estimate when a builder will need materials. This estimation is not perfect because progress on a construction project can be variable and because design specifications may change over the course of a project, sometimes leading the builders ultimately not to place orders. In this paper, we provide a general framework for modeling systems with this type of ADI. The framework is broad enough to model a wide range of demand leadtime systems with various assumptions regrading due date updating. The demand leadtime system can be viewed in general as a queueing system composed of parallel servers, with service times consisting of multiple stages of random duration and the completion of a stage of corresponding to an update. Arrivals to the demand leadtime system correspond to orders being announced; similar to service times, interarrival times are random and may have multiple stages, with the completion of a stage indicating an update. Departures from the demand leadtime system correspond to orders becoming due. In the systems we study, the supplier has finite capacity, producing items one at a time with stochastic production times. Hence, the supplier itself can be viewed as a single server queue, with arrivals corresponding to orders becoming due (i.e., an arrival to the supplier is a departure from the demand leadtime system). The supplier has the ability to produce items ahead of their due dates in a make-to-stock fashion. However, items in inventory incur a holding cost. When an order becomes due and it cannot be immediately satisfied from inventory, it is backordered but it incurs a backorder cost. The supplier’s objective is to find a production control policy to minimize the expected total discounted cost or the expected average cost per unit time. We formulate the problem as a continuous-time Markov decision process (MDP). We show that there is an optimal production policy that is a state-dependent base-stock policy, wherein the supplier produces if and only if the net inventory is below a base-stock level, which depends only upon the numbers of orders in various stages of update. We also derive results on the sensitivity of the statedependent base-stock levels to the numbers of orders in each stage of update. For SDD systems, we obtain similar results. In our analysis, we develop a method for proving structural properties of optimal policies of continuous-time Markov decision processes (CTMDPs) with unbounded jump rates. The derived structure is useful as it allows one to compute and store an optimal policy in terms of just the base-stock levels, simplifying the policy’s implementation. The structure of the optimal policy can also guide construction of simpler heuristics if needed, or in assessing the

3

effectiveness of heuristics that may already be in use. We also conduct a numerical study to examine the benefits of ADI to both suppliers and customers by comparing systems with ADI, without ADI, and with partial ADI. The study yields several managerial insights, a few of which we now summarize. Increasing the average demand leadtime by increasing the number of updates always reduces the supplier’s cost. However, given a fixed number of updates, increasing the average time between updates may increase or decrease cost. ADI is most valuable when the average time between updates is moderate. ADI is less valuable when the average time between updates is short, because there is little time to react to information. It is also less valuable when the average time between updates is long, because the earlier notice comes with an increase in variability of the demand leadtime. This points out that obtaining earlier notice (on average) of orders is not necessarily desirable, and that when evaluating the benefit of ADI, it is important to account for the mechanism by which this ADI might be obtained. The incremental cost reduction from updating is often small compared to that from announcing orders ahead of their due dates. Typically, much of the benefit of ADI can be realized if customers provide just initial advance order announcements and few or no updates. Although ADI leads to an overall reduction in cost, in some cases it may be used by the supplier to reduce inventory at the expense of more backorders. Therefore, customers that provide ADI may witness a decline in service levels. However, in exchange for ADI, customers are in position to negotiate an increase in the backorder penalty they apply to the supplier. Higher backorder penalties can serve as a mechanism for customers to deter suppliers from reducing service levels, or as a mechanism to share, indirectly, the cost savings from ADI. The remainder of the paper is organized as follows. In Section 2, we give a brief literature review and summarize our contribution. In Section 3, we formulate the problem and describe the structure of an optimal policy. We also discuss extensions including systems with variable numbers of updates, order cancelations, multiple customer classes, and lost sales. In Section 4, we present numerical results. In Section 5, we extend our analysis to systems with sequential updating. In Section 6, we offer a summary and concluding comments. Proofs are in the Appendix and Online Supplement.

2

Literature Review and Summary of Contributions

There is a growing literature on inventory systems with ADI. A review of much of this work can ¨ be found in Gallego and Ozer (2002). Models can be broadly classified into two categories based on whether inventory is reviewed periodically or continuously. 4

For systems with periodic review, ADI is typically modeled as information available about ¨ ¨ demand in future periods. Under varying assumptions, Gallego and Ozer (2001), Ozer and Wei (2004), and Schwarz et al. (1997) have shown the existence of optimal state-dependent base-stock policies for periodic-review problems with ADI. In these papers, the base-stock levels depend upon ¨ a vector of advance orders for future periods. Ozer (2003) extends the analysis to distribution ¨ systems with multiple retailers, Gallego and Ozer (2003) to serial systems, and Wang and Toktay (2008) to systems with flexible delivery. Other papers that consider periodic review systems with ADI include Thonemann (2002), Gavirneni et al. (1999), and DeCroix and Mookerjee (1997). For continuous-review inventory systems with ADI, Buzacott and Shanthikumar (1994) consider production-inventory systems with ADI and evaluate policies that use two parameters: a base-stock level and a release leadtime. Hariharan and Zipkin (1995) introduced the notion of demand leadtime in a system where orders are announced a fixed amount of time before they are due. For constant supply leadtimes and Poisson order arrivals, they show that there is an optimal base-stock policy with a fixed base-stock level. Karaesmen et al. (2002) analyze a discrete-time model with constant demand leadtimes that is similar to our SDD model with no due-date updating (see Section 5). They ¨ prove the optimality of state-dependent base-stock policies. Gallego and Ozer (2002, Section 2.4) consider a system similar to a special case of our SDD setting, but with exogenous load-independent supply leadtimes. Gayon et al. (2009) study a system similar to our IDD scheme but with multiple demand classes, lost sales, and no due-date updates. Other papers that deal with continuous-review systems include Liberopoulos et al. (2003) and Karaesmen et al. (2004). It is possible to view the demand leadtime system in our model as a Markovian demandmodulating process with transition probabilities between states determined by the dynamics of order announcements and due dates. Previous literature (see, e.g., Chen and Song 2001) has established the optimality of state-dependent base-stock policies for periodic review inventory problems with Markov-modulated demand and exogenous leadtimes. The results of Chen and Song do not directly apply to our setting of endogenous leadtimes and continuous review. Nevertheless, it might be possible to develop an alternate analysis of the systems considered herein using techniques from the study of inventory models with Markov-demand demand. Advance demand information can be viewed as a form of forecast updating. Examples of papers that deal with inventory systems with periodic forecast updates include Graves et al. (1986), Heath and Jackson (1994), G¨ ull¨ u (1996), Sethi et al. (2001), Zhu and Thonemann (2004), and references therein. The models we present in this paper can be viewed as dealing with forecast updates. However, in our case the updates are with respect to the timing of future demand.

5

Finally, there is a literature that deals with how a supplier should quote delivery leadtimes to its customers; see, for example, Duenyas and Hopp (1995), Hopp and Sturgis (2001), and references therein. The setting studied in this literature is quite different from ours and typically concerns make-to-order systems where no finished goods inventory is held in advance of customer orders. Relative to the above literature, we make the following contributions. Our paper is the first to consider imperfect ADI with updates for continuous-review production-inventory systems. It also appears to be the first to directly model stochastic demand leadtimes and distinguish between systems with independent and sequential due date updates, and to derive the structure of an optimal policy for each. The paper offers one of the most general models of ADI in the literature (e.g., systems with no updates or with a single update can be treated as special cases). The modeling framework is flexible and can accommodate additional features such as random numbers of updates, order cancelations, multiple demand classes, and lost sales. Moreover, the numerical study yields new insights on the benefit of ADI to suppliers, highlighting important effects due to capacity, demand leadtime, and cost parameters. It also contrasts the impact of (i) increasing the number of updates, (ii) increasing the fraction of customers who give ADI, and (iii) increasing the length of individual update stages. The numerical results also shed light on effects of ADI on customers. We show that customers may see their service quality deteriorate if they provide ADI to their suppliers. Beyond the context of ADI, our paper also contains an approach for proving structural properties of optimal policies of CTMDPs with unbounded jump rates. (The IDD model has unbounded jump rates.) The usual approach for proving structural properties for CTMDPs with bounded jump rates is to first uniformize (see, e.g., Lippman 1975) the CTMDP to get an equivalent discretetime Markov decision process (DTMDP), and then to show that certain properties of functions are preserved by the DTMDP transition operator. Results then follow using induction and the convergence of value iteration. With unbounded jump rates, uniformization cannot be applied, and hence the “usual approach” does not work. Our method for CTMDPs with unbounded jump rates involves proving the desired structural properties for each of a sequence of problems with bounded jump rates, and then extending to the problem with unbounded jump rates by passing to a limit via a suitably chosen subsequence and appealing to results of Guo and Hern´ andez-Lerma (2003). Although the method is somewhat intuitive, it involves resolving a number of non-trivial technical issues, some of which are problem specific. A variant of the approach was used in the paper by Gayon et al. (2009) cited above. The approach may prove useful in other problems with unbounded jump rates.

6

3

Problem Formulation and Structure of an Optimal Policy

Consider a supplier of a single product, who can produce at most one unit of the product at a time. The supplier may hold completed units of the product in inventory. Any such unit of inventory incurs a holding cost of h per unit time. We model ADI through the notion of a demand leadtime system. (As mentioned in the introduction, such a system may represent the internal operations of customers; ADI is provided implicitly by allowing the supplier to view these internal operations.) Orders for the product are announced before their due dates. Such announcements may be viewed as arrivals to the demand leadtime system. We assume that the announcements arrive continuously over time according to a Poisson process with rate λ. The amount of time between when an order is announced and when it becomes due is random. We refer to this random variable as the demand leadtime. The demand leadtime of an order is the amount of time it spends in the demand leadtime system. We assume orders are homogeneous in the sense that demand leadtimes have the same distribution for all orders, and hence the expected demand leadtime is the same for all orders. After an order is announced, it progresses through the demand leadtime system before becoming due. Specifically, it undergoes a series of k − 1 updates (k ≥ 1). For i = 1, . . . , k − 1, the time between the (i − 1)th and ith update is exponentially distributed with mean νi−1 . (The 0th update is the order’s initial announcement.) The time between the (k − 1)th update and when the order becomes due is exponentially distributed with mean νk−1 . Hence, each demand leadtime consists of k exponentially distributed stages with the expected demand leadtime of each order equal to ν1−1 + · · · + νk−1 . (The case with k = 1 represents a situation with no updates and exponential demand leadtimes.) When an order has undergone exactly i − 1 updates we say that it is in stage i. Viewed in this fashion, the ith update corresponds to an order moving from stage i to stage (i + 1). When an order undergoes its ith update, the supplier learns that the order’s expected −1 remaining demand leadtime has decreased from νi−1 + · · · + νk−1 to νi+1 + · · · + νk−1 . Equivalently,

we may think of demand leadtime as having a phase-type distribution with k phases in series. Information is provided each time the demand leadtime completes a phase. In the case where νi = ν for i = 1, . . . , k, demand leadtimes have an Erlang distribution. Note that the process by which orders are announced, updated, and become due can be viewed as an M/G/∞ queue. After an order becomes due (i.e., after it leaves the demand leadtime system), the supplier fills the order if it has inventory on hand. If the supplier does not have inventory on hand, then the order is backordered and incurs a backorder cost of b per unit time. Orders do not incur backorder costs before they are are due; i.e., orders in the demand leadtime system do not incur backorder costs. 7

As mentioned above, the supplier can produce one item at a time. We assume that production times are exponentially distributed with mean µ−1 . Hence, the production process can itself be viewed as a queue, whose input is provided by the output of the demand leadtime system. The assumptions of Poisson arrivals of announcements and exponential production and update times are made in part for mathematical tractability as they allow the problem to be cast as a CTMDP. They are also appropriate for approximating systems with high variability. Such Markovian assumptions are consistent with previous studies of production-inventory systems; see e.g., Buzacott and Shanthikumar (1993), Ha (1997), Zipkin (2000), de V´ericourt et al. (2002), and others. Later, we partially relax these assumptions. In the remainder of this section, we develop the formulation and describe the structure of the optimal policy. This is done by first analyzing in Section 3.1 a simplified version (with a truncated state space) of the problem. We then extend the analysis in Section 3.2 to systems without the truncation and state our main result for this section in Theorem 2.

3.1

Bounded Jump Rates

In this section, we assume that the total number of announced orders (i.e., the number of orders in the demand leadtime system) at any instant remains bounded by a finite integer m < ∞, Pk so that i=1 yi ≤ m, where yi is the number of orders in stage i. Order announcements are rejected and leave without entering the leadtime demand system (and hence never become due) if Pk i=1 yi = m. This assumption allows us to formulate the problem as a Markov decision process with bounded jump rates. From a queueing perspective, the introduction of the finite m means

that we approximate the M/G/∞ queue mentioned above by an M/G/m/m queue (an Erlang loss system). When m is chosen to be large, very few order announcements are rejected and hence the arrival rate of due orders to the production facility will be roughly the same as the arrival rate of order announcements to the demand leadtime system. In fact, the precise arrival rate of due orders to the production facility will be λ(1 − B(m)) where B(m) is the probability that an M/G/m/m queue is full. The probability B(m) approaches 0 as m → ∞. The exact value of B(m) is given by the well-known Erlang loss formula. Using the results of this section as a building block, in Section 3.2 we extend our results to the case with no bound on the total announced orders. Let Z and Z+ be respectively the sets of integers and non-negative integers, and let Zk and Zk+ be their k-dimensional cross products. Let R be the real numbers. Throughout, y = (y1 , . . . , yk ). Pk The MDP has state space Sm := Z × Zk+ (m), where Zk+ (m) := {y ∈ Zk+ : i=1 yi ≤ m}. To

keep notation clean, we will indicate the dependence on m only in notation that is used later for

8

extending to the case without m. It is, however, important to keep in mind that most of the quantities in this section do depend upon m, even if this is not reflected in the notation. The state of the system is determined by X(t), which represents the net inventory at time t, and Y(t) = (Y1 (t), . . . , Yk (t)), where Yi (t) is the number of announced orders in stage i at time t. In each state, two actions are possible: produce or idle (do not produce). The objective is to find a production policy that minimizes the long-run expected discounted cost. Let the set of such production policies be denoted by Π. A deterministic stationary policy π := {π(x, y) : (x, y) ∈ Sm } specifies the action taken at any time as a function only of the state of the system, where π(x, y) = 1 means produce in state (x, y), and π(x, y) = 0 means idle in state (x, y). We will work with a uniformized version (see, e.g., Lippman, 1975) of the problem in which the P transition rate in each state under any action is Λ := λ + µ + m ki=1 νi so that the transition times

0 = τ0 ≤ τ1 ≤ τ2 ≤ . . . are such that {τn+1 − τn : n ≥ 0} is a sequence of i.i.d. exponential random variables, each with mean Λ−1 . Let {(Xn , Yn ) : n ≥ 0} denote the embedded Markov chain of states; that is, (Xn , Yn ) := (X(τn ), Y(τn )) is the state immediately after the n-th transition. For i = 1, . . . , k, let ei be the k−dimensional vector with 1 in position i and zeros elsewhere. Let e0 be the k-dimensional vector of zeros. If action a ∈ {0, 1} is selected in state (x, y), then the next state of the embedded Markov chain is (x′ , y′ ) with probability    Λ−1 µI{a=1}       Λ−1 λI{¯y 0 are the per-unit holding and backorder cost rates, and x+ = max{x, 0} and x− = − min{x, 0}. Here, we again emphasize that backorder costs are incurred only when an order becomes due and is not immediately satisfied. Jobs inside the demand leadtime system (which have been announced, but which are not yet due) do not incur backorder costs. The value function, which specifies the optimal expected total discounted cost, is given by "Z # " ∞   # ∞ X Λ n c(Xn ) ∗ π π vm (x, y) := inf E(x,y) e−βt c(X(t))dt = inf E(x,y) , (1) π∈Π π∈Π γ γ t=0 n=0 9

π where β > 0 is the discount rate, γ := β + Λ, and E(x,y) denotes expectation with respect to the

probability measure determined by policy π and (X(0), Y(0)) = (x, y). Let V be the set of real-valued functions on Sm and let v be an arbitrary element of V . Define Tλ , Ti′ , Tµ : V → V as follows Tλ v(x, y) := v(x, y + e1 I{¯y x∗y+el − 1 = x∗y+ej − 1: ∆Tµ v(x + 1, y + el ) = ∆v(x + 1, y + el ) ≥ ∆v(x, y + ej ) = ∆Tµ v(x, y + ej ). If x∗y+el = x∗y+ej + 1, we distinguish three subcases: (a) x < x∗y+ej − 1: ∆Tµ v(x + 1, y + el ) = ∆v(x + 2, y + el ) ≥ ∆v(x + 1, y + ej ) = ∆Tµ v(x, y + ej ). (b) x = x∗y+el − 2 = x∗y+ej − 1: ∆Tµ v(x + 1, y + el ) = ∆Tµ v(x, y + ej ) = 0. (c) x > x∗y+ej − 1: ∆Tµ v(x + 1, y + el ) = ∆v(x + 1, y + el ) ≥ ∆v(x, y + ej ) = ∆Tµ v(x, y + ej ). Condition (C3): (i) For operator Tλ it is straightforward to check that Tλ v satisfies condition (C3) when v ∈ U . (ii) For operator Ti , we need to show that ∆Ti v(x, y+ej+1 ) ≤ ∆Ti v(x, y+ej ) for j = 0, . . . , k −1. Consider first i = 1, . . . , k − 1, and let J = I{i=j} and K = I{i=j+1} . The three possible combinations of J and K are (J, K) ∈ {(0, 0), (0, 1), (1, 0)}. The inequalities (S-8) and (S-9) below follow from the fact that v satisfies condition (C3). If yi ≥ 1, we have ∆Ti v(x, y + ej+1 ) − ∆Ti v(x, y + ej ) = (yi + K)∆v(x, y + ej+1 + ei+1 − ei ) + (m − yi − K)∆v(x, y + ej+1 ) −(yi + J)∆v(x, y + ej + ei+1 − ei ) − (m − yi − J)∆v(x, y + ej ) h i = yi ∆v(x, y + ej+1 + ei+1 − ei ) − ∆v(x, y + ej + ei+1 − ei ) h i +(m − yi − K − J) ∆v(x, y + ej+1 ) − ∆v(x, y + ej ) h i +K ∆v(x, y + ej+1 + ei+1 − ei ) − ∆v(x, y + ej ) h i +J ∆v(x, y + ej+1 ) − ∆v(x, y + ej + ei+1 − ei )

≤ 0.

S-4

(S-8)

If yi = 0, we have ∆Ti v(x, y + ej+1 ) − ∆Ti v(x, y + ej ) = K∆v(x, y + ej+1 + K(ei+1 − ei )) + (m − K)∆v(x, y + ej+1 ) −J∆v(x, y + ej + J(ei+1 − ei )) − (m − J)∆v(x, y + ej ) i h = (m − K − J) ∆v(x, y + ej+1 ) − ∆v(x, y + ej ) h i +K ∆v(x, y + ej+1 + K(ei+1 − ei )) − ∆v(x, y + ej ) h i +J ∆v(x, y + ej+1 ) − ∆v(x, y + ej + J(ei+1 − ei ))

≤ 0.

(S-9)

Now we consider operator Tk . Let I = I{j=k−1} . The inequalities (S-10) and (S-11) below follow from the fact that v satisfies conditions (C2) and (C3). If yk ≥ 1, we have ∆Tk v(x, y + ej+1 ) − ∆Tk v(x, y + ej ) = (yk + I)∆v(x − 1, y + ej+1 − ek ) + (m − yk − I)∆v(x, y + ej+1 ) −yk ∆v(x − 1, y + ej − ek ) − (m − yk )∆v(x, y + ej ) h i = yk ∆v(x − 1, y + ej+1 − ek ) − ∆v(x − 1, y + ej − ek ) h i +(m − yk − I) ∆v(x, y + ej+1 ) − ∆v(x, y + ej ) h i +I ∆v(x − 1, y + ej+1 − ek ) − ∆v(x, y + ej ) ≤ 0.

(S-10)

If yk = 0, we have ∆Tk v(x, y + ej+1 ) − ∆Tk v(x, y + ej ) = I∆v(x − I, y + ej+1 − Iek ) + (m − I)∆v(x, y + ej+1 ) − m∆v(x, y + ej ) h i = (m − I) ∆v(x, y + ej+1 ) − ∆v(x, y + ej ) i h +I ∆v(x − I, y + ej+1 − Iek ) − ∆v(x, y + ej ) ≤ 0.

(S-11)

(iii) For Tµ , we need to show that ∆Tµ v(x, y + ej+1 ) ≤ ∆Tµ v(x, y + ej ) for j = 0, . . . , k − 1. In part (iii) of the argument for condition (C2), we showed that x∗y+ej+1 is either x∗y+ej or x∗y+ej + 1. If x∗y+ej+1 = x∗y+ej , we distinguish three subcases: S-5

(a) x < x∗y+ej − 1 : ∆Tµ v(x, y + ej+1 ) = ∆v(x + 1, y + ej+1 ) ≤ ∆v(x + 1, y + ej ) = ∆Tµ v(x, y + ej ). (b) x = x∗y+ej − 1 : ∆Tµ v(x, y + ej+1 ) = ∆Tµ v(x, y + ej ) = 0. (c) x > x∗y+ej − 1 : ∆Tµ v(x, y + ej+1 ) = ∆v(x, y + ej+1 ) ≤ ∆v(x, y + ej ) = ∆Tµ v(x, y + ej ). If x∗y+ej+1 = x∗y+ej + 1, we distinguish four subcases: (a) x < x∗y+ej − 1 : ∆Tµ v(x, y + ej+1 ) = ∆v(x + 1, y + ej+1 ) ≤ ∆v(x + 1, y + ej ) = ∆Tµ v(x, y + ej ). (b) x = x∗y+ej − 1 : ∆Tµ v(x, y + ej+1 ) = ∆v(x + 1, y + ej+1 ) ≤ 0 = ∆Tµ v(x, y + ej ). (c) x = x∗y+ej : ∆Tµ v(x, y + ej+1 ) = 0 ≤ ∆v(x, y + ej ) = ∆Tµ v(x, y + ej ). (d) x > x∗y+ej : ∆Tµ v(x, y + ej+1 ) = ∆v(x, y + ej+1 ) ≤ ∆v(x, y + ej ) = ∆Tµ v(x, y + ej ). Condition (C4): It is easy to verify that if v ∈ U , then Tλ v and Ti v; i = 1, . . . , k satisfy condition (C4). For Tµ , when x < 0, we have: Tµ v(x + 1, y) = min{v(x + 2, y), v(x + 1, y)} ≤ v(x + 1, y) ≤ v(x, y). Therefore, Tµ v(x + 1, y) ≤ min{v(x + 1, y), v(x, y)} = Tµ v(x, y). This completes the first main component of the proof. To complete the proof of the proposition, let T 1 = T and define T n = T ◦ T n−1 ; n > 1. By ∗ = lim n Propositions 3.1.5 and 3.1.6 of Bertsekas (2001), vm n→∞ T v for any bounded function

S-6

v ∈ V . Take v = v0 , where v0 is the function that is identically zero on Sm . It is simple to show by induction that 0 ≤ T n v0 (x, y) ≤

n−1 b+h X (|x| + j) αj , γ j=0

where α := γ −1 (λ+µ+m

Pk

i=1 νi )

∗ (x, y) = lim n ∈ [0, 1). Hence, we have 0 ≤ vm n→∞ T v0 (x, y) < ∞.

∗ is a real-valued function on S (i.e., v ∗ ∈ V ). Therefore, vm m m

Note that v0 ∈ U , and so it follows from the argument above that T v0 ∈ U , and consequently T n v0 ∈ U for each n. Moreover, it can readily be seen that if functions {vn } and u are such that vn ∈ U for all n and vn → u ∈ V pointwise, then u ∈ U . We have established that T n v0 ∈ U and ∗ ∈ V , and therefore it follows that v ∗ ∈ U . that T n v0 → vm m

S-2

The Average-Cost Optimality Criteria

Consider the IDD framework of Section 3.1 with m < ∞. A direct analog of Theorem 1 holds for the average cost criteria. To set the stage, for any policy π ∈ Π, its average-cost is given by Pn−1 Pn−1 π π E(x,y) E(x,y) l=0 c(Xl )[τl+1 − τl ] l=0 c(Xl ) Jπ (x, y) := lim sup = lim sup . π E(x,y) [τn ] n n→∞ n→∞ Let J(x, y) := inf π∈Π Jπ (x, y). A policy that gives average cost J(x, y) for all (x, y) ∈ Sm is said to be optimal for the average-cost problem. Theorem S-1 Suppose λ < µ. Then there exists a stationary state-dependent base-stock policy π A = {π A (x, y)} that is optimal for the average-cost problem. Its base-stock levels {sA y } satisfy the conditions in (a) and (b) in Theorem 1. In addition, the optimal average cost is finite and independent of the initial state; i.e., there is a finite constant J such that J(x, y) = J for all (x, y) ∈ Sm . Proof. The main idea of the proof is to obtain the desired results for the average-cost problem by using Proposition 1 for the discounted-cost problem and letting β ↓ 0. ∗ (x, y). The optimality equation (6) can be rewritten Given a discount rate, let vˇ(x, y) := γvm

as: vˇ(x, y) = min

a∈{0,1}

n

c(x) +

Λ γ

X

(x′ ,y′ )∈S

o p(x,y),(x′ ,y′ ) (a)ˇ v (x′ , y′ ) .

Let α = Λ/γ = Λ/(β + Λ), and define hα (x, y) := vˇα (x, y) − vˇα (0, e0 ), where we have appended a subscript α to indicate dependence on α (and hence on β). Parts (i) and (ii) of Theorem 7.2.3 in Sennott (1999) state that under conditions I, II, and III given below there exists a sequence {αn := Λ/(βn + Λ)} and a real-valued function h(·) such that S-7

αn ↑ 1 and limn→∞ hαn (x, y) = h(x, y). (Note that αn ↑ 1 means βn ↓ 0.) Moreover, the function h(·) satisfies J + h(x, y) ≥ min

a∈{0,1}

n

c(x) +

X

(x′ ,y′ )∈S

o p(x,y),(x′ ,y′ ) (a)h(x′ , y′ ) ,

(S-12)

vα (x, y) = limα↑1 (1 − α) Λ where J := limα↑1 (1 − α)ˇ α vα (x, y) is a finite constant. By Theorem 7.2.3(ii), any stationary policy that for each (x, y) selects an action that minimizes the right-hand side of (S-12) is optimal and yields constant average cost J. Hence, properties of the average-cost optimal policy are determined through function h(·) in much the same way as were properties of ∗ (·). the discounted-cost optimal policy determined through vm

In the following we show that h(·) satisfies conditions (C1)–(C4). For (C1), we need to show that ∆h(x, y) ≤ ∆h(x + 1, y) for (x, y) ∈ Sm . We have ∆hα (x, y) = ∆ˇ vα (x, y) ≤ ∆ˇ vα (x + 1, y) = ∆hα (x + 1, y), so ∆h(x, y) = ∆ limn→∞ hαn (x, y) = limn→∞ ∆hαn (x, y) ≤ limn→∞ ∆hαn (x + 1, y) = ∆ limn→∞ hαn (x + 1, y) = ∆h(x + 1, y). Similar arguments show that h(·) also satisfies conditions (C2)–(C4). Hence, as in the proof of Theorem 1, it follows that the policy   0 if x ≥ sA y A π (x, y) :=  1 if x < sA y

with sA y := min{x : h(x + 1, y) − h(x, y) ≥ 0} is optimal for the average-cost problem, and that A properties (a) and (b) described in Theorem 1 hold for {sA y } and π . For additional background

on the above approach, see Section 8.11 of Puterman (1994). It remains to show that our problem satisfies conditions that allow application of Theorem 7.2.3. By Theorem 7.5.6 and Corollary 7.5.9 of Sennott (1999), the following conditions are sufficient to apply Theorem 7.2.3. (I) There exists a stationary policy and a state z ∈ Sm such that the induced Markov chain has a positive recurrent class R ⊆ Sm and the expected first passage time and expected first passage cost (for definitions see Lemma S-1 below) from any state (x, y) ∈ Sm to z are finite. (II) For each u > 0, the set {(x, y) ∈ Sm : c(x) ≤ u} is finite. (III) For each state (x, y) ∈ Sm \R, there exists a policy that induces a Markov chain for which the expected first passage time and expected first passage cost from state z to (x, y) are finite.

S-8

Let z = (0, e0 ). Define ̟ := {̟(x, y) : (x, y) ∈ Sm } to be the stationary policy that produces if the net inventory is less than zero and idles if the net inventory is at least zero; i.e. ̟(x, y) = I{x 0 : (X , Y ) = addition, E(x,y) m n n (x,y) PT −1 (0, e0 )} and C := t=0 c(Xt ); that is, the expected first passage time and expected first passage

cost of going from any state (x, y) ∈ Sm to state (0, e0 ) are finite.

Proof. Define set G := {(x, y) ∈ Sm : x = 0}. It is straightforward to show that under policy ̟, starting from a state (x, y) ∈ Sm \ R, the expected first passage time and cost to enter the set G are both finite. Observe also that under ̟, a Markov chain that starts in R, remains in R; that is if (x, y) ∈ R, then p(x,y),(x′ ,y′ ) (̟(x, y)) = 0 for (x′ , y′ ) ∈ / R. Since G ⊂ R and |G| < ∞, to prove the lemma we need only to show that R is positive recurrent and the expected first passage cost of going from any state (x, y) ∈ R to state (0, e0 ) is finite (because by Proposition C.1.4 of Sennott, ̟ T < ∞ for all (x, y) ∈ R). positive recurrence of R implies E(x,y)

Let VR be the set of real-valued functions on R. To show R is positive recurrent it suffices by Foster’s Criterion (see, e.g., Br´emaud, 1999, page 167) to identify a nonnegative function f ∈ VR , a finite set H ⊂ R, and ǫ > 0 such that P f (x, y) ≤ f (x, y) − ǫ

for all (x, y) ∈ R \ H,

(S-13)

P where operator P : VR → VR is defined by P f (x, y) := (x′ ,y′ )∈R p(x,y),(x′ ,y′ ) (̟(x, y))f (x′ , y′ ). P Define f (x, y) := −x + ki=1 yi , H := G, and ǫ := (µ − λ)/Λ. Note that the condition λ < µ ensures P ¯ := ki=1 yi , L = I{¯y 0. Let y

S-9

have k−1 X µ λ νi y i f (x + 1, y) + f (x, y + Le1 ) + f (x, y + Ii (ei+1 − ei )) Λ Λ Λ

P f (x, y) =

i=1

+

νk y k f (x − Ik , y − Ik ek ) + Λ

k X i=1

νi (m − yi ) f (x, y) Λ

k−1 h i X µ λ νi y i ¯ ) + (−x + y ¯ + 1) + ¯) (−x − 1 + y (−x + y Λ Λ Λ



i=1

+

νk y k ¯) + (−x + y Λ

µ λ ¯− + = −x + y Λ Λ = f (x, y) − ǫ .

k h X i=1

i (m − yi )νi ¯) (−x + y Λ

̟ T < Therefore (S-13) holds, and R is positive recurrent. As mentioned above, this also yields E(x,y)

∞ for any state (x, y) ∈ R. Now we show that the expected first passage cost of going from any state (x, y) ∈ R to state ̟ C < ∞. The first step is to show that there exists a nonnegative (0, e0 ) is finite; that is E(x,y)

function g ∈ VR and a finite set H ⊂ R with (0, e0 ) ∈ H such that P g(x, y) ≤ g(x, y) − c(x)

for all (x, y) ∈ R \ H.

Define g(x, y) := θκ−x+¯y and H := G. For κ > 1, θ > 0, a calculation as above shows that for (x, y) ∈ R \ H, we have g(x, y) − P g(x, y) ≥

i θ −x+¯y−1 h κ (λ + µ)κ − µ − λκ2 . Λ

(S-14)

For κ ∈ (1, µ/λ) the term in square brackets in (S-14) is strictly positive. Hence, for such κ and with θ large enough, the right-hand side of (S-14) exceeds c(x) = hx+ + bx− for all (x, y) ∈ R \ H. The assumption λ < µ ensures that (1, µ/λ) is non-empty. By Corollary C.2.4 of Sennott, the above ̟ ̟ C 0 or yk = 0, where Tek is defined in (S-1). Hence, the inequalities in conditions

(C1)–(C3) hold when x > 0 or yk = 0; this was shown in the proof of Proposition 1. Therefore, to show that Tek v satisfies conditions (C1)–(C3), we need only show that the inequalities hold when

x = 0 and yk ≥ 1. That is, we just need to show that

∆Tek v(0, y) ≤ ∆Tek (1, y) for all y ∈ Zk+ (m) with yk ≥ 1

(S-20)

∆Tek v(0, y + ej ) ≤ ∆Tek v(1, y + el ) for all y ∈ Zk+ (m − 1) with yk ≥ 1, j = 0, . . . , k − 1, and l = j + 1, . . . , k (S-21)

∆Tek v(0, y + ej+1 ) ≤ ∆Tek v(0, y + ej ) for all y ∈ Zk+ (m − 1) with yk ≥ 1, and j = 0, . . . , k − 1.

(S-22)

To check that (S-20) holds, suppose that y is such that yk ≥ 1. We have ∆Tek v(0, y) − ∆Tek v(1, y) = (m − yk )∆v(0, y) − yk cLS − yk ∆v(0, y − ek ) − (m − yk )∆v(1, y) h i h i = (m − yk ) ∆v(0, y) − ∆v(1, y) − yk cLS + ∆v(0, y − ek ) ≤0

where the inequality holds because v satisfies conditions (C1) and (C4)′ . Hence, (S-20) holds. Next we check that (S-21) holds. Fix j ∈ {0, . . . , k − 1} and l ∈ {j + 1, . . . , k} and let I = I{l=k} . Then ∆Tek v(0, y + ej ) − ∆Tek v(1, y + el ) = (m − yk )∆v(0, y + ej ) − yk cLS

− (yk + I)∆v(0, y + el − ek ) − (m − yk − I)∆v(1, y + el )   = (m − yk − I) ∆v(0, y + ej ) − ∆v(1, y + el )   − yk cLS + ∆v(0, y + el − ek )   + I ∆v(0, y + ej ) − ∆v(0, y + el − ek )

≤ 0,

where the inequality holds because v satisfies conditions (C2), (C3), and (C4)′ . Finally, we verify

S-18

that (S-21) holds: ∆Tek v(0, y + ej+1 ) − ∆Tek v(0, y + ej ) = (m − yk − I)∆v(0, y + ej+1 ) − (yk + I)cLS

− (m − yk )∆v(0, y + ej ) + yk cLS h i = (m − yk − I) ∆v(0, y + ej+1 ) − ∆v(0, y + ej )   − I cLS + ∆v(0, y + ej ) ≤ 0,

because v satisfies conditions (C3) and (C4)′ . We have now shown that Tek v satisfies conditions (C1)– (C3) and therefore T v satisfies (C1)–(C3).

To complete the proof of the extension of Theorem 1, we next show that T v satisfies condiP tion (C4)′ . This follows easily from (S-19) and the fact that γ = β + λ + µ + m ki=1 νi .

S-6

Additional Material for Section 3.2

In this section we describe results of GH (Guo and Hern´ andez-Lerma 2003), and show how they may be applied in our setting. GH consider continuous time MDPs with countable state space S˜ ˜ The set of allowable actions in state i ∈ S˜ is A(i). ˜ and discount rate β. The conditional transition ˜ is given by q˜(j|i, a) where q˜(j|i, a) ≥ 0 rate from state i ∈ S˜ to state j 6= i under action a ∈ A(i) P for i 6= j. Let q˜(i) := supa∈A(i) ˜i (a) < ∞ where q˜i (a) := −˜ q (i|i, a) := j6=i q˜(j|i, a). The reward ˜ q

˜ rate is r˜(i, a) in state i ∈ S˜ under action a ∈ A(i).

˜ = {0, 1}, r˜((x, y), a) = −c(x) To put our model into their framework, take S˜ = S, β˜ = β, A(i) and for (x′ , y′ ) 6= (x, y) take     µI{a=1}        λ    q˜((x′ , y′ )|(x, y), a) = νi yi       νk y k       0

if (x′ , y′ ) = (x + 1, y) if (x′ , y′ ) = (x, y + e1 ) if (x′ , y′ ) = (x, y + ei+1 − ei ) i = 1, . . . , k − 1 if (x′ , y′ ) = (x − 1, y − ek ) otherwise.

With the above, we have q˜(x,y) (a) = −˜ q ((x, y)|(x, y), a) = λ + Pk λ + i=1 νi yi + µ. GH introduce the following assumptions.

Pk

i=1 νi yi

+ µI{a=1} , and q˜(x, y) =

˜ a non-negative function R Assumption A. There exist a sequence {S˜m : m ≥ 1} of subsets of S, on S˜ and constants b ≥ 0 and c0 ∈ (−∞, ∞) such that S-19

(1) S˜m ↑ S˜ and supi∈S˜m q˜(i) < ∞ for each m ≥ 1; (2) limm→∞ [inf j ∈/ S˜m R(j)] = +∞; and P ˜ (3) ˜(j|i, a)R(j) ≤ c0 R(i) + b for all i ∈ S˜ and a ∈ A(i). j∈S˜ q

Assumption B. With c0 and R as in Assumption A: (1) either c0 ≤ 0 or c0 − β˜ < 0 when c > 0; and

(2) there exist non-negative constants M1 and M2 such that |˜ r (i, a)| ≤ M1 + M2 R(i) for all i ∈ ˜ . S˜ and a ∈ A(i) Assumption C. ˜ A(i) ˜ is compact; (1) For each i ∈ S, (2) the functions r˜(i, a), q˜(i|j, a) and ˜ and i, j ∈ S;

P

˜(j|i, a)R(j) j∈S˜ q

˜ for each fixed are continuous in a ∈ A(i)

(3) there exist a non-negative function w′ on S˜ and constants c′ > 0, b′ > 0, and M ′ > 0 such P that q˜(i)R(i) ≤ M ′ w′ (i) and j∈S˜ q˜(j|i, a)w′ (j) ≤ c′ w′ (i) + b′ for all (i, a).

Parts (b) and (c) of Theorem 3.2 of GH state that if Assumptions A and B hold, then the value ˜ := {v : there exist constants c1 , c2 ≥ 0 so that |v(i)| ≤ function v ∗ is the unique solution in BR (S) ˜ of the optimality equation, c1 + c2 R(i) for all i ∈ S} v(i) =

n o X 1 sup r˜(i, a) + q˜(j|i, a)v(j) + [˜ q (i) − q˜i (a)]v(i) β˜ + q˜(i) j6=i

i ∈ S˜ .

(S-23)

Part (e) of Theorem 3.2 states that if, in addition, Assumption C holds, then there exists an optimal stationary policy. When Assumptions A, B, and C(3) hold, then Theorem 3.3 of GH states that if {a∗ (i)} satisfy v ∗ (i) =

n o X   1 r˜(i, a∗ (i)) + q˜(j|i, a∗ (i))v ∗ (j) + q˜(i) − q˜i (a∗ (i)) v ∗ (i) β˜ + q˜(i) j6=i

i ∈ S˜ ,

(S-24)

then {a∗ (i)} is an optimal (stationary) policy. Before we proceed, observe that the equation v = Lv in Section 3.2 is simply (S-23) specialized to our context, and multiplied by −1 to express our cost minimization as a (negative) revenue maximization. Lemma S-2 The system with IDD and unbounded jump rates satisfies Assumptions A, B, and C of GH.

S-20

Let R(x, y) := |x| +

Proof.

Pk

i=1 yi

¯ be as defined in (10) and let S˜m = Sm . Then = |x| + y

Assumptions A(1) and A(2) hold trivially. For A(3) and B(1) we need to identify constants c0 ≤ 0 and b ≥ 0 so that λR(x, y + e1 ) +

k−1 X

νi yi R(x, y + ei+1 − ei ) + νk yk R(x − 1, y − ek ) + µI{a=1} R(x + 1, y)

i=1

− [λ +

k X

νi yi + µI{a=1} ]R(x, y)



c0 R(x, y) + b

(S-25)

i=1

for all (x, y) ∈ S and a ∈ {0, 1}. The expression on the left side above simplifies to λ[R(x, y + e1 ) − R(x, y)] + νk yk [R(x − 1, y − ek ) − R(x, y)] + µI{a=1} [R(x + 1, y) − R(x, y)], which is bounded above by λ + µ. Hence (S-25) holds with c0 = 0 and b = λ + µ. Thus, Conditions A(3) and B(1) hold. For B(2) we need constants M1 , M2 ≥ 0 so that c(x) ≤ M1 + M2 R(x, y), for all (x, y) ∈ S. It is easy to see that B(2) holds with M1 = 0 and M2 = b + h. The compactness and continuity Assumptions C(1) and C(2) hold trivially because our action space is finite. For C(3), we need to identify a non-negative function w′ : S → S and constants b′ ≥ 0, c′ > 0, and M ′ > 0 such that (λ +

k X

νi yi + µ)R(x, y) ≤ M ′ w′ (x, y)

for all (x, y) ∈ S

(S-26)

i=1

and ′

λw (x, y + e1 ) +

k−1 X

νi yi w′ (x, y + ei+1 − ei ) + νk yk w′ (x − 1, y − ek ) + µI{a=1} w′ (x + 1, y)

i=1

− [λ +

k X

νi yi + µI{a=1} ]w′ (x, y) ≤ c′ w′ (x, y) + b′

(S-27)

i=1

¯ )2 and M ′ := λ + µ + max{ν1 , . . . , νk }. Note that (S-26) holds when Let w′ (x, y) := (|x| + y ¯ = 0. Otherwise, we have either |x| ≥ 1 or y ¯ ≥ 1 (or both), and hence x=y λ+

k X

¯) . νi yi + µ ≤ M ′ (|x| + y

i=1

S-21

Multiplying the above by R(x, y) yields (S-26). The left-hand side of (S-27) simplifies to λ[w′ (x, y + e1 ) − w′ (x, y)] + νk yk [w′ (x − 1, y − ek ) − w′ (x, y)] + µI{a=1} [w′ (x + 1, y) − w′ (x, y)] ≤ λ[w′ (x, y + e1 ) − w′ (x, y)] + µ[w′ (x + 1, y) − w′ (x, y)] ¯ ) + λ + 2µ(|x| + y ¯) + µ ≤ 2λ(|x| + y ¯ ) + (λ + µ) = 2(λ + µ)(|x| + y ≤ 2(λ + µ)w′ (x, y) + (λ + µ). Therefore, C(3) holds with c′ := 2(λ + µ) and b′ := λ + µ. This completes the proof.

S-7

Systems with IDD and Low Arrival Rate

For small enough λ, a system with no ADI will hold no inventory and produce to order, giving an average cost of approximately C1 := λb/µ; this expression comes from ignoring the possibility of multiple orders being present at once (which is not unreasonable when λ is very small) and noting that jobs arrive at rate λ and incur cost at rate b during the time it takes to produce one unit, which has mean 1/µ. For a system with ADI, it will again be best not to hold inventory when no jobs are in the demand leadtime system. When an order is announced, then the decision is whether or not to produce one unit in advance of the order be coming due. Again ignoring the possibility of multiple orders being present simultaneously, if the decision is not to produce upon announcement of an order, then the long-run average cost will again be roughly C1 = λb/µ. If the decision is to commence production upon announcement of an order, then we can derive an approximation for the average cost by conditioning on whether the production is completed prior to the order becoming due [which occurs with probability µ/(µ + ν)] or not [which occurs with probability ν/(µ + ν)]. By standard properties of exponential random variables, the order will generate an average holding cost of h/ν conditional upon the unit completing production prior to the order becoming due. Similarly, the order will generate an average backorder cost of b/µ conditional upon the unit not completing production prior to the order coming due. Putting it together, the long-run average µ h cost is approximately C2 := λ[ µ+ν ν +

ν b µ+ν µ ].

Hence, for the system with ADI, it will be better to produce upon announcement of an order provided C2 < C1 . Rearranging terms, it follows that it is better to produce if ν > ν ∗ (µ) := hµ/b and it is better to wait for the order to become due otherwise. It follows that if ν > ν ∗ (µ) then PCR = 100 × (JN − JA )/JN ≈ 100 × (C1 − C2 )/C1 = 100 × (bµν − hµ2 )/(bµν + bν 2 ) for λ small. S-22

Likewise, if ν ≤ ν ∗ (µ) then PCR ≈ 0 for λ small. A similar analysis is possible for general k.

S-8

Extra Material for Section 5

Proof of Theorem 3. To prove (a), note that by Proposition S-1, the value function vb∗ satisfies

c and (C3). c Applying Condition (C3) c and using the definition of sby+erw , we have conditions (C2)

∆b v ∗ (b sy+erw , y + epq ) ≥ ∆b v ∗ (b sy+erw , y + erw ) ≥ 0, which implies sby+erw ≥ sby+epq . Also, by c and the definition of sby+epq , we have ∆b Condition (C2) v ∗ (b sy+epq + 1, y + erw ) ≥ ∆b v ∗ (b sy+epq , y +

epq ) ≥ 0, which implies sby+epq + 1 ≥ sby+erw . Therefore, sby+epq ≤ sby+erw ≤ sby+epq + 1 and hence sby+erw is equal to either sby+epq or sby+epq + 1. Finally, part (b) is a consequence of the fact that vb∗ c satisfies Condition (C4).

c C5). c Proposition S-1 The value function vb∗ satisfies Conditions (C1)–(

c C5), c then it will follow by an argument identical to that Proof. If Tb preserves Conditions (C1)–(

c C5). c Hence, we need at the end of the proof of Proposition 1 that vb∗ satisfies Conditions (C1)–( c C5), c then so does Tbv. only prove that if v ∈ Vb satisfies Conditions (C1)–( c C5). c To this end, suppose hereafter that v satisfies Conditions (C1)–(

c C5). c As before, it is sufficient to verify that We will check that Tbv satisfies Conditions (C1)–(

c and (C4) c each of {Tij v} and Tµ v individually satisfies the five conditions. For Conditions (C1)

the verification is essentially identical to the approach used for Conditions C1 and C4 in the proof of Proposition 1; hence we omit the details. Likewise, the verification that Tµ v satisfies the five

conditions is virtually identical to the corresponding arguments in Proposition 1. Hence, we present only a few comments where there are minor differences in the verification for Tµ v. c and (C3). c For this, suppose (p, q) ≺ (r, w), and y + We begin by checking Conditions (C2)

epq , y + erw ∈ Y, and let Jij = I{(p,q)=(i,j)} and Lij = I{(r,w)=(i,j)} .

c and (C3), c let x∗ (y) := min{x : ∆v(x, y + epq ) ≥ 0} To verify that Tµ v satisfies Conditions (C2)

for y ∈ Y. Similar to IDD systems, we have that x∗y+erw is either x∗y+epq or x∗y+epq + 1, because v

c and (C3). c Proof that Tµ v does indeed satisfy Conditions (C2) c and (C3) c satisfies Conditions (C2) follows by considering cases directly analogous to those used for Tµ in the proof of Proposition 1.

c For operators {Tij }, we need to show that ∆Tij v(x, y+epq ) ≤ ∆Tij v(x+1, y+erw ). Condition (C2): (a) For i = 0, j = 1, . . . , k0 − 1, the operator Tij = T0j is given by (12).

S-23

If y0j = 1 then ∆T0j v(x, y + epq ) = ∆v(x, y + epq + e0,j+1 − e0j ) ≤ ∆v(x + 1, y + erw + e0,j+1 − e0j ) = ∆T0j v(x + 1, y + erw ) , c where the inequality holds because v is assumed to satisfy Condition (C2). If y0j 6= 1 then

∆T0j v(x, y + epq ) = ∆v(x, y + epq + J0j (e0,j+1 − e0j )) ≤ ∆v(x + 1, y + erw + L0j (e0,j+1 − e0j )) = ∆T0j v(x + 1, y + erw ) . The inequality above can be checked by looking at the three possible values of (J0j , L0j ). If c (J0j , L0j ) ∈ {(0, 0), (1, 0), (0, 1)}, the inequality holds because v satisfies Condition (C2).

(b) For i = 0, j = k0 , the operator Tij = T0k0 is given by (13). Let I1 = I{n=0} . If y0k0 = 1 then

∆T0k0 v(x, y + epq ) = ∆v(x − I1 , y + epq + e01 − e0k0 + (1 − I1 )e11 ) ≤ ∆v(x + 1 − I1 , y + erw + e01 − e0k0 + (1 − I1 )e11 ) = ∆T0k0 v(x + 1, y + erw ) . If y0k0 6= 1 ∆T0k0 v(x, y + epq ) = ∆v(x − J0k0 I1 , y + epq + J0k0 (e01 − e0k0 + (1 − I1 )e11 )) ≤ ∆v(x + 1 − L0k0 I1 , y + erw + L0k0 (e01 − e0k0 + (1 − I1 )e11 )) = ∆T0k0 v(x + 1, y + erw ) . c C3). c The inequalities above follow from the fact that v satisfies Conditions (C2)–(

(c) For i = 1, . . . , n, j = 1 the operator Tij = Ti1 is given by (14) when ki ≥ 2. Let I2 = I{(p,q)6=(i,l) for l=2,...,ki } and I3 = I{(r,w)6=(i,l) for l=2,...,ki } . If yi1 ≥ 1 and yil = 0 for l = 2, . . . , ki then ∆Ti1 v(x, y + epq ) = ∆v(x, y + epq + I2 (ei2 − ei1 )) ≤ ∆v(x + 1, y + erw + I3 (ei2 − ei1 )) = ∆Ti1 v(x + 1, y + erw ). S-24

If yi1 = 0 and yil = 0 for l = 2, . . . , ki ∆Ti1 v(x, y + epq ) = ∆v(x, y + epq + Ji1 (ei2 − ei1 )) ≤ ∆v(x + 1, y + erw + Li1 (ei2 − ei1 )) = ∆Ti1 v(x + 1, y + erw ). If yil ≥ 1 for some l = 2, . . . , ki then ∆Ti1 v(x, y + epq ) = ∆v(x, y + epq ) ≤ ∆v(x + 1, y + erw ) = ∆Ti1 v(x + 1, y + erw ). c C3). c The inequalities above follow from the fact that v satisfies Conditions (C1)–(

(d) For i = 1, . . . , n, j = 2, . . . , ki − 1, the operator Tij is given by (15). If yij = 1

∆Tij v(x, y + epq ) = ∆v(x, y + epq + ei,j+1 − eij ) ≤ ∆v(x + 1, y + erw + ei,j+1 − eij ) = ∆Tij v(x + 1, y + erw ). If yij = 0 ∆Tij v(x, y + epq ) = ∆v(x, y + epq + Jij (ei,j+1 − eij )) ≤ ∆v(x + 1, y + erw + Lij (ei,j+1 − eij )) = ∆Tij v(x + 1, y + erw ). c and (C2). c The inequalities above follow from the fact that v satisfies Conditions (C1)

(e) For i = 1, . . . , n and j = ki , the operator Tiki is given by (16) and (17) . Let I4 = I{i=n} . c C3) c we have the following. Using the fact that v satisfies Conditions (C1)–( If yiki ≥ 1 then

∆Tiki v(x, y + epq ) = ∆v(x − I4 , y + epq − eiki + (1 − I4 )ei+1,1 ) ≤ ∆v(x + 1 − I4 , y + erw − eiki + (1 − I4 )ei+1,1 ) = Tiki v(x + 1, y + erw ) . S-25

If yiki = 0 then ∆Tiki v(x, y + epq ) = ∆v(x − Jiki I4 , y + epq + Jiki (−eiki + (1 − I4 )ei+1,1 )) ≤ ∆v(x + 1 − Liki I4 , y + erw + Liki (−eiki + (1 − I4 )ei+1,1 )) = Tiki v(x + 1, y + erw ). c This completes the verification that each Tij v satisfies Condition (C2).

c For operators {Tij }, we need to show that ∆Tij v(x, y + erw ) ≤ ∆Tij v(x, y + epq ). Condition (C3): (a) For i = 0, j = 1, . . . , k0 − 1, the operator Tij = T0j is given by (12). If y0j = 1 then ∆T0j v(x, y + erw ) = ∆v(x, y + erw + e0,j+1 − e0j ) ≤ ∆v(x, y + epq + e0,j+1 − e0j ) = ∆T0j v(x + 1, y + epq ). If y0j 6= 1 then ∆T0j v(x, y + erw ) = ∆v(x, y + erw + L0j (e0,j+1 − e0j )) ≤ ∆v(x, y + epq + J0j (e0,j+1 − e0j )) = ∆T0j v(x, y + epq ). c The inequalities above follow the fact that v satisfies condition (C3).

(b) For i = 0, j = k0 , the operator Tij = T0k0 is given by (13). Let I1 = I{n=0} . If y0k0 = 1 then ∆T0k0 v(x, y + erw ) = ∆v(x − I1 , y + erw + e01 − e0k0 + (1 − I1 )e11 ) ≤ ∆v(x − I1 , y + epq + e01 − e0k0 + (1 − I1 )e11 ) = ∆T0k0 v(x, y + epq ). If y0k0 6= 1 then ∆T0k0 v(x, y + erw ) = ∆v(x − L0k0 I1 , y + erw + L0k0 (e01 − e0k0 + (1 − I1 )e11 )) ≤ ∆v(x − J0k0 I1 , y + epq + J0k0 (e01 − e0k0 + (1 − I1 )e11 )) = ∆T0k0 v(x, y + epq ). c C3) c and (C5). c The inequalities above hold because v satisfies Conditions (C1)–( S-26

(c) For i = 1, . . . , n, j = 1, the operator Tij = Ti1 is given by (14) when ki ≥ 2. Let I2 = I{(p,q)6=(i,l) for l=2,...,ki } and I3 = I{(r,w)6=(i,l) for l=2,...,ki } . If yi1 ≥ 1 and yil = 0 for l = 2, . . . , ki then ∆Ti1 v(x, y + erw ) = ∆v(x, y + erw + I3 (ei2 − ei1 )) ≤ ∆v(x, y + epq + I2 (ei2 − ei1 )) = ∆Ti1 v(x, y + epq ). If yi1 = 0 and yil = 0 for l = 2, . . . , ki then ∆Ti1 v(x, y + erw ) = ∆v(x, y + erw + Li1 (ei2 − ei1 )) ≤ ∆v(x, y + epq + Ji1 (ei2 − ei1 )) = ∆Ti1 v(x, y + epq ). If yil ≥ 1 for some l = 2, . . . , ki then ∆Ti1 v(x, y + erw ) = ∆v(x, y + erw ) ≤ ∆v(x, y + epq ) = ∆Ti1 v(x, y + epq ) . c The inequalities hold because v satisfies condition (C3).

(d) For i = 1, . . . , n, j = 2, . . . , ki − 1, the operator Tij is given by (15). If yij = 1 then ∆Tij v(x, y + erw ) = ∆v(x, y + erw + ei,j+1 − eij ) ≤ ∆v(x, y + epq + ei,j+1 − eij ) = ∆Tij v(x, y + epq ) . If yij = 0 then ∆Tij v(x, y + erw ) = ∆v(x, y + erw + Lij (ei,j+1 − eij )) ≤ ∆v(x, y + epq + Jij (ei,j+1 − eij )) = ∆Tij v(x, y + epq ) . c The inequalities above follow from the fact that v satisfies condition (C3). S-27

(e) For i = 1, . . . , n, j = ki , the operator Tij = Tiki is given by (16) and (17) . Let I4 = I{i=n} . c C3) c we have the following. Using the fact that v satisfies Conditions (C1)–( If yiki ≥ 1 then

∆Tiki v(x, y + erw ) = ∆v(x − I4 , y + erw − eiki + (1 − I4 )ei+1,1 ) ≤ ∆v(x − I4 , y + epq − eiki + (1 − I4 )ei+1,1 ) = ∆Tiki v(x, y + epq ). If yiki = 0 then ∆Tiki v(x, y + erw ) = ∆v(x − Liki I4 , y + erw + Liki (−eiki + (1 − I4 )ei+1,1 )) ≤ ∆v(x − Jiki I4 , y + epq + Jiki (−eiki + (1 − I4 )ei+1,1 )) = ∆Tiki v(x, y + epq ) . c and (C3) c for all (i, j). This completes the verification that Tij v satisfies Conditions (C1)

c Turning to Condition (C5), c consider y and q ≥ 2 such that y+e0q , y+e01 +e11 ∈ Y. Condition (C5): c in Section 5, it suffices for us to hereafter consider By the comments that precede Condition (C5) only cases with k0 ≥ 2 and n ≥ 1.

c observe first that To verify that Tµ v satisfies Condition (C5),

∆v(x, y + e0q ) ≤ ∆v(x + 1, y + e01 + e11 ).

(S-28)

To see this note that ∆v(x, y + e0q ) ≤ ∆v(x, y + e01 ) = ∆v(x, y + e01 + e00 ) ≤ ∆v(x + 1, y + e01 + c and (C2), c e11 ), where the first and second inequalities follow because v satisfies Conditions (C3)

c it follows that x∗y+e +e respectively. From (S-28) and the fact that v satisfies Condition (C5), 01 11

c follows by considering is either x∗y+e0q or x∗y+e0q + 1. Proof that Tµ v satisfies Condition (C5) cases directly analogous to those used to show that Tµ v satisfies Condition (C3) in the proof of Proposition 1. For {Tij } we need to show that ∆Tij v(x, y + e01 + e11 ) ≤ ∆Tij v(x, y + e0q ) for all (x, y) and q ≥ 2 such that y + e0q , y + e01 + e11 ∈ Y. (a) For i = 0, j = 1, . . . , k0 − 1, the operator Tij = T0j is given by (12). Since v satisfies c and (C5), c we have that Conditions (C3)

∆T0j v(x, y + e01 + e11 ) = ∆v(x, y + e01 I{j6=1} + e02 I{j=1} + e11 ) ≤ ∆v(x, y + e0j I{j6=q} + e0,j+1 I{j=q} ) = ∆T0j v(x, y + e0q ). S-28

(b) For i = 0, j = k0 , the operator Tij = T0k0 is given by (13). Recall that we need only consider c we have the case with k0 ≥ 2 and n ≥ 1. Since v satisfies Condition (C5), ∆T0k0 v(x, y + e01 + e11 ) = ∆v(x, y + e01 + e11 )

≤ ∆v(x, y + [e01 + e11 ]I{q=k0 } + e0q I{q6=k0 } ) = ∆T0k0 v(x, y + e0q ). (c) For i = 1, . . . , n, j = 1, the operator Tij = Ti1 is given by (14) when ki ≥ 2. For i 6= 1 we have ∆Ti1 v(x, y + e01 + e11 ) = ∆v(x, y + e01 + e11 + [ei2 − ei1 ]I{yi1 ≥1 and yiℓ =0 for ℓ=2,...,ki } ) ≤ ∆v(x, y + e0q + [ei2 − ei1 ]I{yi1 ≥1 and yiℓ =0 for ℓ=2,...,ki } ) = ∆Ti1 v(x, y + e0q ), c because v satisfies Condition (C5).

For i = 1, let I ′ = I{y1ℓ =0 for ℓ=2,...,k1 } . Then ∆T11 v(x, y + e01 + e11 ) = ∆v(x, y + e01 + e12 I ′ + e11 [1 − I ′ ]) ≤ ∆v(x, y + e0q + [e12 − e11 ]I{y11 ≥1 and y1ℓ =0 for ℓ=2,...,k1 } ) = ∆T11 v(x, y + e0q ), c and (C5). c because v satisfies Conditions (C3)

(d) For i = 1, . . . , n, j = 2, . . . , ki − 1, operator Tij is given by (15). Since v satisfies Condic we have tion (C5),

∆Tij v(x, y + e01 + e11 ) = ∆v(x, y + e01 + e11 + [ei,j+1 − eij ]I{yij ≥1} ) ≤ ∆v(x, y + e0q + [ei,j+1 − eij ]I{yij ≥1} ) = ∆Tij v(x, y + e0q ).

(e) For i = 1, . . . , n − 1, j = ki , the operator Tij = Tiki is given by (16). If i > 1 or if both i = 1 and k1 > 1, then we have ∆Tiki v(x, y + e01 + e11 ) = ∆v(x, y + e01 + e11 + [ei+1,1 − eiki ]I{yik ≤ ∆v(x, y + e0q + [ei+1,1 − eiki ]I{yik = ∆Tiki v(x, y + e0q ), S-29

i

≥1} )

i

≥1} )

c because v satisfies Condition (C5). If i = 1 and k1 = 1 we have

∆T11 v(x, y + e01 + e11 ) = ∆v(x, y + e01 + e21 ) ≤ ∆v(x, y + e0q + [e21 − e11 ]I{y11 ≥1} ) = ∆T11 v(x, y + e0q ), c and (C5). c because v satisfies Conditions (C3)

(f) For i = n, j = kn , the operator Tij = Tnkn is given by (17). If n > 1 or if both n = 1 and k1 > 1, then we have ∆Tnkn v(x, y + e01 + e11 ) = ∆v(x − I{ynkn ≥1} , y + e01 + e11 − enkn I{ynkn ≥1} ) ≤ ∆v(x − I{ynkn ≥1} , y + e0q − enkn I{ynkn ≥1} ) = ∆Tnkn v(x, y + e0q ), c because v satisfies Condition (C5). If n = 1 and k1 = 1 we have

∆T11 v(x, y + e01 + e11 ) = ∆v(x − 1, y + e01 ) ≤ ∆v(x − I{y11 ≥1} , y + e0q − e11 I{y11 ≥1} ) = ∆T11 v(x, y + e0q ), c and (C5). c because v satisfies Conditions (C2)

c This completes the verification that Tij v satisfies Condition (C5).

S-30

S-9

Additional Numerical Results

λ = 0.4

λ = 0.6

λ = 0.8

ν1 = ν2

k=0

k=1

k=2

k=0

k=1

k=2

k=0

k=1

k=2

0.01

6.67

6.64

6.62

13.00

12.89

12.88

30.96

29.09

28.42

0.51%

0.78%

0.84%

0.87%

6.04%

8.20%

6.56

6.56

12.68

12.67

28.34

27.55

1.66%

1.67%

2.45%

2.50%

8.47%

11.01%

6.35

6.32

12.20

12.10

27.70

27.04

4.84%

5.25%

6.13%

6.92%

10.54%

12.68%

6.10

6.10

11.77

11.60

27.86

27.00

8.52%

8.55%

9.46%

10.77%

10.61%

12.79%

5.82

5.80

11.44

11.00

28.15

27.10

12.73%

13.04%

12.03%

15.38%

9.06%

12.47%

5.59

5.36

11.33

10.84

29.14

27.96

16.21%

19.57%

12.82%

16.62%

5.87%

9.68%

5.27

5.01

11.72

11.01

29.95

29.70

20.97%

24.89%

9.81%

15.31%

3.26 %

4.07%

5.34

5.08

12.37

11.06

30.15

29.76

19.98%

23.78%

4.85%

14.89%

2.60%

3.89%

5.49

5.08

12.82

12.68

30.33

29.93

17.74%

23.81%

1.38%

2.46%

2.02 %

3.34%

0.02

0.05

0.1

0.2

0.5

1.0

1.5

2.0

6.67

6.67

6.67

6.67

6.67

6.67

6.67

6.67

13.00

13.00

13.00

13.00

13.00

13.00

13.00

13.00

30.96

30.96

30.96

30.96

30.96

30.96

30.96

30.96

Table S-1: Average cost and percentage cost reduction (PCR) for systems with IDD (b = 10). The columns labeled “k = 0” show the average cost for systems without ADI.

S-31

λ = 0.4

λ = 0.6

λ = 0.8

ν1 = ν2

k=0

k=1

k=2

k=0

k=1

k=2

k=0

k=1

k=2

0.01

19.33

18.34

18.22

34.44

32.98

32.84

80.26

71.28

70.01

5.15%

5.72%

4.24%

4.65%

11.19%

12.77%

17.87

17.86

31.83

31.77

69.21

67.40

7.57%

7.60%

7.58%

7.74%

13.77%

16.03%

16.93

16.89

30.00

29.63

69.60

65.61

12.41%

12.61%

12.88%

13.98%

13.29%

18.26%

15.93

15.80

29.12

28.10

72.34

67.00

17.58%

18.26%

15.46%

18.41%

9.87%

16.52%

14.98

14.60

29.35

27.00

75.48

71.00

22.52%

24.47%

14.77%

21.60%

5.95%

11.54%

15.92

13.65

31.76

28.81

78.18

76.20

17.64%

29.39%

7.77%

16.35%

2.59%

5.05%

16.37

15.01

32.80

32.00

79.21

78.50

15.29%

22.35%

4.75%

7.08%

1.31%

2.19%

16.86

16.02

33.85

32.06

79.38

78.86

12.77%

17.11%

1.71%

6.92%

1.09%

1.75%

17.27

16.23

33.92

33.75

79.54

79.50

10.67%

16.06%

1.51%

2.00%

0.89%

0.95%

0.02

0.05

0.10

0.20

0.50

1.0

1.50

2.00

19.33

19.33

19.33

19.33

19.33

19.33

19.33

19.33

34.44

34.44

34.44

34.44

34.44

34.44

34.44

34.44

80.26

80.26

80.26

80.26

80.26

80.26

80.26

80.26

Table S-2: Average cost and percentage cost reduction (PCR) for systems with IDD (b = 50). The columns labeled “k = 0” show the average cost for systems without ADI.

S-32

ν11 = ν21

0.41

0.61

0.81

1.00

2.00

5.00

10.00

ρ = 0.4

ρ = 0.6

ρ = 0.8

n=0

n=1

n=2

n=0

n=1

n=2

n=0

n=1

n=2

25.07

24.19

23.97

-

-

-

-

-

-

3.49%

4.39%

21.10

20.21

46.38

43.51

42.69

-

-

-

15.84%

19.39%

6.21%

7.96%

20.35

19.06

39.29

37.69

107.24

98.73

96.74

18.80%

23.96%

15.30%

18.74%

7.94%

9.79%

20.95

19.10

40.12

36.40

93.82

87.33

16.41%

23.79%

13.51%

21.53%

12.52%

18.57%

23.55

21.68

44.74

42.39

105.91

104.40

6.05%

13.51%

3.53 %

8.62 %

1.25%

2.66%

24.42

23.98

45.81

45.40

106.90

106.66

2.57%

4.35%

1.25 %

2.11%

0.32%

0.54%

24.75

24.48

46.11

45.88

107.09

106.95

1.26%

2.33%

0.59 %

1.09%

0.15%

0.27%

25.07

25.07

25.07

25.07

25.07

25.07

46.38

46.38

46.38

46.38

46.38

107.24

107.24

107.24

107.24

Table S-3: The case of SDD: Average cost and percentage cost reduction between systems with ADI and systems with no ADI (b = 100) with ki = 1. When ρ > νi1 the leadtime system is not stable.

S-33

References Bertsekas, D. P., 2001. Dynamic Programming and Optimal Control, Volume 2, 2nd Edition. Athena Scientific, Belmont, MA. Br´emaud, P., 1999. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. SpringerVerlag, New York. Guo, X., Hern´ andez-Lerma, O., 2003. Continuous-time controlled Markov chains with discounted rewards. Acta Applicandae Mathematicae 79, 195–216. Puterman, M. L., 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York. Sennott, L. I., 1999. Stochastic Dynamic Programming and the Control of Queueing Systems. John Wiley & Sons, New York.

S-34