Partially Observed Inventory Systems: The Case of

0 downloads 0 Views 276KB Size Report
Oct 6, 2006 - partially observed inventory system where the demand is not observed, inventory ... Science Foundation under Grant No. ..... When It > 0, πt is an absolutely continuous p.d.f. Note that the recursive ...... [31] P. H. Zipkin (2000).
Partially Observed Inventory Systems: The Case of Zero Balance Walk∗ Alain Bensoussan [email protected] Phone: (972) 883-6117; Fax: (972) 883-5905 International Center for Decision and Risk Analysis School of Management, P.O.Box 830688, SM 30 University of Texas at Dallas Richardson, TX 75083-0688 Metin C ¸ akanyıldırım [email protected] Phone: (972) 883-6361; Fax: (972) 883-2089 School of Management, P.O.Box 830688, SM 30 University of Texas at Dallas Richardson, TX 75083-0688 Suresh P. Sethi [email protected] Phone: (972) 883-6245; Fax: (972) 883-2799 Center for Intelligent Supply Networks School of Management, P.O.Box 830688, SM 30 University of Texas at Dallas Richardson, TX 75083-0688 October 6, 2006 Abstract In many inventory control contexts, inventory levels are only partially (i.e., not fully) observed. This may be due to non-observation of demand, spoilage, misplacement, or theft of inventory. We study a partially observed inventory system where the demand is not observed, inventory level is noticed when it reaches zero, the unmet demand is lost, and replenishment orders must be decided so as to minimize the total discounted costs over an infinite horizon. This problem has an infinite-dimensional state space, and for it we establish the existence of a feedback policy when single period costs are bounded or when the discount factor is sufficiently small. We also provide an approximately optimal feedback policy that uses a finite state representation. Keywords: Stochastic inventory problem, partial observations, the Zakai equation, lost sales. Abbreviated title: The Case of Zero Balance Walk. AMS Subject Classifications: 90B05, 93E20, 93C41, 90C39. ∗

To appear in SIAM Journal on Control and Optimization. This material is based upon work supported by the National Science Foundation under Grant No. 0509278.

Electronic copy available at: http://ssrn.com/abstract=1082935

1

Introduction

Inventory control is among the most important topics in operations research because of large investments in inventory and their effect on the profitability of the firms. In 1999, for example, the investment into the inventory by the U.S. businesses alone amounted to 1.1 trillion dollars [31]. Because of the importance of inventory control decisions, there has resulted an extensive literature on the topic [2, 31]. For the motivation of our research, one of the critical assumptions in the vast inventory literature, dating back to at least the Harris lot size model of 1913 [17], has been that the level of inventory at any given time is fully observed. Some of the most celebrated results, such as the optimality of the base stock policy [1], have been obtained under the assumption of full observation. Yet the inventory level is often not fully observed in practice, as elaborated below. In such cases, most of the well-known inventory policies are not even admissible, let alone optimal. The study of systems with partially observed inventories is important in many real-life situations. We shall introduce some of the possible instances, where inventories can only be partially observed by the inventory manager (IM). Transaction errors: Unintentional mistakes happen from time to time during inventory transactions. Some of these transactions are inventory counting, receiving, checking out at the cash register, etc. An example is checking out at a grocery store. If a customer buys two types of different soups each at the same price, the sales clerk often scans only one soup type twice. A similar example with different types of yogurts can be found in Raman et al. [25]. In such cases, the recorded inventory levels of the items involved will differ from their actual levels. When stock-keeping units are discrete, it may be possible to eliminate counting errors. On the other hand, when they are not discrete, such as oil in a refinery, exact measurements are difficult to obtain. While measurement errors cause inventory to be not fully observed, it is often the mistakes in reporting transactions that lead to partial observation of inventories. Raman et al. [25] report a retailer who has inaccurate inventory records for 65% of its stock-keeping units. It is roughly estimated that the retailer loses 10% of its current profit due to these inaccuracies. They go on to say: “[this particular retailer] is not an isolated case; this [inaccuracy] problem is common at other retailers.” Common use of modern information technology tends to reduce transaction errors. However, as pointed out by Axs¨ater [2], deployment of big-ticket computer technology is not always economically feasible. Misplaced inventory: When a part of the inventory on hand is misplaced, it is not available to meet a demand until it is found. Often the misplaced inventories are not immediately found, and thus they remain unobserved to the IM. This causes the total inventory that is available to meet the demand to become partially observed. Misplaced inventory can be quite large and have a significant impact on the bottomline. It is reported in [25] that customers of a “leading retailer” cannot find 16% of the items in the stores because those items are misplaced. Misplacement of the items reduces the profit roughly by 25% at this retailer. Misplacement is more likely when the location of items in the storage is altered dynamically. According to [2], “It is easier to keep the records accurate if the items have fixed locations. On the other hand, this can lead to inefficient space utilization. By dynamically locating items, the same item can be stored in more than one location.” The recent trends in supply chain management such as crossdocking (see e.g. p.412 of [13]) also 1

Electronic copy available at: http://ssrn.com/abstract=1082935

cause dynamic locations. Misplaced inventories are eventually discovered either by inspection or by chance. When the misplaced items are placed in their proper shelves, they become available once again to meet customer demands. Thus, misplacements and their recoveries can cause the actual inventory to be respectively less and more than the recorded inventory. Spoilage: Products can naturally lose their properties while they are held in the inventory [24]. Examples with limited lifetime are drugs, chemicals, and food products. If the lifetime is limited and not immediately observed, then the actual inventory is less than the recorded inventory, and it is partially observed. If the lifetime is deterministic as in the case of drugs, an implementation of RFID (Radio Frequency IDentification) tags called SMC (Smart Medicine Cabinet) can be used to track the expired drugs [28]. Thus, the SMC can make drug spoilage fully observed. However, investments into technology such as SMC must be justified with an economic analysis, which requires the evaluation of the optimal cost under partial observations [9]. As an example of random lifetime, consider the number of batteries in a Sears retailer store. Only when these batteries are inspected (say by measuring their voltage, one by one), the inventory level of fully functioning batteries becomes known. When the spoiled inventory is observed immediately, the associated model (e.g. [23]), in spite of being challenging to work with, has full observations. In retail stores, customers can cause damage to products, making them unsuitable for sale. Some examples are tearing of a package to try on the contained cloth item, wearing down a shoe by trying it on and walking, erasing software on computers on demonstration, spilling food on clothes, and scratching a car during a test drive. So long as the damages are not detected by the IM, the actual inventory is not fully observed. Product quality and yield: When the product quality is low or a production process has a low yield [29], the actual inventory is not known. Receipts at a warehouse can include products that are defective or the ones not conforming to quality standards. It is very often the case that nonconformance of a product is not immediately observed by the IM. Receipts are usually added to the inventory at the warehouse without full inspection. As a result, the inventory on record may consist of both nondefective products (available to meet customer demands) and defective products (not fit for sale). Since the defective products are not immediately observed, the actual (nondefective) inventory becomes partially observed. If production lead times are long, an IM may have to place a particular order before observing the yields from previous orders, so that the production of the particular order is completed by a given due date. Thus, partial observability of inventory can be caused by due dates and long production lead times as well as process yields. Theft: The items in the inventory can be stolen by thieves who violently break into the inventory storage, by the warehouse employees who calmly pilfer, or by the customers who shoplift. Since violent break-ins are generally investigated, they are usually observed and therefore not relevant for our study. We focus more on continuous pilferage or shoplifting, because they are not always observed without inventory inspections. Instances of theft at the furniture retailers and at the food wholesalers have been documented in [12] and [26]. Axs¨ater [2] says: “. . . thefts may be a major problem. Apart from the loss in value, thefts . . . also lead to inaccurate inventory records.” Thus, the IM relying on inventory records ends up overestimating the available

2

inventory until a stockout occurs. In this case, there are shortage costs in addition to costs of reordering, expediting, and re-receiving items to replace the stolen units. Typically, costs of the expedited items to urgently meet backlogged demands are much more than their regular costs. When there is no physical inventory, i.e., the inventory is zero or negative, then none of the following would happen: transaction errors, misplaced inventories, spoilage, inventory level uncertainty due to yield and quality, or theft. Most companies pay utmost attention to an item when its inventory reaches zero. At these companies, employees walk around the shelves to identify the stocked-out items and verify the inventory levels for those items. This process is implemented at the office supplies store Staples and is called “zerobalance walk” in [16] and [25]. Thus, a model based on a zero-balance walk process can be built by assuming that the inventory levels are fully observed when they are zero. It is the purpose of this paper to formulate and analyze a zero-balance walk model. This paper is part of a greater effort to build a comprehensive theory of inventory control under partial observations. In related works, we study some of the other inventory models with partial observations [6, 7, 8]. There have been a few studies of partial observations in the inventory control context. In these studies, partial observations are about demands rather than inventories. Among these, a common assumption is that of unobserved demands in the periods when lost sales occur. That is, the demand is observed fully when it is smaller than the available inventory. Otherwise, only the event that it is larger than the inventory is observed. When the underlying demand distribution is not known but estimated from the demand observations, partial demand observations limit the data available for estimation. This is called estimation with censored (demand) data. Ding et al. [14] and Lu et al. [22] have a multiperiod newsvendor model with censored demand. They assume the leftover inventory in a period to be salvaged entirely so that every period starts with zero inventory. This assumption decouples the periods from each other as far as the inventory evolution is concerned. However, the periods are still coupled together by the current estimate of the demand distribution. The demand distribution is updated in a Bayesian fashion in each period with that period’s demand or the observation of the lost sales event. Thus, there is an evolution equation that maps one period’s demand distribution to the next period’s. This evolution is affected by the choice of the order quantity. Before Ding et al., Lariviere and Porteus [21] treated a similar problem for the restricted case of exponential demand distributions with gamma conjugate priors. Treharne and Sox [27] study a periodic review inventory model with Markov modulated demand. A simple example of such a demand occurs when there are two demand states – High and Low – of a Markov chain and there are two demand distributions, one for each state. Unlike the Markovian demand cases (treated e.g. in [10, 11]), it is not observed whether the demand state is High or Low. Instead, probabilities are used to represent the event that the demand state is High or Low. These probabilities along with the current level of inventory constitute the state of the system. The probabilities are updated in each period in accordance with that period’s demand. Neither the current level of inventory nor the order size affects the probability updates. Evolution of the probabilities that capture partial observability is totally independent of the order quantities. The evolution equations can be written down in the first period, and they will include the random demands in the forthcoming periods. To make the discussion simple, here we mention only two demand states, but

3

Treharne and Sox consider finitely many demand states. Consequently, they have a finite dimensional state for their system. The models we have described above make simplifying assumptions to end up with an easily workable set up. Ding et al. assume that the leftover inventory is salvaged every period, while Treharne and Sox have updates of the probabilities which are independent of the controls. Thus, they can capture only a limited amount of the dynamics associated with partial observations. Besides, they do not consider the issue of existence of optimal policies. Consequently, they do not require the methodology developed in this paper. Without such a methodology, however, inventory models with partial observations will remain largely unexplored. A main reason for why the analysis of inventory problems under partial observations has been neglected lies in its mathematical difficulty. Whereas one works with a finite dimensional state space in the full observation case, one usually has to deal with an infinite dimensional state space in the partial observation setting. More specifically, the inventory level at a given time is no longer a system state in 0 and π(·) is the probability distribution of I1 . We look for an admissible control q˜ = {q1 , q2 , . . . }, with qt adapted to Zt , t ≥ 1, such that J(ζ, π, q˜) is minimized. Special cases: To make the form of the single-period cost c(I, q) concrete, we can consider c(I, q) = c1 q + hI + bE[(D − I − q)+ ], which is often used in the inventory control literature [11]. The cost parameters c1 , h, and b can be interpreted as the cost of purchasing an item, the cost of holding an item in the inventory charged at the beginning of a period, and the opportunity cost of not selling an item when there is demand for it. Since bE[(D − I − q)+ ] ≤ bE[D], c(I, q) is of linear growth in I and q. Another example includes a nonzero fixed cost of ordering. These observations will inspire an assumption on the bounds of the general single-period cost c(I, q) in Section 3.

2.1

Evolution of state probabilities

We now develop the conditional probability density πt (.) of It given Zt−1 and It > 0. By definition, Z x πt (y)dy = P(It ≤ x|Zt−1 , It > 0). 0

Since the event [It = 0] is observable, conditional probabilities are needed only when It > 0. For any real and bounded test function ϕ(.), we can use the conditional Bayes theorem (e.g. [15]) to obtain Z ∞ E[ϕ(It )1IIt >0 |Zt−1 ] E[ϕ(It )1IIt >0 |Zt−1 ] = . (5) ϕ(x)πt (x)dx = E[ϕ(It )|Zt−1 , It > 0] = E[1IIt >0 |Zt−1 ] P(It > 0|Zt−1 ) 0 In order to obtain a recursive expression for πt in terms of πt−1 , we begin with expressing E(ϕ(It )|Zt ) in terms of conditional expectations with respect to Zt−1 in the next lemma. Lemma 1. E(ϕ(It )|Zt ) = 1IIt =0 ϕ(0) + 1IIt >0

E(ϕ(It )1IIt >0 |Zt−1 ) = 1IIt =0 ϕ(0) + 1IIt >0 E(ϕ(It )|Zt−1 , It > 0) P(It > 0|Zt−1 )

(6)

Proof: Beginning with the left-hand side of (6), we have E(ϕ(It )|Zt ) = E[ϕ(It )(1IIt =0 + 1IIt >0 )|Zt )] = ϕ(0)1IIt =0 + E[ϕ(It )1IIt >0 |Zt ].

(7)

Now take the last term in (7) and obtain E(ϕ(It )1IIt >0 |Zt ) = 1IIt >0 E(ϕ(It )|Zt ) = 1IIt >0 ψ(z1 , . . . , zt−1 , zt ) = 1IIt >0 ψ(z1 , . . . , zt−1 , 0), 6

(8)

where the first equality follows from Zt -measurability of 1IIt >0 . The second equality merely expresses E(ϕ(It )|Zt ) as ψ(z1 , . . . , zt−1 , zt ) for some measurable function ψ. The last equality follows from the fact that It > 0 ⇔ zt = 0. We now take the expectation of (8) with respect to Zt−1 . Since Zt−1 ⊆ Zt and since ψ(z1 , . . . , zt−1 , 0) is Zt−1 -measurable, we obtain E[ϕ(It )1IIt >0 |Zt−1 ] = ψ(z1 , . . . , zt−1 , 0)E[1IIt >0 |Zt−1 ] = ψ(z1 , . . . , zt−1 , 0)P(It > 0|Zt−1 )

(9)

or ψ(z1 , . . . , zt−1 , 0) =

E(ϕ(It )1IIt >0 |Zt−1 ) . P(It > 0|Zt−1 )

(10)

Using (10) in (8) and substituting into (7) the resulting expression for E[ϕ(It )1IIt >0 |Zt ], we obtain the first equality in (6). Using (5) gives the second equality. ¤ Instead of the conditional expectations in Lemma 1, the left-hand side in (6) can also be expressed by using the conditional density function πt . Using (5) on the right-hand side of (6) gives Z ∞ (11) E(ϕ(It )|Zt ) = 1IIt =0 ϕ(0) + 1IIt >0 ϕ(z)πt (z)dz. 0

The density πt is obtained by setting (6) and (11) to be equal. For It = 0, this equality yields πt = δ, which is the Dirac delta function taking the value of zero everywhere except at 0 where it is infinite. For the more interesting case of It > 0, the next lemma molds (6) into a convenient form to set (11) equal to (6) and solve for πt . Lemma 2. R∞

ϕ(z)f (qt−1 − z)1Iqt−1 ≥z dz F (q ) R∞ R ∞ t−1 0 ϕ(z) (z−qt−1 )+ f (y + qt−1 − z)πt−1 (y)dydz R∞ +1IIt−1 >0 . 0 F (y + qt−1 )πt−1 (y)dy

E(ϕ(It )|Zt )1IIt >0 = 1IIt−1 =0

0

7

(12)

Proof: Consider the numerator in the second term on the right-hand side of (6). We see that E(ϕ(It )1IIt >0 |Zt−1 ) = E(ϕ(It−1 + qt−1 − Dt−1 )1IIt−1 +qt−1 −Dt−1 >0 |Zt−1 ) ¡ ¢ = E E(ϕ(It−1 + qt−1 − Dt−1 )1IIt−1 +qt−1 −Dt−1 >0 |Zt−1 , It−1 )|Zt−1 because Zt−1 = σ({z1 , . . . , zt−1 }) ⊆ σ({z1 , . . . , zt−1 , It−1 }) ¶ = E ϕ(It−1 + qt−1 − y)1IIt−1 +qt−1 −y>0 f (y)dy|Zt−1 0 µZ qt−1 +It−1 ¶ = E ϕ(It−1 + qt−1 − y)f (y)dy|Zt−1 0 µZ qt−1 +It−1 ¶ = E ϕ(x)f (It−1 + qt−1 − x)dx|Zt−1 µZ



0

µZ = E Z =



0

0

set x := It−1 + qt−1 − y





ϕ(x)f (It−1 + qt−1 − x)1IIt−1 +qt−1 −x≥0 dx|Zt−1

¡ ¢ ϕ(x)E f (It−1 + qt−1 − x)1IIt−1 +qt−1 −x≥0 |Zt−1 dx.

(13)

Use (11) with the time index t − 1 instead of t and replace ϕ(It−1 ) with f (It−1 + qt−1 − x)1IIt−1 +qt−1 −x≥0 to obtain E(f (It−1 + qt−1 − x)1IIt−1 +qt−1 −x≥0 |Zt−1 ) = 1IIt−1 =0 f (qt−1 − x)1Iqt−1 −x≥0 Z ∞ f (y + qt−1 − x)1Iy+qt−1 −x≥0 πt−1 (y)dy. (14) +1IIt−1 >0 0

Inserting (14) into (13), we obtain

Z

E(ϕ(It )1IIt >0 |Zt−1 ) = 1IIt−1 =0



ϕ(x)f (qt−1 − x)1Ix≤qt−1 dx ÃZ ! Z ∞ ∞ ϕ(x) f (y + qt−1 − x)πt−1 (y)dy dx. +1IIt−1 >0 0

(x−qt−1 )+

0

Now consider the denominator in the second term on the right hand side of (6) to obtain P(It > 0|Zt−1 )) = E(1IIt−1 +qt−1 −Dt−1 >0 |Zt−1 )) = E{E(1IIt−1 +qt−1 −Dt−1 >0 |Zt−1 , It−1 )|Zt−1 } = E{F (It−1 + qt−1 )|Zt−1 } = 1IIt−1 =0 F (qt−1 ) + 1IIt−1 >0

Z 0



F (y + qt−1 )πt−1 (y)dy.

Inserting the numerator and the denominator into (6) yields the desired result. ¤ Having obtained the conditional expectation in Lemma 2, we go back to the conditional probability πt as defined in (11) for It > 0. Setting the second term on the right-hand side of (11) equal to (12), we have ) (R ∞ ½ ¾ f (qt−1 − x)1Ix≤qt−1 (x−qt−1 )+ f (y + qt−1 − x)πt−1 (y)dy R∞ . (15) πt (x) = 1IIt−1 =0 + 1IIt−1 >0 F (qt−1 ) 0 F (y + qt−1 )πt−1 (y)dy 8

This expression specializes to the conditional probabilities stated in the next theorem. Theorem 1. The conditional probability πt can be expressed recursively as follows:   f (qt−1 − x)     1 I if I = 0 t−1  x≤qt−1 F (q )  t−1 R ∞ πt (x) = .   (x−qt−1 )+ πt−1 (y)f (y + qt−1 − x)dy   R  if It−1 > 0  ∞ 0 πt−1 (y)F (y + qt−1 )dy

(16)

Note that the denominators in (16) are P(Dt−1 < It−1 + qt−1 ), which is P(It > 0). When It > 0, πt is an absolutely continuous p.d.f. Note that the recursive equations for It−1 > 0 and It−1 = 0 coincide for πt−1 = δ, so the equation for It−1 > 0 applies even when It−1 = 0. Since the largest value P Pt−1 0 of It is It−1 + qt−1 , πt has a support of [0, t−1 i=1 qi ]. If It0 = 0 for some t < t, then the support is [0, i=t0 qi ]. Since π1 , f , and F are all given, the evolution of πt can be controlled only by q˜ = {q1 , q2 , . . . }. The conditional probability evolves according to a highly nonlinear equation R∞ f (qt−1 − x)1Ix≤qt−1 (x−qt−1 )+ f (y + qt−1 − x)πt−1 (y)dy R∞ πt (x) = zt−1 + (1 − zt−1 ) t ≥ 2, (17) F (qt−1 ) 0 F (qt−1 + y)πt−1 (y)dy π1 (x) = π(x), which corresponds to the Kushner equation [18] in our inventory context. We can linearize (17) as follows. Set pt (x) := λt πt (x),

(18)

where λt is a weighting factor to be defined shortly. On account of this weighting, pt (x) can be viewed as unnormalized probability. Furthermore, it evolves according to the linear equation Z ∞ pt (x) = zt−1 f (qt−1 − x)1Ix≤qt−1 + (1 − zt−1 ) f (y + qt−1 − x)pt−1 (y)dy , p1 (x) = π(x). (19) (x−qt−1 )+

This equation corresponds to the Zakai equation for systems with diffusions in [30] and [5]. By integrating both sides of (18), Z ∞ λt = pt (x)dx 0 Z ∞ (19) = zt−1 F (qt−1 ) + (1 − zt−1 ) F (qt−1 + y)pt−1 (y)dy 0 Z ∞ (18) = zt−1 F (qt−1 ) + (1 − zt−1 )λt−1 F (qt−1 + y)πt−1 (y)dy. 0

The last equation defines λt recursively starting with λ1 = 1. However, note that λt depends on πt−1 on the right-hand side. The normalized probabilities can easily be computed from the unnormalized probabilities as follows: pt (x) . πt (x) = R ∞ 0 pt (x)dx 9

(20)

These equations can be written in the operator form in the space ½ ¾ Z ∞ 1 + H := p ∈ L (< ) : x|p(x)|dx < ∞ , 0

where L1 ( 0. Properties: • Operator θ(q, p) is well defined if hρ(q, p), 1i > 0, i.e., Z ∞ hρ(q, p), 1i = p(y)F (y + q)dy 6= 0. 0

11

(30)

R∞ This property is satisfied if p 6= 0, p ∈ H+ , and F (y) > 0 for all y > 0. Then, 0 p(y)F (y + q)dy > 0. Otherwise, p(y)F (y + q) = 0 is a.e. satisfied, which along with F (y) > 0 for all y > 0 implies p(y) = 0 a.e. This contradicts p 6= 0. Moreover, the operator ρ preserves the nonzero property: p 6= 0 =⇒ ρ(q, p) 6= 0. We establish the contrapositive of this statement. If ρ(q, p) = 0, then hρ(q, p), 1i = 0, which is possible only under p = 0. Furthermore, note that the equality in (30) specializes to hρ(q, δ), 1i = F (q). • Operator θ(q, p)(x) yields a valid p.d.f. if p ∈ H+ : Clearly, θ(q, p)(x) ≥ 0 and

R∞ 0

θ(q, p)(x)dx = 1.

• Operator ρ(q, p) is a linear operator from L1 (