Maintenance staffing management - Springer Link

36 downloads 268 Views 447KB Size Report
Existing research in “call center” ... Maintenance opportunity window (Chang, Ni, Bandyo- padhyay ..... fault is to get enough maintenance windows for PM work.
J Intell Manuf (2007) 18:351–360 DOI 10.1007/s10845-007-0027-7

Maintenance staffing management Qing Chang · Jun Ni · Pulak Bandyopadhyay · Stephan Biller · Guoxian Xiao

Published online: July 2007 © Springer Science+Business Media, LLC 2007

Abstract This paper addresses the maintenance staffing planning process and the role simulation plays in this process. Feedback control notion is utilized in this personnel planning problem. The research investigates the tradeoffs between maintenance personnel staffing levels and the throughput of a production line. The more reactive maintenance (RM) personnel on staff, the lower the probability that a given repair will be delayed waiting for an available personnel. On the other hand, increasing the number of personnel increases labor costs while reducing the utilization of RM labor resources. What makes this tradeoff complicated is that the magnitude of the impact of reactive maintenance delays varies considerably between bottleneck and non-bottleneck stations. The tradeoff is examined, factoring in both labor costs and the cost of lost throughput. An approach is developed for determining RM staffing levels that minimize overall costs.

Q. Chang (B)· P. Bandyopadhyay · S. Biller · G. Xiao Manufacturing Systems Research Lab, General Motors Research and Development Center, 30500 Mound Road, Warren, MI 480909055, USA e-mail: [email protected] J. Ni Department of Mechanical Engineering, University of Michigan – Ann Arbor, 1023 H. H. Dow, 2300 Hayward St., Ann Arbor, MI 48109-2136, USA e-mail: [email protected] P. Bandyopadhyay e-mail: [email protected] S. Biller e-mail: [email protected] G. Xiao e-mail: [email protected]

Keywords Labor estimation · Resource dispatching · Maintenance management · Dynamic simulation · Bottleneck identification · Opportunistic maintenance

Introduction Maintenance staffing management is an interesting and important problem in manufacturing system. Maintenance labor could contribute as high as 80% of the total maintenance cost. Therefore, accurate estimation of labor force becomes crucial. The questions of the maintenance workforce planning include how to decide the appropriate staffing level corresponding to the dynamic workforce load over time, and how to maintain absolute levels of staffing despite the constantly changing maintenance demands. The maintenance workforce planning has to satisfy long-term maintenance request on one hand, and also respond to the demand changes quickly to maximize the throughput of a production system on the other hand. Today’s commercial-off-the-shelf Enterprise Resource Planning (ERP) software packages provide Human Resource Management System (HRMS) capabilities. However, these systems fall short in analyzing the present workforce levels and associated capital expenditures as they change over time (Parker & Marriott, 1999). Parker developed a prescribed method for the strategic analyst to incorporate a personnel and cost simulation model, which was used to project personnel requirements and evaluate workforce at least cost through time. Barlas and Diker (1996) proposed an interesting interactive dynamic simulation model in the research of a university management system. The model was converted into an interactive dynamic simulation game. In the game, the player played the role of a university policy-maker, who was

123

352

trying to seek a delicate balance among the main academic functions of the university. Existing research in “call center” type of workforce management systems has addressed faster pace of change in business environments. The research used both modeling and simulation methods to capture the shortterm requirement behavior (Barlas & Diker, 1996; Klungle, 1999). However, the aforementioned works focused on business management aspects with well-defined policies for different scenarios. The works have limitation in identifying specific manufacturing constraints, e.g., long-term staffing constraints (cannot be frequently changed after it is determined), process bottleneck constraints and tradeoff between staffing and cost. This research will focus on a manufacturing plant Reactive Maintenance (RM) staffing planning problem. RM is still major maintenance activities in most current manufacturing systems. Due to increased production demand and random failure of stations, maintenance personnel are busy on reactive maintenance. In order to maintain a smooth production, quicker RM reaction is required to achieve less down time and high production throughput. The more RM personnel on staff, the lower the probability that a given repair will be delayed as the result of maintenance personnel shortages. On the other hand, increasing the number of personnel increases labor costs while reducing the utilization of RM labor resource. What makes this tradeoff complicated is that the magnitude of the impact of reactive maintenance delays varies considerably between bottleneck and non-bottleneck stations. This tradeoff is examined, factoring in both labor costs and the cost of lost throughput. An approach is needed for determining RM staffing levels that minimize overall costs. The research will first analyze the RM activity distribution and the duration of repair time over long-term periods, which will provide valuable information to determine the RM staffing level. Next, a sensitivity analysis is performed to justify the appropriate RM staffing level by considering throughput impact. Finally, a cost function is integrated to the analysis to justify the staffing level by considering total cost. Maintenance opportunity window (Chang, Ni, Bandyopadhyay, Biller, & Xiao, 2006b) is utilized to further improve maintenance efficiency.

Methodology Maintenance personnel planning can be treated as a dynamic process. The oscillating levels of personnel over a projected time period are non-linear and dynamic. Standard open loop simulation alone may not accurately capture the constantly changing variables or quantities under investigation. A feedback control framework as shown in Fig. 1 is built

123

J Intell Manuf (2007) 18:351–360

to integrate two simulation models to capture the dynamic natures of the problem. One simulation model is maintenance staffing dispatch simulation, it is used to describe workforce load over time. The other simulation model is throughput model, and it is utilized for the evaluation of throughput impact at various staffing levels. The objective is to minimize the total cost, which is defined as a function of production profit and maintenance cost. The controllable variables are the maintenance staffing level and the method to dynamically allocate the personnel to maintenance work. For example, a demand change in the plant status may trigger the recruitment of new personnel. To accomplish an effective feedback control procedure, three major steps are necessary: 1. Develop a simulation procedure to estimate the workforce level and utilization. 2. Based on the long-term study of the workforce load, a sensitivity analysis needs to be performed by utilizing throughput simulation to study the impact of workforce on throughput. 3. Incorporate cost function to derive an optimal maintenance labor level. The three steps will be described in detail in the following sections. Maintenance resource dispatch simulation The goal of daily production operation is to reduce machine failure related downtime and improve production efficiency. To achieve the goal, a reasonable maintenance staffing level is required. Personnel maturation is a dynamic problem, concerning the full spectrum of personnel management and the events impacting it, such as relocation, the stochastic natures of machine random failures. Typical static approaches, such as linear programming, to solve such allocation problems often cannot be used where the problem scenario changes continuously over time. It is also advantageous to model fluctuation in demand, in order to determine how quickly the personnel levels return to steady state. These and similar questions can only be answered efficiently with a simulation method which can cope with delays, flows of information, obviously lending itself to the study of transient phenomena. Therefore, a maintenance resource dispatch simulation is developed to analyze the RM activity distribution and the duration of repair, which will provide valuable information to improve maintenance efficiency. The simulation procedure dispatches maintenance labor to reactive maintenance jobs based on first come first serve rule. This rule is generally the approach used in industry. Maintenance personnel dispatching rules and assumptions are summarized as following:

J Intell Manuf (2007) 18:351–360

353

Fig. 1 Feedback control framework for optimal maintenance labor

Disturbance, e.g. : 1. machine random failure, 2. lack of workforce due to absentees, etc.

Maintenance staffing level

Reference Input: Total Cost

+/-

Output: Cost

+/-

• Maintenance dispatching simulation • Throughput simulation

Hourly/Daily control

Long-term control

Dynamically relocate RM staffing

• •

Increase/Decrease RM staffing

Short-term bottleneck analysis Maintenance opportunity

Controller

• • • •



Maintenance people stay idle at a central location of his/her responsible area waiting for maintenance requests. First idled maintenance person is dispatched to perform the upcoming maintenance request without any waiting time. After finishing maintenance, a person can go to other machines for another corrective maintenance. The simulation uses the distribution of number of simultaneous downtime events in a production line to set labor targets for a production line. To enable a quick response to these machine failures, zero-waiting time is usually required. Therefore, the maximum number of simultaneous downtime events determines the maximum RM staffing level. The simulation can also take limited maintenance people as input. At times when all RM people are busy, the simulation forces any incoming requests for RM to queue until someone becomes available to perform the repair and logs the resulting waiting time.

tion can be analyzed. Usually, historical data over 60–90 days period is utilized. Note that by using these data, the simulation results mimic actual real-world demands for maintenance specific to a production line. From this simulation, one may calculate the percent of labor utilization and the maximum number of concurrent failures on a shift or daily basis. The continuously changing staffing levels and utilizations are expressed in terms of graphs of variables over time, as illustrated in Fig. 3. Figure 3 demonstrates staffing level oscillations of a production line during 2 months period. The vertical bars represent staffing level, and the lines represent labor utilization. The simulation procedure accurately reflects actual patterns of plant reactive maintenance events. Note that the average maintenance staff is calculated by Eq. 1. Based on the average maintenance staffing level, the simulation calculates every day utilization and overall average utilization based on Eq. 2. In this example, the average maintenance staff is two people and the average utilization of maintenance staff is only 14.16%. N 

The simulation does not break out the time required by RM personnel to prepare and walk to a breakdown. This information is included in the total repair time which is recorded as event breakdown duration in machine controller. Therefore, the simulation does not require the physical layout of the production line, although it does require the logical station layout at a block diagram level of detail. Figure 2 demonstrates the simulation procedure. RM staffing tradeoff Next, a long-term study of the maintenance labor level is performed. In the simulation, downtime events may be generated by either “replaying” actual fault data or, when this data is not available, by randomly simulating events according to user-specified station reliabilities. In this research, “replay” method is used. By doing this, day-by-day or shift-by-shift RM requests can be captured and the corresponding fluctua-

average_R M =

Si

i=1

(1)

N average_R  M

U tili zation i =

W H _R M j

j=1

average_R M × H our s N 

average_utili zation =

(2)

U tili zation i

i=1

N

where, average_R M Average number of RM staffing level over N days RM staffing level for the ith day Si N Number of days U tili zation i Utilization of average staffing level at the ith day

123

354

J Intell Manuf (2007) 18:351–360

Fig. 2 Flow chart of simulation procedure

Sequence of Faults

Reactive Maintenance request

no

Maintenance labor available?

Waiting T = T’+∆T T’: original RM request time

yes

∆T: waiting time Dispatch a maintenance person for the job

T: RM request time after waiting time ∆T

Maintenance Labor level

Fig. 3 Labor level distribution for 2 months

sample size: 60 average_RM = 2 average utilization = 14.16%

50%

80%

30% 60%

25% 20%

40%

15% 10%

20%

% events

% of days cumulated % days

35%

cumulated % of days

40%

% of days

60%

100%

% of events cumulated % of events

40% 30% 20% 10% 0%

5% 0%

0% 1

2

3

4

5

1

2

3 RM personnel

4

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

cum ul ated % of events

Simulatenous breakdowns vs. RM personnel

Breakdowns on day-day basis for zero wait time 45%

95%CI on the mean: (1, 2.5) 95%CI on the mean: (13.14% , 15.5% )

5

RM personnel

Fig. 5 RM personnel distribution based on breakdown events Fig. 4 Simultaneous RM personnel requirements on a daily basis

W H _R M j Hours

The total hours spending on RM for the jth worker The total working hours (8 h per shift less half hour break time) for each worker Average utilization over N days

average_ utili zation Corresponding to Figs. 3 and 4 depicts the distribution of minimum daily RM labor requirements for zero-wait time response to downtimes over a two-month long-period. The graph illustrates that about 40% of days required at most three maintenance personnel for zero response time to downtime events. Based on the same data, the graph in Fig. 5

123

illustrates that only 8% of times all three maintenance personnel were simultaneously busy. So one can conclude from Figs. 4 and 5 that while a staffing level of four covered 90% of days with zero response time, less than 5% of times during those days required simultaneous deployment of four maintenance personnel. Therefore, staffing levels estimations based on zero response time may lead to over-conservative results. One could instead staff based on the average number of simultaneous deployments, but by neglecting the extreme cases this might have an unduly severe impact on throughput. To understand the impact of maintenance staffing level to throughput, a throughput simulation is needed for a sensitivity study for various RM staffing scenarios. Note that

J Intell Manuf (2007) 18:351–360

355

RM personnel vs. Production

Production (units)

51000 49000 47000 45000 43000 41000 39000 1

2

3

4

5

RM personnel

Fig. 6 RM personnel levels versus production

when queuing of maintenance requests occurs, the equipment downtime becomes longer than the actual repair time (i.e., the waiting time increases); the difference increases as queuing becomes more severe. By manipulating downtime duration inputs into the throughput simulator, the impact of the maintenance queuing on throughput can be established. Figure 6 demonstrates RM staffing level and production relation. It is observed that beyond RM labor level of three people, the production does not increase with increasing labor. Although downtimes are still lengthened by queuing with three people (about 10% of machine failure events need extra waiting time), near zero throughput loss results relative to the more conservative “zero-wait time” staffing level of five people. Based on the line layout, dynamic production status (e.g., buffer status), repair waiting time on some stations may not have significant impact on system throughput. One reason is that some stations may be starved or be blocked by other stations failures. An immediate response to the starved or blocked stations has no effect to improve system throughput. Another reason is that the magnitude of the impact of reactive maintenance delays varies considerably between bottleneck and non-bottleneck stations. Through this sensitivity study, the staffing level with minimum throughput impact (relative to the throughput of “zerowait time” staffing level) can be derived. Dynamic relocation of maintenance staffing It is assumed that when decisions are made on RM staffing level, the RM workforce remains at the level for a longtime (several months) and would not be changed frequently. Therefore, staffing level is a relatively long-term decision; however, it has short-term impact to production. Realizing the tradeoff of RM staff and throughput, one may want to decide on a leaner RM staffing without sacrificing too much production loss. It is not unusual that a decision maker have very limited budget and decides to have very limited RM staffing level (e.g., two people in Fig. 6 scenario) by considering most of the time the staffs can cover machine failure

events and realizing that some times throughput loss is inevitable. Is it possible to reduce the occasional impact to production with the limited resource? It is known that the magnitude of the impact of machine repair delays varies between bottleneck and non-bottleneck machines. Actually only downtime at the bottleneck station contributes the most to production loss. Therefore to reduce overall downtime should not be the goal of maintenance research, and only to reduce the downtime which causes the most production loss (which is usually related to the bottleneck station) is a much more meaningful and practical goal one should pursue (Chang, Ni, Bandyopadhyay, Biller, & Xiao, 2006a). If the limited resource is not utilized at the right place (such as bottleneck station), resulting downtime can cause significant production loss. Based on this logic, additional analysis can be performed in such a way: whenever a bottleneck station breaks down, a RM staff is sent at the earliest possible time, while non-bottleneck station may have longer waiting time. The throughput impact can be evaluated for this maintenance dispatch policy. This analysis provides a solution on how to achieve the repair time reduction on the bottleneck—through maintenance work prioritization and staffing the bottleneck stations in the highest priority. The analysis is performed on short-term basis (i.e., on a shift or daily basis), since the short-term bottleneck will switch over time. The long-term decision variable—staffing level, is a constraint to the analysis; and short-term decisions, such as how to dynamically relocate RM staffing, can be obtained. The case study of the analysis and assumptions will be demonstrated through a simple example in the experiment section.

Total cost consideration A higher RM staff will mean more on hiring cost but making sure that faults are responded in a timely fashion. However, since the throughput is also bounded by a number of other factors, e.g., machine cycle time, line layout, etc., one cannot expect to increase the throughput by simply increasing the numbers of maintenance personnel. Understanding the tradeoff between RM staff and throughput, the research further justifies optimal staffing level on a cost function that unifies labor and throughput cost. While there is a direct cost associated with each RM person on staff, there is also cost due to throughput degradation. The latter usually consists of extra overtime labor costs to meet production goals but for popular models where production capacity is constrained this may instead represent lost contribution margin. Thus, we may express total system costs as follows:

123

356

J Intell Manuf (2007) 18:351–360 Total Cost for various maintenance staffing level

cost ($)

Throughput Loss Hiring Cost Total Cost

0 Throughput loss

0

1

2

3

4

5

6

Number of staff Optimal level?

Fig. 7 The tradeoff between RM labor costs and cost of lost throughput

Total Cost = NST × CST + PL × CPL , where CPL = (NBS × WOT )/PS

(3)

Number of RM personnel Labor cost for each RM person Number of units of production lost Contribution margin of production lost, estimated as overtime costs to meet the production goals Number of labor needed for overtime producNBS tion WOT Hourly wage rate for Overtime = Hourly Wage rate × 150% PS Speed of the production line in Jobs per Hour (JPH). The relationship can be represented graphically, as illustrated via an example in Fig. 7. It is obvious that the RM staffing level corresponding to zero production throughput loss is the “upper bound” for the optimal staffing level. Any more staffing numbers beyond that will only increase total cost. With the integration of the cost function to the simulations, RM staffing planning can be treated as a feedback control procedure. Over a projected time period, the staffing level is justified and control actions are triggered by demand fluctuations. For a short-term period, the staffing level becomes a constraint (it is assumed that the staffing level cannot be changed frequently), and dynamic relocation of staffing becomes control action to maximize throughput benefit and minimize the total cost without increasing staffing level. For a long-term period, the control action is to increase or decrease staffing level based on workload estimation. NST CST PL CPL

Proactive maintenance opportunity Another part of maintenance works is proactive maintenance (PM) and it is sometimes not recorded in the failure history. Proactive maintenance is the practice of replacing compo-

123

nents or subsystems before they fail in order to promote continuous system operation; it includes preventive maintenance and predictive maintenance. Usually, the proactive maintenance workforces are determined by machine vendors’ specifications. As busy production schedules are common, it is often difficult to shut down machines for proactive maintenance during production time. The proactive maintenance has to be completed during scheduled down times, which are usually in overtime shifts. It is not unusual that PM works are ignored when everyone is busy on catching up production demands. This situation significantly decreases machine reliability and impacts normal operation efficiency. Chang et al. (2006b) provided a method to identify maintenance opportunity window to shut down machines during production time for proactive maintenance while maintaining production throughput to be minimally impacted. Based on the previous example, it is seen that RM staffing utilization is very low. Therefore if the opportunity windows can be utilized by RM staffing during production time, the maintenance efficiency will be increased. Usually, more PM works completion means higher machine reliability and less unscheduled machine downtimes. Also, more PM works completed during production time means higher RM staffing utilization, less overtime PM hours and less PM staffing. All these effects can be directly translated into dollar savings, as each minute of RM personnel performed during opportunity window reduces by one minute the personnel that would otherwise be expended during planned downtime and/or overtime to accomplish the same tasks. This assumes, of course, that the PM’s performed during opportunity window would have actually been performed at other times. Therefore, the Total Cost after adding PM savings is defined as T otal_Cost  . T otal_Cost  = T otal_Cost − P M_Saving W her e P M_Saving = P M_hour scompletion ∗ W O T where, PM_Saving PM_hourscompletion

(4)

PM Overtime savings PM hours completed during opportunity window Hourly wage rate (if the PM WO T works have to be completed during overtime, Hourly wage rate for Overtime = Hourly Wage rate × 150%) The problem is set up as follows: A sequence of faults is generated either from actual fault data or by simulating faults based on statistical properties of historical fault data. This fault sequence is used as the basis for simulations used to evaluate the maintenance staffing level and maintenance opportunity calculation. Whenever a fault occurs and its

J Intell Manuf (2007) 18:351–360 Fig. 8 A simple serial production line

357 S1

Station

S2 MTBF (minutes)

B1 MTTR (minutes)

S3 Cycle time (seconds)

S1

20

4

19

S2

22

4

20.2

S3

23

4

19

S4

23

4

18

S5

22

4

19

estimated fault duration is longer than 5 min, the maintenance opportunity window is calculated. The reason for using 5 min fault is to get enough maintenance windows for PM work. A PM task list is also provided (from vendor’s specification). Currently, a simplifying assumption is made that only one PM task is performed during a single fault, and the longest PM task in the list is always selected. This process is repeated for all faults in the sequence. It is assumed that PM can only be performed when maintenance opportunity window is available and at the meantime there is a RM staffing available. Experiments and results The aforementioned method is tested in a simple production line consisting of five stations and two buffers as described in Fig. 8. A sequence of faults generated from simulation is used as the basis for the simulations used to evaluate RM staffing level fluctuations and throughput impacts. The parameters for the stations are also listed in Fig. 8. Two months of fault data are used for the simulation. The results of various RM staffing levels are concluded in Table 1. The average staffing level rounds up to two people. The 95% confidence interval about the average RM personnel is (1, 2.5). The results of sensitivity study on the throughput for various staffing levels are summarized in the last column of Table 1. It is noticed that with two RM people the total repair wait time is about 1,000 min, and the production loss is about 10% relative to the “zero-wait time” staffing level. By considering the tradeoff, one can easily select staffing level of three personnel. Since, it only yields a loss of a negligible 40 units over 2 months relative to the more conservative “zero-wait time” staffing level of four personnel. A further analysis discovered that during the 2 months period nearly 90% of events can be covered by two RM personnel. Therefore, it is possible to determine to have two RM personnel while realizing 10 ∼ 15% of events need simultaneously three or more personnel and thus cause some throughput losses. The next experiment is set up to examine with only two RM personnel, if the throughput impact can be decreased

S4

B2

S5

Buffer

Capacity

B1

100

B2

100

by dynamically prioritizing maintenance work such that the bottleneck be on the highest priority. For the purpose of the experiment, simplified assumptions are made to keep the problem tractable: • •

RM staff can go to another higher prioritized maintenance work in the middle of working on one station. The station keeps the breakdown status until a RM staff returns from other higher prioritized works.

First, day-by-day bottlenecks are identified. Then RM staffs are dispatched in such a way that whenever the bottleneck station breakdown happens, a RM staff is sent to it immediately, even though the staff is working on other stations. Therefore, while the bottleneck station is immediately served, other stations will encounter longer waiting time. The simulation results for this prioritization policy are shown in Table 2. It is observed that with two RM people and prioritizing the maintenance works for the bottlenecks, the production loss is about 5%. A throughput improvement of 5% is obtained with prioritizing maintenance for the bottlenecks without increasing RM staffing level. Finally, the leaner RM personnel of two are further justified by utilizing the cost function. Assuming that any throughput losses can be made up via overtime, we set the cost of lost throughput equal to the direct and indirect labor overtime wages required to operate the system long enough to make up any losses. It is assumed: • • • • • •

NBS = number of needed labor = 10 PS = Production line speed = 150 JPH WOT = Hourly Wage Rate for Overtime = $45 × 150% = $68 Lost Production (PL) = Max Throughput − Current Throughput Throughput Loss = PL × CPL Maintenance staff cost, CST = $100,000/year.

Also recall, • •

CPL = (NBS × WOT )/PS Total Cost = NST × CST + PL × CPL .

123

358 Table 1 Throughput result for various staffing level

Table 2 Result for prioritizing bottleneck stations

Table 3 Summary of various staffing level for 2 months period

J Intell Manuf (2007) 18:351–360

RM staff level

Total number of breakdowns

Total wait time (minutes)

Production

1

3120

4078

39480

2

3120

1019

46000

3

3120

122

51040

4

3120

0

51080

RM staff dispatch

Production

2 people without prioritizing bottlenecks

1019

46000

2 people with prioritizing bottlenecks

1019

48440

4 people with 0 wait time

0

51080

RM staff level

Production

Labor cost

Throughput loss cost

Total cost

1

39480

$16,667

$52,586

$69,253

2

46000

$33,333

$23,029

$56,362

2 (prioritize bottleneck)

48440

$33,333

$11,968

$45,301

3

51040

$50,000

$181

$50,181

4

51080

$66,667

0

$66,667

Table 3 summarizes the results for different scenarios. This tradeoff study suggests three RM people when no maintenance prioritization applied to bottlenecks. When bottlenecks are set higher priority and RM staffs are dynamically relocated according to the priority settings, the optimal staffing level is suggested to be two, although in this case the throughput impact is 5%. Next, to evaluate the potential savings by utilizing maintenance opportunity, a proof-of-concept study is developed. The study examined the amount of PM works completion through utilizing maintenance opportunity window. A PM task sample is shown in Table 4. It can be seen that the total PM works are 106.8 min for one day. From the fault sequence used in the earlier experiments, one day faults are selected. Whenever a fault with duration of 5 min or more incurred, maintenance opportunity calculation is invoked. The calculation results of maintenance windows are shown in Table 5. The numbers inside the table are opportunity window duration (in minutes) for each station at different time. The PM works at one station can only be completed when long enough opportunity windows exist at that station as well as there is an available staff. Therefore with the increasing RM staffing level, the PM completion hours will increase. The PM works completion results are demonstrated in Fig. 9 for various RM staffing level. In order to integrate the results to cost function, the PM dispatching procedure were applied by using 2 months of

123

Total wait time (minutes)

fault data. It is assumed that all PM hours completed during opportunity windows have to be otherwise expended during overtime. Therefore, the same overtime hourly wage rate used in the earlier example is applied. The final results are described in Fig. 10. It can be observed that as staffing level increased, the RM and PM works completion increased, and throughput impact decreased. The final cost (T otal_Cost  ) is evaluated for various staffing level (shown as vertical bar in the graph). When utilizing cost function, the tradeoff suggests relatively lean labor of two personnel.

Discussions The appropriate RM staffing level is determined by the combined result of maintenance dispatch simulation, throughput simulation and the cost function. It is important to notice that when the staffing level is fixed, the real-time decision on maintenance staffing allocation and maintenance job priority are based on dynamic system bottlenecks. It can be seen that giving the bottleneck station the highest maintenance priority is an effective maintenance staffing dispatch strategy. Prior study of Yang and Chang (Yang, Chang, Djurdjanovic, Ni, & Lee, 2004) developed a method for assigning priority to reactive maintenance works. A Genetic Algorithm (GA) based search for the optimal maintenance priority was studied. The procedure can be used to find maintenance priority

J Intell Manuf (2007) 18:351–360

359

Table 4 PM task list Station #

Equipment type

Description

Expected repair time (minutes)

S0001

Stationary Clamp Fixture

Clamp & Sealing Fixture

7.8

S0001

Accumulator (Sys-T-Mation)

Roller Chain Conv.

6.6

S0002

Stationary Clamp Fixture

Clamp & Sealing Fixture

7.8

S0002

Accumulator (Sys-T-Mation) 1

Roller Chain Conv.

6.6

S0002

Accumulator (Sys-T-Mation) 2

Roller Chain Conv.

6.6

S0002

Accumulator (Sys-T-Mation) 3

Roller Chain Conv.

6.6

S0003

Stationary Clamp Fixture

Clamp & Sealing Fixture

7.8

S0003

Accumulator (Sys-T-Mation) 1

Roller Chain Conv.

6.6

Accumulator (Sys-T-Mation) 2

Roller Chain Conv.

6.6

Sealing end effector

10.8

S0003

Exchange end effector – S0004

Sealing 1 Exchange end effector –

S0004

Sealing 2

Sealing end effector

10.8

S0004

Stationary Clamp Fixture

Clamp & Sealing Fixture

7.8

S0005

Stationary Clamp Fixture

Clamp & Sealing Fixture

7.8

S0005

Accumulator (Sys-T-Mation)

Roller Chain Conv.

6.6

Window opportunity

S1

S2

S3

S4

S5

1

5.45

5.70

0

5.70

5.45

2

9.17

9.42

9.43

0

0

3

7.75

8.00

8.03

0

8.03

4

0

12.25

12.00

11.75

5

0

8.38

8.13

7.88

7.63

6

0

16.33

16.08

15.83

15.58

7

7.03

6.95

6.70

6.45

13.60

13.85

10.23

9.98

9.73

0 13.32

11.58

8

13.08

9

10.23

10

7.93

8.18

8.43

0

8.43

11

9.27

9.52

9.77

0

9.77

0

12

6.20

6.45

13

19.23

19.48

which will minimally intrude with the production system and will thus maximize the productivity. However, the GA based search was to find an optimal maintenance priority among all possible priority combinations without understanding the relationships between maintenance staffing level and production process. This research provides the rationale behind the priority settings. It explains why the optimal priority can give maximum productivity—it shortens the repair time of the bottlenecks. The simulation based analysis in this research takes the real events as input, instead of average statistic value with assumed distribution. It’s more like “replay” the current and

0

0

6.45

19.73

6.20

0

19.48

PM completion vs. Staffing level 100% PM completion %

Table 5 Maintenance opportunity

80% 60% 40% 20% 0% 0

1

2

3

4

5

RM personnel

Fig. 9 PM completion percentage versus RM personnel

123

360

J Intell Manuf (2007) 18:351–360 cost

RM completion

PM completion

Throughput Impact

70000 60000 50000 40000 30000

Cost ($)

RM/PM completion % or Throughput Impact %

Impact of various maintenance staffing level 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

20000 10000 0 1

2 3 RM personnel

4

Fig. 10 Integrated total cost for various maintenance staffing levels

historical events, and track the results and useful information. The simulation model and value are validated by not only throughput but also lines starve/block trends to capture its dynamic feature. By this “replay,” we can effectively capture “real-time” bottleneck (not long-term statistic bottleneck). From the experiments one can conclude that this research provides a decision support method to justify RM staffing level. When utilizing maintenance opportunity window, more savings can be achieved. Combining the calculation results, a graph as shown in Fig. 10 can help the decision maker to choose the right maintenance staffing level. For various staffing level, the RM completion percentage, PM completion percentage and throughput impact percentage are clearly presented. After all, the total cost can be evaluated for each staffing level according to user-specified cost parameters (e.g., labor cost, overtime cost etc.).

Conclusions In this research, the maintenance staffing planning problem is framed as a feedback control procedure. By using maintenance staffing dispatch simulation, the maintenance workforce oscillations are captured and expressed in terms of graphs of variables over time. The tradeoff between RM

123

staffing levels and the cost of lost throughput is quantified and a methodology for assessing the optimal RM staffing level to minimize overall costs is presented. In addition, the research links the long-term decision—RM staffing level to short-term bottleneck analysis. The short-term analysis is used to dynamically prioritize maintenance work and relocate maintenance staffs to achieve more throughput benefits. The maintenance efficiency is further improved by utilizing maintenance opportunity window. The experiments show additional savings in terms of PM works and personnel. Although it may seem counterintuitive to accept delays in repairing failed equipment given a historical emphasis on maximizing throughput, the results of case studies suggest that at plants that are not capacity-constrained, the labor cost of current RM staffing may outweigh the throughput benefits it affords.

References Barlas, Y., & Diker, G. V. (1996). An interactive dynamic simulation model of a university management system. 1996 ACM 0-89791820-7. Chang, Q., Ni, J., Bandyopadhyay, P., Biller, S., & Xiao, G. (2006a). Short-term system bottleneck identification. In Proceedings of The 7th international conference on frontiers of design and manufacturing, June 19–22, 2006, Guangzhou, China, pp. 203–208. Chang, Q., Ni, J., Bandyopadhyay, P., Biller, S., & Xiao, G. (2006b). Maintenance opportunity planning system. In Proceedings of 2006 international symposium on flexible automation, July 10–12, 2006, Osaka, Japan, pp. 244–251. Klungle, R. (1999). Simulation of a claims call center: A success and a failure. In P. A. Farrington, H. B. Nembhard, D. T. Sturrock, & G. W. Evans (Eds.), Proceedings of the 1999 winter simulation conference. Parker, R. S., & Marriott, A. J. (1999). “Personnel forecasting strategic workforce planning’ A proposed simulation cost modeling methodology. In P. A. Farrington, H. B. Nembhard, D. T. Sturrock, & G. W. Evans (Eds.), Proceedings of the 1999 winter simulation conference. Yang, Z., Chang, Q., Djurdjanovic, D., Ni, J., & Lee, J. (2004). Maintenance priority assignment utilizing online production information. In Proceedings of 2004 Japan – USA symposium on flexible automation, Denver, Colorado.