Process data chemometrics - Instrumentation and ... - IEEE Xplore

1 downloads 0 Views 710KB Size Report
Apr 2, 1992 - quantity of data into meaningful information is Chemometrics. Data when properly interpreted by statistical data analysis tools and fundamental ...
262

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT. VOL 41. NO. 2. APRIL 1992

Process Data Chemometrics Michael J. Piovoso, Senior Member, IEEE, Karlene A. Kosanovich, and James P. Yuk

Abstract-Data rich but information poor is an excellent way to characterize most chemical processes today. The lack of data analysis tools and adequate fundamental and experimental models makes it difficult to pursure product quality and improved understanding of a process. One data analysis technique successfully applied in spectroscopy to reduce a large quantity of data into meaningful information is Chemometrics. Data when properly interpreted by statistical data analysis tools and fundamental and heuristic models yield meaningful information. In this paper we discuss the use of Chemometrics as a multivariate analyzer to provide a composite measurement of the state of a chemical process operation. An application of this analyzer on a Du Pont Plant is presented, and we introduce two measures to detect and identify important process shifts.

NOMENCLATURE number of principal components in the calibration model, A Im B rxm matrix of coefficients E nxm matrix of residual of X ei error of the ith column of residuals,E norm squared of the error of the ith column of em residuals, E F nxr matrix of residual of Y F * estimated residuals of Y from the calibration model linear operator g h Mahalanobis distance M centroid of the calibration model m number of sensor variables n number of sample data points P m A matrix of loadings of X Pi mxl vector of the ith column of loadings Q rxA matrix of loadings of Y r number of quality variables S variance-covariance matrix of X SIMCA residual variance of the i th sample data x, s’ diagonal elements of the variance-covariance maSaa trix of T ST2 variance of the ith estimated data from the calibration model T nxA matrix of scores ti nxl column vector of the ith column of scores X nxm matrix of sensor variables X* ith estimated data from the calibration model Y nxr matrix of quality variables

A

Manuscript received May 14, 1991; revised October 8, 1991. The authors are with E. I. Du Pont de Nemours & Company, Inc., Newark, DE 19714-6090. IEEE Log Number 9106705.

Y* estimated predictions of Y from the calibration Fs



model variance ratio transpose of a matrix I. INTRODUCTION

A

DVANCEMENTS in automation and distributed control systems make possible the collection of large quantities of data. But without the corresponding adequate tools, it is not possible to interpret the data. Every modern industrial site believes that this data bank is a gold mine of information if only the “important” and relevant information could be extracted painlessly and quickly. Timely interpretation of data would improve quality and safety, reduce waste, and improve business profits. This interpretation is possible except for the following dilemmas: undetected sensor failures, uncalibrated and misplaced sensors, lack of integrity of the data historian, lack of data compression techniques used to store the data, and general human errors. It is no wonder that data analysis method in the face of these serious problems appear to be inadequate. Meanwhile, the data bank continues to grow without garnering any useful information. There have been several approaches to this general problem of interpreting sensor data, including statistical techniques to model relationships between sensor measurements and quality variables [ l]-[4], and to identify process upsets and sensor failures [5], [6]. Other approaches are the use of expert system methods to detect sensor failures and sensor drifts [7], [8], fuzzy logic to interpret sensor state [9], [lo], and neural nets to construct a nonlinear mapping between sensor measurements and quality variables [ 1I], [ 121. This paper discusses the use of two multivariate statistical techniques to interpret sensor data in the following manner. We use the technique known as Partial Least Squares or Projection to Latent Structures (PLS) to build a correlation between the dominant effect in the process (demand) and its relationship to other process sensors. We then apply Principal Component Analysis (PCA) to build a model of the process behavior unaccounted for by the demand. It is our intention to use the PCA model to calibrate the process ir! the same sense as calibrating an instrument. Our concept is that the process data can be viewed as an instrument which provides a measure of the process. If this instrument is calibrated correctly, that is, it provides the same measure for all periods in which the outputs or the quality variables are on-aim, then it can

0018-9456/92$03.00 0 1992 IEEE

PIOVOSO er al.: PROCESS DATA CHEMOMETRICS

263

provide the kind of information that process engineers and operators need. If the process is drifting away from the target, then this should easily be detected on-line and adjusted appropriately to prevent a poorer quality product, reduce losses and downtime. We show the development of this particular concept and its use as a process analyzer to enhance the control of the process in an on-line, realtime mode of operation. We also discuss two measures to identify and classify significant deviations in process behavior. 11. BACKGROUND The choice of PLS and PCA comes from the need to transform collinear, noisy measurements into informative results. The process control practitioner intuitively realizes that univariate techniques, while simpler to understand and implement, cannot account for the underlying multi-interactions that are inherent in a chemical process. To properly understand a process and implement sophisticated control requires a model. First, principal models that accurately and consistently describe the varied and complicated behavior of a complex process are generally difficult to develop. Neural nets do offer a hope of modeling the input-output behavior including nonlinearities, but they suffer from a lack of sound data pretreatment methods, selection of appropriate exemplars and inner nodes, long training times, and definition of adequate training sets. The theoretical foundations of PLS and PCA cannot be presented here in their entirety. The interested reader is referred to the texts by Martens and Naes [ 131, and Kowalski and Beebe [14] for more detailed information on the subject. For a comparison of several analytical techniques for quantitative spectral analysis see Thomas and Haaland [15]. Here we will provide only what is necessary to the objectives of this work.

X = tip; + t2p; + * * * + tAp; + E, (1) where p,' is the transpose of the ith column of the m A matrix P of loadings, with A I m. More generally we can write X = TP' + E, (2) where the columns of T are the scores for each sample x,, the columns of P are the principal component loadings, and the E values are the residuals of X. The loadings P are the new basis set for the space spanned by X. If the data is mean-centered and standard deviation scaled, then the elements of the loadings range from - 1 to 1 with high absolute loadings corresponding to high correlations and small loadings to low correlations. The first loading is the first eigenvector and corresponds to the scores with the largest variance. The first eigenvector describes the direction of greatest variation in X; the second describes the next dominant direction of variation and so on. Determination of the number of eigenvectors can be established by cross validation or leverage correction techniques. The i th score t, is the projection of the X data vector onto the eigenvector or the ith column of P, p,:

tl = XIPI7

(3)

where t, is the i th principal component for the columns of X because it can be written as a linear combination of the columns of X. The maximum possible dimensionality of the number of principal components equals the rank of X, but if there is significant collinearity which is quite often the case, the PCA is able to describe the significant variations in fewer dimensions. In fact, when A = m, the technique converges to Multiple Linear Regression, MLR. The residuals E are an indication of the fit of the model to new samples. Analogous to the discussion of Wise, Ricker, and Velktamp [4], given a model of A principal components with A Im, the residual matrix E is computed as

A. Statistical Methods PLS and PCA are multivariate statistical methods that E = X(I - PAP;), (4) attempt to identify the underlying phenomena that relate sensor measurements to quality variables in a reduced di- where PA is an m A matrix with each of the A columns mensional subspace. They have the advantage that col- corresponding to the first A principal components. PAP:, = I, the identity matrix when A = m, otherwise PAP:, # linearity in the data does not present statistical estimation problems. Furthermore, the decomposition of the data into I. The magnitude of any column e, is computed as a reduced set often is interpretable in terms of physically meaningful phenomena. 1 ) Principal Component Analysis: PCA was origi- e,,, may be used to determine if a significant shift has ocnally developed by Pearson [ 161 and is related to Singular curred between new samples and the operating space as Value Decomposition. PCA is used to explain the vari- defined by the calibration model, equation (2). A large e, ance in a single data set X = {xI, x12, * * , x,] by a indicates a poor fit of new samples to the calibration lower number of variables T = {t,, tZ, tA], A I model. A poor fit is probably due to systematic changes m. Each xi and ti is an nrl column vector. The PCA in the normal operations of the process and/or sensors that method involves the calculation of vectors that are a linear have become uncorrelated. Simple detection of the major combination of the columns of X that describe the amount source of variability can be done by inspection of the eleof variability in the data. These vectors are the eigenvecments of E. Rigorous tests on the classification of new tors of the covariance matrix of X. Hence X can be writ- samples to the calibration model using the Mahalanobis ten as a linear combination distance test [17] and the Soft Independent Modeling of

-

9

264

IEEE TRANSACTlONS ON INSTRUMENTATION A N D MEASUREMENT. VOL. 41, NO. 2. APRIL 1992

Class Analogy (SIMCA) residual variance [ 181 are used in this work. A discussion of these two measures will subsequently follow. 2) Partial Least Squares: We are not able to give a thorough background on PLS in this paper. A good theoretical treatment of PLS can be found in Martens and Naes [13] and Hoskuldsson [19]. Consider X,an nxm matrix whose rows are samples and whose columns are measurements, and Y, an m r matrix whose columns are quality variables. In general, we want to find a function that relates X to Y:

outliers. This may not be a universal problem, but in many cases the data base is subjected to human error, machine interlocks, and unusual spikes. A method for detecting outliers is clearly necessary. Detection was accomplished by both calibrating and visually inspecting the portraits of scores and loadings and by the outlier analysis provided by the chemometrics package used. Removal was performed manually. Failure to remove outliers results in a model which does not give a true picture of the correlations in the X and Y variables.

C. Membership and Classification In this work one of our objectives is to alert the operg is most likely a complex relationship that is usually not ators when the process has shifted away from normal opclearly articulated or exactly known in a complex process. erations. We can measure significant shifts of new samWhat PLS attempts to do is to define g as a bilinear model ples away from the operating space defined by the of the form, for mean-centered and scaled X and Y , calibration model by determining if the new sample is a legitimate member of the set. Two measures of memberX = TP’ + E ship used in this work are the Mahalanobis distance h,, Y = TQ’ + F, (7) and the SIMCA residual variance .:s The Mahalanobis distance is calculated in the reduced such that the relationships dimensional subspace defined by the primary eigenvecY* = XB’ (8) tors, those than span the significantly correlated behavior in the original data set, while the SIMCA residual variB’ = P(P’P)-’Q’ (9) ance is calculated in the subspace that is spanned by the hold. B is a matrix of coefficients that relate Y to X,T is residual variation left in the original data set. The two the scores matrix to represent the variability in X that is complement each other by providing a measure of “inside correlated to the variability in Y, Q is the basis set of the model space” and “outside the model space” classivectors that span the space of Y, and F is the residual in fication [20]. This is analogous to univariate methods such Y. The model is bilinear because X and Y are the product as Shewhart charts and Cusum that rely on detecting proof two sets of estimated linear terms (T,P) and (T,Q), cess shifts based on being within + n standard deviations. respectively. The following should be noted T,the matrix The Mahalanobis distance is calculated as the distance of scores, is not optimal for estimating the columns of X between the i th sample x, and the centroid M of the calas was the case of PCA, because T must simultaneously ibration model describe Y. Equation (8) is the prediction model, because h, = (x, - M)’S-’(X,- M ) , (10) given X we can determine Y, and the calculations and implications of the residual F are analogous to the diswhere S is the variance-covariance matrix of the data set cussion of the PCA residuals E. X,computed as g:

x

-+

Y;

(6)

B. Pretreatment and Outliers Because both PCA and PLS are data-driven techniques, an essential part of the procedure is the pretreatment of the data. In order to effectively derive a bilinear model, the relation between the variables should be sufficiently linear, and/or the calibration sample set should span a narrow enough portion of space for a linear approximation. It is important that the variables be scaled relative to one another, so as to avoid having important process variables whose magnitudes are small from being overshadowed by less important but larger magnitude variables. Also, since the underlying algorithm is some sort of least-squares fit, the eigenvalue/eigenvector determination in PCA will tend to be biased towards variables with larger numerical values. In this work scaling is done by first subtracting the mean and then standardizing by the standard deviation of each variable. A second treatment of the data, albeit not an automated one, involves the very important detection and removal of

s=-

1

” C

(m - 1) [ = I

(x, - M ) (x,

-

M)’.

(1 1)

Since t, is the projection of x, onto the reduced subspace defined by the principal component loadings, and because the mean of the principal component scores for each principal component is zero due to mean centering, S and h, simplify when calculated in the score space of the data to A

where m

for an A principal component model, A Im. In the scores space, the variance-covariance matrix is diagonal with the a t h element, s:~. We refer to this as “inside the model space” because the primary eigenvectors are used to calculate hi.

265

PIOVOSO et a l . : PROCESS DATA CHEMOMETRICS

The SIMCA residual variance s measures the similarity between new samples and the calibration model using the secondary eigenvectors. s i is calculated as the sum of squares difference between the sample data set xi and the estimated sample data set x;, regenerated from the principal component model, m

s: =

C

j =1

(xij - xZI2/(m - A ) ,

(14)

or in terms of the principal component scores m

s: =

C j=A+

I

(tijl2/(m - A ) ,

(15)

when m is the number of variables and A is as before. Since the secondary eigenvectors are used to calculate the residual variance, we refer to this as “outside the model space” classification. The F-ratio, Fs, can be calculated to estimate the probability levels for new samples, Fs =

1 m-A-1’

($)

where s’ is the standard deviation for the raw sample data, sT2 is the estimated standard deviation, and 1/ ( m - A 1) is the number of degrees of freedom. Classification based upon the Mahalanobis distance is done by calculating the probabilistic confidence levels p ( * ) obtained from the chi-squared distributed random variable h. Classification based upon SIMCA residual variance is based upon probability levels calculated from a Fisher F test. We use the ranges as given in [19]. That is, for a sample, if p ( . ) is in the range 0.0 < p ( - ) < 0.01, then the sample is a nonmember of the class. If 0.01 < p ( < 0.05, then the sample is an outlier, or if 0.05 < p ( . ) < 1.0, then the sample is a member of the calibration model. The Fisher F test gives a measure of the quality of fit of the model to the observed data. e )

111. DESCRIPTION OF THE CHEMICAL PROCESS The chemical process for which this multivariate analyzer was developed is one step in a multistage process. The first two stages, where the chemical reaction occurs, have the greatest impact on the operations of the downstream stages. Critical properties such as viscosity and density, if altered significantly, would affect the final product, resulting in a loss of revenue, or creating operability problems for customers. Furthermore, it is difficult to determine which stage is truly responsible for the loss of yields. A lack of on-line sensors to continuously measure the critical properties also makes it impossible to relate any specific changes to a particular stage. Indeed, changes in properties are detected by laboratory measurements. These measurements are sometimes taken on as small a sampling frequency as once every 8 h. The analysis, when returned, indicates past information; consequently the current state of operations may not reflect the state that is associated with the lab results.

Such infrequent measurements make the control of the product quality difficult. At best, the operators have learned a set of heuristics, which if adhered to, usually produce a good product yield. However, unforeseen disturbances and undetected machine degradations may occur which will also cause a yield loss. There are periods of operations, however, when the final process step produces a degraded product in spite of near-perfect upstream operations. What also contributes to the control problem is a lack of ‘7ust in time” knowledge about the current state of the process and instrumentation that is inadequate. Fig. l(a) illustrates the major details of the second stage of the multistage process. It begins with the feed from the first stage, combined with a solvent in a mixer at a carefully controlled speed and temperature. A sample of the mixture is taken at the exit for lab analysis. From there the mixture is sent to a blender under level and speed control. The mixture is then pushed through a series of pumps and filters before reaching the third stage in the process. Pumps and filters need to be replaced frequently; thus they are installed in pairs. The load on one can be temporarily increased while the other is being serviced. Likewise, the filters must be changed routinely to avoid pluggage and downtimes. The frequent maintenance on the equipment is not the primary source of control problems. Rather, the majority of the abrupt control changes occur due to the demands of the third stage. Whenever there is a decrease in demand, the second stage must reduce throughput because the third stage has no storage capacity. In the current control scheme, a change in the third stage demand is indirectly coupled to the feed flowrate control valves. When the demand increases, the second stage must ramp up to meet the demand and do so quickly. This causes the process to move around significantly, and it never reaches an equilibrium. Clearly, demand is the dominant effect on the variability in the sensor values and process performance. IV. MODELDEVELOPMENT The purpose of the model is to improve the process operations of the second stage. Intuitively, if the second stage could anticipate shifts away from the process aim, then corrective action could be taken to minimize the impact on the performance in the downstream stages, resulting in tight control. This solution, however, does not address the effect of demand on control. It would appear on the surface that a redesign of the control scheme to feedforward feed rate changes in the final stage to the controlling valves would allow for proper adjustment of the setpoints. This upgrade is outside the scope of this paper. To further define this concept, consider the following. Suppose a PLS model that relates changes in demand to the other measurements could be realized such that its effect on the other measurements could be predicted. In notation, let X be the demand as measured by feed flowrate variables, and Y be the set of all other measurements.

266

IEEE TRANSACTIONS ON INSTRUMENTATION A N D MEASUREMENT, VOL. 41. NO. 2, APRIL 1992

stage

blender

ACAO w Platform M p s ,

V . IMPLEMENTATION

*

D i ita1 VAX Computer

(b) Fig. 1. (a) Process schematic of Stage 2 . (b) System configuration.

Then accordingly, we could develop a predictive model of the form

Y = XB’ + F, where Y and X are mean-centered and scaled. Some of the variability in measurements Y is accounted for by the variability of the feed flowrate measurements X. The residuals in Y will be a measure of the amount of unexplained variance left by feed flowrate effects. We use only A of a total of m possible principal components in our PLS and PCA model. Our presumption is that only A principal components are necessary to explain the significant variability, that is,

F

=

Y - XP,(PiP,)-’Q:.

(17)

If the residuals in F are large then they represent significant information not attributed to feed flowrate changes, and confidence limits on these may be established. PCA analysis on the residuals is computed to investigate the remaining variability. Recall that a PCA explains the variability in a single data set in a reduced dimensional space defined as

F here F replaces

=

TP

+ E.

developed two calibration models. The first, model 1, is a PLS model of all the measurements, excluding the set representing demand, regressed upon all the other measurements. The second, model 2, is a PCA model of the residuals of model 1, to model the variability of the process not accounted for by model 1. Both are reduced order models but more importantly, they capture the dominant directions of variability in the data. What we have produced from the PCA analysis is a portrait of the second stage “setpoint,” or a ‘tfingerprint” defined in the new basis set which in tum describes the second stage process operations. From our intuitive understanding of setpoints in the univariate sense we may use this portrait to measure and detect significant shifts away from “desired” setpoint. Significance or membership of a new sample to the calibration model will be determined by the Mahalanobis distance test and the SIMCA residual variance.

(18)

X in the specific case. In effect, we have

A system was designed to implement a prototype of the multivariate analyzer in the solution area. The system is composed of three components: the user-interface, the statistical engine, and the expert system rule base. The requirements of the system are as follows. It must be interactive, capable of displaying data in several graphical forms, user-friendly, integrated to the process data base, statistical engine and rule base, capable of communicating fromho peripheral devices, and capable of scheduling real-time tasks both internally and extemally. The platform we settled on for the user-interface, the expert system rule base, and task-to-task and device-todevice communication is the ACA@ computer marketed by Advanced Computer Applications Inc., Newtown, PA. The application package, STATEST@,marketed by Consensus analysis, Norway, is used on-line in real-time as the statistical engine of the system. The Chemometrics application package, UnscrambleP, marketed by CAMO, Norway, is used to develop the models off-line. The ACA@platform controls the task-to-task communication between the data historian running on a Digital VAX platform and the PC running STATEST@. It also serves as the interactive display to the operators, the engineers, the manufacturing supervisor, and the model developer. The integration between the system and the other platforms was done using Ethernet connections with DECnet@protocol. See Fig. l(b) for a simplified view of the system. Every hour, process data is taken from the historian and displayed on the ACA@ computer as a foreground task. As a background task it relays the data, after doing some simple data validation, to the PC platform. There STATEST@ applies model 1 and model 2. STATEST@ also contains robust algorithms to handle missing sensor values. The fit of a new sample to the models is determined by calculating the magnitude of the residuals and the probability levels based upon the Mahalanobis distance

PIOVOSO

el

261

al.: PROCESS DATA CHEMOMETRICS

test and the SIhICA residual variance. Estimates of what the sensor measurements of the sample data should have been to be within the confidence levels of the process setpoint are computed as

Y * = F*

+ (XB’+ Bo).

: 75 c >

(19)

The results are visually displayed to the operators on the ACA@platform.

VI. RESULTSAND DISCUSSION Two models were developed for the second stage of a multistage process based on process sensor data. The Unscrambler@ package was used off-line to build the PLS and PCA models, models 1 and 2, respectively. Proper scaling and outlier detection were done prior to model building. Model 1 was developed with measurements representing demand X, regressed upon all other measurements Y, and leads to a principal component model that explained over 50 % of the variances in X and Y. Fig. 2(a) shows this result. Model 2 was developed using PCA analysis of the residuals of model 1. Fig. 2(b) shows the added explained variance by model 2, resulting in a total explained variance of approximately 80%.The other 20% is probably due to the nonlinearities inherent in the process. Additionally, note that model 2 has K = 6 significant principal components; 7 and 8 appear to be modeling measurement noise. The number of principal components was adjusted until optimal classification of the independent measures of the test samples was obtained. The metr i c ~used in this selection process were the percentage of correct classification of acceptable samples and percentage of rejection of unacceptable samples. The identification of what effect the principal components of model 2 represent may be gleaned from a loadings plot of the original variables. For example, for the loadings of principal component 1 versus the sensor variables, the operating staff has preliminarily identified this to be a viscosity effect. Similar identification was done for principal component 2 of model 2. Fig. 3 shows a plot of the scores of principal component 1 versus the scores of principal component 2 of model 2. Principal components 1 and 2 account for 40% of the explained variance. The numbers represent the hour-byhour composite “measurement” of where the process is or has been. We use this fingerprint as our process setpoint. Inspection of this fingerprint shows that the process shifts between two overlapping regions; that is, it starts in one region and due to “narural” events it shifts to another region and stays there for awhile until other “nutural” events move it back to the first region. Stage 2 personnel consider this normal operations. It could be argued that two fingerprint models of the process should be developed, one for each natural event. Fig. 3 also illustrates the case when a new sample was observed to have questionable membership. The Mahalanobis distance and the probability level of membership both indicate a poor fit. The classification of this sample data according to the ranges discussed earlier is as an out-

n

g 50

-a w

25

0 0

2

I

Factors

(a)

D

l

2

3

4

5

6

7

0

Factors

(b) Fig. 2. (a) Explained variance of model 1. (b) Explained variance of model 2

0-

-101

*

-20 -20

-10

I 0

h,= 090 p(FJ = 0 01 0

10

2c

Fig. 3 . Scores of factor 1 vs. scores of factor 2 of model 2 (* is a new process data point, h is the Mahalanobis distance, and p ( . ) is the probability of membership in the calibration set).

her. As it turns out, this was a case of the manufacturing supervisor changing a setpoint to a value outside the calibration range of the model. This change was done to extend the life of one of the pumps and is outside of routine operations. The system flagged this shift and brought it to the attention of the operators. Identification of the troublesome sensor(s) was done by visual inspection of the residual plots of the sample data. The correct control actions to take in this case were a scheduling of the pump for replacement and an adjustment of the manipulated variables to maintain demand and correct product quality. In the future an on-line expert system will assist in providing open-loop corrective action.

268

IEEE TRANSACTIONS ON INSTRUMENTATION A N D MEASUREMENT. VOL. 41. NO. 2. APRIL 1992

VI. SUMMARY Data is gathered in many chemical processes at a very high rate. Unfortunately, much of that data is not often used unless a major problem has occurred. This paper presents a technique in which data can first be analyzed to determine what is normal variability in the process and a model or models developed which define in a compact way that variability. The information in the models is then used to measure the state of the process as to whether it represents a likely sample from the model of normal variations. If it is not, information about why the data is not of the form expected is made available to the operators in a readily understandable manner. This technique was researched and implemented on the second stage of a multistage process at a Du Pont Plant. The approach taken is to first separate out the major variation effect which is normal and a major contributor to the total variability of the data. On the residuals of the model, a PCA model is developed that determines if new data fit the pattern of normal variations. If not, then a determination is made of what went wrong, why the data is different, and what changes are needed to get the operation into the acceptable region. ACKNOWLEDGMENT The authors would like to thank Harald Martens and his crew at Consensus Analysis, and the team of R. 0. Caullwine, J . R. Tate, B. W. Wells, and M. A. Forte for their tireless patience and input into the development, design, and commissioning of this multivariate analyzer.

REFERENCES [ I ] J. M. Lucas, “Exponentially weighted moving average control schemes: Properties and enhancements.” Technometrics, vol. 32. no. I . pp. 1-12, 1990. 121 K. A. Kosanovich and M. J. Piovoso, “Process data analysis using multivariate statistical mefhods.” presented at 1991 American Control Conference, Boston, 1990.

[3] J. S. Hunter, “The exponentially weighted moving average,” J. Quality Tech., vol. 18, no. 4 , pp. 203-210. 1986. 141 I . F. MacGregor, T. E. Marlin, J . Kresta, and B. Skagerberg, “Multivariate statistical methods in process analysis and control,” presented at the Fourth International Conf. on Chem. Proc. Contr., Feb. 17-22, South Padre Island, TX, 1991. [5] B. M. Wise, L. Ricker, and D. J. Veltkamp, “Upset and sensor failure detection in muntivariate processes,” presented at 1989 AlChE Annual Meeting, 1989. [6] J. Kresta, J. F. MacGregor, and T . E. Marlin, “Multivariate statistical monitoring of process operating performance,” personal communication. [7] M. Kramer and B. L. Palowitch, “Expert system and knowledgebased approaches to process malfunction diagnosis,” presented at 1985 AIChE National Meeting, Chicago, 1985. [8] Y. lkuta, K. Hamanaka, R. Funakoshi, and A. Kako, “Operation instruction system.” presented at 987 AlChE National Meeting, Nov. 1987. [9] P. J. King and E. H. Mamdani, “The application of fuzzy control systems to industrial processes,” Automatica. vol. 13, pp.235-242, 1977. [IO] R. M. Tong, M. B. Beck, and A. Latten, “Fuzzy control of the activated sludge wastewater treatment process,” Automatica, vol. 16, pp. 659-701, 1980. I I ] K. Watanabe, I. Matsuura, M. Abe, M. Kubota, and D. M. Himmelblau, “Incipient fault diagnosis of chemical processes via artificial networks,” AfChE J.. vol. 35. no. I I . pp. 1803-1812. 1989. 121 Y. H. Pao, Adaptive Pattern Recognition and Neural Networks. New York: Addison-Wesley, 1990. 131 H. Martens and T. Naes, Multivariate Calibration. New York: John Wiley & Sons, 1989. 141 K. R. Beebe and B. R. Kowalski, “An introduction to multivariate calibration and analysis,” Analyrical Chemistry. vol. 59, no. 17, pp. 1007-1017, 1987. [I51 E. V. Thomas and D. M. Haaland, “Comparison of multivariate calibration methods for quantitative spectral analysis,” Anal. Chem., vol. 62, no. IO, pp. 1091-1099, 1990. [I61 K. Pearson, “On lines and planes of closest fit to systems of points in space,” Philosophical Magazine, vol. 2, 1901. [I71 R. D. Cook and S. Weisberg, Residuals and fnfruence Regression. London: Chapman & Hall, 1982. 1181 S. Wold and M. Sjorstonn, “In chemometrics: Theory and application,” B. R. Kowalski, Ed. Washington. DC: Am Chem Soc., 1977. [I91 A. Hoskuldsson, “PLS regression methods,” J. of Chemo., vol. 2 . 1988. 1201 N. K. Shah and P. J. Gemperline, “Combination of the Mahalanobis distance and residual variance pattern recognition techniques for classification of near-infrared reflectance spectra,” Am. Chem. Soc., vol. 62, no. 5, pp. 465-470, 1990.