Fault Diagnosis in Finite-State Automata and ... - Semantic Scholar

2 downloads 0 Views 319KB Size Report
First, a state-based approach for on-line passive fault diagnosis in nite- ... protecting life and property, and in increasing operational time and produc- tivity.
Fault Diagnosis in Finite-State Automata and Timed Discrete-Event Systems  S. Hashtrudi Zad, R.H. Kwong and W.M. Wonham Department of Electrical & Computer Eng. University of Toronto Toronto, ON, Canada M5S 3G4 hashtrud, kwong, [email protected] Abstract

In this paper, we propose a framework for fault diagnosis in discreteevent systems. In this approach, the system and the diagnoser (the fault detection system) do not have to be initialized at the same time. Furthermore, no information about the state or even the condition (failure status) of the system before the initiation of diagnosis is required. First, a state-based approach for on-line passive fault diagnosis in nitestate automata is presented. The design of the fault detection system, in the worst case, has exponential time complexity. A model reduction scheme with polynomial time complexity is introduced to reduce the computational complexity of the design. Next we consider the use of timing information to improve the accuracy of diagnosis. Instead of directly extending our framework to timed discrete-event systems, we take an alternative apporach which leads to signi cant reduction in on-line computing power requirements and, in many cases, in the size of the diagnoser at the expense of more o -line design calculations. We also discuss the issue of diagnosability of failures in our framework.

1 Introduction Fault detection systems are of paramount importance in aerospace, manufacturing and process industries. This is due to the crucial role they play in protecting life and property, and in increasing operational time and productivity. Solving diagnostic problems for complex systems is a complicated task requiring a reliable, systematic approach. As a result, fault diagnosis has been the subject of extensive research (see, e.g., [15],[11]).  This

work was supported in part by AlliedSignal Aerospace Canada.

1

In this paper, we examine fault diagnosis in systems that, for diagnostic purposes, can be modelled as discrete-event systems (DES). For this problem, di erent methodologies have been proposed in the literature. By far, fault-tree analysis [9] seems to be the most popular approach. Fault trees can be synthesized automatically. They provide a pictorial display of the system which can be easily read and understood. However, they have certain limitations: it is dicult to incorporate information about ordering and timing of events in a fault tree [11]; moreover, problems arise in the analysis of systems that go through more than one phase of operation [10]; also, there seems to be no way of treating common-cause failures resulting from fault propagation (domino e ects) using fault trees [10]. Other approaches to fault detection proposed by researchers include methods based on arti cial intelligence [5] and, templates (see [14] and references therein), event signatures [2] and Petri nets [19] for manufacturing processes. Finite-state automata have also been used in the study of diagnostic problems. In [12], a state-based approach is utilized for the study of o -line diagnosis. Here, the state of the system is assumed xed during diagnostic tests, and output measurements are used for failure detection and isolation. The concept of testability is also introduced and studied. An active on-line diagnosis problem has been studied in [12]. In this case, input test sequences for fault diagnosis are computed. Furthermore, the notion of on-line diagnosability is introduced. In [16], [17] passive on-line diagnosis is studied using an event-based framework: based on observed events, inference is made about the occurrence of (unobservable) failure events. Two di erent notions of diagnosability are de ned and examined. This framework has been extended to utilize information about the timing of events [3]. Integrated approaches to fault detection and supervisory control have also been studied in an eventbased framework in [4] and [18]. In this paper, we rst present a state-based approach for passive on-line diagnosis of nite-state automata. Engineers have found that the use of nitestate automata and the corresponding state transition diagrams make the design and maintenance of complex control and diagnostic systems easier (see, e.g., Ch. 14 of [11]). It is also straightforward to capture the ordering of events using nite-state automata. There are similarities between our work and the event-based approach of [16]. However, our framework is simpler and the diagnoser is easier to compute. Moreover, we do not assume that the plant and the diagnoser are initialized at the same time. Knowledge of the state and even the condition of the plant (normal/faulty) at the time the diagnoser is initialized is not required. It can be shown [6] that any problem in an event-based framework can be recast as a problem in the state-based framework presented here. We also introduce a model reduction scheme with polynomial time complexity to reduce the computational complexity of designing the diagnoser, which has exponential time complexity in the worst case. To our knowledge, the use of model reduction in the design of diagnoser has not been studied previously in DES literature. 2

Next we discuss the extension of our framework for incorporating timing information. In this case, it is assumed that the system is modelled as a timed discrete-event system (TDES). First introduced in [1] in the study of supervisory control, timed discrete-event systems can be used for capturing timing information in a useful range of problems in control engineering in a reasonably simple fashion. For the purpose of fault diagnosis in TDES, it is possible to adopt our diagnosis theory for nite-state automata simply by treating the clock tick as an extra output signal. In this work, however, we have taken an alternative approach in which the process of updating the estimate of the system's condition is performed only when a new output symbol is generated. The update process uses the generated output symbols and the number of clock ticks between them. No update at clock ticks is required in this method. This results in signi cant reduction in on-line computing power requirements and, in many cases, in the size of the diagnoser, at the expense of extra o -line design calculations. We note that testing of nite-state machines [8] is related to fault diagnosis. However, the framework used for that purpose is di erent: the nite-state machines are usually assumed to be deterministic with a xed condition (failure status); also it is assumed that transitions can always be observed even if they do not result in a change in output. These assumptions often do not hold in fault diagnosis of control systems. In this paper, we shall present an overview of our framework and illustrate it using examples. Some of the technical details can be found in [6]. A complete account is given in [7]. An outline of the paper is as follows. In section 2 we study failure modelling, introduce our framework for the design of fault diagnosis system and discuss failure diagnosability in discrete-event systems. We propose an extension of our approach for diagnosis in timed discrete-event systems in section 3. Section 4 presents the conclusion.

2 Fault Diagnosis in Finite-State Automata In this section, we study fault diagnosis in nite-state automata. 2.1

Plant Model

We assume that the plant under control, i.e., plant along with low-level continuous controllers and DES supervisors, can be modelled as a nite-state Moore automaton G = (X; ; ; x0 ; Y; ), where X , , Y are the nite state, event and output set; x0 is the initial state,  : X   ! 2X the transition function and  : X ! Y the output map (2X denotes the power set of X ). The model describes the behaviour of the system in both normal (system functioning properly) and faulty situations. Suppose there are p failure modes F1 ;    ; Fp. Each failure mode corresponds to some kind of failure in an instrument (valve, sensor, etc.). The event set  includes failure events. In this paper, for brevity, we assume at most one failure mode may occur at a 3

time. Simultaneous occurrence of two (or more) failure modes is discussed in [7]. Let K := fN; F1 ;    ; Fp g denote the condition set of the system. It is assumed that the state set X can be partitioned according to the condition of the system: X = XN [_ XF1 [_    [_ XF ([_ denotes disjoint union). De ne  : X ! K such that for every x 2 X , (x) is the condition of the system at the state x: (x) = N if x 2 XN , and (x) = Fi if x 2 XF (i 2 f1;    ; pg). Also (abusing notation) extend the de nition of  to the subsets of X : (z ) = [f(x) j x 2 z g, for any z  X . In failure detection and isolation, given the output sequence (y1 ; y2 ; y3 ;   ), we want to nd the condition of the system. In general, some of the events in  are observable; it is assumed that information about the occurrence of these events has been transferred and included in the output map. Also note that only changes in the output are assumed to be observable. This means that a transition of the system from one state to another state having the same output will not be noticed, i.e., the transition will be unobservable. So in the output sequence: yi 6= yi+1 for i  1. p

i

Example 1 - Heating System

A heating system uses a heater, a temperature sensor and an ON/OFF controller to regulate the temperature of a room about a set-point. The DES model is shown in Fig. 1. In this gure, each dashed arc represents a heaterfailure event. \Load" models the e ect of disturbance such as the temperature of the adjoining room and the ambient temperature, and is supposed to have two states \normal (n)" and \above normal (a)". It is also assumed that even when the load is above normal, the heater can keep the temperature close to the set-point. In the heating system: X = f1; 2;    ; 24g, Y = fld, le, bd, be, ad, aeg, K = fN; F g, (i) = N for 1  i  12, and (i) = F for 13  i  24. The output symbols are explained below: ld temperature low, heater OFF (disabled) le temperature low, heater ON (enabled) bd temp. below set-point, heater OFF (disabled) be temp. below set-point, heater ON (enabled) ad temp. above set-point, heater OFF (disabled) ae temp. above set-point, heater ON (enabled). Note that the output symbol may contain information about the commands issued by the DES supervisor (in this case, the ON/OFF controller), in addition to sensor readings. 2 2.2

Diagnoser Construction and Diagnosability

Before studying diagnoser design, we introduce the concept of output-adjacency. De nition 1 For any two states x; x0 2 X , we say x0 is output-adjacent to x and write x ) x0 if (x) 6= (x0 ) and there exist l  2, x2 ;    ; xl?1 , 4

Condition Heater

F

N d

e

e

d

n

10

8

20

22

a

9

7

19

21

n

12

6

18

24

a

11

5

17

23

n

2

4

16

14

a

1

3

15

13

Temp Load a

b

l

Figure 1: Example 1. DES model of the heating system. 1 ;    ; l?1 such that xi+1 2  (xi ; i ) and (xi ) = (x), for all 1  i  l ? 1, with x1 = x and xl = x0 . 2 A diagnoser is a system that detects and isolates failures. In our framework, it is a nite-state Moore machine that takes the output sequence of the system (y1 ; y2 ;    ; yk ) as input and generates at its output an estimate of the condition of the system at the time that yk was generated. Speci cally, based on the output sequence up to yk , a set zk 2 2X ? f;g is calculated to which x can belong at the time that yk was generated. (zk ) will be the estimate of the system's condition (Fig. 2). Upon observing yk+1 , zk will be updated to zk+1 . Therefore zk+1 (for k  1), as depicted in Fig. 3, will be the set of states, having output yk+1 that are reachable from the states in zk using paths along which the output is yk . We denote the initial state estimate by z0 . It contains the information available about the state of the system at the time that the diagnoser is initialized, before the reading of sensors begins. Usually, z0 = X , because the diagnoser may be initialized at any time while the system is in operation and in this situation the state of the system is not known exactly. If the system and the diagnoser are initialized at the same time, then z0 = fx0 g. If the system is only known to be normal at the time that the diagnoser is initialized, then z0 = XN . z1 = z0 \ ?1 (fy1g) is the rst estimate of the system's state immediately after observing the rst output symbol y1 . For k  1,

5

Plant + Controller (DES)

y1 y2



Diagnoser (DES)

Output Sequence

 1 2



Estimates of the System’s Condition

Figure 2: System and diagnoser. zk+1

zk yk

yk

yk

yk+1

Figure 3: zk+1 =  (zk ; yk+1 ) = fx j (x) = yk+1 & (9 x0 2 zk : x0 ) x)g. The transition map of the diagnoser is denoted by  . Now we introduce the reachability transition system (RTS). The RTS will be useful in the computation of diagnoser, in on-line implementation of diagnostic computations and in model reduction. Suppose z 1 ; z 2 2 Z and  (z 1 ; y ) = z 2 for some y 2 Y . Given z 1 and in order to compute z 2 , we have to nd for every x1 2 z 1 , all x2 such that (x2 ) = y and x1 ) x2 . Since every x 2 X typically belongs to several states of the diagnoser, it is computationally economical to compute the set of output-adjacent states of every x 2 X . This can be done in O(jX j2 +jX jjT j) time because a breadth- rst search reachability analysis for each x 2 X can be done in O(jX j + jT j) time. Here jX j and jT j are the cardinalities of X and T (the set of transitions of G). Having the sets of output-adjacent states, we can construct the reachability transition system ~ = (X; R; Y; ), which has X ,Y and  as the state set, output set and output G map. R  X  X is a binary relation and (x1 ; x2 ) 2 R if and only if x1 ) x2 . As mentioned above, G~ can be computed in O(jX j2 + jX jjT j) time. With G~ available, one can compute the diagnoser.

zk+1

Example 1 - Heating System (Cont'd)

The RTS (in the form of a table) and the diagnoser for the heating system are given in Table 1 and Fig. 4 (assuming z0 = X ). In this example, heater failure can be detected using output observations unless it occurs while the temperature is low. For example, suppose that the diagnoser is initialized when the system is at the state 10. The output at this state is `ad'. Therefore z1 = f9; 10; 21; 22g and (z1 ) = fN; F g. Now assume that the state x moves to 12, and then to 6; at this point a heater failure occurs following which the state moves to 18 and nally to 16. While the system evolves along this path, 6

State Output-adjacent states (output) 1 3,4,15,16 (le) 2 3,4,15,16 (le) 3 5,6 (be) 4 5,6 (be) 5 8 (ae) / 15,16 (le) 6 8 (ae) / 15,16 (le) 7 9,10,21,22 (ad) 8 9,10,21,22 (ad) 9 11,12,23,24 (bd) 10 11,12,23,24 (bd) 11 5,6,17,18 (be) 12 5,6,17,18 (be)

State Output-adjacent states (output) 13 15,16 (le) 14 15,16 (le) 15 16 17 15,16 (le) 18 15,16 (le) 19 21,22 (ad) 20 21,22 (ad) 21 23,24 (bd) 22 23,24 (bd) 23 17,18 (be) 24 17,18 (be)

Table 1: Example 1. Reachability transition system. the output sequence `bd, be, le' will be generated which will take the diagnoser to the state z4 = f15; 16g. (z4 ) = fF g; thus the diagnoser eventually detects the failure. To see why heater failure cannot be detected while the temperature is low, suppose the system is at the state 3 when the diagnoser is started. At this state the output is `le'; thus z1 = f3; 4; 15; 16g and (z1 ) = fN; F g. A heater failure at this point takes the state x to 15. No new output symbol will be generated following the failure or afterwards. Therefore, fN; F g will remain as the estimate of the system's condition. As a result, the failure will not be detected with certainty. 2 In the example, after the occurrence of failure, the heating system does not return to the normal condition, i.e., the failure is permanent or hard. In control, we are usually interested in permanent failures. In the remainder of this subsection, we assume the failure modes are all permanent; non-permanent failures are discussed in [7]. Consider a permanent failure mode Fi . A state of the diagnoser z is called Fi -certain if (z ) = fFi g, i.e., z  XF . If the diagnoser enters an Fi -certain state, it means that Fi has been detected and isolated. If Fi 2 (z ) but (z ) 6= fFi g, then z is called Fi -uncertain. If for some k0  0 and a failure Fi , (zk0 ) = fFi g, then (zk ) = fFi g for all k  k0 (because XN [ ([j 6=i XF ) is not reachable from XF ). De nition 2 A permanent failure mode Fi is diagnosable if there exists an integer N  0 such that following both the occurrence of the failure and initialization of the diagnoser, Fi can be detected and isolated (i.e., the diagnoser reaches an Fi -certain state) after the occurrence of at most N events in the i

j

i

7

yk zk

X

(zk )

N,F

le

be

bd

ae

15,16

5,6,17,18

11,12,23,24

7,8,19,20

F

N,F

N,F

N,F

ld

le

be

ae

ad

1,2,13,14

3,4,15,16

5,6

7,8

9,10,21,22

N,F

N,F

N

N

N,F

Figure 4: Example 1. Diagnoser. system. 2 In the above de nition, no assumption is made about the system's condition (normal/faulty) at the time that the diagnoser is initialized, i.e., the diagnoser might have been initialized either before or after the occurrence of the failure. According to Def. 2, the heater failure in Example 1 is not diagnosable. Necessary and sucient conditions for diagnosability of permanent failures are obtained in [7]. Intuitively, a failure would not be diagnosable if after it occurs, no new output symbols are generated. Also a failure mode Fi would be undiagnosable if in this mode, the system can generate a periodic output sequence that throws the diagnoser into a cycle of Fi -uncertain states. Since the number of states of the diagnoser is in the worst case exponential in jX j, the whole diagnoser may occupy a large space in computer memory. In this case, it might be better to store the reachability transition system in memory and use it to perform the diagnostic computations on-line: having zk and the observation yk+1 , use G~ to compute zk+1 and update the estimate of the system's condition. G~ contains jX j states and at most jX j(jX j ? 1) transitions. There are similarities between our framework and that of [16], especially in the use of an observer for diagnosis. However, while we try to determine the condition of the system for fault detection, in [16] the authors attempt to detect the unobservable failure events. This leads to some di erences. (i) In our approach, the state and even the condition of the system do not have 8

to be known at the time that the diagnoser is started. The diagnoser can be initialized at any time while the system is in operation, not necessarily when the system is started. If say a failure occurs before the diagnoser is initialized, then our diagnoser can eventually detect the faulty condition and isolate the failure (assuming the failure is diagnosable). However, an eventbased diagnoser cannot detect the failure because the failure event has already happened when the diagnoser is initialized. (ii) Our approach is simpler and in this framework, computation of the diagnoser is less complex. This is because at each step, after observing a new output symbol, we only have to update our estimate of the system's state zk . In the event-based approach of [16], after the occurrence of an observable event, in addition to updating the state estimate, all of the paths that the state of the system might have evolved along since the occurrence of the previous observable event, have to be checked for the occurrence of failure events. We note that a diagnosis problem in an event-based framework can always be transformed to an equivalent problem in the state-based framework presented in this paper [6]. 2.3

Model Reduction

The computational complexity of designing the diagnoser in the worst case is exponential in the number of system states. To mitigate this, in this section we introduce a model reduction scheme with polynomial time complexity. The equivalence relation used for model reduction is based on the solution of the relational coarsest partition (RCP) problem for the reachability transition system. We will show that the diagnoser built using the reduced RTS will be equivalent to the original diagnoser in the following sense. De nition 3 Two diagnosers for a system are equivalent if for any given output sequence, they produce identical sequences of estimates for the system's condition. 2 Consider the RTS G~ = (X; R; Y; ). For every x1 ; x2 2 X , let x2 2 R(x1 ) i (x1 ; x2 ) 2 R, i.e., x1 ) x2 . Let  = fB1 ;    ; Bjj g be a partition of X , with Bi denoting the blocks of  . Then  is compatible with R i whenever x and x0 are in the same block Bi , then for any block Bj , R(x) \ Bj 6= ; i R(x0 ) \ Bj 6= ;. Let  be the set of partitions compatible with R. Suppose  is compatible with R and   ker ^ ker, where ker refers to the equivalence kernel of the corresponding map and ^ denotes the meet operation in the lattice of equivalence relations. If two states x and x0 are in the same block of ker ^ ker, then the system has the same output and condition at these states. Moreover, the set of output sequences generated by the system, starting at either of these states, will be identical. Also, starting at any of these states and for any feasible output sequence generated, the sequence of condition estimates will be the same. This shows that the three statements x 2 z , x0 2 z and x; x0 2 z contain the same information about the 9

zk P zk

(:; yk+1 )

 (:; yk+1 )

Figure 5:

P zk

zk+1 P zk+1

= zk

present and future estimates of the system's condition. Hence, for the purpose of estimating the system's condition, x and x0 are equivalent. Let P : X ! X= be the canonical projection. For every x 2 X , [x] := P x denotes the block x belongs to. Also for simplicity, instead of x 2 P ?1 x1 , we write x 2 x1 . We de ne the reduced RTS G = (X; R; Y; ) (corresponding to ) according to: (i) X = X=; (ii) for all x1 ; x2 2 X : (x1 ; x2 ) 2 R , (8 x 2 x1 9 x0 2 x2 : (x; x0 ) 2 R);

and (iii) for all x 2 X : (x) = (x) for any x 2 x: Similarly we de ne   : X ! K according to (x) = (x) for any x 2 x. Since   ker ^ ker,  and  are well-de ned. We de ne the canonical projection of a subset z  X S to be P (z ) := f[x]jx 2 z g. We refer to the diagnoser designed based on the reduced RTS G with z0 := P z0 as its initial state estimate, as the high-level diagnoser (corresponding to ) and, call it D. Theorem [7] The original diagnoser and the high-level diagnoser are equivalent. 2 Let zk and  denote the state and transition function of D. Remark 1 It can be shown [7] that P zk = zk (Fig. 5). Therefore since P is onto, the number of states of D is less than or equal to that of D.

Remark 2 Obviously the coarser the partition  is, the fewer states will the reduced RTS have. The set f 2 ;   ker ^ kerg has a unique supremal element which is the coarsest partition compatible with R and ner than ker ^ ker. The problem of computing this supremal element is called the relational coarsest partition (RCP) problem. The ecient algorithm of [13] solves the RCP problem in O(jRj log jX j) time (jRj is the cardinality of the set R). This gives a O(jX j2 log jX j) time complexity for model reduction since jRj  jX j(jX j ? 1). Remark 3 Sometimes the output map  provides redundant information, i.e., fault diagnosis can be accomplished using a coarser output map. The use of a coarser output map, in general, results in further aggregation in model reduction, hence fewer states in the reduced RTS. 10

State Output-adjacent State Output-adjacent states (output) states (output) 10 20,80 (le) 70 80 (le) 20 30 (be) 80 0 0 0 3 4 (ae) / 8 (le) 90 80 (le) 0 0 0 4 5 ,11 (ad) 100 110 (ad) 0 0 0 5 6 ,12 (bd) 110 120 (bd) 60 30,90 (be) 120 90 (be) Table 2: Example 1. Reduced reachability transition system.

Remark 4 The high-level diagnoser produces an accurate estimate of the system's condition, with an accuracy proportionate to the amount of information available in the reduced RTS, i the relation P zk = zk holds. It can be shown [7] that P zk = zk holds for a partition   ker^ker if  is compatible with R. If  is not compatible with R, then the high-level state estimates will become either conservative, i.e., P zk  zk or risk-accepting, i.e., zk  P zk [7]. Thus the use of compatible partitions is necessary for guaranteeing P zk = zk and in this case, the coarsest partition compatible with R and ner than ker ^ ker will be the optimal partition for model reduction, i.e., it will give the reduced RTS with minimum number of states. Example 1 - Heating System (Cont'd)

Using Table 1, the reader can verify that the partition ker ^ ker = ff1; 2g; f3; 4g; f5; 6g;    ; f23; 24gg is compatible with the transition relation R of the RTS. Hence it is the solution to the RCP problem for this example. The reduced RTS is given in Table 2 where each pair of states 2i ? 1 and 2i (1  i  12) in the original RTS is replaced with i0 in the reduced RTS. Therefore the number of states of the RTS has been reduced to half. The high-level diagnoser is identical to the original one (After replacing each pair of 2i ? 1 and 2i with i0 ). The fact that the pair of states 2i and 2i ? 1 are equivalent and can be replaced by the single state i0, shows that incorporating the model of the load has not added any useful information to the heating system's model for the purpose of fault diagnosis and therefore, the load model can be removed. This is an interesting point and, is useful for decentralized fault diagnosis. Think of the load as subsystem 1 and of the rest of the heating system as subsystem 2. Our model reduction scheme has shown that in the absence of sensor readings from subsystem 1 (which is typical in a decentralized fault detection scheme), the model of subsystem 2 (the local model) has the same amount of information for fault diagnosis as the combined model of the subsystems (the centralized model). Thus the design of diagnoser for subsystem 2 can be done based on the local model only. 2 11

Acid

V1

V2

Base

L3 L2 L1

l1

L0

V3

Figure 6: Example 2. A neutralization process.

3 Fault Diagnosis in Timed Discrete-Event Systems In section 2, changes in the output sequence resulting from failures were used for fault diagnosis in nite-state Moore automata. In many cases, however, a failure does not always change the output sequence; it only changes the timing of the output sequence. In other cases, as a result of failure, the system stops generating new output symbols (e.g., when the temperature is low in Example 1). In these situations, timing information may help us to detect and isolate failures. The use of timing information increases the accuracy of the fault detection system at the expense of additional computational complexity. 3.1

Plant Model

In this section, we assume that the plant under control can be modelled as a timed discrete-event system [1]. A TDES is a nite-state automaton . One of the events (in the event set of the TDES) is the tick of a global clock. The tick event is assumed to be observable. The TDES model can be used to describe the sequence of events occurring in the system with respect to the ticks of the global clock. The TDES model used in this paper is similar to that in [1] except that our model is nondeterministic and, it includes an output and a condition map de ned on its state set. For further details, the reader is referred to [7]. Here we explain the model using the following example.

Example 2 - Neutralization Process

In a (simpli ed) neutralization process (Fig. 6) all valves are closed and the tank is almost empty initially. The process starts by opening valve V1 and lling the reaction tank up to level l1 with the chemical to be treated (here acid). Next, V1 is closed and the neutralizer (base) is added by opening valve V2. When the alkalinity (pH) of the solution reaches the normal range, V2 is closed and the tank contents are drained through valve V3. This completes a 12

Event valve i open valve i closed Lij level change from Li to Lj a2n pH change from a (acid) to n (neutral) n2a pH change from n (neutral) to a (acid) W2 wait for 2 clock ticks F1 valve V1 stuck open F2 valve V1 stuck closed oi ci

Time bounds [1,1] [1,1] [l0 ; u0 ] for L23 [l; u] otherwise [l,u] [1,1] 2 [0, 1 ) [0, 1 )

Table 3: Example 2. Events and their time bounds. cycle of the process. Following this, another cycle will be started. The events and the corresponding time bounds (to be explained later) are given in Table 3. Two failure modes are considered here: valve V1 stuck open (F1 ) and valve V1 stuck closed (F2 ). For simiplcity, it is assumed V1 gets stuck open (resp. closed) only when it is open (resp. closed). Sensor measurements are `pH' and `Level', with pH2 fa(acid); n(neutral); b(base)g and Level2 fL0; L1 ; L2; L3 g. The control sequence consists of 8 steps: C1: Order V1 open; WAIT UNTIL pH=a OR Level=L1; IF pH=a GO TO C2; ELSE GO TO C3 C2: WHEN Level=L1 GO TO C4 C3: WHEN pH=a GO TO C4 C4: WHEN Level=L2 GO TO C5 C5: Order V1 closed; Order V2 open; WHEN pH=n GO TO C6 C6: Order V2 closed; Order V3 open; WHEN Level=L1 GO TO C7 C7: WHEN Level=L0 GO TO C8 C8: Order V3 closed; WAIT for 2 clock ticks; GO TO C1 13

Note that by assumption, the controller does not have a way of knowing whether a valve is open or closed. It is also assumed that in addition to Level and pH, the current step in the control sequence is known to the diagnoser. Therefore Y  fC1;    ; C8g  fL0; L1 ; L2; L3 g  fa; n; bg. The output and the condition map are given in Table 4. 2 The TDES describing the neutralization process is depicted in Fig. 7 (assuming l = 2, u = 3, l0 = 1 and u0 = 2 for the time bounds of the Lij as given in Table 3). It consists of three subautomata each describing the behaviour of the system in the N or F1 or F2 condition. Note that, to avoid cluttering the graph, we have depicted the transitions from the normal mode N to the faulty conditions F1 and F2 (i.e., failure events) only on the subautomata corresponding to the faulty conditions. In Fig. 7,  is the tick event. Initially, the system is at x0 = 1 where the controller enables the event o1 (open V1). The lower and upper bounds of o1 are both 1 which means that following its enablement, o1 cannot occur before the rst clock tick but if it remains enabled, it will certainly happen before the second clock tick. In general, a time bound [a, b] for an event  means that after  is enabled, it cannot occur before a ticks of the clock but if it remains enabled, it will occur before the (b + 1)-th tick (following its enablement). Reverting to the TDES, the event o1 occurs after the rst tick but before the second. Following this, the level reaches the L1 range and the solution becomes acid within the time bounds speci ed in Table 3. Then level reaches L2 in l(= 2) to u(= 3) clock ticks and, so on. Note that at state 15 when the controller orders V3 closed, it waits two ticks of the clock before starting a new cycle to make sure V3 is closed at the start of the new cycle. While V1 is open, it may fail stuck-open which ultimately results in the occurrence of L23 . Also it may fail stuck-closed while it is closed. In this case, at the beginning of the new cycle when the controller orders V1 open, neither L01 or n2a will happen. A TDES is usually obtained from an activity transition graph (ATG) [1]. ATG models are more convenient for describing discrete-event processes. In this paper, we do not discuss the ATG of the neutralization process for the sake of brevity. The details are given in [7]. For diagnoser design, it is useful to project out the clock ticks 1 from the TDES. We refer to the resulting system which describes the order and duration (in ticks) of occurrence of the non-tick events in the plant as the timed nitestate Moore automaton (corresponding to the TDES model). Let Q ,  ,  , q0 , X , ,  and x0 denote the state set, event set, transition function and initial state of the TDES and the timed FSMA, respectively. Then X

= fx 2 Q j 9 2  ? f g; x0 2 Q :

x

2  (x0 ; )g [ fq0 g;

1 In the process of projecting out the  transitions, it is assumed that no  transition results in a change in the output (because if it does, the information about the output change will be lost after projection). If this is not the case, then, before projection, any transition   0 12 x1 x2 with (x1 ) = (x2 ) should be replaced with the two transitions x1 x1 x2 0 0 with (x1 ) = (x1 ), where x1 and 12 are new state and event that have to be added to the state and event set of the TDES. !

6

!

14

!

State 1 1.1 2 2.1 3 4 5 5.1 5.2 5.3 6 6.1 7 8 9 9.1 9.2 9.3 10 10.1 11 12 13 13.1 13.2 13.3 14 14.1 14.2 14.3 15 15.1 16 16.1

Output, Condition (C1,L0,n), N (C1,L0,n), N (C1,L0,n), N (C1,L0,n), N (C2,L0,a), N (C3,L1,n), N (C4,L1,a), N (C4,L1,a), N (C4,L1,a), N (C4,L1,a), N (C5,L2,a), N (C5,L2,a), N (C5,L2,a), N (C5,L2,a), N (C5,L2,a), N (C5,L2,a), N (C5,L2,a), N (C5,L2,a), N (C6,L2,n), N (C6,L2,n), N (C6,L2,n), N (C6,L2,n), N (C6,L2,n), N (C6,L2,n), N (C6,L2,n), N (C6,L2,n), N (C7,L1,n), N (C7,L1,n), N (C7,L1,n), N (C7,L1,n), N (C8,L0,n), N (C8,L0,n), N (C8,L0,n), N (C8,L0,n), N

State 2:10 2:20 30 40 50 5:10 5:20 5:30 60 6:10 80 8:10 8:20 170 700 900 9:100 9:200 9:300 1000 10:100 1100 1200 1300 13:100 13:200 13:300 1400 14:100 14:200 14:300 1500 15:100 1600 16:100 100 1:100

Output, Condition (C1,L0, n), F1 (C1,L0, n), F1 (C2,L0,a), F1 (C3,L1,n), F1 (C4,L1,a), F1 (C4,L1,a), F1 (C4,L1,a), F1 (C4,L1,a), F1 (C5,L2,a), F1 (C5,L2,a), F1 (C5,L2,a), F1 (C5,L2,a), F1 (C5,L2,a), F1 (C5,L3,a), F1 (C5,L2,a), F2 (C5,L2,a), F2 (C5,L2,a), F2 (C5,L2,a), F2 (C5,L2,a), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C6,L2,n), F2 (C7,L1,n), F2 (C7,L1,n), F2 (C7,L1,n), F2 (C7,L1,n), F2 (C8,L0,n), F2 (C8,L0,n), F2 (C8,L0,n), F2 (C8,L0,n), F2 (C1,L0,n), F2 (C1,L0,n), F2

Table 4: Example 2. Output and condition map.

15



6

n2a

4 

1

1:1

16

Figure 7: Example 2. TDES model.

5

4

n2a

40 2:10

L01 

F1

L01 

2

F1 

50

2:20

n2a

F1

2:1

5:10 L01

3

2:1

60

L12 

30

n2a

F1 

5:20

5:2

L12  5:3

5:2

c1

7

o2

o2

8

c1

9





9:1



9:2

a2n

9:3

F1

6:1

o2 0

8 80

F1 

L23

8:10

5:30

5:3

1

F2  00

1:100

c2

14:3



16:1

F2

o2

9 900

F2 

9:1

F2 

9:100

9:2

F2 

9:200

9:3

10

F2

9:300

a2n

F2 

1000

c3

16

13:1



L10 

14:2

 

13:2 L21

14:1

 

13:3 14

L21

L10

15

15:1

11

10:1

F2

c2 10:100 o3 F2

a2n

1:1

F2

12



13

L23

700 1

o3

8:20 7

F1

11



170 

c2

10:1

a2n

3

6:1



10

o3

W2

L12 

F1 F1

5:1

L01

6

5:1 F1

F1

2

o1



5

L12 

6:1

12 

W2

16:100



F2

16:1

1600 16

c3

F2

13

F2

1100

o3

1200

c2

F2 

1300 F2

14:3

14:300

15:1

15



L10 L10  00 00

F2

15:1

F2

15

13:1

F2 

13:2

F2 

13:3

F2

13:100 13:200 13:300 

14:200

F2

14:2

L21  14:100 F2 14:1

L21

1400 14

F2

 =  ? f g; x0 = q 0 :

Notation For every x; x0 2 X and  2 , x ) x0 i there exist l  2, q1 ;    ; ql , with q1 = x and ql = x0 , such that qi+1 2  (qi ;  ) for 1  i  l ? 2 and ql 2 (ql?1 ; ). 2 The transition function  : X   ! 2X is de ned according to: 8x1 ; x2 2 X;  2  : x2 2 (x1 ; ) , x1 ) x2 : By de nition, for any x 2 X , the output (x) and condition (x) in the timed FSMA are the same as those in the TDES. We also de ne the transition time function T : X    X :! 2N (N = f0; 1; 2;   g) as follows:  2  : n 2 T (x1 ; ; x2 ) , 88 x91 ;qx;22  X; >< 1 ; qn : q1 2  (x1 ;  ) & (qi+1 2  (qi ;  ) 8 1  i  n ? 1) & x2 2  (qn ; ) if n = 6 0 >: x2 2  (x1 ;  ) if n = 0. Obviously if x2 62 (x1 ; ), then T (x1 ; ; x2 ) = ;. T (x1 ; ; x2 ) is the set of times (in ticks) that a -transition from x1 to x2 may take in the timed FSMA. The timed FSMA of the neutralization process is shown in Figures 8 and 9. According to the timed FSMA, for example, the transition from state 5 to 6 takes 2 to 3 ticks. This is in accordance with the TDES (Fig. 7). 3.2

Diagnoser Construction and Diagnosability

For diagnoser design, it is possible to start with the TDES model of the system and treating the clock tick as an extra output signal (i.e., extending the output map to include the ticks), and design a diagnoser using the methodology presented in the previous section. This diagnoser which we will refer to as the standard diagnoser provides updates of the system's condition after the generation of any new output symbol in the output set Y and any clock tick. This is a straightforward apporach. However, the number of states of the corresponding RTS and diagnoser will be very large due to the incorporation of timing information. In this paper, we propose an alternative approach in which the process of updating the estimate of the system's condition is performed only when a new output symbol y 2 Y is generated. The update process is based on the generated output symbols and the number of clock ticks occurring between them; no updating at clock ticks is required in this method. This results in signi cant reduction in on-line computing power requirements and, in many cases, in the size of the diagnoser at the expense of extra o -line design calculations. Consider the plant under control which in the previous section, was modelled as a timed FSMA. This timed FSMA generates an output sequence 17

c1

6

1

o2

1

f g

2; 3 L12

4

f

g

n2a

5

0

f g

1 L01

1

c2f1g

8

c1

a2n

9

0

2; 3

f

f g

10

g

o3

1

f g

L01

2:10 1

0 F1

f g

0

0 L01

f g

1 F1

f g

2

50

0

n2a f g

L12 f g f g

n2a

0

f g

g

2; 3 L10

30

2; 3

f

g

L12

1; 2

f

5:10 0

f g

1

f g

g

0; 1 L12 0 5:30 5:20

L12

f

2

g

f g

0

f g

F1

3

0 F1

f g

3

Figure 8: Example2. Timed FSMA (Part 1).

1; 2

L23

6:10

5

18

170

1

f g

15

1

f g

f g

f g

F1 F1 F1 F1

g

c3

16

60 n2a

2:20

c2

f

W2 f1g

f g

40

13 0 2; 3 L21

f g

f

4 0

0

f g

14

o2

L01

o3

12

1

f g

3

1

f g

F1

11

f g

n2a

2

1

f g

0

f g

0 L01

f g

o1

o2

7

f g

1

o2

F1

6

80

0

f g

f g

f

0

f g

F1

8

g

19

Figure 9: Example2. Timed FSMA (Part 2).

7 F2

9

F2

0

0

f g

700 o20

F2

f g

f g

9:200 a2n

a2n

a2n

2; 3

f

1; 2

f

10

F2

3

2

9:100

900

f g

F2

1

f g

F2 f g

0

f g

1

1000 9:300 a2n 0 0; 1 o3 g

F2

c2

10:100

f g

f

1

f g

f g

c2

11

F2

o3

1

0 0

f g

1200

F2

12

g

0

o3

0

c2

0

f g

1300

0

0

f g

100

1

F2

f g

1

F2

F2 F2

0

1

0

f g

1600

16:100 W2

1

f g

W2

f g

L10

0

f g

0

00 0 15:1

1500

f g

1

f g

c3

L10

0; 1

f

f g

c3

2

f g

14:300

15

F2 f g

f g

1:100

16

L21

F2

g

1; 2

f

1

F2

3

f g

13:300

L21

L21

0; 1

g

f

0

g

f g

0

f g

f g

14:100 14:200 L10 1; 2 f

F2

2

13:200

g

14

F2 F2

3

1

f g

2; 3

f

f g

f g

F2

F2

13:100

L21

f g

F2 f g

1

f g

f g

f g

g

0

f g

1100

13

F2

g

1400 2; 3

f

g

L10

y1 ; y2 ;    ; yk . Let tk denote the number of clock ticks that occurred between the observations yk?1 and yk . The diagnoser proposed in this paper generates a state estimate zk 2 2X ? f;g for the timed FSMA based on the output sequence y1 ; y2;    ; yk and the timing sequence t2 ;    ; tk . (zk ) will be the estimate of the system's condition at the time that yk was generated. After tk+1 ticks and upon observing yk+1 , zk will be updated to zk+1 . As with fault diagnosis in nite-state automata, it is computationally economical to construct a transition system to summarize and store the information about the output-adjacent states of the timed FSMA (The de nition of output-adjacent states in timed FSMA is the same as Def. 1). But rst de ne T~ : X  X ! 2N according to

8 fnj n is the time (in ticks) it can take the timed >< 0 ~T (x; x0 ) = FSMA to go from x to x using a path0 along >: which the output is (x) (except at x )g ;

if x ) x0 , otherwise.

The timed RTS corresponding to the timed FSMA of the neutralization process is given in Table. 5, assuming l = 2, u = 3, l0 = 1 and u0 = 2. Note that in this table, for output-adjacent states, the transition time sets T~ (:; :) are also given. The outputs were provided in Table 4. The diagnoser for the neutralization process is shown in Fig. 10 (assuming z0 = X ). Note that there are 4 parameters in the diagnoser: l; u; l0 and u0 . The way the diagnoser operates is described in the following. If, for example, the rst output symbol is (C5,L2,a), then z1 = f6; 7; 8; 9; 60; 6:10; 80 ; 700 ; 900 ; 9:100;    ; 9:u00 g. If the next output symbol y2 is (C5,L3 ,a), then z2 = f170 g and (z2 ) = fF1 g; hence, the diagnoser will reach an F1 -certain state. According to the timed RTS, (C5,L3 ,a) can be generated within l0 to u0 + 1 ticks of the clock after y1 is generated. Therefore y2 can take any time between 0 to u0 + 1 ticks to occur following the initialization of the diagnoser, i.e., 0  t2  u0 + 1. The time interval [0; u0 + 1] is written on the transition from f6; 7; 8; 9; 60; 6:10; 80 ; 700 ; 900; 9:100 ;    ; 9:u00g to f170g to indicate this. Similarly, if instead of (C5,L3 ,a), y2 = (C6; L2 ; n), then z2 = f10; 1000g. This transition should occur in 0 to u + 1 ticks. After this, y3 = (C7; L1 ; n) will be generated in l + 1 to u + 1 ticks (l + 1  t3  u + 1). The rest of Fig. 10 can be similarly interpreted. The diagnoser provides estimates of the system's condition at the instants when new output symbols are generated. For diagnosis, we also need condition estimates between output symbols after every clock tick. In the following, we will discuss how for diagnosing permanent failures, these estimates can be replaced with predictions for the system's condition. Let us assume that at some point an output symbol y is generated. Suppose z and (z ) are the estimates of the systems's state and condition following the generation of y (Fig. 11). Let Rch(z ) denote the set of states of the timed FSMA that have output y and are reachable from a state in z using a path 20

State Output-adjacent states ftimeg 1 3,30 f2g / 4,40 f2g 2 3,30 f1g / 4,40 f1g 3 5,50 f0g 4 5,50 f0g 5 6,60 f2,3g 6 10,1000 f3,4g / 170 f2,3g 7 10,1000 f2,3g 8 10 f2,3g / 170 f1,2g 9 10,1000 f2,3g 10 14,1400 f3,4g 11 14,1400 f2,3g 12 14,1400 f2,3g 13 14,1400 f2,3g 14 15,1500 f2,3g 15 1,100 f2g 16 1,100 f1g 0 2:1 30 f1g / 40 f1g 0 2:2 30 f0g / 40 f0g 0 3 50 f0g 0 4 50 f0g 0 5 60 f2,3g 0 5:1 60 f1,2g 0 5:2 60 f0,1g 0 5:3 60 f0g 60 170 f2,3g 0 6:1 170 f1,2g 0 8 170 f1,2g 0 17 -

State Output-adjacent states ftimeg 100 1:100 700 1000 f2,3g 00 9 1000 f2,3g 00 9:1 1000 f1,2g 00 9:2 1000 f0,1g 00 9:3 1000 f0g 00 10 1400 f3,4g 00 10:1 1400 f2,3g 1100 1400 f2,3g 1200 1400 f2,3g 1300 1400 f2,3g 13:100 1400 f1,2g 13:200 1400 f0,1g 13:300 1400 f0g 1400 1500 f2,3g 14:100 1500 f1,2g 14:200 1500 f0,1g 14:300 1500 f0g 1500 100 f2g 15:100 100 f1g 1600 100 f1g 16:100 100 f0g

Table 5: Example 2. Timed reachability transition system.

21

z0 = X N; F1 ; F2

yk zk (zk )

22

Figure 10: Example 2. Diagnoser.

C1,L0 ,n 1; 100 ; 1:100 2; 2:10 ; 2:20

C3,L1 ,n

3; 30

4; 40

N; F1

N; F1 ; F2

[0; 2]

C2,L0 ,a

[0; 2]



3

0

f g

N; F1

0

C4,L1 ,a 5; 50 ; 5:10 ; ; 5:u0 

N; F1

f g

[0; u]

C5,L2 ,a 6; 7; 8; 9 60 ; 6:10 ; 80 700 ; 900 ; 900:100 ; ; 9:u

C6,L2 ,n 10; 11; 12; 13 100000; 10:00100 00 11 ;0012 ; 13 00 13:1 ; ; 13:u 



N; F1 ; F2

[0; u0 + 1] [0; u + 1]

C7,L1 ,n 14; 1400 14:100 ; ; 14:u00 

N; F2

N; F2

[0; u + 1]

[0; u]

C8,L0 ,n 15; 16 1500 ; 15:100 1600 ; 16:100 N; F2 [0; 2]

C1,L0 ,n

C2,L0 ,a

C3,L1 ,n

C4,L1 ,a

C5,L2 ,a

C6,L2 ,n

C7,L1 ,n

C8,L0 ,n

1; 100

3; 30

4; 40

5; 50

6; 60

10; 1000

14; 1400

15; 1500

N; F1

N; F1 [l; u]

N; F1 [l + 1; u + 1]

N; F2

N; F2

N; F2

N; F2

2

f g

2

f g

N; F1

0

f g

0

f g

[l0

+ 1; u0

+ 1]

[l + 1; u + 1]

3 C1,L0 ,n

C5,L3 ,a

C5,L3 ,a

100 ; 1:100

170

170

F2

F1

F1



[l; u]

2

f g

y10

z10

(z10 )

f2; 3g y z

f2; 5g

(z )

[3; 1)

y20

z20

(z20 ) y30

z30

(z30 )

Figure 11: Using predictions in fault diagnosis. along which the output is y. Also de ne Lim(z ) := fx 2 X j x 2 Rch(z ) & ( (x;  ) = ; 8  2 )g [ fx 2 X j x 2 Rch(z ) & (9 x0 2 X;  2  : jT (x; ; x0 )j = 1g [ fx 2 X j x belongs to a cycle in Rch(z )g: Lim(z ) is the set of states and cycles in Rch(z ) in which the system can be trapped for an arbitrary long time without generating a new output symbol. As illustrated in Fig. 11, following y, either a new output symbol yi0 (i 2 f1; 2; 3g) is generated after some time in which case x 2 zi0 at the time yi0 is observed or, no new output is generated in which case x must eventually enter Lim(z ) after a nite time. For this reason, we refer to Lim(z ) [ ([3i=1 zi0 ) as the prediction for the system's state at the time y was generated. If four ticks occur without the generation of any new output symbol, we can conclude that the next output symbol cannot be y10 . Therefore the transition to z10 can be ruled out and the prediction can be updated to Lim(z ) [ z20 [ z30 . More generally, we denote the prediction for the system's condition after t clock ticks following the generation of y, assuming the generation of no new output symbol, by Pred(z; t) and de ne it according to Pred(z; t) := Lim(z ) [ ([i fzi0 j t  sup T^i (z; zi0 )g) where T^i (z; zi0 ) denotes the set of transition times from z to zi0 in the diagnoser. In Fig. 11, T^1 (z; z10 ) = f2; 3g. Naturally, (Pred(z; t)) will be the prediction for the system's condition. Now suppose the failures are permanent and hence, once a failure occurs, it 23

will remain permanently. Then we can expect to be able to use the predictions for fault detection and isolation. In fact, it can be shown [7] that after t ticks of clock following the occurrence of an observation yk and before the (possible) generation of yk+1 , if the standard diagnoser is in an Fi -certain state, then Pred(zk ; t) will also be Fi -certain (Recall that zk is the state estimate provided by the diagnoser described in this section upon observing yk ). Therefore, our diagnoser is at least as fast as the standard diagnoser in detecting and isolating failures. In some cases, our diagnoser could even be faster than the standard diagnoser. For instance, if a failure F2 is caused by another failure F1 , then our diagnoser may predict F2 (when it detects F1 ) even before F2 occurs. Obviously the standard diagnoser cannot detect F2 before it happens. In summary, the diagnoser provides an estimate of the system's condition ((zk )) every time a new output symbol is generated. Between consecutive output symbols, the predictions (Pred(zk ; t)) can be used for diagnosing permanent failures. Going back to the diagnoser for the neutralization process (Fig. 10), we can easily verify that for z = f1; 100g, Lim(z ) = f100; 1:100 g, and Pred(z; t) Pred(z; t)

= f3; 30; 4; 40 g [ f100 ; 1:100g for 0  t  2 = f100; 1:100 g for t  3:

Therefore when the diagnoser is at z = f1; 100g, if no new output symbol is generated for three ticks of the clock, then it can be concluded that F2 has occurred. This deduction is shown on Fig. 10 by a transition (dashed line) from z = f1; 100g to Pred(z; t) = f100 ; 1:100g. We de ne diagnosability in fault diagnosis with timing information as follows: A permanent failure mode Fi is time-diagnosable if there exists an integer T  0 such that, following both the occurrence of the failure and initialization of the diagnoser, Fi can be detected and isolated (i.e., either zk or Pred(zk ; t) become Fi -certain) in at most T clock ticks. If the TDES is activity-loop-free, i.e., it contains no cycles of non-tick events, then the notions of diagnosability (section 2) and time-diagnosability become equivalent for the TDES [7]. In [7], necessary and sucient conditions for time-diagnosability of failures are obtained. Also note that unlike the event-based version [3], in our de nition of time-diagnosability, no assumption is made about the system's state or condition at the time the diagnoser is started. For the neutralization process, the reader can verify that if V1 gets stuckopen, then in at most u + u0 + 2 clock ticks, the output symbol (C5,L3 ,a) will be generated and the diagnoser will enter the F1 -certain state z = f170g. On the other hand, if V1 gets stuck-closed, then in at most 3u + 3 ticks, the diagnoser will enter z = f1; 100g and Pred(f1; 100g; 3) = f100; 1:100g will indicate the occurrence of F2 . Hence, F2 will be diagnosed in at most 3u + 6 clock ticks. Therefore, both F1 and F2 are time-diagnosable. Note that without the timing information, F2 would not have been diagnosable because once the tank is emptied and (C1,L0 ,n) is generated, the system stops generating new output symbols and the estimate of the system's condition remains ambiguous. 24

The fact that in our approach, no update of the estimate of the systems's state is required at clock ticks results in signi cant reduction in on-line computing power requirement. In many cases, the diagnoser will also have fewer states than the standard diagnoser. For example, in the neutralization process, lling the tank takes a lot longer than opening a valve (assumed here to take 1 clock tick). Therefore u and u0 are considerably larger than 1. If we assume u = 200 and u0 = 100, then the TDES will have about 1700 states. As a result, while both the standard diagnoser and the diagnoser based on the methodology of [3] will have at least a few hundred states, our diagnoser (Fig. 10) has only 19 states and, except for some of the transition time sets (T^ (:; :)), the structure of our diagnoser does not depend on the parameters l, u, l0 and u0 . In general, a small clock period is necessary for accurate time keeping in a TDES. This results in a large state set for the TDES. In these cases, our approach can lead to signi cant reduction in the size of the diagnoser and in the on-line computing power requirement. This improvement is obtained at the expense of more o -line design calculations, in particular, the computation of the transition time sets T and T~ . For further details, the reader is referred to [7].

4 Conclusion In this paper, we have proposed a state-based framework for fault diagnosis in nite-state automata and timed discrete-event systems. In this approach, the system and the diagnoser (the fault detection system) do not have to be started at the same time and no information about the state or even the condition of the system before the initialization of the diagnoser is required. Furthermore, any problem in an event-based framework can be recast as a problem in our state-based framework. We have shown how we can reduce model complexity using a polynomial time algorithm. We have also presented a new approach to incorporate timing information which does not require estimate updates at clock ticks, thereby signi cantly reducing on-line computing power requirements and, in many cases, the size of the diagnoser. We have focused on the main ideas in this paper and have illustrated them with examples. A more detailed account of our work, including necessary and sucient conditions for diagnosability and extension to hybrid systems, can be found in [7].

References [1] B.A. Brandin and W.M. Wonham, \Supervisory control of timed discreteevent systems," IEEE Trans. Automat. Contr., vol. 39, no. 2, pp. 329-342, 1994.

25

[2] S. Chand, \Discrete-event based monitoring and diagnosis of manufacturing processes," in Proc. Amer. Contr. Conf., San Francisco, CA, June 1993, pp. 1508-1512. [3] Y. Chen and G. Provan, \Modeling and diagnosis of timed discrete event systems- A factory automation example," in Proc. Amer. Contr. Conf., Albuquerque, NM, June 1997, pp. 31-36. [4] T.Y.L. Chun, \Diagnostic supervisory control: A DES approach," Master's thesis, Dept. of ECE, University of Toronto, Canada, Aug. 1996. [5] W. Hamscher, L. Console and J. de Kleer, Eds., Readings in Model-Based Diagnosis. San Mateo, CA: Morgan Kaufmann, 1992. [6] S. Hashtrudi Zad, R.H. Kwong and W.M. Wonham, \Fault diagnosis in discrete-event systems: Framework and model reduction," to be presented at the 37th IEEE Conf. Decision Contr., Tampa, FL, USA. [7] S. Hashtrudi Zad, Ph.D. thesis, Dept. of ECE, University of Toronto, Canada, in preparation. [8] D. Lee and M. Yannakakis, \Principles and methods of testing nite state machines- A survey," Proc. IEEE, vol. 84, no. 8, pp. 1090-1123, 1996. [9] W.S. Lee, D.L. Grosh, F.A. Tillman and C.H. Lie, \Fault tree analysis, methods, and applications- A review," IEEE Trans. Reliability, vol. R-34, no. 3, pp. 194-203, 1985. [10] F.P. Lees, Loss Prevention in the Process Industries, Volume 1. London: Butterworths, 1980. [11] N.G. Leveson, Safeware: System Safety and Computers. Reading, Mass.: Addison-Wesley, 1995. [12] F. Lin, \Diagnosability of discrete event systems and its applications," Discrete Event Dynamic Systems, vol. 4, pp. 197-212, 1994. [13] R. Paige and R.E. Tarjan, \Three partition re nement algorithms," SIAM J. Comput., vol. 16, pp. 973-989, 1987. [14] D.N. Pandalai and L.E. Holloway, \Template languages for fault monitoring of single-instance and multiple-instance discrete event processes," in Proc. 36th Conf. Decision Contr., San Diego, USA, Dec. 1997, pp. 4619-4625. [15] T. Ruokonen, Ed., Proc. of IFAC Symp. SAFEPROCESS'94, Espoo, Finland, 1994. [16] M. Sampath, R. Sengupta, S. Lafortune, K. Sinnamohideen and D. Teneketzis, \Diagnosability of discrete-event systems," IEEE Trans. Automat. Contr., vol. 40, no. 9, pp. 1555-1575, 1995. 26

[17] M. Sampath, R. Sengupta, S. Lafortune, K. Sinnamohideen and D. Teneketzis, \Failure diagnosis using discrete-event models," IEEE Trans. Contr. Syst. Technology, vol. 4, no. 2, pp. 105-124, 1996. [18] M. Sampath, S. Lafortune and D. Teneketzis, \Active diagnosis of discrete-event systems," in Proc. 36th Conf. Decision Contr., San Diego, USA, Dec. 1997, pp. 2976-2983. [19] N. Viswanadham and T.L. Johnson, \Fault detection and diagnosis of automated manufacturing systems," in Proc. 27th Conf. Decision Contr., Austin, USA, Dec. 1988, pp. 2301-2306.

27