Modeling Uncertainty in Clinical Diagnosis Using Fuzzy ... - IEEE Xplore

2 downloads 0 Views 351KB Size Report
R. I. John and P. R. Innocent. Abstract—This paper describes a fuzzy approach to com- puter-aided medical diagnosis in a clinical context. It introduces a formal ...
1340

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 35, NO. 6, DECEMBER 2005

Modeling Uncertainty in Clinical Diagnosis Using Fuzzy Logic R. I. John and P. R. Innocent

Abstract—This paper describes a fuzzy approach to computer-aided medical diagnosis in a clinical context. It introduces a formal view of diagnosis in clinical settings and shows the relevance and possible uses of fuzzy cognitive maps. A constraint satisfaction method is introduced that uses the temporal uncertainty in symptom durations that may occur with particular diseases. The method results in an estimate of the stage of the disease if the temporal constraints of the disease in relation to the occurrence of the symptoms are satisfied. A lightweight fuzzy process is described and evaluated in the context of diagnosis of two confusable diseases. The process is based on the idea of an incremental simple additive model for fuzzy sets supporting and negating particular diseases. These are combined to produce an index of support for a particular disease. The process is developed to allow fuzzy symptom information on the intensity and duration of symptoms. Results are presented showing the effectiveness of the method for supporting differential diagnosis. Index Terms—Diagnosis, fuzzy cognitive map (FCM), fuzzy sets, fuzzy temporal reasoning.

I. INTRODUCTION

F

UZZY approaches to medical diagnosis have been reviewed in [1] and shown to been effective in this domain. Early work by Adlassnig [2] has been particularly influential in this domain. Other researchers (e.g., [3]–[5]) have considered temporal approaches in the medical domain. However, the work presented in this paper differs in that we are considering the role, specifically, of linguistic classification and, in particular, the role of type-2 fuzzy sets. The body of research in temporal classification is large, and the purpose of this paper is to provide a new approach. This paper considers using fuzzy methods to address the specific problem of disease classification in the presence of uncertain or vague knowledge of a linguistic nature. Of course, diseases can be considered fuzzy in that it is possible to have a disease to some degree. Here, we are interested in fuzzy symptoms. The problem context is typified by clinical diagnosis, whereby an expert is attempting to classify a patient into a disease category(ies) using limited vague knowledge consisting primarily of elicited linguistic information. This context arises in many problem domains. For example, in the early stages of many classification problems, it is necessary to decide what measurements can and should be made based on a preliminary diagnosis using Manuscript received October 4, 2004. This paper was recommended by Associate Editor C. M. Helgason. The authors are with the Centre for Computational Intelligence, School of Computing, De Montfort University, Leicester LE1 9BH, U.K. (e-mail: [email protected]). Digital Object Identifier 10.1109/TSMCB.2005.855588

primarily linguistic information. The medical context is an extreme case, whereby most or even all information available is of a linguistic nature when patients are first seen, and the diagnostic problem is severe because of the possibility of confusion between different diseases in their early developmental stages. Our paper adopts the “Computing with Words” approach advocated by Zadeh [6] and our particular representation of language in fuzzy sets rather than that adopted by, for example, Wainer and Sandri [7], which uses network models. In this paper, we extend recent work from Innocent and John [8]–[10] in the soft computing [11] domain of artificial intelligence that has been applied in the area of clinical diagnosis of confusable diseases in their early stages using vague linguistic knowledge. We first consider the aetiology of diseases and show how fuzzy cognitive maps (FCMs) [12] can encode fuzzy causal structures and so support symptom information elicitation and diagnostic reasoning. Second, we show how fuzzy temporal reasoning and constraints can be used to support diagnosis by providing a quantitative decision support index for diseases, which satisfy constraints. The paper uses the ideas of fuzzy combination [13]. This information is then used together with the known disease/symptom profiles to provide clinical decision support information. Furthermore, we consider other sources of uncertainty relating to the observed symptom information and show a method for using this in fuzzy sets.

II. KNOWLEDGE REPRESENTATION AND FCMs It is clear from the aetiology of some (but not all) diseases [14] that clinicians’ knowledge of disease–symptom relationships can be encoded as statements of causal relationships in the following form. Disease A causes symptom S1 in context X. Disease A causes symptom S2 in context X. Disease B causes condition C in context X. Condition C causes symptom S2 in context X. Disease D negative causes symptom S3 in context X. FCMs [12] can encode such causal relationships between concepts into a directed graph representation where nodes are concepts and links are causes (e.g., [15]). Encoding can be more difficult than simply taking each rule above and transferring into a graph since, as Kosko points out, “A causes B” cannot always be considered equivalent to “A implies B” since the representation of negative concepts and causes becomes problematic. Other methods use fuzzy hierarchical approaches [16] but suffer from similar problems. However, causal maps from one context may be combined appropriately with others

1083-4419/$20.00 © 2005 IEEE

JOHN AND INNOCENT: MODELLING UNCERTAINTY IN CLINICAL DIAGNOSIS USING FUZZY LOGIC

in other contexts, and this makes the representation flexible and useful. In the simplest representation, we are allowed only positive and negative causal links that represent sets of statements like those above. Kosko shows how these links can be extended to fuzzy sets in a particular context using statements like the following: A is a strong cause of B and A is a weak negative cause of B. He then shows how the fuzzy set operators, e.g., min-max, can be applied to particular sequences of fuzzy values to infer how such a set of concepts can affect other concepts in the FCM. In our example, a clinician might reason about what symptoms would be present and absent given that some diseases are present from a knowledge base such as the following: Disease A is a positive cause of symptom S1. Disease A is a weak negative cause of symptom S2. Disease B is a weak positive cause of symptom S2. Disease C is a weak positive cause of symptom not-S2. Disease C is a positive cause of condition D. Condition D is the negative cause of symptom S2. One use for such an approach would be to confirm (or otherwise) a provisional diagnosis made on the basis of how well the symptom set observed matches the fuzzy strength of the symptom set predicted by the FCM. This is a test of “categorical consistency” [7]. Another use is to remind the clinician of the possible range and variation of symptoms that are possible in the disease set information. While the latter use can be relatively easily reliably achieved, the former use requires further refinement of the fuzzy knowledge base and how it is processed. In Kosko’s example, the degree of causality was selected from the ordered set a little, some, usually, much, very much, a lot . In our paper, we defined a set of appropriate linguistic terms and ordered them on the basis of an analysis of the terms used by a clinician during diagnosis. In eliciting knowledge from a clinician about chest infections, asthma, and lung cancer, we found that many terms were used to describe the relation between the condition and the symptoms. Among these were “always excludes (never),” “sometimes,” “occasionally,” “often (usually),” and “always.” For example: “Chest infection always causes coughing.” In our knowledge elicitation, we found that some terms were related. “Always,” for example, can be construed to be synonymous with “certainly.” We modified causal representations by attaching a fuzzy label to the causal link to represent these elicited facts. For example: Disease A is (always) a (very) positive cause of symptom S1. However, the clinician was also providing knowledge gained from practical experience (learned implications) in these sessions in the form of “symptom-relation-disease.” For example: “Absence of cough” often indicates “asthma” in a particular context (e.g., nonsmoker). These indicators are clearly important and posed a problem for our chosen representation since we required knowledge in the form “A causes B.” We chose to adopt a weak causal representation of these indicators.

1341

For example: Disease A is (sometimes) a (little) weak-negative cause of symptom S2. Occasionally, the clinician indicated comparative information. For example: “Chest pain” is less likely for asthma than lung cancer. Again, we have chosen to adopt a weak causal representation of this information in the following form. Disease A is (frequently) a stronger negative cause of symptom S1 than Disease B. The comparative information allows some ordering of the linguistic terms, and it is clear that some of the other linguistic terms (e.g., “very”) can be treated like linguistic hedges that modify the fuzzy label of the causal link. Apart from uses of the semantic network as a simple guide through knowledge structures, the power of the representation is that it is amenable to further processing so that hidden patterns of causal states may be revealed. Typically, we can discover instability as information flows through the network [17]. In our paper, we have developed a small prototype to test these ideas using respiratory diseases: influenza, asthma, and lung cancer. In test mode, the clinician is presented with a graphical display of the causal network and is allowed to set symptom nodes to be on or off. The FCM interpreter then cycles through the network showing the intermediate nodes states and the disease node states. In development mode, the clinician can highlight the symptoms causally associated with a particular disease. While FCMs are useful for diagnostic purposes by clinicians using a hypotheticodeductive method, they suffer from being “shallow” in that the causal statements are simple representations of deeper knowledge. Thus, confidence in them can be low, unless suitable explanations are provided. A second limitation is that FCMs deal with essentially static descriptions of dynamic processes. This is a major deficiency since we would like to know the stage of a disease rather than just diagnose the disease itself. Diseases are sometimes misdiagnosed because they are very similar to each other at different stages. We now address each of these limitations. The first limitation can be addressed in many ways; for example, FCMs can be extended by adding embedded qualitative models that involves “deeper” knowledge at a detailed high granularity. Models can be added that have been learned about human physiology and is both descriptive and procedural in nature. If an explanation is required for a particular causal link. such as Diabetes is (always) a (very) positive cause of high blood sugar then a deeper appropriate model is used (note here that diabetes is not a single disease but a constellation; this is for illustrative purposes). There is a range of possibilities for the depth of these models ranging from a qualitative to a highly quantitative representation. Given the training of clinicians in the hypotheticodeductive method of diagnosis, we have adopted the principle of parsimony in selecting an appropriate level that implies going from one level to the next deeper level, as required for explanation or supporting confidence purposes. Examples of this approach are given in [10].

1342

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 35, NO. 6, DECEMBER 2005

There are clear limitations to this approach since diseases usually progress through stages, but the FCM and the qualitative models assume stationary values of measured symptoms that do not alter over time. While causal knowledge is useful to reason about many diseases using symptom profiles and qualitative measures alone, there are many diseases that can be easily confused in the early stages because symptom profiles and static measures are not sufficient. There are a number of approaches to deal with this, with the principal ones being to make use of local experiential knowledge (e.g., relative frequencies of diseases) and global knowledge about the progression of a disease through different stages. We present our work in overcoming the important temporal limitations by using fuzzy sets and inference as one method of approaching these problems. Other approaches [16] are related but adopt different principles and representations, such as network search and parsimonious covering theory. III. INCORPORATING EXPERIENCE AND TIME RELATIONSHIPS In our shallow FCM, we have allowed representations of static information relating to the frequency of occurrence in the causal relationship: e.g., Disease A is (frequently) a strong negative cause of symptom S1. Disease B is (occasionally) a strong negative cause of symptom S1. The following knowledge may be formally deduced from these statements: Disease A is (always) a stronger negative cause of symptom S1 than Disease B. Such a shallow deduction can be explained through the causal links and used to support differential diagnosis. However, a statistical approach may compute the relative likelihood of a disease using a method for handling uncertainty such as Baye’s rule [18], given a profile of symptoms and a set of conditional probabilities. Ross [19] describes how a fuzzy Baye’s rule can be derived and used. These methods are knowledge intensive but have been shown to be successful in a limited domain. They use information about both the presence and absence of symptoms. However, information about the duration of symptoms (here called “temporal” knowledge) is not explicitly used, and we propose that this is important in making a differential diagnosis in the early stages of diseases. In the examples above, temporal knowledge acquired from experience could be incorporated into the representation as follows: e.g., Disease A (always) causes symptom S1 in time interval t1 to t2. Disease B (sometimes) causes symptom S2 in time interval t3 to t4. The first statement means that the clinician is expecting to see a particular symptom (S1) in the interval t1 to t2 (measured form presumed onset of a disease), and the “always” indicates that if S1 is not observed in this time, the disease cannot be Disease A. The “always” indicates a causal relationship. The second statement indicates that the clinician may expect to occasionally (“sometimes”) see a particular symptom (S2), but its absence will not eliminate Disease B. Other linguistic labels for CR we

have found are “often,” meaning that a symptom occurrence is not unexceptional, and “normally,” which indicates an expectation of a symptom but not a causal relationship. It follows that if a clinician observes a complete set (all symptoms possible) of symptoms whose causal relationship is “always,” then there is a end point from which a disease state can be inferred with maximum confidence. The crisp temporal intervals t1 to t2 allow logical temporal reasoning within, for example, formal logical procedures [20], where logical closure can be used. These can be implemented in expert systems using appropriate reasoning methods [21] and uncertainty measures of rule inference strengths. This supports clinicians in using inductive logic: “given a particular disease(s) onset at time t, what are the expected symptom profiles to time t1, and do any of these match a current patient symptom profile?” FCMs have been extended in a number of ways to allow for some temporal processing [22]. Take the simplest case, where the time domain is made discrete and extra nodes are introduced into the FCM that correspond to different time points. They give rules on how such nodes are to be introduced and processed. Tsadiras and Margaritis [23] extended FCMs (EFCMs) in a more general case, where nodes are allowed to take contin, and these values are allowed uous truth values between to decay with time unless reinforced by positive causal links (the converse is also true). Thus, we would be able to use the EFCM to provide symptom profiles as lists of real numbers for given sets of diseases that are developing over time. Both of these techniques may be combined to provide information that would enable a clinician to see the dynamics of multiple related diseases in terms of symptom changes over time. In terms of diagnostic efficiency, however, clinicians do not only use a combination of deductive and inductive logic in their hypotheticodeductive approach; they also use constraint satisfaction knowledge. For example, if a symptom does not occur at a certain stage when it is expected for a specific disease, then the disease is unlikely. We, therefore, consider a constraint satisfaction approach using fuzzy temporal logic as a complementary approach to support diagnosis. Such an approach should also enable questions such as “given S1 at time t1 until t2, and S2 at time t3 until t4, S3 absent, , what is the most likely disease from a given range of possible diseases?” There is usually vagueness associated with temporal knowledge of the symptoms with respect to a given disease. A symptom occurrence over time for a particular disease is, therefore, better represented as a fuzzy set, and we reason by using fuzzy inferences on these sets [24]. A successful nontemporal example of this approach is described by Kovalerchuk [25] in the general issue edited by Steimann [26] on fuzzy set theory in medicine. There is also a case for using this approach more generally to model a clinicians reasoning given by Esogbue [27], which gives some support for continuing the approach further. This is presented in the following section of this paper.

IV. FUZZY CAUSATION AND TIME RELATIONSHIPS Temporal knowledge acquired from experience or written sources could be incorporated into the FCM causal representation statements [28] as follows:

JOHN AND INNOCENT: MODELLING UNCERTAINTY IN CLINICAL DIAGNOSIS USING FUZZY LOGIC

Disease D (CR) causes symptom S in time interval t1 to t2. The statement means that the clinician is expecting to see symptom (S) with CR in the interval t1 to t2 (measured from disease onset) given that the hypothesis that the patient has disease D is true. Temporal constraint reasoning is then possible. The following example makes this clear: e.g., Influenza (always) causes fever in time one day from onset to three days from onset. The “always” indicates that if fever is not observed in the interval of days (1,3) measured from onset (day 0), the disease cannot be influenza. Thus, decision making is within a more constrained temporal context. Notice that it is possible to extend the granularity of this representation to include expectations of the duration of a symptom, e.g., Influenza (always) causes fever (for duration one to two days) at time one day from onset to three days from onset. In our approach, the temporal knowledge base relating the class of disease D to the general symptom set S uses fuzzy representations of symptom time duration and onset information [24] since, in medical literature, we commonly find linguistic descriptions such as “fever usually develops between the third and fourth day of influenza and lasts for approximately three days.” By using a linguistic translation, we use the CR term “always” to be synonymous with “usually” and use fuzzy sets for the time information. This may be represented using fuzzy set notation as e.g., Influenza (always) causes fever at time fever set where fever set is a fuzzy set such as those shown in Fig. 1 and . designated by the symbol The interpretation of these sets is that they define the possibility distribution of the intervals of time within which a symptom may be observed when it has a possibility of different durations. They should not be interpreted to mean that a symptom occurs to a particular degree at a particular time. Thus, at this point, a symptom observation is assumed to be crisp in its duration and starting time. A graphic representation of the fuzzy sets for an example disease with example common symptoms is shown in Fig. 1. These have been constructed after interaction with medical expertise as an exercise to demonstrate our approach. V. TEMPORAL INFERENCE USING FUZZY SETS In order to test the feasibility of using fuzzy sets for temporal reasoning, we have used constraint satisfaction using fuzzy set representations of temporal knowledge for diseases that have a subset of temporally overlapping common symptom sets. We chose chest infection, measles, and scarlet fever since the disease histories are well known and readily available in, for example, [29]. This results in about 60 different symptoms and about 90 different symptom descriptions in a database where each symptom is related to a temporal fuzzy set description for are each disease. Two examples of the discrete fuzzy set shown in Fig. 1, where the granularity of the time base T is 14

Fig. 1.

1343

Fuzzy sets for two symptoms of influenza.

days. The granularity reflects the clinician’s use of “days” when relating to symptoms for these diseases. Diseases (which are not fuzzy) are said to start on day 0 (onset) although a symptom may not be observable until after this day. The first and last indices of the nonzero membership has special significance grade of the discrete fuzzy set for the computation of our support index and is denoted by (e.g., day 1 in Fig. 1 for vertigo) and (e.g., day 11 in Fig. 1 for vertigo), respectively. The basic goal of the inference engine is to establish the stage of a disease at the current day given observations of symptoms in the crisp form “Symptom occurred from M days ago up to and including N days ago” where . A value of 0 is interpreted to mean the current day. e.g., “fever started three days ago and is still present” is represented as “fever occurred during time interval (3, 0).” The (3, 0) is a useful notation to indicate it started three days ago. We place the observed symptom into the history and estimation of the current disease by checking to see if the observed symptom duration is possible in and, if so, determine the lower and upper boundaries of the position of the current for each symptom expected for each disease. The day in lower boundary is set to be. The upper boundary is the result of a simple search for the best fit of the observed symptom interval into the full range of possible intervals. Observation of another causally relevant symptom for the disease under test produces another set of boundaries, and these are combined with the older estimate using fuzzy min/max operators. If this combination results in a null interval, then we assume that the temporal dependencies for that disease are not fulfilled. Otherwise, we have a revised estimate of the upper and lower boundaries of the current day for that disease. This is repeated for all observed symptoms for each disease. The outcome of this process is an estimate of the stage of each disease presented as the interval null null , where null is the lower bound, and null is the upper bound for the day of the th disease since onset. This information can then be used together with

1344

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 35, NO. 6, DECEMBER 2005

Fig. 2. Support for influenza as a function of different symptom observabilities. TABLE I DISEASE BOUNDARIES FOR INFLUENZA GIVEN A RANGE OF SYMPTOMS

the known disease/symptom profiles to provide clinical decision support information. VI. EXAMPLE OF TEMPORAL INFERENCE In this example, we assume that the patient is presenting information on each symptom in the order provided by the index in Table I at a specific clinical interview. For simplicity, we assume that every symptom started two days ago and is still present, i.e., symptom started two days ago and is still present. According to the procedure, we first check for a particular symptom if the duration is possible in the time set. If we consider “fever” for the disease “influenza” and the time set shown in Fig. 1, it is possible for this set to accommodate a two-day duration since the maximum duration allowed is days and the minimum is one day. Now, we must estimate the upper and lower boundaries for the stage of the disease given fever. From Fig. 2, it can be seen that it is first possible for fever to show on day 1 after onset of influenza. This is the lower bound for the placement of the symptom start. To estimate the upper bound, we use the interpretation of the fuzzy set that the symptom must be observed

on or after the membership is unity. This occurs at day 3 from Fig. 2. This is true as long as the symptom duration does not force the end of the symptom occurrence outside the allowed range in the set (day 10 from Fig. 2). If this does occur, then the boundary must be adjusted if possible so that the possible upper boundary is greater than or equal to the lowest possible value but still allows the symptom to occur in the allowable range. If this is not possible, then the constraint is broken. For example, if the fever lasted 12 days, it would not be possible to fit into the set of possibilities. Hence, our interval for possible influenza given fever for the past two days is that onset was possibly three to five days ago. Table I shows that this interval remains the same for all of the symptoms until we get to coughing. The fuzzy set distribution for coughing given influenza results in a set of possible boundaries of [4, 5]. We adjust the possible interval for the disease stage by taking the maximum of the two lower boundaries (3, 4) to arrive at a new lower boundary (4) and then take the minimum of the two upper boundaries (5, 5) to arrive at a new upper boundary (5). Further symptoms do not further restrict this interval, and we can conclude that influenza is possible because the boundary constraints are satisfied. If, however, a particular further symptom had resulted in an interval such as [1, 2], it is clear that this would result in a new boundary of [3, 2], which violates the boundary definitions, and this constraint would not be satisfied. The violation of this constraint would infer that the disease is eliminated. In collecting temporally valid symptom information, the clinician would achieve logical closure after the first six symptoms in Table I and, at that stage of the consultation, could have some confidence in inferring that the patient has influenza. Collection of further symptom information adds support to this conclusion. There are no contraindications such as the absence of necessary symptoms either because the observations are absent (not observed) or because temporal constraints have been violated.

JOHN AND INNOCENT: MODELLING UNCERTAINTY IN CLINICAL DIAGNOSIS USING FUZZY LOGIC

VII. DECISION SUPPORT INFORMATION In order to provide diagnostic information, we compute an estimate of “goodness of fit” called an index of support (Support) following the use of this term in the CADIAG system (referred to by Adlassnig [30]) for each disease. In our system, support may be seen as a measure of compatibility that expresses the degree to which a particular diagnosis logically follows from the medical evidence. Support is defined as the evidence provided by the observation of the causally related th symptom for the th disease. Support is the accumulated evidence for the th disease from causally related and all the observed symptoms . Support is defined as the evidence provided by the nonobservation of the causally related th symptom for the th disease. by Support is the accumulated evidence against disease the nonobservation of all the causally related symptoms . The following rule definition shows the rules used for each symptom/disease to compute the evidence for and against a disease:

1345

in its degree of support according to the stage (i.e., temporal development) of a disease. This information requires us to use the temporal knowledge about the duration of the symptom as described in the previous section. A short duration symptom in relation to the expected range of duration possibilities should be interpreted as low support for a disease. Conversely, if a symptom duration is a good match to the possibility set, then it should be interpreted as high support for a disease. A simple method to compute the level of support based on this idea can be to use the area of the temporal possibility set covered by the duration of the symptom. was observed in time interval Suppose the th symptom and is causally related to the disease . (a bias-corrected ) in the appropriate temWe now use for calculating the supporal fuzzy set in discrete form port for the particular disease . Support is computed as the is the inarea under the temporal fuzzy set using (4), where null , and “i” is a symptom terval null index Support

Rule # if is Observed then Support else Support

Support Support

Support Support

Rule Definition: Rule base for the computation of the support index. represent the avAt any stage of symptom elicitation, let erage of the accumulated evidence from observed symptoms , and represent the average of the accumulated evifor unobserved (but expected from causal relationdence from . If we assume that the absence or ships) symptoms against presence of a causally relevant symptom are equally valuable, we may define in the form of a simple additive model (SAM, as defined in [31]) as in (1) and (2) Support Support

(1) (2)

However, while the presence of a causally relevant symptom is always of maximum value to a diagnosis, the absence of a causally relevant symptom depends on its level of CR. For example, if a symptom is sometimes expected but not always, and we have not observed it, we would wish to weight the negative determine the effect evidence accordingly. Let the weight of negative evidence depending on the CR of the th symptom. Then, we modify the computation of the negative evidence in (2) to become Support

(3)

Now, we must consider what we mean by evidence. The manifestation of a symptom is clearly supportive, but this will vary

(4)

In the absence of symptom information, it may be computed is the interval from null using (5), where up to null , and “k” is the symptom index Support

(5)

The support index for the th disease Support is then computed as the weighted aggregate of the evidence over all union causally relevant symptoms. We use the ideas of Kosko [13] to combine the knowledge Support

(6)

is related to the conditional probability for given the is related to the conobservation of the set of symptoms. given the absence of set of sympditional probability for and do not obey the laws of probability, however, toms. in that they do not sum to unity. We use the idea of an “incremental SAM” where computation of the support index is performed every time new information on symptoms is obtained. The idea is that there is a closed set of symptoms expected for a particular disease. These are assumed to be not present unless information from observations is obtained to the contrary. As information is acquired from observations for each symptom, the support index is updated. We now consider how we should interpret the various forms of uncertainty inherent in this computation. First, we consider the interpretation of CR. VIII. CR INTERPRETATION By considering how the term is used in this domain, we can define the CR of the symptom to the disease as a linguistic term from the set never, sometimes, often normally, always .

1346

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 35, NO. 6, DECEMBER 2005

is now computed using (7), where is the interval null null , and “i” is a symptom index (7) We translate the linguistic term into a numeric weight for symptoms that have not been observed (see Fig. 1). We use an “S” curve to map from the linguistic CR term to a numeric weighting so that the absence of normally expected symptoms may now be computed for a disease has maximum effect. using (8) where is the interval from null up to null , and “k” is the symptom index (8) Using the idea of weighted combinations of fuzzy information from Kosko [8], computation of the support index using crisp weights is

Support

IX. INCORPORATING FUZZY SYMPTOMS Within definition 1, there are a number of possible ways to introduce the idea of fuzzy observed symptoms into our paper. The simplest method is to allow the symptoms to be defined at a higher linguistic granularity, such as “(high) fever occurred during (approximate) time interval (3, 0).” This must then be treated as a different symptom from “(mild) fever occurred during (exact) time interval (3, 0).” If we use this approach, then for every disease, and for all variations of fever and duration, there must be a set of temporal possibilities [Q(t)] and CRs generated, as in Fig. 1. This is a major knowledge acquisition and user interface problem, and we propose an alternative approach. of the th obWe propose introducing the “strength” served symptom. The strength of a particular observed manifestation is similar to the manifestation “intensity” defined by Wainer and Sandri [7] and could be used in a similar way, i.e., we can include intensity information to match against strength values. We allow our strength values to be in the range [0, 1] and use particular strength values of particular observed symptoms to modify the computation of

(9) In foregoing work, the CR was linguistically defined but modeled with singular numeric weights , which implies an inappropriate level of accuracy in the disease database specification. It is more sensible to assume a degree of uncertainty in the encoding of the CR. We capture this using the set where members of the set are defined by Gaussian fuzzy sets. However, when a symptom is observed, it is either associated with a disease (normally or often or sometimes) or not (never). Therefore, there is no uncertainty in how it should contribute to the evidence in favor of that disease. This means that the numeric interpretation of CR used in the previous sections is unchanged, and (7) holds. However, the relevance of an expected but as-yet-unobserved symptom is important, and the variation in the relevance needs to be taken into account. In our computation of the negative evidence using Gaussian fuzzy sets, we normalize the result by dividing by the sum of the means less the standard deviation (sd) of the fuzzy sets (mean(fuzzy CR)-sd(fuzzy CR)). This is a normalization approximation, which is reasonable for Gaussian for sets. We, therefore, modify the equation for the index counter evidence (10) In (10), note that the sum over the time possibility sets Q over the appropriate range results in a scalar value. Hence, we are scaling the fuzzy set by multiplication. This operation is defined later in the paper. The computation of the support index is then unchanged from . (9), except that

(11) The computation of the index of support then continues as per (9). The strength of an observed symptom depends on a number of factors. In previous work [9], we defined strength as a crisp value determined by an assumed relationship with the crisp value of observability (O) of a symptom. Thus, weakly observable symptoms have a low strength. This assumption is not warranted since precision is not usual in patient recollections. Observability is assumed to be dependant on the intensity (I) and the certainty of the duration (D) of the observed symptom. We now assume that both of these attributes are fuzzy in nature since they are usually reported from the recollections of patients. If crisp information from test results were known, it would have a maximum strength value. We may define the strength of a symptom directly in terms of a first-order (type 1) fuzzy set (Gaussian) whose membership values are determined by the second order (type-2) fuzzy set of observability. Observability is now represented as a fuzzy set that is determined by two related second-order (type 2) fuzzy sets (Gaussian) associated with intensity and duration certainty. Since we wish to represent a second order of uncertainty in the symptoms using type-2 fuzzy sets, we briefly review type-2 fuzzy in the following section. X. TYPE-2 FUZZY SETS Zadeh [32] introduced type-2 fuzzy sets. They can be considered as a fuzzification of a type-1 fuzzy set. Any fuzzy logic application using type-1 fuzzy sets requires the developer to describe the membership function by numbers, in the discrete case,

JOHN AND INNOCENT: MODELLING UNCERTAINTY IN CLINICAL DIAGNOSIS USING FUZZY LOGIC

or by a function, where the fuzzy set has a continuous membership function. So, any system employing fuzzy sets represents the fuzziness of the particular problem using a “nonfuzzy” (or crisp) representation. Dubois and Prade [33, p. 256] when discussing the problem of determining membership functions of type-1 fuzzy sets say “To take into account the imprecision of membership functions, we may think of using type-2 fuzzy sets.” As Klir and Folger [34] also point out, “it may seem problematical, if not paradoxical, that a representation of fuzziness is made using membership grades that are themselves precise real numbers.” This paradox leads us to consider the role of type-2 fuzzy sets as an alternative to the type-1 paradigm. So, type-2 fuzzy sets appear to be potentially very useful and important since the need for a “crisp” measure of fuzziness (by a number in [0, 1]) is removed, and linguistic grades are allowed. Yager [35] summarizes this in the following way: “The usefulness of fuzzy subsets of type II is that it enables us to extend membership grades to linguistic values.” It is this ability to represent linguistic grades using type-2 fuzzy sets that we exploit in this paper. Other researchers have investigated theoretical properties of type-2 fuzzy sets [36]–[40] and used them in applications [35], [40], [42]. Mendel and John [42] have established a full set of terms for communicating about type-2 fuzzy sets. Using the definitions there, we define a type-2 fuzzy set in the following way. Definition 1: A type-2 fuzzy set is characterized by a type-2 , where and membership function

in which . The are known as the primary memberships. So, a type-2 fuzzy set has membership grades that are type-1 fuzzy sets, which are referred to as secondary membership functions. is a Definition 2: At each value (say, ), then secondary membership function of . A secondary membership function is also called a vertical slice (of the type-2 membership function). The type-2 fuzzy set is the union of all the secondary membership functions. The join and meet of two type-2 fuzzy sets are defined in the following way. Suppose, for a given , there are two type-2 and fuzzy sets and in and

are two secondary membership functions of and , reand spectively, represented as . Definition 3: The union or join of two type-2 fuzzy sets is given by

1347

Definition 4: The intersection, or meet, of two type-2 fuzzy sets

where is an appropriate t-norm (e.g., max), and is a t-conorm (e.g., min). Thus, the join and meet allow for us to combine type-2 fuzzy sets for the situation where it is required to “OR” or “AND” two type-2 fuzzy sets. Join and meet are the building blocks for type-2 fuzzy relations and type-2 fuzzy inferencing with type-2 if-then rules. So, then, we are able to model linguistic labels using type-2 fuzzy sets and combine them using join and meet. The next section explores how we have used type-2 fuzzy sets to aid decision support in clinical diagnosis. XI. SYMPTOM FUZZINESS USING TYPE-2 SETS In our use of type-2 sets for symptom duration certainty and intensity, we need to combine these sets to estimate the secondorder set called observability. We propose using the type-2 fuzzy relation “meet” of the form shown below where symptom intenand symptom duration certainty sity and the set contains type-2 Gaussian fuzzy sets. Observability is computed as follows: Observability

Type

meet

The computation of Observability results in a Gaussian set that would normally be “joined” in a type-2 manner with all the other type-2 sets contributing to the type-1 set for observability. However, in this case, there are no other contributory sets to be “joined” with to determine the single range of uncertainty associated with a particular observed value of a particular symptom. Hence, we make a direct link from the “meet” of the sets to the observability of a symptom so that Symptom Strength

Type

meet

Thus, for a given symptom, we require the user to provide an estimate of the intensity certainty and duration certainty as well as the duration. For example, the natural English statement “I had a high fever about three days ago that lasted about a day” more formally becomes (high intensity) fever occurred during (low certainty) interval (3, 2). This is translated as fever occurred during interval (3, 2) with strength . The computational method is shown in (12) (which indicates how the supporting evidence for disease j of m observed symp) is modified so that we toms is accumulated into the index use the type-2 meet relation instead of to take the intensity I and duration certainty D of observed symptom i into account type meet (12)

1348

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 35, NO. 6, DECEMBER 2005

Note that we could also use the type-2 “join” operator instead of sigma as the aggregation operator in computing F as the sets being aggregated are type-2 fuzzy sets.

TABLE II SYMPTOM DESCRIPTIONS AND ORDERING

XII. FUZZY SET OPERATIONS In (13) and (14), we are required to define what is meant by the product of a fuzzy set and the scalar quantity computed by the integral of the time set. We define P, the result of the product of a scalar S, by the fuzzy set T, over the range of X as (13) where (14) Similarly, (13) and (14) specify the computation as an aggregation [15] S of scaled fuzzy sets P. The set aggregation operator Op is defined as

(15)

This is a departure from the SAM model since the use of (15) results in sets with accumulated membership values greater than 1. The sets S are strictly not fuzzy sets but are related to the concept of “mass,” wherein each set is considered as being composed of distinct entities that are not unified on aggregation. The intuitive reason for the “mass” assumption is that we are balancing evidence accumulated from symptom information that is both for and against a disease. After aggregation over all the symptoms for a disease, the set S can then be defuzzified in one operation to compute the revised support index. XIII. RESULTS A series of simple tests was made with a Matlab prototype, which implemented the ideas presented above for the disease of influenza since the disease histories are well known and readily available in [10]. Advice from a practicing general medical practitioner was sought in interpreting the data. This results in about 60 different symptoms and about 90 different symptom descriptions in a database, where each symptom is related to a temporal fuzzy set description for each disease. We wished to show how the support index changes as symptom information is collected in the early stages (i.e., the first few days since onset) of a disease and how the symptom strength affects the index. We represented the secondary grades of the type-2 fuzzy sets using Gaussian functions in this paper. In all of the tests, we entered symptom observations in the same order and with the same temporal details (all symptoms started three days ago and are still present). The symptoms and ordering were based on what would be ideally expected for the disease of influenza, as shown in Table II for that disease. It is more realistic to assume that each symptom will vary in its observability. Fig. 2 shows a sample of this case, where dif-

ferent values for each symptom’s duration certainty and intensity are taken. This shows that tuning of the fuzzy set parameters is nontrivial, although in a differential diagnosis, this problem is less severe. The results here are similar but not identical to those presented in [2]–[4], which did not use fuzzy but crisp estimates of symptom strength. The experimental conditions here are similar to those in [3], apart from the use of crisp symptom value patterns. The main difference in results arises from using crisp and type-2 approaches. The type-2 context requires further tuning of the type-2 sets and evaluation in a diagnostic setting to provide sensible guidance on diagnosis using the index. At present, we would expect the index to be of use in the differential diagnosis situation, where some tolerance of index normalization error can be acceptable. Fig. 3 shows how the support index varies for two diseases as the symptom information is collected, where each symptom has different observabilities, as given in Fig. 2. This shows that the support index indicates an ability to differentiate, although the differences in support are small in some cases. This indicates that we would like to tune our parameters in the light of experience to reflect the actual difference and that further sources of uncertainty should be investigated to provide an estimate of range of tolerance. XIV. CONCLUSION AND FUTURE WORK Although this paper is promising, there are additional sources of uncertainty to incorporate into the method we propose. A source occurs in the knowledge base provided by clinicians in the fuzzy sets that describe the temporal possibility sets of each symptom for every disease. We would like to explore the further use of type-2 fuzzy sets to deal with these second-order uncertainties by placing a “footprint of uncertainty” [39], [40] on these sets (see Fig. 4) and to process them to take vagueness into account. The result of this paper will be an estimate of support for each disease that includes upper and lower bounds. Given the solutions proposed for handling various sources of uncertainty, it is clear that we need to develop a theoretically

JOHN AND INNOCENT: MODELLING UNCERTAINTY IN CLINICAL DIAGNOSIS USING FUZZY LOGIC

1349

Fig. 3. Support for influenza and scarlet fever for varying symptoms (see Fig. 2).

[3] N. Belacel, N. Vincke, J. M. Scheiff, and M. R. Boulassel, “Acute leukemia diagnosis aid using multicriteria fuzzy assignment methodology,” Comput, Meth. Prog. Med., vol. 64, no. 2, pp. 145–151, 2001. [4] D. Kopecky, M. Hayde, A. R. Prusa, and K. P. Adlassnig, “Knowledgebased interpretation of toxoplasmosis serology results including fuzzy temporal concepts—The ToxoNet system,” Medinfo, vol. 10, no. 1, pp. 484–488, 2001. [5] E. Santos, Jr, “On modeling time and uncertainty for diagnosis through linear constraint satisfaction,” in Proc. Int. Congr. Computer Systems Applied Mathematics Workshop Constraint Processing, St. Petersburg, Russia, 1993, pp. 93–106. [6] L. A Zadeh, “Fuzzy logic computing with words,” IEEE Trans. Fuzzy Syst., vol. 4, no. 2, pp. 103–111, Apr. 1996. [7] J. Wainer and S. Sandri, “Fuzzy temporal/categorical information in diagnosis,” J. Intell. Inf. Syst., vol. 13, no. 1–2, pp. 9–26, 1996. [8] P. R. Innocent and R. I. John, “A fuzzy symptoms and a decision support index for the early diagnosis of confusable diseases,” in Proc. RASC Conf.. Leicester, U.K., Jul. 2000. [9] , “A lightweight fuzzy process to support early diagnosis of confusable diseases using causation and time,” in Proc. Relationships Conf. FUZZ-IEEE2000, vol. 2, 2000, pp. 516–521. [10] P. R. Innocent, “Clinical diagnosis using FCM and fuzzy temporal reasoning,” in Proc. Conf. SCB99, Rochester, NY, 1999, pp. 490–496. [11] L. A. Zadeh, “Fuzzy logic, neural networks and soft computing,” One Page Course Announcement of CS-294-4, Spring 1993, Nov. 1992. [12] B. Kosko, “Fuzzy cognitive maps,” Int. J. Man-Machine Stud., vol. 24, pp. 65–75, 1986. [13] , “Fuzzy knowledge combination,” Int. J. Intell. Syst., vol. 1, no. 4, pp. 293–320, 1986. [14] K. Sadegh-Zadeh, “Fundamentals of clinical methodology: Aetiology,” Artif. Intell. Med., vol. 12, pp. 227–270, 1998. [15] W. R. Taber and M. A. Siegel, “Estimation of expert credibility weights using FCM,” in Proc. IEEE 1st Int. Conf. Neural Networks, vol. 2, San Diego, CA, 1987, pp. 319–326. [16] S. Zahan, C. Michael, and S. Nikolakeas, “A fuzzy hierarchical approach to medical diagnosis,” in Proc. FUZZ-IEEE, 1997, pp. 319–324. [17] R. Taber, “Knowledge processing with FCM,” Expert Syst. Appl., vol. 2, pp. 83–87, 1991.

=

Fig. 4.

“Footprint of uncertainty” in the knowledge base.

sound method for combining these and presenting a comprehensible index of support for the user. ACKNOWLEDGMENT The authors would like to thank Mr. T. Goeckler for developing prototype temporal fuzzy reasoning programs and Dr. A. Sharpe for valuable medical information in general practice. REFERENCES [1] F. Steimann and K.-P. Adlassnig. (2000) Fuzzy Medical Diagnosis. [Online]. Available: http://www.citeseer.nj.nec.com/160 037.html. [2] K.-P. Adlassnig, “A fuzzy logical model of computer assisted medical diagnosis,” Meth. Inf. Med., vol. 19, pp. 141–148, 1998.

1350

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 35, NO. 6, DECEMBER 2005

[18] R. A. Frost, “Theories for dealing with uncertainty,” in Introduction to Knowledge Based Systems. New York: Harper-Collins, 1996, ch. 7, p. 399. [19] T. J. Ross, Fuzzy Logic with Engineering Applications. New York: McGraw-Hill, 1995. [20] E. T. Keravnou, “Temporal reasoning in medicine,” Artif. Intell. Med., vol. 8, pp. 187–191, 1996. [21] P. Jackson, Introduction to Expert Systems. Reading, MA: AddisonWesley, 1990. [22] K. S. Park and S. H. Kim, “Fuzzy cognitive maps considering time relationships,” Int. J. Human Comput. Stud., vol. 42, pp. 157–168, 1995. [23] A. K. Tsadiras and K. G. Margaritis, “Cognitive mapping and certainty neuron FCM,” Inf. Sci., vol. 101, pp. 109–130, 1997. [24] H. Thiele and S. Kalenka, “On fuzzy temporal logic.,” in Proc. 2nd Int. Conf. Fuzzy Systems, vol. II, San Francisco, CA, 1993, pp. 1027–1032. [25] B. Kovalerchuk, E. Triantaphyllou, J. F. Ruiz, and J. Clayton, “Fuzzy logic in computer aided breast cancer diagnosis: Analysis of lobulation,” Artif. Intell. Med., vol. 11, pp. 75–85, 1997. [26] F. Steimann, “Fuzzy set theory in medicine,” Artif. Intell. Med., vol. 11, no. 1–7, 1997. [27] A. O. Esogbue and R. C. Elder, “Fuzzy sets and the modeling of physician decision processes, part (II): Fuzzy diagnosis decision models,” Fuzzy Sets Syst., vol. 3, no. 1, pp. 1–9, 1980. [28] J. Gamper and W. Nejdl, “Abstract temporal diagnosis in medical domains,” Artif. Intell. Med., vol. 10, pp. 209–234, 1997. [29] E. J. Condon, Ed., Virtues Family Physician. London, U.K.: Virtue and Company, London and Coulsdon, 1977, vol. 2. [30] K.-P. Adlassnig, “A fuzzy logical model of computer assisted medical diagnosis,” Meth. Inf. Med., vol. 19, pp. 141–148, 1980. [31] B. Kosko, Fuzzy Engineering. Englewood Cliffs, NJ: Prentice-Hall, 1997. [32] L. A Zadeh, “The concept of a linguistic variable and its application to approximate reasoning—I,” Inf. Sci., vol. 8, pp. 199–249, 1975. [33] D. Dubois and H. Prade, Fuzzy Sets and Systems: Theory and Applications. New York: Academic, 1980. [34] G. J. Klir and T. A. Folger, Fuzzy Sets, Uncertainty and Information. Englewood Cliffs, NJ: Prentice-Hall, 1988. [35] R. R Yager, “Fuzzy subsets of type II in decisions,” J. Cybern., vol. 10, pp. 137–159, 1980. [36] M. Mizumoto and K. Tanaka, “Some properties of fuzzy sets of type-2,” Inf. Control, vol. 31, pp. 312–340, 1976. [37] N. N. Karnik and J. M. Mendel, “Applications of type-2 fuzzy logic systems: Handling the uncertainty associated with surveys,” in Proc. 8th Int. Conf. Fuzzy Systems FUZZ-IEEE’99, 1999, pp. 1546–1551.

[38] N. N. Karnik, J. M. Mendel, and Q. L. Liang, “Type-2 fuzzy logic systems,” IEEE Trans. Fuzzy Syst., vol. 7, no. 6, pp. 643–658, Dec. 1999. [39] J. M. Mendel and R. I. John, “Type-2 fuzzy sets made simple,” IEEE Trans. Fuzzy Syst., vol. 10, no. 2, pp. 117–127, Apr. 2002. [40] N. N. Karnik and J. M. Mendel, “Introduction to type-2 fuzzy logic systems,” in Proc. 7th Int. Conf. Fuzzy Systems FUZZ-IEEE’98, 1998, pp. 915–920. [41] J. M. Mendel, Uncertain Rule Based Fuzzy Logic Systems. Englewood Cliffs, NJ: Prentice-Hall, 2001. [42] R. I. John and C. Czarnecki, “An adaptive type-2 fuzzy system for learning linguistic membership grades,” in Proc. 8th Int. Conf. Fuzzy Systems FUZZ-IEEE’99, 1999, pp. 1552–1556.

R. I. John is a Reader in Computer Science with De Montfort University, Leicester, U.K. He is the Director of the Centre for Computational Intelligence at De Montfort. His primary research interests lie in the role of fuzzy logic in modeling uncertainty in medical applications. He is particularly concerned with type-2 fuzzy logic in that context and has published extensively on type-2 fuzzy logic theory and applications.

P. R. Innocent is a Principal Lecturer in Computer Science with De Montfort University, Leicester, U.K. His primary research interests lie in the role of fuzzy logic, neural networks, and neurofuzzy systems in modeling uncertainty and managing decisions in medical applications. He has published extensively on medical decision making.