RedSoar-A System for Red Blood Cell Antibody Identification

5 downloads 516 Views 1018KB Size Report
antigens on the cells, the bloodbank technologist must attempt to identify the .... tive reaction on the master panel is annotated with explain to indicate that it must ...
RedSoar-A System for Red Blood Cell Antibody Identification Kathy A. Johnson, Todd R. Johnson, Jack W. Smith, Jr., Matt DeJongh, Olivier Fischer, Nasir K. Amra, and Ayse Bayazitoglu Division of Medical Informatics The Ohio State University 571 Health Sciences Library 376 W. 10th Ave. Columbus, Ohio 43210 Email: [email protected], [email protected], [email protected] serum. If the patient has an antibody to an antigen on the Abstract The primary goal of our research is to build an intelligent red cells, agglutination occurs (see Figure 1 and Figure 2). For example, if the red cells are known to have the K, c, and tutoring system for red blood cell antibody identification. M antigens and the patients serum reacts with the red cells, In this paper, we describe the basisfor a tutoring system then the patient's blood must have an antibody to at least an expert system called RedSoar. RedSoar is builtfrom two task-specific architectures that were designedfor building one of the antigens on the red cells. To identify the flexible systems (i.e. systems that can use a variety ofprob- antibodies, several independent tests are run using red cells lem-solving strategies, and to which knowledge can easily- of different antigenic make-up. For example, the antigram be added). RedSoar solves the antibody identification task in Figure 3 lists 8 different red cells. Samples of each red correctly 81% of the time and new knowledge can be added cell are usually referred to singularly as in "The patients in a straightforward manner. The system is capable of ex- blood reacts with red cell 1." The labels at the top of the hibiting human-like behavior which we believe is a neces- antigram list red cell antigens grouped according to family. A '+" in the matrix means that the red cell has the antigen, sary condition for building a successful tutoring system. a "0" means that the red cell lacks the antigen. Certain test conditions can affect the strength of some Introduction The domain of red blood cell antibody identification antigenic reactions. For example, antigens that do not has been a focus of work at The Ohio State University for normally react at room temperature might react when quite some time [9]. The primary goal of our current re- heated to 37PC. Also, some chemicals can enhance or search is to build an intelligent tutoring system for the do- destroy an antigen's ability to bind to its antibody. Initial main. To accomplish this goal, we need an underlying testing is usually done under three conditions or phases expert system which is capable of expressing the various known as albumin-is, albumin-37, and Coombs. The behaviors observed in human experts solving the red blood reaction matrix in Figure 3 shows these test conditions as cell antibody identification problem. Such a system can be well as two other phases. This matrix records the reaction built if a framework exists to support a flexible implemen- strength for each phase and red cell being tested. Reaction tation to which additional knowledge and strategies can be strength is recorded using a scale of 0 to 4+, with 4+ being added. Our work on task specific architectures supports the strongest. Based on the reactions in this matrix and the just this tpe of system building and it serves as the basis for our current research. This paper describes RedSoar, a A AB4A flexible expert system for doing red blood cell antibody identification. We give a brief introduction to the domain AB X( ) followed by a description of our methodology for designing flexible systems. The RedSoar system is then described and evaluated. Red Blood Cell Antibody Identirication Figure 1: Antibodies (AB) attach to their corresponding When a patient requires a blood transfusion, their antigens on red cells (RC). blood must be typed so that compatible units of donor blood can be selected. One type of incompatibility arises when the patient's blood contains antibodies to antigens on the donor's red cells. This causes in a transfusion reaction. To identify the antibodies in a patient's blood, red cells of known antigenic make-up are mixed with the patient's * This research is supported by National Heart Lung and Blood Institute grant HL-38776, and National Library of Medicine grant

LM-04298.

0195-4210/91/$5.00 © 1992 AMIA, Inc.

664

Figure 2: The antigenic reaction results in agglutination (clumping) of red cells.

Observed Reactions IS ALBUMIN

COOMBS 0 ENZYMEIS ENZYME 37

0

+

0

2+ 1+

0 0 0 0

2+i l 0 1 -1+

e

M

N

+

O+

0

0

0

O

0 +

2+ l 2+ [ 0 0 1+

0

0

0

+

+

0

+

O0 0

0

Antigram Cell

D

32 4 5

+

0 O

6

+

+

7

C

E

c

+

0

+

O

O

+

+

+

0

+

+ +

+

+

+ 0

0

O+

+

+

+

4

__

O

+

O

O O

+

+

|P| K |Fy0Jka

S| O

+

O

+

.

+

+ +

O

0O

O

O

O

O-

+ +

+ +

+

O+

+

O

Jkb Lea Lob +

O + O__ +

O

__

+

0O

O

O

O

+

+

0

+iu 3Da foa a yt on+cas 3: Data from an Figure antibody identification case. antigens on the cells, the blood bank technologist must make use of a subset of the potentially relevant knowledge. attempt to identify the antibodies present in the patients It has become clear that we need to develop a framework blood. If the information is insufficient to positively that allows the opportunistic construction of a problemidentify the antibodies, additional tests using different sets solving method based on the particular situation and the of red cells must be ordered. knowledge available [2]. Motivation Building Flexible Systems We would like to build a system that will interact with In order to build a flexible system, we must first have a student solving the red blood cell antibody identification a flexible representation of the problem-solving methods problem. Expert human tutors have two kinds of knowlinvolved. The standard way to specify a problem solving edge-how to solve the problem and how to teach a student method is by writing an algorithm that lists each operator to solve the problem. The first kind of knowledge can be in the order in which the operators are to be done along encapsulated in an expert system for solving the red blood with conditionals and loops. To implement the method a cell antibody identification problem. The second kind of procedural language such as LISP can be used to encode knowledge could then be integrated with the expert system the algorithm. The operators are either built-in LISP funcresulting in an expert tutor. Underlying this methodology is tions or subprocedures. In a procedural description of a the belief that the expert system should solve the problem method, the order of the operators must be completely in as human-like a manner as possible. Analysis of human specified. In other words, the procedure must encode protocols show that experts employ a wide variety of prob- enough knowledge so that the next operator can always be lem-solving methods, thus the expert system should also be determined. Furthermore, a procedural language makes it able to take advantage of different methods and knowl- difficult to encode operators that generate control knowledge. edge. A further constraint on the design of an expert tutor is Since we desire a flexible and thus opportunistic system, that a tutorial situation will be unpredictable. It is necessary we need to be able to specify a set of operators without for the tutoring portion of the system to handle all the ways necessarily pre-specifying a complete ordering of those a student might solve the problem whether they are right or operators. An opportunistic system works by enumerating not. If the student's method is correct but merely differs in applicable operators for the immediate situation and then his step ordering, the system should accept the answer; selecting one of those operators based on the current goal however, if the student is wrong, the system should and situation. The system is capable of generating or using respond appropriately. This gives further indication that we additional control knowledge. Thus, when the system must need to build a robust, flexible system to which we can decide between several operators, it is possible for the easily add alternate strategies and knowledge. system to engage in complex problem solving to determine In the past we have built systems that followed a fixed which operator is best. strategy. Such systems are often brittle. This was true of To achieve these results we have been using the our previous systems-they worked well for the particular problem-space computational model (PSCM) [7] to specify problem situation they were designed to handle, but failed methods. In the PSCM, all problem solving is viewed as in slightly unusual situations. Much of the brittleness arises search for a goal state in a problem space. Knowledge from the fact that a single fixed method can only respond about when operators are applicable to a state can be appropriately in a limited range of situations and can only specified independent of knowledge about which operator 0

0

0

665

to selecL Operator selection knowledge, called searchcontrol knowledge, is expressed in terms of preferences for or against applicable operators. For example, we can encode knowledge like "If operators A and B are applicable and X is true of the state, then B is better than A." If at any time during problem solving the search control knowledge is insufficient to indicate which operator to select, a subgoal is set up to generate additional knowledge so that a single operator can be selected. This subgoal is achieved by searching another problem space. Operators can either be implemented by directly available knowledge or by using an operator-specific problem space. Implementation in a problem space is similar to using a subprocedure to implement an operator in LISP. To adequately describe problem spaces the knowledge content of their states must be specified. In this work we use annotated models [6] to describe and implement problem-space states. A model consists of objects with properties and relations, along with the assumption that every object in the model must represent an object in the referent (called the correspondence principle). An annotated model allows the correspondence principle to be modified by annotating an objecL For instance, if an object is annotated with not, it means the object is not in the referent. Examples of other annotations are some, many, uncertain, and so on. Annotations can be task-independent, such as not, or task-speciflc, such as explain (an annotation used in RedSoar). RedSoar To solve the red blood cell antibody identification problem, a series of systems (REDI, RED2, RED3) were built at OSU using the generic task toolset [9]. As a result of the flexibility issues discussed above, a set of Task Specific Architectures (TSA's) corresponding to the generic tasks were developed in Soar, an architecture supporting the PSCM [3, 51. We then began a re-implementation of the RED systems, RedSoar, which incorporated the original problem-solving strategy as well as variations and additional knowledge based on concurrent verbal protocols taken of experts solving the task. Implementation As was stated in a previous section, we are viewing problem solving as search for a goal-state in a problem space and we use an annotated model representation for the state. This leads to the following task description for red blood cell antibody identification: Initial State: a set of reactions to be explained. Each positive reaction on the master panel is annotated with explain to indicate that it must be explained by an antibody hypothesis. Goal State: a set of antibodies that explain the reactions. There are six conditions for this model: 1) The explanation must be complete, i.e. all reactions explained. 2) The hypotheses in the model must be at the desired level of detail,

i.e. antibody specificity (antibody name) determined. This is necessary because RedSoar can form abstract hypotheses that do not indicate the antibody's specificity. 3) No part of the model can be redundant. 4) No part of the model can be inconsistent. 5) All parts must be certain. 6) All parts of the model must be processed, i.e., the implications of the object in terms of its effect on the rest of the model must have been considered. In addition, the goal state must specify which antibodies are present, which are absent, and which ones require additional testing to positively rule-out or confim. In RedSoar each of the criteria for an acceptable goal state corresponds to a subgoal. When one of the subgoals is not met, RedSoar proposes an operator to achieve the goal. These operators are listed in Figure 4. Note that many of the goals have several operators listed below them. This is done for one of two reasons: 1) Many of the subgoals can be achieved using different methods, in which case an operator exists for each technique; or 2) All the operators listed for the goal must be applied before the goal can be met (such is the case with processed). The operators cover, resolve-redundancy, resolve-inconsistency, determine-certainty, determine-accounts-for, mark-redundancies and mark-inconsistencies are instantiations of operators from the TSA for abduction, ABD-Soar[4]. The operator refine is from the TSA for hierarchical classification, ER-Soar [2]. Unlike earlier versions of RED, RedSoar tightly integrates abduction and classification problem solving. Make-abstract-hypotheses is proposed once at the beginning of problem solving. It's goal is to form abstract hypotheses by looking only at the master panel for patterns of reactions (ignoring the antigram). These hypotheses specify what reactions they could be used to explain, but do not give a specificity. For example, for the case shown in Figure 3, RedSoar would hypothesize that two antibodies are Complete Make-abstract-hypotheses Rule-out Cover reaction Speciflcity Known Match-hypothesis-to-antigram antibody-hypothesis Refine antibody-hypothesis dimension

Irredundant Resolve-redundancy objectl object2... Consistent Resolve-inconsistency objecti object2... Certain Determine-certainty object Processed

Determine-accounts-for antibody-hypothesis Mark-redundancies object Mark-inconsistencies object Final Results Determined Determine-final-results Figure 4: RedSoar's Operators Organized by Goal

666

present: one that can explain all the reactions on red cells 1, 3 and 6, and another that can explain the reaction on cell 7. Once the abstract hypotheses are made, the system can look for a single antibody specificity for the entire group of reactions, instead of looking for specificities for each separate reaction. The match-hypothesis-to-antigram operator described later in this section provides one method for finding such explanations. Experiments have shown that the abstract hypotheses can lead to more efficient problem solving and that humans use these abstractions[1, 10]. Rule-out is proposed once at the beginning of problem solving. When selected, it determines which antibodies are not likely to be present in the patients blood. This information is recorded in the state so that it can later be used to make decisions about what antibody hypotheses should be preferred in the composite. Rule-out is usually taught to students as an initial method for interpreting the data. The basic reasoning behind rule-out is: 1) Antibodies cause cells to agglutinate if the cell has the antigen which caused the creation of the antibody. 2) If no reaction is present, then the patient's blood must not contain an antibody to any of the antigens on the non-reacting cell. Cover is proposed for each reaction that needs to be explained, but is not yet explained by an antibody hypothesis. The goal is to add to the model an antibody hypothesis that explains the finding. RedSoar does this by determining candidate specificities and creating a new hypothesis for each specificity. Candidates are all the antibodies that: 1) have antigens on the red cell on which the reaction appears; 2) have not been previously ruled-out; 3) offer to explain the reaction; and 4) are not already in the model. If more than one is added, they will be marked redundant and resolve-redundancy will decided which to keep. Match-hypothesis-to-antigram is proposed for each abstract hypothesis that does not have a specificity. Because abstract hypotheses indicate which reactions they explain, a scan of the antigram will reveal all possible candidate specificities. The antigram is searched for antigens that are on all the red cells containing reactions that the antibody offers to explain and not on any of the other red cells. The antibodies to such antigens are potential specificities. If no specificity can be found for a hypothesis, then that hypothesis is incorrect and RedSoar removes it from the model. If the technique does not result in a unique specificity for a hypothesis, then the candidates will be marked redundant and the decision as to which to keep will be made by resolve-redundancy. Refine seeks to add additional information to an antibody hypothesis based on the master panel data. An antibody hypothesis has four dimensions: specificity, molecular-type, enzyme-effect, and coombs-only-reactor. These dimensions can be used to compare a hypothesized antibody's dimensions to that of a typical antibody in order to determine the plausibility of the hypothesis.

667

Resolve-redundancy is proposed for each set ofobjects that are mutually redundant. The default method is to propose a remove operator for each redundant object. If an antibody is the only explanation for some reaction, then that antibody is marked essential and no remove operator is generated for it. If there are two essential antibodies that are also redundant then the redundancy is considered acceptable. If there is more than one remove operator proposed, then a space that implements lookahead is entered to try each out to see which remove operator is best. During lookahead, antibodies are removed until the explanation is no longer redundant at which point the resulting state is evaluated. The remove operator that generated the best final state is marked best. If there is still a tie, additional information generated by the refine operator can be used to determine the plausibility of the antibodies (based on the conditional probability of such an antibody). Resolve-inconsistency is proposed for each set of objects that are mutually inconsistent, i.e., that participate in an inconsistent-with relation. The default method is like that of resolve-redundancy.

Determine-certainty is proposed for each object annotated uncertain. RedSoar only questions the certainty of antibody hypotheses, all other objects are assumed to be certain. RedSoar rates an antibody hypothesis as certain if it has a high plausibility. The likelihood is determined by combining antibody prevalence, and conditional frequency (how frequently the antibody matches the dimensions determined by the data). If the antibody cannot be made certain, it is removed from the model. Determine-accounts-for is proposed for each antibody hypothesis that is annotated with new and has the specificity and/or molecular type specified. The goal is to determine what reactions the hypothesis can account for. Each such reaction is indicated using explains and explained-by relations. Mark-redundancies is proposed for each object annotated with new. The goal is to add redundant-with relations to the model to indicate any objects to which the object is redundant with. Antibody hypotheses that offer to explain an identical finding (or findings) are considered redundant, as are multiple values for an object property. Mark-inconsistencies is proposed for each object annotated with new. The goal is to add inconsistent-with relations to the model to indicate any objects to which the object is logically inconsistent with. Determine-final-results is proposed once rule-out has been applied and a best explanation has been found. Its goal is to determine which antibodies are present, which are not present, and which require additional tests to rule-out or confirm. There is a domain test called rule-of-three which is a breadth of evidence test that is commonly used by blood bank technicians. An antibody hypothesis passes the rule-of-three if there are three red cells that have the anti-

gen and react and three red cells that don't have the antigen and don't react. Determine-final-results produces a report that classifies the antibodies as either confirmed, likelypresent, likely-absent, or ruled-out, according to the following conditions: 1) Confirmed The antibody is part of the best explanation and passes the rule-of-three. 2) Likelypresent The antibody is part of the best explanation, but does not pass the rule-of-three. 3) Likely-absent The antibody is not part of the best explanation, but it has not been ruled-out. 4) Ruled-out The antibody has been ruled-out. Search-Control Knowledge Whenever multiple operators are applicable to the same state, RedSoar must select which operator to apply. The knowledge about which operator to select is stated in the form of rules that prefer certain operators over others given specific conditions. Three examples are: 1) Make-abstract-hypotheses is better than rule-out and cover.; 2) If reactionl is greater than reaction2 then cover reactionl is better than cover reaction2 and 3) Mark-redundancies objectl is indifferent to markredundancies object2 (i.e. pick one at random). Whenever the search-control knowledge is insufficient to allow a single operator to be chosen, a goal is set up to decide what to do next and the various options are tried. Evaluation There are two grounds on which we must evaluate RedSoar. The first is to determine how well it solves problems and the second is to see how well we have accomplished our goals of a robust, flexible, human-like problemsolver. In terms of how well RedSoar solves problems-it gets an answer to any case given to it; however, it has gotten the correct answer on 39 out of 48 medium to difficult cases from the blood bank at OSU (81%). There are several reasons for not getting the correct answer: 1) RedSoar does not use patient background information; 2) The system does not order additional tests; and 3) RedSoar does not know when to allow variance in the amount of data explained (i.e., If Anti-K will explain 2+ of a 3+ reaction, it might be acceptable for Anti-K to explain the entire 3+ due to slight variations in antigen-antibody reactivity.) All of these limitations fall under the category of future workwe hope to incorporate each of the three kinds of knowledge listed above. We have just begun to test the flexibility of RedSoar. Our initial experience shows that the behavior of the system can easily be changed by adding search-control knowledge and additional domain knowledge. Furthermore, the flexibility of the re-implementation of the original method combined with the alternate techniques determined by protocol analysis results in RedSoar solving problems in a human-like way. These are important first steps toward our ultimate goal of a tutorial system. Conclusion We have succeeded in building a system that is capable of solving the red blood cell antibody identification task

668

in a more robust and flexible manner than pervious systems. There are still several kinds of knowledge to be added to make it complete, but that is a focus of our current effort. The task specific architectures we developed have proved to be useful and easy to integrate with domain knowledge. We believe that a system that is capable of human-like behavior is a necessary basis for an expert tutoring system and RedSoar is just such a system. Our next step will be to design and implement the tutorial component to the system. Only when that is done will we know if RedSoar is an adequate foundation. Acknowledgments We thank B. Chandrasekaran, John Josephson, and the OSU AIM group for their assistance and comments on this paper and the work it reports. We also thank the members of the Soar community for their intellectual and technical support. References 1. Olivier Fischer, Cognitively Plausible Heuristics to Tackle the Computational Complexity of Abductive Reasoning, PhD. Diss., The Ohio State Univ.(1991). 2. Todd R. Johnson, Generic Tasks in the Problem-Space Paradigm: Building Flexible Knowledge Systems While Using Task-Level Constraints, Ph.D. Dissertation, The Ohio State University (1991). 3. Todd R. Johnson and Jack W. Smith, Generic tasks and Soar, Working Notes of the AAAI-89 Spring Symposium (AAAI, Stanford University, 1989), 4. Todd R. Johnson and Jack W. Smith, A Framework for Opportunistic Abductive Strategies, in Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society Chicago, 1991) pp. 760-764. 5. John E. Laird, Allen Newell and Paul S. Rosenbloom, SOAR: An architecture for general intelligence, Articial Intelligence 33, 1-64 (1987). 6. Allen Newell, Unified Theories of Cognition (Harvard University Press, Cambridge, 1990). 7. Allen Newell, Gregg Yost, John E. Laird, Paul S. Rosenbloom and Erik Altmann, Formulating the problem space computational model, in Carnegie Mellon Computer Science: A 25-Year Commemorative, R. F. Rashid, Ed. (ACM-Press: Addison-Wesley, Reading, MA, 1991). 8. W. F. Punch III, M. C. Tanner, J. R. Josephson and J. W. Smith, Peirce: A tool for experimenting with abduction, IEEE Expert 5, 34-44 (1990). 9. Jack W. Smith, John Svirbely, C. Evans, Pat Strohm, John Josephson and Michael Tanner, RED: A red-cell antibody identification expert module, Jornal ofMedical Systems9, 121-138 (1985). 10. P. Smith, J. W. Smith, J. R. Svirbely, D. Krawczak, J. M. Fraser, S. Rudmann, T. E. Miller and J. Blazina, Coping with the Complexities of Multiple-Solution Problems: A Case Study, Intrtnational Journal on Man-Machine Studies, (1989).