Visual Interactions with Web Database Content - CiteSeerX

1 downloads 0 Views 720KB Size Report
Medical Language System (UMLS) to facilitate the integration ... same machine from which the applet was downloaded. ... download the application first.
Visual Interactions

with Web Database Content

Xia Lin, Lewis Hassell, II-Yeol Song

Tamas E. Doszkocs

College of Information Science and Technology Drexel University

Specialized Information Services Division National Library of Medicine

{xlin, hassell, song} @drexel.edu

[email protected]

ABSTRACT

2. SYSTEM DESCRIPTION

In this paper, we describe an experimental web application, Visual MeSH that supports dynamic user interaction with medical vocabulary databasesand literature databases. Visual MeSH is a client-server application based on Java RMI technology. It provides multiple views of MeSH term relationships and allows the user to explore term relationships visually. Visual MeSH helps the user improve both searchprecision and recall by making it easy to convert user’s natural language queries to controlled vocabulary-basedqueries.

Visual MeSH is basedon a three-tier architecture. The data server tier, including the Metalhesaurus and MEDLINE, is provided by servers at the National Library of Medicine. The middle tier is a Java RMI server, which includes the search logic and communication protocols and methods. The RMI server directs all the network traffic between clients and database servers. The client tier is a Java applet, which provides visual displays and interaction functions (Figure 1).

2.1 RMl MeSH Server Java RMI (Remote Method Invocation) technology is designed for the development of distributed Java-to-Java applications. RMI allows registered clients to invoke the methods of remote Java objects in servers that reside on different hosts on the network. RMI also enables the transmission of objects through object serialization [l]. These features prove to be useful for the design of Visual MeSH.

Keywords Visual accessto databases,visual displays, medical databases, client-server applications, Java applications.

1. INTRODUCTION The MEDLINE database is the National Library of Medicine’s (NLM) premier bibliographic database covering the fields of medicine, nursing, dentistry, veterinary medicine, the health care system, and the pre-clinical sciences. It consists of more than 9 million references to articles published in over 3900 biomedical journals. MEDLINE records are indexed by professional indexers using the Medical Subject Headings (MeSH) thesaurus. These MeSH terms describe the subject content of articles; they provide a consistent way to retrieve information; and they are extremely useful for searching relevant articles in MEDLINE. However, they are also very difficult to use by the end user becauseof the limits of controlled vocabulary [3]. Users may not be able to find and use exact MeSH terms in their searchqueries without help.

Visual MeSH Architecture

User Interface

This paper describes an experimental web application, Visual MeSH, which helps the user to find and use MeSH terms in MEDLINE searching. Currently, MEDLINE is available free of charge on the Web through the PubMed Retrieval System [6]. The National Library of Medicine also developed a Unified Medical Language System (UMLS) to facilitate the integration and retrieval of biomedical information databasesand controlled vocabularies 143. The MetaThesaurus is a key vocabulary component of the UMLS. It provides meanings, hierarchical connections, and other term relationships for many source vocabularies, including the MeSH Vocabulary[S]. The purpose of Visual MeSH is to provide a graphical interface (a Java applet) to let the user interact with both the MetaThesaurus and PubMed, and to link the two together for increasedprecision in searching.

Data Communication

efve b-l

Database

Figure 1. Visual MeSH Architecture

65

The RMI MeSH Server provides the following functions: . Register a client so that the client can invoke methods provided by the server. . Provide a series of methods for querying and searching the two databases. . Provide parsing methods for converting search results into objects. . Transmit the processed objects (search results) back to the client. . Maintain communication with registered clients Using the RMI server also overcomes a Java applet security restriction: an applet can only communicate with servers on the same machine from which the applet was downloaded. Through the RMI server, the MeSH Client can accessMetaThesaurus and MEDLINE, which are on different machines. If needed, it can also be connected to many other databases on different host machines.

2.2 MeSH Client The client for Visual MeSH is designed as a Java applet, instead of a Java application, because this allows any computer with a Web browser to access the system, and the user doesn’t need to download the application first. To make the client lighter, processesare moved to the RMI server whenever possible. The focus of the client is to let the user interact with content, including MeSH terms and documents retrieved from MEDLINE. When the user sends a query term to the MetaThesaurus, the RMI server createsan object called MeSH concept, which includes the term’s synonyms, narrower terms, broader terms, and related terms as defined in the MetaThesaurus, and passes the object to the client. The client then provides four different views of such a MeSH concept: text view, tree view, neighbor view, and map view, each of which renders more interactive functions than the previous one. Users have choices of simple or complex interaction styles to fit their needs and have different options to find the terms that best match their queries. In any of the views, the user can double-click on a term to add the term to a list of search terms (the lower left list right above the search button). When the user clicks the Search Button, the system performs a Boolean AND search with all the terms in the search list and opens a new browser’s window with search results. The user can continue to interact with the terms and the retrieved documents in many different ways as described in the next section.

3. THE VISUAL INTERFACE The graphical interface of Visual MeSH is designed to make it easy for the user to interact with the displayed information (Figure 2). Major components of the interface are described below. The Lookun Field. This is where the user can type in terms or queries to look up in the MetaThesaurus or MEDLINE. From the MetaThesaurus, the system will return a meshconcept object with

66

synonyms, narrower terms, broader terms, and related terms. From MEDLINE, the system will display a list of up to 100 MeSH terms used in the documents retrieved by the query. The user then can select any of the MeSH terms to perform a lookup in the MetaThesaurus. The Co-occurrence List. One type of very useful information in the MetaThesaurus is the co-occurrence of concepts in MEDLINE. For each MeSH term, a list of MeSH terms that cooccur with it in MEDLINE documents can be retrieved from the MetaThesaurus, along with their co-occurrence numbers (the number of times two MeSH terms co-occur in a document in MEDLINE). Visual MeSH retrieves this information and puts the co-occurrence terms in a list ordered by their co-occurrence numbers. Browsing the top of the list will often find terms highly related to the lookup term because they have co-occurred often in documents. Double-clicking a term in the list will move the term to the search list. Single-clicking a term will highlight the term, and if the map view is shown, a click on any of the four comers will put the highlighted term into the clicked comer (see Map View below). The Tree View. The Tree View displays each MeSH Concept in a tree with branches of synonyms, broader terms (BT), narrower terms (NT), and related terms (RT) (Figure 2(a)) . Each branch can be expanded or contracted to help the user browse one branch at the time and reduce the scrolling. The terms on the tree are also click-able and can be moved to the search list if the user clicks on a term on the tree. The Neighbor View. The Neighbor View has a center and four neighbor lists, which are the center term’s synonyms, broader terms, narrower terms, and related terms (Figure 2(b)) A main feature of this view is its dynamics. Clicking on any term in the neighbor lists will move that term to the center and update all the neighbor lists to the new center terms’ synonyms, broader terms, narrower terms, and related terms. With this function, the user can easily move up and down, left and right in a complex multihierarchical structure. The Neighbor View also enables the user to read the definition of the center term. Moving the mouse over the center term in the neighbor view will pop up a window with the definition for the term if the definition is provided in the MetaThesaurus. Moving the mouse out of the popup window will hide it. The Mao View. The Map View shows two types of maps,the term map and the document map. The term map roughly positions synonyms, broader terms, narrower terms, and related terms in the samedirections as in the neighbor view. When the TERMS button is pressed, the system retrieves each term’s Semantic Type from the MetaThesaurus and represents it by color (Figure 2(c)). For simplicity, the 135 semantic types (as defined in the MetaThesaurus) are grouped into 13 groups, and labeled with colors, e.g. “Diseases & Pathologic process” is represented in RED, “procedures or treatment” in GREEN, and “Concepts & Ideas” in WHITE, etc. Moving the mouse over the list of colored diamonds on the right will reveal what each color represents.

Figure 2(a) The Tree View -- au expandable/collapsible tree that displays the concept’s broader terms, narrower terms, related terms, and synonyms.

Figure 2(b). The Neighbor View -- it displays the sameinformation as in the tree view and allows traversal along the hierarchy (If a term in any of the lists is clicked, the term will moved to the center and its narrower, broader, related term lists will be updated automatically).

Figure 2(c) The Map View -- it displays hierarchical relationships of terms as well as semantic types of each term by colors.

Figure 2. The Visual Interface -- When the user searchesthe term “hair loss,” the concept “Alopecia” is found. The interface then helps the user to explore the concept’sbroader terms, narrower terms, related terms, and synonyms. A list of top 100 co-occurrence terms (co-occurs with alopecia in MEDLINE documents) is also retrieved and displayed on the interface.

67

4. CONCLUSIONS AND FUTURE RESEARCH

Terms on the term map can be dragged into the four comers reserved for selected terms. After selecting four terms, the user can click on the DOCS button to get a document map. The Document map represents MEDLINE search results of the four selected terms. The system automatically constructs MEDLINE queries and conducts 9 searchesbefore representing the results on the map: The result of AND-ing all the four terms is in the middle; the results of AND-ing 3 terms together in different combinations are in the four comers, etc. Figure 3 illustrates the distribution of searchresults.

TetmA

Visual MeSH is a research prototype that supports dynamic user interaction with content in the MetaThesaurus and MEDLINE. It provides several ways to retrieve information related to medical concepts, including definitions, semantic types, synonyms, narrower terms, broader terms, related terms, and co-occurrence terms. Visual MeSH displays the information in several different ways to facilitate the user’s interaction with the information. It allows the user to select MeSH terms to search MEDLINE databases. The ultimate goal of information visualization in retrieval systems is to provide better results to the user in terms of search recall and precision [2]. Visual MeSH helps the user to achieve better search precision and recall by making it easy to convert’s user’s natural language queries to controlled vocabulary-based queries. Currently, we are continuing to test and implement additional enhancementsto the system. We are also planning to solicit user feedback, conduct an evaluation of the retrieval effectiveness of Visual MeSH in the near future.

TflI-lB

Visual MeSH prototype can be accessedat: i ABD ABC 1 AB . .. ... ..I . . -..p . ... “_“...... ““..i.....“..I .............. i ABCD ; BD f .. ........... ........-...,..-. . . . .I, .

http://research.cis.drexel.edu/mesh/index.html

5. REFERENCES

AC

ACD

j

CD

El1 Dowing, T. B. (1998). Java RMI: Remote Method

.

Invocation. Foster City, CA., IDG Books Worldwide.

PI Egghe L; Rousseau R (1998). A theoretical study of recall

; BCD

Term C

and precision using a topological approach to information retrieval, Information Processing and Management, 34: (2-3) 191-218 Lancaster, F. W. (1986). Vocabulary Control for Information Retrieval. 2nd ed. Arlington, VA, Information Resource Press. Lindberg, D.; Humphreys, B. L.; & McCray, A. T. (1993). The Unified Medical Language System. Methods of Information in Medicine, 32(4), 28 1-291.

TWt-llD

Figure 3. The search logic for the document map -when the user drags four terms to the four comers, the search result of (Term A AND Term B AND Term C AND Term D) will be displayed in the center, the result (Term A AND Term B AND Term C) will be displayed in the upleft comer, etc.

National Library of Medicine (1997). UMLS Knowledge Source: 8th Edition Documentation. US Department of Health and Human Services, National Library of Medicine.

kmission to makt: digital or hard copies of all or part ofthis work for personal or classroom use is granted without fee provided that copies are not ma& or distrihutcd for prolit or commercial advalltagc and that copies hx~r this noti~c and the full citation on the first page. TO COPY otherwise. to rqthlish, to post on servers or to redistribute to lists, requires prior specific permission and!or a fee.

[61 National Library of Medicine (1998). The NLM PubMed Project. [available: http://www.ncbi.nlm.nih.gov/PubMed/overview.html].

NPIV 98 Bethesda MD USA Copyright

ACM

2000

l-581 13-179-8/00/1...$5.00

68