Identifying Patients With Ischemic Heart Disease in ... - SAGE Journals

2 downloads 0 Views 128KB Size Report
Bayview Avenue, Toronto, Ontario M4N 3M5. Email: [email protected]. Identifying Patients With Ischemic. Heart Disease in an Electronic. Medical Record.
Original Research

Identifying Patients With Ischemic Heart Disease in an Electronic Medical Record

Journal of Primary Care & Community Health 2(1) 49­–53 © The Author(s) 2011 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/2150131910382251 http://jpc.sagepub.com

Noah Ivers, MD, CCFP1-3, Bogdan Pylypenko, BScN2, and Karen Tu, MD, MSc, CCFP, FCFP2-4

Abstract Purpose: Increasing utilization of electronic medical records (EMRs) presents an opportunity to efficiently measure quality indicators in primary care. Achieving this goal requires the development of accurate patient-disease registries. This study aimed to develop and validate an algorithm for identifying patients with ischemic heart disease (IHD) within the EMR. Methods: An algorithm was developed to search the unstructured text within the medical history fields in the EMR for IHD-related terminology. This algorithm was applied to a 5% random sample of adult patient charts (n = 969) drawn from a convenience sample of 17 Ontario family physicians. The accuracy of the algorithm for identifying patients with IHD was compared to the results of 3 trained chart abstractors. Results: The manual chart abstraction identified 87 patients with IHD in the random sample (prevalence = 8.98%). The accuracy of the algorithm for identifying patients with IHD was as follows: sensitivity = 72.4% (95% confidence interval [CI]: 61.8-81.5); specificity = 99.3% (95% CI: 98.5-99.8); positive predictive value = 91.3% (95% CI: 82.0-96.7); negative predictive value = 97.3 (95% CI: 96.1-98.3); and kappa = 0.79 (95% CI: 0.72-0.86). Conclusions: Patients with IHD can be accurately identified by applying a search algorithm for the medical history fields in the EMR of primary care providers who were not using standardized approaches to code diagnoses. The accuracy compares favorably to other methods for identifying patients with IHD. The results of this study may aid policy makers, researchers, and clinicians to develop registries and to examine quality indicators for IHD in primary care. Keywords electronic medical record, administrative data, ischemic heart disease, chart audit

Ischemic heart disease (IHD) is a leading cause of morbidity and mortality that is often managed in primary care;1 however, recent evidence indicates a wide variability in the management of such patients.2,3 The increasing use of electronic medical records (EMRs) could facilitate improved monitoring of patients with IHD. However, unlike the case of diabetes, where structured fields in the EMR (such as laboratory results and treatments) can be used to accurately identify patients,5 using the EMR to develop a registry of IHD patients is difficult.6 Therefore, in jurisdictions where providers are not incentivised to systematically code diagnoses, approaches to help identify patients with IHD are needed to facilitate quality improvement efforts. With this goal in mind, we developed and tested a novel EMR search algorithm focusing on the unstructured text in the medical history fields. We also examined whether searching for nitrate prescriptions in the EMR would increase the accuracy of identification of patients with IHD as has been proposed in similar studies.8

Methods Setting As part of a study for the Canadian Cardiovascular Outcomes Research Team (CCORT), selected Ontario family physicians using Practice Solutions® EMR share their data with researchers at the Institute for Clinical Evaluative Sciences (ICES) to create an Electronic Medical Record 1

Women’s College Hospital, Toronto, Canada Institute for Clinical Evaluative Sciences, Toronto, Canada 3 Department of Family and Community Medicine, University of Toronto, Canada 4 Toronto Western Hospital Family Health Team, University Health Network, Toronto, Canada 2

Corresponding Author: Noah Ivers, Institute for Clinical Evaluative Sciences, G1 06, 2075 Bayview Avenue, Toronto, Ontario M4N 3M5 Email: [email protected]

50 Table 1. Initial Search Algorithm Applied to Medical History Fields in Electronic Medical Records IHD, ischemic heart disease ASHD, arteriosclerotic heart disease CAD, coronary artery disease Angina ACS, acute coronary syndrome MI, myocardial infarction, STEMI, NSTEMI, non-STEMI, AMI PCI, percutaneous coronary intervention, coronary stent, coronary angioplasty CABG, coronary bypass graft surgery

Administrative Data Linked Database (EMRALD). ICES is a prescribed entity under the province of Ontario’s Personal Health Information Protection Act, which allows for the collection of individual level health information for use in planning and managing the health care system. Practice Solutions is Ontario’s most used EMR system, with 46.2% of the market as of November 2009.

Algorithm Development The search algorithm was developed between August 2009 and January 2010 and was conducted on the EMR charts from all 38 097 active, rostered, adult patients in the EMRALD database at that time. Active was defined as having at least 2 visits within 3 years. These patients belonged to 39 community-based family physicians from across Ontario who had used the EMR for at least 2 years. The algorithm was developed using the SQL database management system, starting by applying the clinically derived list of terms in Table 1 to the free-text entries in the medical history fields (“problem list” and “past medical history”) of the EMR. This initial IHD-related dictionary was developed based on the clinical experience of the investigators. It was refined iteratively by applying the terms in a search of the medical history fields and then adding any new IHD-related terms discovered. Physicians rarely used structured entries (eg, ICD-9 code 413 for angina), and usually entered data into the medical history fields using free text. Idiosyncratic descriptors were common in the medical history fields, such as “stent-LAD” (meaning angioplasty for the left anterior descending coronary artery). Therefore, the search algorithm was designed to find exact matches to phrases/ words/acronyms and also allow for some variation in syntax or spelling (eg, LAD-stent vs stent-LAD, AMI vs A.M.I.). Many patients had multiple IHD-relevant descriptors. To illustrate, some patients had a history of myocardial infarction (MI) and a prior stent. Because it seemed likely that “stent” was another IHD-related term, it was added to the

Journal of Primary Care & Community Health 2(1) algorithm. In the next iteration, new patients were identified with “stent” as their only IHD descriptor. Search-term-exclusions were also added to the algorithm to remove terms when they were not related to IHD. For instance, “stent” would be excluded if adjacent to “aneurysm” because the stent was not necessary related to IHD. Similarly, “bypass” (as in coronary artery bypass) was included as a search term, but was excluded from the algorithm if near the term “gastric.” We tried to increase specificity by excluding words preceded by “?” or “query” or “possible.” We also excluded IHD-related words occurring next to “mother/father” and occurring next to the phrase “no known.” All such exclusions were developed inductively. The algorithm was refined iteratively until no new search terms or exclusions were uncovered. The refinement of the search algorithm was conducted by 2 of the investigators (Noah Ivers and Bogdan Pylypenko) and any disagreements were settled through discussion with all investigators. The final search algorithm has 317 search terms and automatically removes 52 exclusion terms. Our methods were very similar to Natural Language Processing approaches recently reported in the medical informatics literature.9,10 In these papers, it is argued that using “text-dictionaries” with “modifier detection” is a simple but efficient approach, and that clinical expertise is needed to adjust the search terms used to find exact matches, similar matches, and exclusion terms for specific clinical domains and contexts. Although there are open-source software options to assist medical informatics professionals to develop algorithms for searching free-text (such as “NegEx”11 for finding exclusions), we chose to avoid such software to ensure that our methods were accessible to those with a clinical or policy focus. We also developed an algorithm to search for nitrate prescriptions in the treatments field of the EMR. We counted patients as “nitrate positive” if they had 1 long-acting nitrate prescription and/or 2 or more prescriptions for short-acting nitrates. The requirement for 2 short-acting nitrates was implemented to try to eliminate patients who were prescribed a nitro-spray only once prior to a normal stress test.

Validation Set The validation set was based on a 5% random sample of patients from the first 17 physicians to contribute data to EMRALD. The 5% random sample resulted in 969 patients. As described in a related paper,12 3 trained abstractors reviewed the full text from the entire patient chart. Patients that had documented evidence of a MI, percutaneous coronary intervention (PCI), or coronary artery bypass graft (CABG), or had IHD documented in a specialist letter, in the medical history fields, or in the family physician’s

51

Ivers et al. Table 2. Accuracy of Four Different Search Algorithms for Identifying Patients With IHD Compared to Manual Abstracted Occurrences (N = 87) From a Random Sample of Primary Care Electronic Medical Recordsa Search Technique Initial search, medical history fields ONLY Initial search PLUS Rx nitrate Refined search, medical history fields ONLY Refined search PLUS Rx nitrate

Sensitivity (%)

Specificity (%)

PPV (%)

NPV (%)

Kappa

26.4, (17.6-36.9) 40.2, (29.9-51.3) 72.4, (61.8-81.5) 74.7, (64.3-83.4)

99.6, (98.8-99.9) 99.3, (98.5-99.8) 99.6, (98.9-99.9) 99.3, (98.5-99.8)

85.2, (66.3-95.8) 85.4, (70.1-94.4) 94.0, (85.4-98.4) 91.2, (85.1-96.8)

93.2, (91.4-94.7) 94.4, (92.7-95.8) 97.3, (96.1-98.3) 97.6, (96.3-98.5)

0.37, (0.26-0.49) 0.52, 0.41-0.63 0.80, 0.73-0.87 0.81, 0.74-0.88

Abbreviations: IHD, ischemic heart disease; PPV, positive predictive value; NPV, negative predictive value. a Values are given, followed by exact 95% confidence intervals.

progress notes were “IHD positive.” The intraobserver reliability was high, with kappa values exceeding 0.80.12

Analysis We tested the initial, clinical terms only (Table 1) and the final, refined search algorithm, with and without a nitrate prescription, against the results from the manual abstraction. Sensitivity was calculated as the proportion of patients identified by the manual abstraction as having IHD (used as the reference standard), who also had IHD according to the algorithm. Specificity was calculated in the same manner except that it was based on individuals who were not identified as having IHD. Positive predictive value (PPV) was defined as the proportion of IHD patients identified by the algorithm that were also identified as IHD by the manual abstraction. Negative predictive value (NPV) was defined similarly for patients that did not have IHD. Kappa statistics for agreement between the results from the manual abstraction and from the algorithm data were also calculated. Exact confidence interval (CIs) were calculated for all proportions and tests of accuracy. All analyses were conducted using SAS version 9.2 (SAS Institute, Cary, NC). In a secondary analysis, we evaluated the algorithms in a practice of 6 physicians that were new to the EMRALD database. One researcher (Dr Ivers) manually audited the complete electronic charts of all patients identified by each algorithm as having IHD. If the manual abstraction suggested that the patient did not have IHD, that patient was considered a false positive. This project received ethics approval through Sunnybrook Health Sciences Centre Research Ethics Board.

Results The manual chart abstraction of 969 randomly selected charts identified 87 patients meeting criteria for IHD (prevalence = 8.98%). Table 2 shows that the refined search algorithm outperforms the initial search terms, greatly improving sensitivity with limited change to specificity. The addition of nitrate prescriptions substantially increased sensitivity

compared to the initial search terms, but had little impact on the diagnostic accuracy of the refined search algorithm. When the algorithms were applied to a new EMRALD practice, the prevalence of IHD in this practice was estimated to be 9.68% to 12.66%, depending on the algorithm used. Table 3 shows that the false positive rate ranged from 4.14% to 12.54%. Compared to the initial search terms, the refined search resulted in a lower false positive rate, while identifying more IHD patients. Of the extra patients identified by adding a search for nitrate prescriptions, 51.92% were false positives. Of these false positives, 62.96% represented patients who had a negative work-up (eg, investigations revealed normal coronary arteries), while 25.93% seemed to have received nitrates for either congestive heart failure or for refractory hypertension, rather than angina. Using the refined algorithm, 85.71% of the false positives had a negative workup and the others had no further evidence in the chart of IHD.

Discussion The refined algorithm vastly outperformed a set of clinically derived search terms for identifying patients with IHD. Using the same set of manually abstracted charts, algorithms using administrative databases identify patients with IHD with sensitivity of 77.0% but a PPV of only 78.8%.12 The approach described here has slightly lower sensitivity but much better PPV. The PPV is also superior to methods described in the UK.7,8 When the refined algorithm was applied to a new set of over 3000 patient charts, the false positive rate was