Annals
Established in 1927 by the American College of Physicians
:
Advanced search
 
box Article
 arrow  Table of Contents                
space
box Services
 arrow  Send comment/rapid response letter
space
 arrow  Notify a friend about this article
space
 arrow  Alert me when this article is cited
space
 arrow  Add to Personal Archive
space
 arrow  Download to Citation Manager
space
 arrow  ACP Search                        
space
 arrow  Get Permissions
space
box Google Scholar
 arrow  Search for Related Content
space
box PubMed
Articles in PubMed by Author:
  arrow  Lazaridis, E. N.
space
 arrow  Related Articles in PubMed
space
 arrow  PubMed Citation
space
 arrow  PubMed
space

THE DATABASES

Database Standardization, Linkage, and the Protection of Privacy

right arrow Emmanuel N. Lazaridis, PhD

15 October 1997 | Volume 127 Issue 5 Part 2 | Page 696


In writing this summary, I was tempted to call on one of the mantras of the current generation: "reduce, reuse, recycle." The authors of the preceding three articles all seek to reduce the need for expensive and time-consuming clinical trials that may be difficult or impossible to conduct by reusing data already available in existing databases and by recycling it into products not envisioned when the data were originally collected. The views expressed in the paper by Gostin suggest that perhaps this mantra should be expanded from three Rs to four in the context of health care data: "reduce, reuse, recycle responsibly." Because of the need to address pressing health care issues, data collected for one reason can and should be used in ways not originally intended; however, researchers must consider the legal and ethical implications of such use. Irresponsible reuse and recycling is a recipe for failure; of necessity, it will lead to decreased means by which researchers can address important health-related questions.

McDonald and colleagues concentrate on barriers to reusing and recycling clinical data. Distinguishing between operational data (patient information "gathered in direct support of patient care") and analytic data, they identify two major barriers to the use of operational data in health care research: differences in structure between operational and analytic databases and variation in coding across different database systems. They propose that the first barrier can be overcome by selective and standardized definitions of analytic variables based on operational data. Overcoming the second barrier requires substantially more investment; the authors recommend that nonstandard coding systems should be mapped to standard ones, such as LOINC (Logical Observations Identifier Names and Codes) and SNOMED (Systematized Nomenclature of Medicine). Standardization in both these areas would increase our ability to effectively reuse and recycle readily available clinical data.

The power of recycling to address important health-related questions can also be enhanced by database linkage, whereby databases with different characteristics are connected to leverage the strengths of each. Lillard and Farmer discuss issues related to the linking of Medicare data with data obtained in health or demographic surveys in the context of research on older persons. For example, data on uncovered medical services and important dimensions of cost are missing from Medicare claims, but this information can be acquired by surveys. Similarly, it is difficult for the survey mechanism to obtain accurate information on health care utilization rates and the cost of reimbursable health services-information that is better obtained from Medicare records. The two data sources together can provide a more comprehensive picture of health and health care costs than can either individually. In general, linked data are more powerful tools for research.

Increased power must have its drawbacks, and Gostin sounds the cautionary note. He points out that systematic and standardized collection of health-related data results in a substantial tradeoff in loss of privacy. An extensive health information infrastructure leads to increased opportunities for inappropriate use by "authorized" users, as well as the potential for access and exploitation by "unauthorized" parties. Automation is no panacea: it can be used to improve the security of computerized data, but the increasing ease with which electronic data can be disseminated and linked also increases the potential for abuse. Gostin argues that current law is inadequate to protect against misuse of increasingly comprehensive, electronic medical records.

Considered together, these three papers suggest that the path to the future of health care research using computerized data is paved, but some sizable potholes remain. Unless severely restricted by new laws or privacy safeguards, standardization and linkage will result in ever more powerful tools for health care research. In turn, these tools must be treated ever more responsibly by their end users. My experience as a biostatistician suggests that researchers who work with health care data hold varying attitudes toward these data, spanning the spectrum from cautious to cavalier. Even among biostatisticians, little attention has historically been paid to issues of data integrity and security [1]. The competing and complementary principles of "reduce, reuse, recycle responsibly" are still being negotiated among the many stakeholders in health care: patients, physicians, health care organizations, insurers, pharmaceutical companies, government agencies, and legislatures. How this tension will be resolved is still a matter for speculation.

Emmanuel N. Lazaridis, PhD

The Regenstrief Institute for Health Care; Indiana University Medical Center; Indianapolis, IN 46202


Author and Article Information
space
up arrowTop
dotAuthor & Article Info
down arrowReferences

The Regenstrief Institute for Health Care; Indiana University Medical Center; Indianapolis, IN 46202
Note: This article is one of a series of articles comprising an Annals of Internal Medicine supplement entitled "Measuring Quality, Outcomes, and Cost of Care Using Large Databases: The Sixth Regenstrief Conference." To see a complete list of the articles included in this supplement, please view its Table of Contents.


References
space
up arrowTop
up arrowAuthor & Article Info
dotReferences

1. Lazaridis EN, Carey MA. Security concerns for statisticians in a networked world. The American Statistician; [In press].



box Article
 arrow  Table of Contents                
space
box Services
 arrow  Send comment/rapid response letter
space
 arrow  Notify a friend about this article
space
 arrow  Alert me when this article is cited
space
 arrow  Add to Personal Archive
space
 arrow  Download to Citation Manager
space
 arrow  ACP Search                        
space
 arrow  Get Permissions
space
box Google Scholar
 arrow  Search for Related Content
space
box PubMed
Articles in PubMed by Author:
  arrow  Lazaridis, E. N.
space
 arrow  Related Articles in PubMed
space
 arrow  PubMed Citation
space
 arrow  PubMed
space


 Home | Current Issue | Past Issues | In the Clinic | ACP Journal Club | CME | Collections | Audio/Video | Mobile | Subscribe | Tools | Help | ACP Online