Friday, October 9, 2009

Data Sharing

Andrew Vickers is a biostatistician working on cancer. Throughout his career, he has requested underlying datasets from several researchers at different times to help advance scientific studies, and has been refused more often than not. He was perturbed enough to write an essay in the New York Times about it, in which he advocated making clinical trial data freely available.

Vickers’ piece discussed the various difficulties he has encountered trying to obtain data that could “make an immediate and important impact on the lives of cancer patients". The reasons that researchers were reluctant to share data included:
• the potential that analyses could be undermined
• the original research team might “consider a similar analysis at some point in the future”
• privacy concerns
• unwillingness to co-operate
• the “difficulty of putting together a dataset”
• potential for misinterpretation or misrepresentation

“Given the enormous physical, emotional and financial toll of cancer, one might expect researchers to promote the free and open exchange of information,” Dr. Vickers wrote. “The patients who volunteer for ... trials often suffer through painful procedures and harsh experimental treatments in the hope of hastening a cure. The data they provide ought to belong to all of us. Yet ... researchers typically treat it as their personal property.”

Dr. Vickers cited the work of Dr. John Kirwan, a rheumatologist from the University of Bristol, who has studied researchers’ attitudes on sharing clinical trials data. Dr. Kirwan found that three-quarters of researchers he surveyed, as well as a major industry group, opposed making original trial data available.

In an article published in Nature magazine on the same topic (“Making Clinical Data Widely Available”), Jocelyn Kaiser writes on difficulties of sharing raw data among scientists. The issue is complex; researchers often embrace the idea of sharing, which can open channels for independent scrutiny, foster collaborations and encourage new discoveries in old data. In practice, however, those advantages often fail to outweigh researchers' concerns.

Some researchers in her article were concerned that open-access could lead to erroneous interpretations. Epidemiologist Bruce Psaty warned that investigators could conduct erroneous risk analyses, or that young researchers could become too dependent on data mining, and neglect designing their own rigorous studies.

Kaiser also notes that, “clinical investigators are understandably reluctant to hand over datasets in which they have invested years and that they hope will generate many papers.”

“They don’t want to be scooped,” says Christine Laine, senior deputy editor of the Annals of Internal Medicine.

Dr. Vickers frames it differently. “This is exactly what...patients need. They want new results to be published as quickly as possible and to encourage a robust debate on the merits of key research findings.” Reidpath and Allotey agree, writing that this “highlights an unfortunate situation where researchers are more concerned with losing an advantage than advancing science.”

“It is worth restating this finding: most scientists doing research on how best to help those in pain, or at risk of death, want to keep their data a secret.” Dr. Vickers finished.

However, the trend of keeping data private is perhaps beginning to change. Data-sharing advocates say that the power to encourage data sharing rests largely with those who have always had the most clout in science: the funding agencies, which can require data sharing in return for support, or journals, which can make sharing a condition of publication. Many such agencies have begun to encourage data sharing through those mechanisms.

In her article, Kaiser points out that: “A law enacted in 2007 will require sponsors of all clinical trials of drugs and devices that were subsequently approved in the United States to post summary results in a federal database. The aim is to ensure that all findings see the light of day, including negative results that often get buried. At the same time, the National Institutes of Health (NIH), which ramped up its data-sharing efforts with genome data more than a decade ago, has been advancing these policies into clinical research.”

This trend appears to be spreading slowly across different parts of the scientific landscape. ResearchGate recently launched a Self­Archiving Repository, which provides members with free access to potentially millions of research papers without the obstacle of library subscriptions or the financial barrier of pay-per-view.

In a broader mandate for sharing Data, the NIH policy states that “NIH reaffirms its support for the concept of data sharing. We believe that data sharing is essential for expedited translation of research results into knowledge, products, and procedures to improve human health. The NIH endorses the sharing of final research data to serve these and other important scientific goals. The NIH expects and supports the timely release and sharing of final research data from NIH-supported studies for use by other researchers.”

Nature magazine’s policy is similar. “An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols promptly available to readers without preconditions,” says the website.

Similarly, the Annals for Internal Medicine asks that their technical studies include a reproducible research statement explaining the study protocol, dataset, and statistical code. Researchers are not required to do so, and many simply say that data are not available, Kaiser reports. However, Christine Laine “views this policy as a tiny, baby step towards changing the scientific culture.”

by Meghan Kallman


