Top Clinical Investigators Seek To Dampen Impact Of Data Sharing

Despite earlier concerns by its editors about “data parasites,” the New England Journal of Medicine has now published 4 articles offering support in some form for data sharing. But two of the articles— written by many of the most prominent clinical trial researchers in the United States and Canada— express grave concerns about data sharing and propose limitations and safeguards that could significantly limit its impact.

The NEJM papers will likely receive considerable public attention, if only because the author of one of the papers is US Senator Elizabeth Warren. Warren offers a strong endorsement of data sharing, writing that “over the long run, data sharing may help reduce costs by allowing researchers to avoid” performing duplicate trials and that data sharing “can also help to address concerns about conflicts of interest.”

The articles respond to a recent proposal in support of data sharing from the International Committee of Medical Journal Editors (ICMJE). The ICMJE proposal would require authors to share deidentified participant-level data within 6 months of publication and would require authors to prospectively include a data-sharing plan upon trial registration.

The strongest case for open data is made by Harlan Krumholz (Yale) and Joanne Waldstreicher (Johnson & Johnson). They acknowledge that data sharing is more complex and difficult to implement than trial preregistration. They cite patient privacy and giving proper credit to the original investigators as two issues that must be addressed. The Yale Open Data Access (YODA) trial, which is run by Krumholz and which is used by Johnson & Johnson, employs “a ‘trusted intermediary’ approach, in which an independent partner provides support, accountability, fairness, and transparency,” they write. YODA scientists “conduct a blinded review of proposals to ensure that the scientific purpose is clearly described, that the data requested will be used to create or materially enhance generalizable scientific or medical knowledge to inform science and public health, and that the proposed research can be reasonably addressed using the requested data.”

In an email message Krumholz said that “our obligation to society and scientific progress and the people we seek science to benefit must outweigh efforts to maintain data as proprietary assets to be mined at will. Our goal must be to properly credit those who generate data, but to make as much useful data available to the research community as we can – with the meta-data that is essential to understand it and use it properly. There are many practical issues to overcome, but our commitment must be to share as quickly as we can while protecting privacy of the participants. As scientists, we are stronger together – working on common datasets – and learning from each other – even contesting interpretations together – and, ultimately, fully harvesting as much as we can from what is produced.”

Investigators Defend Their Turf

The paper that is most critical of data sharing comes from the International Consortium of Investigators for Fairness in Trial Data Sharing (ICIFTDS), led by Salim Yusuf and other clinical trialists at McMaster University. Although they acknowledge the “potential benefits” of data sharing they focus on what they perceive to be significant limitations. The chief risk they cite is the spectre of “misleading or inaccurate analyses and analyses aimed at unfairly discrediting or undermining the original publication.” Further, they express concern about the “enormous direct costs” of the plan and the diversion of “resources, both financial and human, from the actual conduct of trials.”

The McMaster authors write that they would like to see the interested parties “explore alternatives that will achieve the same goals efficiently.” But they then propose specific modifications to the ICMJE proposal, including extending the deadline for data sharing from 6 months to 2 years after first publication, though the deadline could be further extended, to as long as 5 years, by adding 6 months for every year it took to complete the study. They also call for journals to arrange for independent analyses of trials and for financial compensation to “the original investigators for their efforts and investments in the trial and the costs of making the data available.”

Finally, a group of leading cardiovascular researchers, ACCESS CV (Academic Research Organization Consortium for Continuing Evaluation of Scientific Studies — Cardiovascular), including top researchers from the TIMI, Duke, and CRF groups, express support for “the concept of data sharing” and propose “a strategy to “thoughtfully operationalize” the ICMJE recommendations. But they also focus on the potential dangers of data sharing and recommend weakening some of the stronger provisions in the ICMJE proposal. Along with the McMaster group they propose extending the deadline for sharing to two years, though they don’t mention additional extensions. For cardiovascular trials the ACCESS CV group proposes a formal process of submission and review managed by ACCESS CV members.

Their strategy also seeks to address potential “unintended consequences” of data sharing. They cite several “challenges: complexity of the data and metadata, publication bias or selection bias in proposed new analyses, increased risk of type I error (from multiple unplanned secondary analyses), and patient privacy.” A poorly performed reanalysis “could create apparent discrepancies where none exist, potentially alarming the public and hindering rather than advancing science.” They also express the concern— raised by the original NEJM “data parasite” article— over the “unresolved issue… of how to provide meaningful academic credit (typically authorship) to the team that designed and conducted the trial.”

Who Owns The Data?

I asked Milton Packer (Baylor University) for his thoughts about the articles. He expressed concerns about the proposals from the two groups of clinical trialists, stating that these proposals appear to seriously undermine the principles behind data sharing. Here are his remarks:

“I have concerns about the proposals put forth by the ICIFTDS and ACCESS-CV. Both mention financial compensation, but it is really hard to understand how that would work. Who would get paid? The original sponsor? The research organization? The investigators? I cannot imagine how the contracts for the transference of payments would read. My humorous side thinks it would take at least 2-5 years before legal agreement would ever be reached. It is hard to support a financial requirement for data access. Proposing payments makes it sounds as if the investigators really want to create a disincentive for data sharing.

“There is another (more important) issue. The proposals put forth by ICIFTDS and ACCESS-CV give great weight to the concept that — in some way and for some reason — investigators ‘own’ the data and that their proprietary rights need to be respected. Really? Should all clinical trial data not be considered an asset of the entire clinical and research community, regardless of who generated them? Why is there a proprietary consideration at all? Is a trialist’s need to publish so important? And if it is, what price do we need to pay to protect the publication interests of the original investigators?

“What we really need to collectively understand is that no one possesses data. By virtue of the fact that it was collected as a result of observations in human beings other than the investigators ethically makes it a shared resource without proprietary considerations. No one is trying to deprive the original investigators of their ability to present the major findings of a study first. But we should think of the data presentation as a responsibility and a privilege; it is not a possession, which needs to be guarded and to which access needs to be minimized.”

See this followup story:


  1. What is medical research for? What’s the purpose of data sharing. The well-being of patients?
    How much less harm would have been done to them over the decades had sharing been the norm?
    Virtually half of what physicians had been doing is now recognized as wrong.
    Would sharing indeed have lessened it?

  2. Roger Bumgarner says:

    The proposal from “The International Consortium of Investigators for Fairness in Trial Data Sharing” is a step backwards. In brief, their proposal would allow unverified and results to permeate medical thinking for up to five years. During this time, patients may be potentially at increased risk and they may be subject to treatments that have not been and indeed cannot be independently verified.

    The authors state that “A key motivation for investigators to conduct RCTs is the ability to publish not only the primary trial report, but also major secondary articles based on the trial data.”

    This is used as a justification for a lengthy delay in data release. THE primary motivation MUST BE to improve patient health. As long as medical investigators continue to prioritize publication rights over what is best for the patients, progress will be slowed. We owe the patients who participate in trials rapid release of the data so that it can be independently analyzed and verified. The literature is replete with results from clinical trials that did not stand up to scrutiny including some in which this was obvious from a reanalysis of the data alone.

  3. I am all in favor of sharing data obtained by experiments. At the same time, the US must take a much stricter approach to protecting the privacy of medical records of patients who get treatment outside of experiments. The HIPPA law makes it far too easy for the government to get at your medical records from a clinic. Furthermore, clinics ask patients to communicate through other companies, whose privacy policies are in some cases a screen of misleading words.

    Dr Richard Stallman
    President, Free Software Foundation (,
    Internet Hall-of-Famer (
    MacArthur Fellow

Leave a Reply

%d bloggers like this: