Despite earlier concerns by its editors about “data parasites,” the New England Journal of Medicine has now published 4 articles offering support in some form for data sharing. But two of the articles— written by many of the most prominent clinical trial researchers in the United States and Canada— express grave concerns about data sharing and propose limitations and safeguards that could significantly limit its impact.
The NEJM papers will likely receive considerable public attention, if only because the author of one of the papers is US Senator Elizabeth Warren. Warren offers a strong endorsement of data sharing, writing that “over the long run, data sharing may help reduce costs by allowing researchers to avoid” performing duplicate trials and that data sharing “can also help to address concerns about conflicts of interest.”
The articles respond to a recent proposal in support of data sharing from the International Committee of Medical Journal Editors (ICMJE). The ICMJE proposal would require authors to share deidentified participant-level data within 6 months of publication and would require authors to prospectively include a data-sharing plan upon trial registration.
The strongest case for open data is made by Harlan Krumholz (Yale) and Joanne Waldstreicher (Johnson & Johnson). They acknowledge that data sharing is more complex and difficult to implement than trial preregistration. They cite patient privacy and giving proper credit to the original investigators as two issues that must be addressed. The Yale Open Data Access (YODA) trial, which is run by Krumholz and which is used by Johnson & Johnson, employs “a ‘trusted intermediary’ approach, in which an independent partner provides support, accountability, fairness, and transparency,” they write. YODA scientists “conduct a blinded review of proposals to ensure that the scientific purpose is clearly described, that the data requested will be used to create or materially enhance generalizable scientific or medical knowledge to inform science and public health, and that the proposed research can be reasonably addressed using the requested data.”
In an email message Krumholz said that “our obligation to society and scientific progress and the people we seek science to benefit must outweigh efforts to maintain data as proprietary assets to be mined at will. Our goal must be to properly credit those who generate data, but to make as much useful data available to the research community as we can – with the meta-data that is essential to understand it and use it properly. There are many practical issues to overcome, but our commitment must be to share as quickly as we can while protecting privacy of the participants. As scientists, we are stronger together – working on common datasets – and learning from each other – even contesting interpretations together – and, ultimately, fully harvesting as much as we can from what is produced.”
Investigators Defend Their Turf
The paper that is most critical of data sharing comes from the International Consortium of Investigators for Fairness in Trial Data Sharing (ICIFTDS), led by Salim Yusuf and other clinical trialists at McMaster University. Although they acknowledge the “potential benefits” of data sharing they focus on what they perceive to be significant limitations. The chief risk they cite is the spectre of “misleading or inaccurate analyses and analyses aimed at unfairly discrediting or undermining the original publication.” Further, they express concern about the “enormous direct costs” of the plan and the diversion of “resources, both financial and human, from the actual conduct of trials.”
The McMaster authors write that they would like to see the interested parties “explore alternatives that will achieve the same goals efficiently.” But they then propose specific modifications to the ICMJE proposal, including extending the deadline for data sharing from 6 months to 2 years after first publication, though the deadline could be further extended, to as long as 5 years, by adding 6 months for every year it took to complete the study. They also call for journals to arrange for independent analyses of trials and for financial compensation to “the original investigators for their efforts and investments in the trial and the costs of making the data available.”
Finally, a group of leading cardiovascular researchers, ACCESS CV (Academic Research Organization Consortium for Continuing Evaluation of Scientific Studies — Cardiovascular), including top researchers from the TIMI, Duke, and CRF groups, express support for “the concept of data sharing” and propose “a strategy to “thoughtfully operationalize” the ICMJE recommendations. But they also focus on the potential dangers of data sharing and recommend weakening some of the stronger provisions in the ICMJE proposal. Along with the McMaster group they propose extending the deadline for sharing to two years, though they don’t mention additional extensions. For cardiovascular trials the ACCESS CV group proposes a formal process of submission and review managed by ACCESS CV members.
Their strategy also seeks to address potential “unintended consequences” of data sharing. They cite several “challenges: complexity of the data and metadata, publication bias or selection bias in proposed new analyses, increased risk of type I error (from multiple unplanned secondary analyses), and patient privacy.” A poorly performed reanalysis “could create apparent discrepancies where none exist, potentially alarming the public and hindering rather than advancing science.” They also express the concern— raised by the original NEJM “data parasite” article— over the “unresolved issue… of how to provide meaningful academic credit (typically authorship) to the team that designed and conducted the trial.”
Who Owns The Data?
I asked Milton Packer (Baylor University) for his thoughts about the articles. He expressed concerns about the proposals from the two groups of clinical trialists, stating that these proposals appear to seriously undermine the principles behind data sharing. Here are his remarks:
“I have concerns about the proposals put forth by the ICIFTDS and ACCESS-CV. Both mention financial compensation, but it is really hard to understand how that would work. Who would get paid? The original sponsor? The research organization? The investigators? I cannot imagine how the contracts for the transference of payments would read. My humorous side thinks it would take at least 2-5 years before legal agreement would ever be reached. It is hard to support a financial requirement for data access. Proposing payments makes it sounds as if the investigators really want to create a disincentive for data sharing.
“There is another (more important) issue. The proposals put forth by ICIFTDS and ACCESS-CV give great weight to the concept that — in some way and for some reason — investigators ‘own’ the data and that their proprietary rights need to be respected. Really? Should all clinical trial data not be considered an asset of the entire clinical and research community, regardless of who generated them? Why is there a proprietary consideration at all? Is a trialist’s need to publish so important? And if it is, what price do we need to pay to protect the publication interests of the original investigators?
“What we really need to collectively understand is that no one possesses data. By virtue of the fact that it was collected as a result of observations in human beings other than the investigators ethically makes it a shared resource without proprietary considerations. No one is trying to deprive the original investigators of their ability to present the major findings of a study first. But we should think of the data presentation as a responsibility and a privilege; it is not a possession, which needs to be guarded and to which access needs to be minimized.”
See this followup story: