It is widely acknowledged that real-world data (RWD) has the potential to unlock value for researchers and organizations seeking to bring more effective, affordable and accessible health solutions to patients in need. However, as is often the case with new paradigms, the evolving structures and provenance of RWD can make its analysis quite complex and challenging.

More than 80 percent of pharmaceutical companies surveyed in a recent market analysis indicated that they were entering into strategic partnerships to access new sources of RWD. This surge in sources, along with the potential for inconsistencies and variability in what are thought to be the best methodologies to apply to RWD, highlights an equally growing need to establish trust and transparency in analytical methods, insights, and conclusions drawn from studies that use RWD.

Key thought leaders from groups like The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) have stated that the “growing interest (in RWD) has created an urgency to develop processes that promote trust in the evidence-generation process and to enable decision-makers to evaluate the quality of the methods used in real-world studies.”

Trust in real-world evidence (RWE) usually relates to concepts like pre-specification of an analytic plan but additionally, that trust should extend to detailed discussions of RWD source characteristics and data quality. Going further, we can also build trust in RWE with transparency into analytic method implementation for a particular study, as well as the ability to reproduce that study.

The quote from ISPOR highlights a growing call to action for those who generate and use RWD to commit to demonstrating the trustworthiness of RWE. In order to do that, we all need to ensure our analyses are both reproducible and transparent.

Let’s pause here for a moment.

Reproducibility and transparency are often used interchangeably; it is common to assume that, if you have one, you automatically have the other. In fact, the two terms are more nuanced, and there are variations in how they are understood and practiced across the scientific community. We at Flatiron define them as follows:

Reproducibility: An analysis is considered reproducible if an independent researcher can generate identical results and insights using the original data and code.

Transparency: Analysis is considered transparent if an independent researcher can interrogate and understand the specific methods, individual analytic steps and processes applied for a particular analysis well enough to generate identical results using the original data, but without the benefit of the original code.

It’s important to emphasize that an analysis that’s reproducible isn’t, by extension, necessarily correct. Reproducible code may have bugs or errors, or the methods used may be inappropriate. While reproducibility is likely correlated with correctness, in today’s rapidly evolving RWD landscape it’s not enough for an analysis to simply be reproducible; it should also be transparent.

Many “thought leadership” pieces already call for more reproducibility and transparency. The question is: How?

Creating a reproducible and transparent RWD analytic workflow requires careful consideration, not only of the high-level analytical methodology (what are the best practices and why), but also of the technical application of that methodology (how are best practices implemented.)

In this moment, the RWD community has a tremendous opportunity to leverage widespread momentum to collaboratively establish new standards that will both define and accelerate robust analysis of RWD.

To that end, Flatiron engineers, data scientists and clinical experts have joined with peers and researchers in life sciences companies, academic institutions and other organizations that work with Flatiron RWD, in a collaboration dedicated to actualizing methodological reproducibility and transparency. The more we all collaborate on the tools and techniques needed for analyzing Flatiron RWD, the sooner the promise of RWE will come into focus.

These Flatiron authors are collaboratively exploring these issues with peers from Huntsman Cancer Institute, Moffitt Cancer Center, Foundation Medicine and the Roche Group (special thanks to Jinjoo Shim). Click here to learn more and join the conversation.

Director, Data Insights Engineering
Josh builds tools to improve the research and analysis process for scientists using real-world data, as well as point-of-care solutions for physicians across Flatiron’s network.
Product Manager
Conal focuses on building solutions to enable rapid, transparent, reproducible analyses using Flatiron data.
Senior Quantitative Scientist
Daniel co-leads the documentation and dissemination of best practices for analyzing Flatiron real-world data for regulatory and non-regulatory uses.