Natural language processing-based detection of transgender and gender non-conforming patients in electronic health record-derived data
Authors: Ian J. Hooley et al.
Studying transgender and gender non-conforming (GNC) individuals using EHR-derived RWD is difficult as gender identity indicators are not reliably populated in structured data fields. Researchers from Flatiron Health looked to develop a natural language processing-based approach to detect transgender and GNC patients based on patient charts within a real-world dataset.
Why this matters
Studying health outcomes in rare patient populations continues to be an unmet need. This is particularly true for populations that are know to have inferior healthcare outcomes relative to the broader population, such as transgender and gender non-conforming patients.
While real-world data sources represent promising tools to find and study these populations, variations in documentation workflows (e.g., lack of structured gender identity information) continue to limit research. This study presents an approach to overcome that problem in the context of transgender and gender non-confirming patient research by deploying natural language processing to scan unstructured EHR documents. Strategies like this can enable research geared towards tackling disparities in care and outcomes for rare and underserved populations.
Can early US adoption of cancer drugs inform HTA decision making?
Authors: Blythe Adamson et al.
Variation in time-to-approval of new drugs internationally could enable the use of RWD from the US to information health technology appraisals (HTAs). This research conducted by NICE, Flatiron Health and Fred Hutchinson Cancer Research Center explored whether time from FDA approval to NICE guidance might provide an opportunity to inform reimbursement decisions with real-world US patients.
Why this matters
Health Technology Appraisals (HTAs) are quickly becoming a critical component of population-level decision processes in healthcare. Yet, they often have to be conducted based on information generated within the limited boundaries of clinical trials.
Real-world data can be effectively incorporated in these processes to vastly expand their scope, depth and insight on the anticipated impact of new technologies. One potential approach can be to leverage the learnings gleaned from regions with early drug approvals to inform subsequent appraisals as they roll out. By documenting the extent to which US approvals of oncology drugs precede European ones, this study opens the door to a fertile research area for the generation of valuable insights relevant to HTAs.
Statistical methods for pantumor analysis: models to account for tumor-level heterogeneity
Authors: Akshay Swaminathan et al.
As interest grows in the development of tumor-agnostic treatments, these pan-tumor analyses can be difficult due to tumor-level differences. In this poster, researchers compared the performance of six Cox models for estimating simulated tumor-specific and pantumor effects of a biomarker on overall survival. Across these models, they saw similar performance for estimating pantumor effects, with a random-effects Cox model performing the most favorably.
Why this matters
One of the most striking developments in oncology is the refined understanding of malignancies driven by distinct genetic or genomic alterations, which can be present across multiple ‘traditional’ tumor types, i.e., in pan-tumor settings. While these settings have seen multiple drug approvals in recent years, research in this area continues to be challenging.
The conduct of studies, and the interpretation of results generated across disparate clinical contexts is a laborious task, often carried out on small patient samples. This study provides a framework to aid and support these studies, and proposes another analytical tool that could further a nascent research field.
Enhanced cost-effectiveness analysis using EHR data for real-world value
Author: Akshay Swaminathan et al.
EHR-derived RWE has been shown to be more relevant, timely, and representative for health technology appraisal (HTA) decision-making compared to evidence from clinical trials. This study found that RWE can reduce uncertainty in cost-effectiveness estimates due to larger sample sizes and longer duration of follow-up times compared to published trial data.
Why this matters
While clinical trials are a gold standard to establish the efficacy of new drugs, their use as source for health technology appraisals (HTAs) faces limitations to generate reliable estimates of effectiveness and value at the population level.
This study characterizes how two core real-world evidence (RWE) attributes, the ability to accrue large patient cohorts and to aggregate longitudinal data throughout relatively long follow up periods, can address the shortcomings found in clinical trial evidence, and potentially lead to more precise evaluations of the impact of new technologies.
Quantifying bias in Flatiron ML-extracted variables for inference in clinical oncology
Authors: Jaron Lee et al.
Machine learning (ML) can be used to extract clinically relevant information from EHRs for the purposes of conducting analyses using real-world data. Of late, there has been increased discussion on biases in machine learning models that necessitated further exploration. This study assessed the effects of misclassification error in ML-extracted clinical variables when used in statistical analyses.
Why this matters
Real-world data sources have well known limitations, such as the challenges associated with extracting information suitable for analysis from unstructured documents (notes, reports). This extraction has been traditionally carried out by manual abstraction, which is a burdensome and resource intensive process hard to scale.
Machine learning (ML) tools have emerged as a valuable approach to automate the extraction of information from medical documents. The rigorous deployment of ML-extracted information for clinical research, however, demands analytic transparency. It is important to evaluate ML-extracted variables in the context of data quality, to understand their reliability and their potential for error, to ultimately control for the biases that these variables may introduce in subsequent analyses. This study takes two ML-extracted variables in an EHR-derived dataset to characterize their quality and estimate the associated risk for downstream analytic biases, in a seminal approach to the investigation of data quality in the realm of ML-extracted variables.