The following is from the speech “AI at the Intersection of Data Science and Life Science,” delivered to the Intelligent Health AI conference in Amsterdam on Oct. 12, 2021. 

My father was a physician who ran a small practice south of San Francisco. As a child, I loved to tag along to his office. I remember gazing up at the big rainbow of paper folders that lined the walls, each housing someone’s health story, told across hundreds of pages of visit notes and test results. And I remember doing my homework while he read through chart after chart, trying to identify patterns and correlations and any lessons they held.

Today, these paper charts are largely replaced by electronic health records (EHRs), which should in theory facilitate sharing and learning. But Flatiron works with more than 2,000 doctors in 800 cancer clinics across the U.S. – and we see, every day, that too much critically important patient information is still siloed, locked within electronic walls in systems that don’t speak to each other. Much of it is still unstructured data – like the notes a doctor types into an EHR’s open field – that exists in prose and is difficult to query. Many clinic teams still use ordinary whiteboards and markers to match patients to the clinical trials and experimental treatments that could be transformative for their care. And most trial sites still use paper logs to capture important data like how a patient’s tumor is responding to treatment.

Flatiron’s 2,500 experts – including hundreds of physicians, data scientists, machine learning (ML) engineers, and product designers, and over 1,500 nurses and tumor registrars – transform real-world data (RWD) gathered at the point of care into real-world evidence (RWE) to improve treatment, inform policy, and advance research. We partner across sectors and borders with other clinicians and researchers, with life science companies and academic health centers, and with bodies like the U.S. Food and Drug Administration and the National Institute for Health and Care Excellence (NICE). Over the past decade, we have helped bring important new treatments to cancer patients around the world.

Artificial intelligence (AI) – specifically, ML and natural language processing (NLP) – is core to achieving Flatiron’s mission to improve lives by learning from the experience of every cancer patient. Cancer is not a single disease; in fact, the more we learn, the more diseases it becomes. Cancer drug development and treatments change constantly. This is why oncology has been an important proving ground for RWE and precision medicine broadly. The more we all learn about the biology of tumors, the more personalized we can make cancer care – selecting the right treatment and approach for each unique individual. AI amplifies the potential in the data we curate. It helps surface characteristics of patients with a particularly rare form of cancer. It identifies patterns across millions of data points, driving learning at unprecedented scale. And, crucially, it helps bring innovative and potentially life-saving therapies to more patients, more quickly. 

But creating datasets and technology that can impact the lives of human beings needs human experts – always. Those of us who employ ML and other breakthrough technology in the service of people’s lives are guided by a very simple and important truth: Healthcare is a human endeavor.

Most of what we know today about which drugs work for which cancer patients comes from the relatively few patients who enroll in clinical trials. Clinical trials are still the gold standard for medical evidence. But the relatively few patients in trials tend to be younger and better educated, and less diverse and with fewer comorbidities, than the overall population of people with cancer. In a field like oncology – where it’s all about the n of one and where each patient’s disease and response to treatment is so personal – learning from a relatively small and unrepresentative portion of the population just isn’t good enough.

Flatiron uses ML to help fill this knowledge gap, integrating traditional and emerging sources of evidence about the experiences of more cancer patients, including the majority who aren’t in trials. Our datasets currently unlock knowledge from EHRs that hold the stories of almost 3 million cancer patients. This knowledge has helped lead to treatment alternatives for men with breast cancer, for patients with gastric cancer, those with colorectal cancer, and those with head and neck cancer, to name a few. 

ML is a critical tool in our work

ML and NLP help us create more valuable research datasets. Without additional manipulation, the point-of-care data we can consider and curate is only as good as the data that has been accurately and consistently entered into EHRs. Unfortunately, important data – like biomarkers that can power observational research in targeted populations or a patient’s functional performance status or smoking status – aren’t consistently collected or logged into structured fields that we can query. ML and NLP help us extract EHR data that is incomplete or difficult to access. Of course, algorithms can’t do this alone. We rely on trained clinical experts for validation and interpretation, particularly for data points that require judgment like diagnosis, stage, and progression. But ML makes all of this – and all of us – much more efficient.

ML helps pre-screen patients for clinical trials. It is critically important to identify patients who might be eligible for trials before they begin a standard-of-care therapy, in what is often a very short window of time. Most clinical trials happen in big hospitals, where dedicated staff can sift through mountains of data to determine who might be a good match for which trial. But most U.S. cancer patients get their care at smaller, more local clinics that aren’t resourced to do this time-intensive screening. We use ML, employed at the point of care, to comb through millions of data points and reduce the time it takes for research teams to screen for eligibility criteria across their patients and  trials. Research coordinators can then directly surface potential trial matches in the EHR for physicians’ consideration. This is so much faster, more accurate, and more efficient than whiteboards and markers! And the faster trials are enrolled, the faster the patients in them – and many others – might have access to innovative, life-extending treatments.

ML supports this screening by inferring data that isn’t easy to find in the EHR. Metastatic status, for example, is often a key consideration in matching patients to trials but isn’t always logged in a structured EHR field. Our ML models can infer a patient’s metastatic status by analyzing unstructured EHR documents and generate accurate predictions in more than 90 percent of cases. But even that’s not good enough when patients’ lives are at stake, so we surface the predictions for review by clinic staff. Over the past two years, clinical research teams across our U.S. network have used this model to identify patients eligible for trials and have validated the ML-inferred metastatic status of more than 13,000 patients.  

Some complexities to consider

All models have statistical error based on the quality and quantity of training data, and all datasets come with bias. Data from patients treated at big hospitals may reflect different characteristics and treatment patterns than data from patients treated in smaller clinics. These are persistent challenges for those who work with RWD. If we train an ML model from data that reflect some underlying bias, we risk perpetuating that bias. For example, we are piloting a model in the U.S. – with algorithms applied to EHR data – that can predict if a patient is at risk of an acute event like an emergency hospital visit in the next 60 days. Knowing this risk helps physicians take proactive steps to avoid acute events, such as augmenting a patient’s treatment with IV fluids or changing their pain management. If we calibrated that model with a training dataset that used only patients who live near a major hospital, it might not perform well when applied to patients treated in environments with fewer resources. Similarly, a model trained in an urban environment might not perform well in a rural setting. Even more concerning is the potential for any model to perpetuate racial or ethnic biases. Flatiron has a longstanding commitment to advancing inclusive research and reducing disparities in cancer treatment and outcomes. In our ML work, we proactively measure our algorithms for bias, ensuring our models are calibrated to predict risk beyond the diversity of the patients being treated at the institutions represented in the training data. We use sensitivity analyses and calibration curves stratified by demographics, and we choose outcomes with deliberation, avoiding proxies like costs that are likely to be biased against under-represented groups.

People turn ML’s promise into power 

ML is really about scaling human experts’ understanding. When lives are at stake, we can’t depend on algorithms alone. Many aspects of a patient’s status and treatment can be easily captured as variables to build ML models, but educated guesswork and human intuition are also critical building blocks. ML models infer missing data variables and surface patients for clinical trials, but you need human experts to weigh whether ML is even suited for the use case in question and to evaluate the quality, completeness, and representativeness of the data used to train the algorithms. Moreover, it takes humans to consider which algorithms and what data outputs will be most useful to clinical staff and researchers, to calibrate and continuously validate the models, and to use the tools and apply the outputs to help patients.

At Flatiron, algorithms don’t dictate what we do. People – teams of clinically trained nurses and tumor registrars – curate often-messy EHR information into the golden datasets we use to train ML models and to answer previously unanswerable research questions. Clinicians who have the deepest understanding of the patient journey – work side-by-side with data scientists and engineers through every step of building our ML models, policies, and procedures because you need both math and medical judgement to ensure that a model works as planned. People make sure our models are applied with great care to well-understood, carefully curated – and top-quality – data, and that any algorithms we build are scientifically, clinically, and statistically validated for meaningful use cases. People validate what our ML models infer, ensure the correct information is included in each patient’s chart, and determine how this information will impact that patient’s care. We spend a lot of time figuring out how to make ML models work for care teams because even the best tools must align with existing workflows or clinical teams won’t use them.

Looking ahead

The future of healthcare demands innovation that is solidly grounded in science and data and holistic solutions that take into account every stage and stakeholder – from patients to doctors to clinics to researchers to developers to regulators and back to patients. We work with the top global developers of cancer treatments. And we know that the core problems in oncology persist – that R&D still takes too long, that it’s still exclusive, that costs are unsustainable. Flatiron continues to push the envelope to transform cancer research and care. Our software and methods help to more seamlessly integrate research into everyday care, to synthesize previously disparate and disconnected data, further closing the gap between clinical trials and RWE, and to broaden access to clinical trials, making research and care more inclusive.

Insights from Flatiron’s U.S.-derived RWD have led to new treatment alternatives for cancer patients across the globe. Now, we are also building ML-powered RWD partnerships with hospitals and health networks in Europe and in Asia. Each partnership is designed in alignment with local legal, regulatory, data-privacy, and compliance standards and requirements to support the generation and use of high-quality RWD so that we can learn from, and bring value to, more patients with cancer.

The world increasingly sees that RWD and RWE have roles to play in every stage of drug development. This year, we were proud to be a founding member of the RWE Alliance in the U.S., raising the collective, expert voice of those who actually work with RWD and RWE to help inform standards and guidelines. ML in healthcare is still a frontier field. We’re all learning as we go and keenly aware of the increasing need for standards and benchmarks for what constitutes acceptable performance for ML tools and applications. Regulators in the U.S. and Europe have recognized AI’s potential to transform healthcare by deriving important new insights from the vast amount of data generated during routine care, but they haven’t yet issued specific guidance on uses for drug evaluation. We’re also closely watching the evolving landscape governing software as a medical device to ensure that we follow best practices. And we look forward to what the ISPOR Machine Learning Methods in HEOR Task Force has to say about important topics like cohort selection, causal inference, and economic evaluation. Flatiron supports transparency in the development and application of standards and in reporting performance and validation of tools once standards are established. We know that different use cases may require different performance standards. 

We also know that the stakes are higher in healthcare than almost anywhere else ML is used. People’s lives are at stake. Patients are the reason we do this work. Cancer is personal, as well as professional, for so many of us at Flatiron. 

The value of RWD and RWE is much better understood and appreciated today than when Flatiron was founded almost a decade ago. And today’s pandemic has driven much wider awareness of how our individual medical experiences can contribute to high-quality research for the greater good. At the same time, our health experiences are steeped in intensely personal and cultural issues. Flatiron’s utmost respect for and commitment to patient privacy guides all of our work and partnerships. We work hard to help patients understand how their data – safely aggregated and anonymized – can unlock transformative insights. In the UK, for example, in addition to our research partnership with NICE, we’ve been working closely with DATA-CAN, the Health Data Research Hub for Cancer, to ensure that the voices of UK cancer patients are front and center in our developing work. 

We are headed toward a future in which expanded interoperability and evidence generation will mean greater access to even more of the data my Dad dreamed about. As we continue along this exciting road, we need careful, deliberate work to integrate evidence sources in ways that help bring more treatments to more patients, more quickly. Algorithms can’t do this important work alone. At Flatiron, we’ve pioneered the use of AI to accelerate learning about cancer – and we know the future also relies on the care that human experts give and the decisions that human experts make. We can’t  do this important work alone, or in silos. We look forward to working with many of you. Remember: patients are counting on us.

Author(s)
Carolyn Starrett oversees Flatiron Health’s community oncology and research businesses and all corporate functions.
Back