Skip to content

Assessing the contribution of scanned outside documents to the completeness of real-world data abstraction

Published

February 2023

Citation

Zhao Y, Howard R, Amorrortu RP, Stewart SC, Wang X, Calip GS, Rollison DE. Assessing the Contribution of Scanned Outside Documents to the Completeness of Real-World Data Abstraction. JCO Clin Cancer Inform. (2023 https://doi.org/10.1200/cci.22.00118

Our summary

Electronic health records (EHRs) show promise for clinical decision support, precision medicine, quality improvement, disease surveillance, and population health management; however, the efficient collection, curation, and utilization of this data is subject to an ongoing debate. Up to 80% of EHR data can be stored in unstructured formats, including narrative text notes and scanned documents, which can introduce an informatics issue for secondary data use.

In this study, researchers from Moffitt Cancer Center and Flatiron Health aimed to evaluate the clinical information available in unstructured data to the completeness of a patient record and understand whether this information affects the use of this real-world data for cancer research.

Why this matters

When compared to structured data, unstructured data is typically available in different, less accessible formats and in backend databases. However, studies show that accessing even a small, targeted subset of this data can be of significant value when integrated with other readily available clinical information. It is pivotal to determine the extent unstructured data may affect the development of real-world datasets to facilitate quality reporting, cancer research, and the overall progress toward personalized, value-based care.

Read the research

Share