Not All MCED Tests are Created Equal: The Realities of MCED Test Development and Validation
July 31, 2025

By Joshua Ofman MD, MSHS, GRAIL President; Megan P. Hall Ph.D., GRAIL Medical Affairs; Christina Clarke Dur Ph.D. GRAIL Cancer Epidemiology

Multi-cancer early detection (MCED) tests hold enormous promise, but will only achieve their public health and clinical impact if the tests are validated rigorously in the appropriate “intended use” population: adults at elevated risk with no clinical suspicion of cancer.

While there are several MCED tests in the discovery and development phase, we believe that no cancer screening test should be introduced into clinical practice until its performance has been prospectively validated in the intended use population. The Food and Drug Administration (FDA) has announced a similar approach for how safety and effectiveness is established for diagnostic tests (Note 1). GRAIL’s Galleri MCED test was not launched until the PATHFINDER study, which was conducted under an FDA-approved investigational device exemption application, confirmed Galleri’s performance in the intended use population: adults aged 50 and above with no clinical suspicion of cancer (Schrag 2023). Clinical validation from an interventional study in the intended use population must not be confused with analytical or basic validation from confirmatory sample sets in discovery and development studies. 

Why is clinical validation in the intended use population so important for cancer screening? 
For certain MCED tests in development, promising performance in retrospective case-control studies has not consistently been confirmed by trials in the intended use population evaluating performance in clinical practice. Retrospective case-control studies may have significant study design flaws. For example, studies may be small, have cases and controls that are highly selected and not representative of cancer prevalence in the general population, or are not appropriately “matched” – e.g., the samples are from different times, different clinics or health systems, and/or patients of different ages. These limitations can lead to non-reproducible results, which may include detecting study artifacts rather than cancer, not clearly distinguishing (or even combining) training and validation sample sets, batch effects (differences in sample handling and machine conditions), or other technical artifacts. This has been observed previously with ovarian cancer tests (e.g., Petricoin 2002, Baggerly 2005) and other early MCED technologies (e.g., Cohen 2018, Lennon 2020). For example, the original CancerSEEK assay is a case-control study that reported a specificity of greater than 99% (Cohen 2018). However, when studied in a clinical trial in the intended use population, the specificity of the first blood test was 95.3% (at least a 4.7 times higher false-positive rate), with a positive predictive value (PPV) of 5.9% (Lennon 2020). 

Unfortunately for the MCED field, some test developers are claiming results from small, retrospective case-controlled studies as “validation” with no reported plans for prospective studies in the intended use population (Abraham 2025, Seeking Alpha). The results of such studies may appear promising at first glance, but they should not be considered validation for real-world screening readiness. There is simply no way to establish a test’s safety or benefits until clinical validation performance has been established in the intended use population. 

If tests without sufficient validation are prematurely offered and result in patient harm, the entire field of MCED could be set back, which has the potential to dramatically impact public health. Harms to individual patients may result from missing deadly cancers, the risks of excessive diagnostic follow-ups resulting from tests with high false-positive rates, and the risk of over-diagnosing indolent cancers. The only way to understand, quantify, and minimize these patient risks is by ensuring that any test introduced into clinical practice has been adequately and rigorously validated with strong performance characteristics in the intended use population. 

The first clinically introduced MCED test – Galleri – is supported by strong, published results from large and well-designed case-controlled studies, interventional trials, and real-world studies in the intended use population (Klein 2021, Schrag 2023, Atwood 2024). Specifically, the Galleri test’s robust specificity and cancer signal origin (CSO) accuracy in two of the largest and most diverse trials ever conducted in cancer screening, including the first and only MCED randomized controlled trial, has confirmed what was observed in earlier studies (Klein 2021, Schrag 2023, Giridhar 2024, Neal 2022, GRAIL 2025 [1], GRAIL 2025 [2]). Importantly, the PPV substantially increased in these larger and more representative trials (GRAIL 2025 [1], GRAIL 2025 [2]) and cancer detection rates were substantially higher when added to standard of care (SOC) screening in PATHFINDER 2 compared to PATHFINDER (GRAIL 2025 [2]). The results of this breakthrough technology and the education of the clinical community have helped catalyze the entire MCED field. However, the MCED field is at risk if other MCED tests are launched without the same approach to validation and without rigorously demonstrating a favorable benefit-risk profile. 

Cancer is soon to become the leading killer worldwide, and the current status quo of cancer screening is unacceptable. We only look for three cancers in women (breast, colon, and cervical), two in men (colon and prostate), and leverage an additional screen (lung) for heavy smokers (Nicholson 2024, Davidson 2021, Krist  2021, Curry SJ 2018, Grossman DC 2018). While these screening tests are saving lives, we recently estimated with the American Cancer Society that nearly 80% of cancer deaths result from all of the other cancers we are not screening for today (Ofman 2025) – there is no mechanism to detect those cancers before symptoms appear, when cancers are more treatable and outcomes are better. The SOC screening paradigm only identifies 14% of cancers in the population (Ofman 2025) in the U.S. Adding the Galleri test to SOC screening could dramatically increase the cancer detection rate in the population (Hackshaw 2021) without increasing the risk of over-diagnosis (Chen 2020, Swanton 2025). In the PATHFINDER trial in the intended use population, adding Galleri to SOC more than doubled the number of cancers detected, with half being in stages 1 or 2 (Schrag 2023).

It is quite apparent that not all MCED tests are created equal. For this reason, any comparison of test performance must be evidence-driven and based on the study design, as we describe below (Note 2). For example, it would be clinically inappropriate to compare the results from a case-control study against those from an interventional study. 

To realize the tremendous promise of MCED tests, and continue to develop this important field, all test developers must be held to high standards. Our families, our loved ones, and patients are depending on us. 

Note 1: FDA Guidance Documents Supporting Methodologies and Expectations for Clinical Validation 

Design Considerations for Pivotal Clinical Investigations for Medical Devices

“Sites from which subjects or samples are chosen for studies that support the intended use of the device should be representative of the types of sites where the device is intended to be used. Subjects or samples should also represent the proposed target population. Estimates of overall performance from non-representative sites or subjects may suffer from selection bias.” 

Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests

“We note at the outset that evaluation of a new diagnostic test should compare a new product’s outcome (test results) to an appropriate and relevant diagnostic benchmark using subjects/patients from the intended use population; that is, those subjects/patients for whom the test is intended to be used.” 

In Vitro Diagnostic (IVD) Device Studies – FAQs

“Studies should be performed in a representative sample of the intended use population (i.e., representation of both diseased and non-diseased cases, and controlling for subject demographics and morbidity factors that may affect the level of device performance).”

Note 2: A Checklist for Careful Evaluation of Blood-Based MCED Studies

There are important considerations when evaluating the results of an MCED study or comparing results between studies. 

  1. What was the study design? Well-designed case-control studies, in which cancer patients are tested after diagnosis, can provide estimates of “test sensitivity.” Interventional studies in the intended use population provide estimates of “episode sensitivity,” because patients are followed for a defined time period to understand what cancers may have been missed by the test, and whether apparent false positive test results actually precede a cancer diagnosis. Performance that is applicable to actual clinical use should be assessed in the intended use population and comparisons should not be made across different study designs. 
  2. In an interventional study, what is the length of the episode? Interventional studies define an episode duration (e.g., 12 months to define the cancer status) to estimate episode sensitivity. Studies with different episode durations may have different sensitivity estimates and are difficult to compare and cannot be compared to estimates of test sensitivity from case-controlled studies.  
  3. Were the sensitivity estimates reported at the same specificity? Any estimates of test sensitivity must be compared in relation to the reported levels of specificity. For example, a 98.5% specificity has a 3x higher false positive rate than a specificity of 99.5%. With a lower specificity, a test would be expected to have a higher sensitivity estimate, all other factors held constant.  
  4. What is the overall cancer incidence and case mix in the study population? Overall cancer incidence rate over the episode may be influenced by the percentage of cancer survivors or other high-risk groups. Cancer case mix is one of the most important study characteristics for interpreting performance. If a population is rife with indolent forms of breast and prostate cancer or late-stage cancers versus more deadly cancers across all stages, the performance characteristics will be very different. Studies that exclude certain cancer types should not be directly compared to studies with a different case mix of cancers, as this will impact the results. 
  5. In the interventional studies, what was the intensity and timing of guideline-based screening or imaging? This will have an impact on the results. 
  6. What is the extent of the healthy volunteer effect? In screening trials, it is usual to have participants who are healthier than the general population, with lower overall cancer incidence rates and higher adherence to guideline-based screening. This can impact the cancer case mix. It is generally appropriate to standardize the cancer case mix to that of a standard population (e.g., SEER) to generate estimates of performance that can be more easily compared across different study populations.
  7. Finally, are the results reported from the MCED blood test itself, or from a combination of tests? (e.g., blood test + PET-CT) 

References:

Abraham, J., Domenyuk, V., Perdigones, N. et al. Validation of an AI-enabled exome/transcriptome liquid biopsy platform for early detection, MRD, disease monitoring, and therapy selection for solid tumors. Sci Rep 2025;15:21173 doi:10.1038/s41598-025-08986-0.

Atwood C, Moy S, Kindy MS, et al. REFLECTION: Initial Findings From a Real-World Evidence Study of Multi-Cancer Early Detection (MCED) and Toxic Exposures Among Veterans in the Veterans Affairs Healthcare System (VA). Poster and Presentation at the Early Detection of Cancer Conference (EDCC); October 22-24, 2024; San Francisco, CA.

Baggerly KA, Coombes KR, Morris JS. Bias, Randomization, and Ovarian Proteomic Data: A Reply to “Producers and Consumers. Cancer Informatics 2005;1. doi:10.1177/117693510500100101.

Chen X, Dong Z, Hubbell E, et al. Prognostic Significance of Blood-Based Multi-cancer Detection in Plasma Cell-Free DNA. Clin Cancer Res 2020;27(15): 4221–4229. doi:10.1126/science.abb9.

Cohen JD, Li L, Wang Y, et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 2018;359(6378):926–930. 

Curry SJ, Krist AH, Owens DK, et al. Screening for Cervical Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2018;320;(7):674-686. doi:10.1001/jama.2018.10897.

Davidson KW, Marry MJ, Mangione CM, et al. Screening for Colorectal Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2021;325;(19):1965-1977. doi:10.1001/jama.2021.6238. 

Giridhar KV, Demeure MJ, Kim RH, et al. PATHFINDER 2: A Prospective Study to Evaluate Safety and Performance of a Multi-Cancer Early Detection Test in a Population Setting. Cancer Res 2024;84(6_Supplement):4784. doi.org/10.1158/1538-7445.AM2024-4784.

GRAIL, Inc. [1] (2025, May 13). GRAIL Reports First Quarter 2025 Financial Results [press release]. https://grail.com/press-releases/grail-reports-first-quarter-2025-financial-results/

GRAIL, Inc. [2] (2025, June 18). GRAIL Announces Positive Top-Line Results From The Galleri® PATHFINDER 2 Registrational Study [Press release]. https://grail.com/press-releases/grail-announces-positive-top-line-results-from-the-galleri%e2%93%a1-pathfinder-2-registrational-study/

Grossman DC, Curry SJ, Owens DK, et al. Screening for Prostate Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2018;319;(18):1901-1913. doi:10.1001/jama.2018.3710.

Hackshaw A, Cohen SS, Reichert H, Kansal AR, Chung KC, Ofman JJ. Estimating the population health impact of a multi-cancer early detection genomic blood test to complement existing screening in the US and UK. Br J Cancer 2021;125(10):1432-1442. doi: 10.1038/s41416-021-01498-4.

Klein EA, Richards D, Cohn A, et al. Clinical validation of a targeted methylation-based multi-cancer early detection test using an independent validation set. Ann Oncol 2021 Sep;32(9):1167-1177. doi: 10.1016/j.annonc.2021.05.806.

Krist AH, Davidson KW, Mangione CM, et al. Screening for Lung Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2021;325;(10):962-970. doi:10.1001/jama.2021.1117.

Lennon AM, Buchanan AH, Kinde I, et al. Feasibility of blood testing combined with PET-CT to screen for cancer and guide intervention. Science. 2020 Jul 3;369(6499):eabb9601. doi: 10.1126/science.abb9601. Epub 2020 Apr 28.

Neal RD, Johnson P, Clarke CA, et al. Cell-Free DNA-Based Multi-Cancer Early Detection Test in an Asymptomatic Screening Population (NHS-Galleri): Design of a Pragmatic, Prospective Randomised Controlled Trial. Cancers 2022;14(19):4818. doi: 10.3390/cancers14194818.

Nicholson WK, Silverstein M, Wong JB, et al. Screening for Breast Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2024;331;(22):1918-1930. doi:10.1001/jama.2024.5534.

Ofman JJ, Dahut W, Jemal A, Chang ET, Clarke CA, Hubbell E, Kansal AR, Kurian AW, Colditz GA, Patel AV. Estimated proportion of cancer deaths not addressed by current cancer screening efforts in the United States. Cancer Biomark 2025 Jan;42(1):18758592241308754. doi: 10.1177/18758592241308754. Epub 2025 Mar 20. PMID: 40109213.

Petricoin EF, Ardekani AM, Hitt BA, et al. Use of proteomic patterns in serum to identify ovarian cancer. Lancet 2002;;359(9306):572-7. doi: 10.1016/S0140-6736(02)07746-2.

Schrag D, Beer TM, McDonnell CH, et al. Blood-based tests for multi-cancer early detection (PATHFINDER): a prospective cohort study. Lancet 2023;402:1251-1260. doi: 10.1016/S0140-6736(23)01700-2.

Swanton C, Cohn A, Margolis M, et al. Prognostic significance of blood-based multi-cancer detection in circulating tumor DNA (ctDNA): 5-year outcomes analysis. J Clin Oncol 2025;43(suppl 16).doi: 10.1200/JCO.2025.43.16_suppl.101.

Share