Machine learning in pathology and clinical laboratory medicine: a narrative review

Jintong Wang; Jieli Li

doi:10.21037/jlpm-2025-1-62

Review Article

Machine learning in pathology and clinical laboratory medicine: a narrative review

Jintong Wang^1,2, Jieli Li¹

¹Department of Pathology, The Ohio State University Wexner Medical Center, Columbus, OH, USA; ²Alexander Junior High/High School, Albany, OH, USA

Contributions: (I) Conception and design: Both authors; (II) Administrative support: J Li; (III) Provision of study materials or patients: None; (IV) Collection and assembly of data: J Wang; (V) Data analysis and interpretation: Both authors; (VI) Manuscript writing: Both authors; (VII) Final approval of manuscript: Both authors.

Correspondence to: Jieli Li, MD, PhD. Department of Pathology, The Ohio State University Wexner Medical Center, E314 Doan Hall, 410 W 10th Ave., Columbus, OH 43210, USA. Email: Jieli.Li@osumc.edu.

Background and Objective: Pathology and clinical laboratory medicine face shared challenges in disease screening and diagnosis, including increasing workloads, workforce constraints, diagnostic variability, and susceptibility to errors across the total testing process. While machine learning (ML) has been widely applied to image analysis and laboratory automation, existing reviews often address these domains separately, limiting an integrated understanding of their combined impact on diagnostic practice. The objective of this narrative review is to synthesize current evidence on ML applications across pathology and clinical laboratory medicine, with an emphasis on diagnostic performance, workflow efficiency, and detection of laboratory errors across the total testing process.

Methods: We conducted a narrative review of studies indexed in PubMed, Scopus, Web of Science, and IEEE Xplore published between January 2010 and December 2025, limited to English-language articles, using predefined search terms related to ML, pathology, clinical laboratory medicine, and diagnostic quality.

Key Content and Findings: ML applications are synthesized across four domains: diagnostic image analysis, biomarker and outcome prediction, workflow automation, and laboratory error detection. In pathology, deep learning (DL) models applied to whole-slide images (WSIs) have achieved performance comparable to pathologists in controlled settings. In clinical laboratory medicine, ML-based systems enhance automated verification, patient-based quality control (QC), and error detection by leveraging multianalyte and temporal patterns, often outperforming traditional rule-based methods. Emerging multimodal frameworks integrating histopathologic, laboratory, and clinical data further improve prognostic modeling and decision support.

Conclusions: This review highlights both the promise and limitations of ML in diagnostic medicine, emphasizing the need for interpretability, generalizability, and regulatory oversight to support clinically meaningful, integrated deployment.

Keywords: Machine learning (ML); pathology; laboratory medicine; diagnostic quality

Received: 01 November 2025; Accepted: 26 January 2026; Published online: 17 April 2026.

doi: 10.21037/jlpm-2025-1-62

Introduction

Background

Machine learning (ML) refers to computational methods that enable algorithms to learn patterns from data and improve their performance on specific tasks without being explicitly programmed. ML includes a broad range of approaches, spanning conventional statistical models like logistic regression and decision trees to modern deep learning (DL) and neural network architectures. Selecting an appropriate method depends on the structure of the data, the complexity of the task, and the clinical context of the application (1,2).

In recent years, ML has begun to transform the field of pathology by enabling automated, accurate, and scalable interpretation of complex medical images and laboratory data. Its integration into diagnostic workflows represents a paradigm shift, supporting applications such as histopathologic image classification, diagnostic decision support, and prognostic risk modeling. These tools have been shown to enhance diagnostic precision, improve consistency, accelerate turnaround times, and reduce manual workload.

Early ML applications in pathology relied on handcrafted feature extraction guided by field expertise, which limited generalizability and scalability (3). The advent of DL, however, has enabled classified feature learning directly from raw image data, substantially improving performance in complex tasks such as tumor classification, margin detection, and metastasis identification (4,5).

While much of the early literature has focused on histopathologic image analysis, the potential of ML extends far beyond morphologic interpretation. Increasingly, ML is being applied to all phases of the total testing process, from pre-analytical error detection and analytical quality monitoring to post-analytical interpretation and reference interval (RI) optimization. These developments highlight a growing role for ML in strengthening laboratory quality systems, reducing human error, and enhancing diagnostic safety.

Rationale and knowledge gap

Despite advances in diagnostic technologies, pathology and clinical laboratory medicine continue to face common and persistent challenges throughout disease screening and diagnosis. These include steadily increasing test volumes, growing case complexity, and workforce shortages, all of which place pressure on diagnostic accuracy, consistency, and turnaround time. In pathology, manual interpretation of complex histologic patterns is time-intensive and subject to interobserver variability, particularly for borderline or heterogeneous lesions. In clinical laboratories, high-throughput testing across the total testing process introduces multiple opportunities for pre-analytical, analytical, and post-analytical errors, while traditional rule-based quality control (QC) systems may lack sensitivity to subtle or multivariate error patterns.

These shared obstacles highlight the need for scalable, data-driven approaches capable of supporting diagnostic decision-making, reducing variability, and enhancing quality assurance across diagnostic disciplines. ML offers such a framework by enabling automated pattern recognition in high-dimensional data, whether derived from whole-slide images (WSIs) or longitudinal laboratory results, thereby providing a unifying methodological approach to address challenges common to both pathology and clinical laboratory medicine.

Several recent reviews have examined applications of ML in either diagnostic pathology or routine clinical laboratory medicine, often with a focus on image analysis, automation, or biomarker discovery (6-9). In contrast, the present review is distinguished by its integrated scope and outcome-oriented perspective. We jointly examine ML applications across pathology and clinical laboratory medicine within the framework of the total testing process, emphasizing not only diagnostic performance but also quality assurance, error detection, and operational reliability. In addition, this review highlights emerging multimodal learning approaches that unify histopathologic, laboratory, and clinical data, as well as practical challenges related to interpretability, generalizability, and regulatory implementation. By bridging traditionally siloed diagnostic disciplines, this narrative review provides a cohesive and practice-relevant synthesis that complements existing literature and reflects the evolving role of ML in diagnostic medicine.

Objective

The objective of this narrative review is to synthesize current evidence on ML applications across pathology and clinical laboratory medicine within a unified framework. We examine ML applications in diagnostic image analysis, biomarker and outcome prediction, workflow automation, and laboratory error detection, with particular emphasis on pre-analytical, analytical, and post-analytical vulnerabilities. In addition, we discuss emerging multimodal learning approaches, implementation challenges, and regulatory considerations relevant to clinical deployment. By integrating perspectives from pathology and laboratory medicine, this review aims to highlight the potential of ML to support proactive quality assurance that supports more reliable, efficient, and patient-centered diagnostic care. We present this article in accordance with the Narrative Review reporting checklist (available at https://jlpm.amegroups.com/article/view/10.21037/jlpm-2025-1-62/rc).

Methods

This narrative review was informed by a structured literature search conducted in PubMed, Scopus, Web of Science, and IEEE Xplore to identify relevant studies on ML applications in pathology and clinical laboratory medicine. The search covered publications from January 2010 through December 2025, reflecting the modern development and clinical translation of ML and DL methods.

Search terms included combinations of keywords such as “machine learning”, “deep learning”, “artificial intelligence”, “pathology”, “histopathology”, “clinical laboratory medicine”, “laboratory errors”, “delta check”, “patient-based real-time quality control”, and “reference interval estimation”.

Original research articles, multicenter studies, and high-quality reviews published in English were included if they described ML applications with relevance to diagnostic interpretation, laboratory workflow, or quality assurance across the total testing process. Conference abstracts without peer-reviewed full manuscripts and studies lacking clinical or operational relevance were excluded.

Articles were screened based on title and abstract for relevance, followed by full-text review. Studies were selected to represent both well-established and emerging applications across pathology image analysis and clinical laboratory operations, with emphasis on methodological diversity, clinical impact, and real-world implementation considerations (Table 1).

Table 1

Literature search methodology

Item	Specification
Date of search	The structured literature search was completed in October 2025
Databases and other sources searched	PubMed, Scopus, Web of Science, and IEEE Xplore were searched to identify relevant studies on ML applications in pathology and clinical laboratory medicine
Search terms used	Search terms included combinations of free-text keywords and controlled vocabulary where applicable, such as: “machine learning”, “deep learning”, “artificial intelligence”, “pathology”, “histopathology”, “clinical laboratory medicine”, “laboratory errors”, “delta check”, “patient-based real-time quality control”, and “reference interval estimation”. Boolean operators (AND/OR) were applied as appropriate
Timeframe	Publications from January 2010 through December 2025 were included to capture the modern development and clinical translation of ML and DL methods
Inclusion and exclusion criteria	Inclusion criteria: English-language original research articles, multicenter studies, and high-quality reviews describing ML applications relevant to diagnostic interpretation, laboratory workflow, or quality assurance across the total testing process
Inclusion and exclusion criteria	Exclusion criteria: conference abstracts without peer-reviewed full manuscripts and studies lacking clinical or operational relevance
Selection process	Articles were initially screened based on title and abstract for relevance, followed by full-text review where appropriate. Studies were selected to represent both established and emerging applications across pathology image analysis and clinical laboratory operations, with emphasis on methodological diversity, clinical impact, and real-world implementation considerations
Additional considerations	As a narrative review, no formal meta-analysis or quantitative quality scoring was performed. Study selection prioritized representative and clinically meaningful applications rather than exhaustive coverage

DL, deep learning; ML, machine learning.

Applications of ML in pathology and clinical laboratory medicine

Applications of ML across pathology and clinical laboratory medicine share common objectives, improving diagnostic accuracy, efficiency, and quality, but differ in data structure, workflow, and points of clinical interaction. Pathology applications predominantly focus on image-based interpretation and spatial pattern recognition, whereas laboratory medicine applications rely on high-frequency, structured numerical data generated across the total testing process. Despite these differences, both disciplines operate within interconnected diagnostic workflows, and ML increasingly provides opportunities for integration, feedback, and coordinated decision support across traditionally siloed domains.

Image analysis and diagnosis

A major application of ML in pathology lies in image-based diagnosis. ML models have achieved performance comparable to pathologists in controlled settings in tumor detection, grading, and metastasis identification across cancers, including breast, prostate, lung, and colorectal malignancies, which often match or surpass human pathologists in diagnostic performance (10-18).

Large-scale studies have demonstrated the remarkable diagnostic capability of ML in histopathologic image analysis. In one investigation involving more than 44,000 WSIs from over 15,000 patients, ML models achieved area under the curve (AUC) values of 0.991 and 0.966 for prostate cancer and breast cancer metastasis detection, respectively, without requiring manual data curation (12). Another multicenter analysis reported 97.1% accuracy, 93.5% positive predictive value (PPV), and 98.0% negative predictive value (NPV) for metastatic lymph node identification (11). Collectively, these studies underscore the remarkable diagnostic precision and scalability of ML in well-annotated, image-based cancer diagnostics.

Despite these impressive performance metrics, most landmark pathology studies report results derived primarily from internal validation cohorts, and model performance frequently declines when applied to external datasets (12,19). This performance degradation is largely attributable to domain shift—often referred to as batch effects—introduced by variability in tissue processing, staining protocols, scanner manufacturers, image resolution, and image compression, all of which can substantially impair generalizability in real-world clinical settings (19,20). These challenges are particularly relevant for image-based DL models, where domain shift, annotation burden, and limited external validation remain key barriers to routine clinical deployment.

Consistent with this observation, several studies have demonstrated that DL models trained on WSIs from a single institution experience significant reductions in accuracy or AUC when deployed across centers using different staining or scanning workflows (9,12,21). To mitigate these effects, stain normalization techniques (e.g., color deconvolution and histogram matching), along with domain adaptation strategies such as adversarial training and multicenter model development, have become essential components of clinically robust pathology artificial intelligence (AI) pipelines (19). Without such approaches and robust external validation, the exceptionally high AUC values reported in internal testing may overestimate real-world diagnostic performance.

From a technical perspective, most early pathology applications relied on convolutional neural networks (CNNs) trained in a fully supervised manner, requiring extensive pixel-level or region-level annotations by expert pathologists (3,21,22). While effective, this approach is limited by the substantial annotation burden and associated interobserver variability.

More recent studies have increasingly adopted weakly supervised multiple instance learning (MIL) frameworks, which enable model training using slide-level diagnostic labels without precise spatial annotation. MIL-based approaches have proven particularly well-suited for WSI analysis, allowing scalable model development while substantially reducing annotation requirements (23-25). In parallel, newer architectures such as Vision Transformers (ViTs) have begun to complement or replace traditional CNNs by capturing long-range spatial relationships within histologic images (9,26). These advances represent a critical methodological shift that addresses practical barriers to clinical implementation, including annotation scalability, workflow integration, and model adaptability across diverse laboratory environments.

Biomarker and prognostic prediction

Beyond image classification, ML, particularly DL, has transformed biomarker discovery and prognostic modeling in computational pathology. Recent studies demonstrate that DL can extract clinically meaningful features from routine hematoxylin and eosin (H&E) stained WSIs, enabling prediction of molecular biomarkers and patient outcomes, and thus supporting personalized therapeutic decisions (10,16,27,28). DL models have identified morphological patterns, such as tissue texture, spatial architecture, tumor microenvironment, and stromal organization, that correlate with genetic mutations, protein expression, and survival outcomes (29-32).

A notable application is the prediction of TP53 mutation status in breast cancer (33). In addition, DL-based models have demonstrated the ability to infer molecular subtypes, microsatellite instability (MSI), tumor mutational burden, and protein biomarker expression [e.g., human epidermal growth factor receptor 2 (HER2), estrogen receptor status] directly from histology (34,35).

Regarding prognostic modeling, DL and multimodal models integrating histologic and genomic data have shown promise in forecasting overall survival, disease recurrence, and treatment response (36-38). The integration of multi-omic data enhances prognostic accuracy and provides insights into biologically relevant histologic correlates, such as tumor-infiltrating lymphocytes, vascular proliferation, and stromal remodeling (26,39).

In breast cancer, DL models have also been developed to predict pathological complete response (pCR) to neoadjuvant chemotherapy directly from H&E slides, outperforming or complementing existing biomarkers (40,41). These studies highlight the growing utility of DL as a tool for precision oncology, bridging morphology with molecular and clinical outcomes.

Multimodal ML: integrating histopathology, laboratory, and clinical data

While much of the current literature treats histopathologic image analysis and laboratory data modeling as separate domains, an emerging frontier in AI is multimodal ML, which integrates complementary data types to improve diagnostic and prognostic performance. In oncology and precision medicine, models that jointly analyze WSIs with molecular, laboratory, and clinical variables have demonstrated superior predictive accuracy compared with unimodal approaches (9,26,42-44).

Multimodal frameworks leverage the strengths of each modality: WSIs capture spatial architecture, tumor microenvironment, and morphologic heterogeneity, while laboratory biomarkers and genomic data provide quantitative and molecular context reflecting tumor burden, biology, and systemic response. For example, DL models that combine histopathologic features with genomic alterations or circulating tumor markers have shown improved performance in predicting survival, recurrence risk, and therapeutic response, outperforming image-only or molecular-only models (9,26,39,43).

Importantly, this paradigm naturally bridges pathology and clinical laboratory medicine. Laboratory data, including tumor markers, inflammatory indices, metabolic profiles, and longitudinal chemistry trends, represent structured, high-frequency measurements that complement static tissue morphology. Integrating these data streams enables context-aware modeling of disease progression, capturing both spatial and temporal dimensions of pathology. As multimodal learning frameworks mature, they offer a unifying approach that aligns image-based diagnostics with laboratory medicine, supporting more accurate risk stratification, personalized treatment planning, and outcome-oriented evaluation of diagnostic quality.

Workflow automation and efficiency

In clinical laboratory medicine, workflow automation has similarly benefited from ML-driven decision support systems. For example, ML-based automated verification approaches have been developed to evaluate analytical quality and flag unacceptable results in high-throughput testing environments, including mass spectrometry workflows (45-47). Such systems have demonstrated the ability to detect analytically invalid results with high sensitivity while reducing manual review burden, supporting scalable laboratory operations while maintaining analytical quality. These approaches parallel slide triage and case prioritization in pathology, highlighting a shared role for ML in optimizing diagnostic workflows across disciplines.

Another important contribution of ML in pathology is workflow automation. ML algorithms enable automated slide triage, tissue segmentation, and quantitative analysis, thereby reducing manual workload and improving consistency. Automated systems accelerate common diagnostic tasks, such as tumor detection, grading, and subtyping, allowing pathologists to focus on complex or ambiguous cases and minimizing interobserver variability (48-50).

Moreover, ML-driven quantification enhances reproducibility and standardization in pathology reporting by providing objective, quantitative metrics such as tumor area, cellular density, and architectural patterns (48,51). Collectively, these automated processes are key to improving diagnostic throughput while maintaining high-quality, reproducible results across laboratories.

Beyond parallel applications, ML also enables cross-domain workflow integration. For instance, abnormal laboratory results from sputum cultures could trigger ML-driven alerts prompting targeted histopathologic review or additional acid-fast staining in pathology, reflecting established diagnostic workflows that may be increasingly coordinated through AI-enabled decision-support systems (7,52). Conversely, confirmed pathology findings may feed back into laboratory ML systems as validated outcomes, enabling continuous refinement of alert thresholds and feature representations. Such closed-loop feedback mechanisms illustrate how ML systems can evolve through outcome validation, allowing both pathology and laboratory workflows to benefit from shared diagnostic signals while maintaining appropriate clinical oversight.

ML for detection of laboratory errors

ML-based approaches to laboratory error detection represent one of the most mature and operationally relevant applications of AI in clinical laboratory medicine (7,53-55). Unlike diagnostic prediction tasks that focus on disease classification, these methods are designed to monitor the integrity of the total testing process by identifying anomalous patterns indicative of pre-analytical, analytical, or post-analytical errors (56,57). Recent studies collectively demonstrate that ML offers advantages over traditional rule-based systems, but they also raise important questions regarding generalizability, interpretability, and implementation.

Laboratory testing results support the vast majority of medical decisions, guiding diagnosis, prognosis, and treatment. Consequently, laboratory errors—defined as any defect occurring at any point in the total testing process that compromises result integrity—can have profound clinical and operational consequences. Laboratory errors can occur throughout the testing process, encompassing inappropriate test ordering and stewardship, specimen collection and transportation problems, analytical measurement inaccuracies, and post-analytical errors during auto-verification, reporting, or abnormal result flagging.

Among these, preanalytical errors are the most frequent and diverse. They include incorrect test requests, use of inappropriate tube types, specimen misidentification or mislabeling, contamination from intravenous fluids, hemolysis due to poor venipuncture technique, sample clotting or degradation during delayed transport, and inadequate fill volumes (58). Each error mechanism can produce distinct and often predictable measurement patterns across multiple analytes once the specimen is analyzed—patterns that are well-suited for detection using ML approaches.

Recent studies have demonstrated the growing capability of ML in detecting laboratory errors across the total testing process. Support vector machines (SVMs) have been successfully applied to chemistry panels to identify result patterns consistent with preanalytical problems. Gradient boosting models (XGBoost) have been used to detect sample misidentification events in tumor marker testing, achieving strong performance and generalizability to external patient populations (56,59). Similarly, deep belief networks have been implemented to identify specimen-related errors by analyzing changes from patients’ prior hematology results, while XGBoost-based systems have been developed to recognize simulated wrong tube errors in both chemistry and hematology testing (57,60-62).

Further advances have been made through the use of neural network-based systems trained on simulated laboratory errors and verified through type-and-screen prediction. These models have demonstrated higher sensitivity and specificity than human reviewers and show improved robustness when designed to accommodate missing input data (62-64). Collectively, these ML-based approaches represent a valuable augmentation to operational safeguards such as positive patient identification, offering scalable, data-driven solutions for early detection of laboratory errors and improved diagnostic reliability.

Delta-check protocols have long served as a patient-centric QC measure in clinical laboratories, flagging instances where consecutive test results from the same patient deviate beyond a predetermined threshold, thereby prompting investigation for possible specimen misidentification, contamination, or preanalytical issues (65). However, studies have repeatedly shown that conventional univariate delta-check approaches yield low hit-rates for true errors and generate many false alarms, which burdens laboratory workflow (for example, only ~0.3% of delta-check alerts were related to specimen misidentification in some reports) (65). Unlike traditional delta checks that typically evaluate only the magnitude of change between two consecutive results for a single analyte, ML-based delta-check models incorporate a broader set of contextual features. These may include the absolute current and prior values, the time interval between measurements, concurrent results from physiologically related analytes, and patient-level metadata such as care setting (e.g., inpatient versus outpatient or intensive care unit status) (57,59,66-69). By integrating these features, ML models can learn clinically plausible change patterns, such as rapid biomarker shifts in critically ill patients, while recognizing implausible combinations indicative of specimen misidentification or preanalytical error (57,59,68,69). This contextual awareness substantially improves specificity, reducing false-positive alerts and unnecessary manual review compared with conventional rule-based delta checks. Against this backdrop, the application of ML to delta-check algorithms offers a way to overcome the intrinsic limitations of the traditional approach. Notably, a recent study in tumor-marker testing developed a random forest (RF) and deep neural network (DNN) delta-check model using more than 246,000 results and simulated misidentification events; the DNN achieved balanced accuracies of about 0.79–0.84 across five tumor markers, outperforming traditional percent-change and reference change value (RCV) methods (59). By learning non-linear relationships between current and prior results, as well as incorporating auxiliary features, ML-based systems demonstrate superior sensitivity and specificity compared with static threshold rules, effectively reducing unnecessary alerts while improving true error detection.

A complementary study in Clinical Chemistry (56) further substantiated the utility of ML for detecting sample misidentification in tumor-marker testing. Using nearly 400,000 test results for training and an additional 215,000 for external validation across multiple centers, the authors simulated misidentification events to benchmark ML performance against traditional delta-check methods. Their DNN and XGBoost achieved AUC values of 0.83–0.90, with balanced accuracies up to 0.84, significantly outperforming conventional rule-based approaches (AUC =0.70–0.82). The study highlights that ML-enabled auto-verification systems can improve detection of mislabeled or mismatched samples without substantially increasing manual review burden. Importantly, the approach generalizes across laboratories and provides a scalable framework for integrating ML-driven quality assurance into routine testing workflows.

Beyond misidentification detection, ML-driven delta-check frameworks align with the broader trend of patient-based real-time QC in clinical laboratories. A recent narrative review outlined how supervised and unsupervised ML models have been successfully applied to detect analytical shifts, contamination, delayed processing, and other error types by leveraging large streams of laboratory data (57). In this context, ML-enhanced delta checks can dynamically adjust to laboratory-specific data distributions, account for patient-class heterogeneity (inpatients, outpatients, screening populations), and accommodate method changes or instrument drift more effectively than rigid threshold systems. In particular, in the tumor marker study, the DNN model maintained stable performance across patient classes and showed less performance decline between health screening and inpatient groups than conventional methods (59). The promise for clinical laboratories is thus two-fold: improved error-detection sensitivity (and specificity) and decreased manual review workload, which may significantly contribute to the downstream goal of reducing laboratory-related errors and enhancing diagnostic quality.

Analytical errors, though relatively uncommon compared to preanalytical or postanalytical events, can be among the most insidious and difficult to identify. Such errors may arise from instrument malfunctions, reagent instability, calibration drift, or degradation of analytical precision—all of which may escape conventional QC checks. Because they are often noticeable gradually and inconsistently across patient samples, their detection requires methods capable of discerning small but systematic deviations from expected analytical behavior.

Traditional patient-based real-time QC systems have long been used to identify analytical shifts or bias through statistical monitoring of patient data trends. Unlike traditional patient-based real-time QC approaches that rely on univariate thresholds or fixed statistical rules (e.g., Westgard rules), ML-based PBRTQC systems evaluate multivariate patterns across analytes and time. For example, an ML model may detect an emerging analytical issue by identifying a subtle but coordinated downward drift in serum sodium results while chloride and anion gap distributions remain stable, a pattern suggestive of an instrument-specific fluidics or dilution issue rather than a true shift in patient population (57,68,70). Such multidimensional pattern recognition allows ML systems to distinguish analytical artifacts from biological variation, enabling earlier detection of instrument drift that may not trigger conventional rule-based alerts. However, these approaches typically depend on fixed thresholds and may struggle to differentiate between normal biological variation and true analytical abnormalities. ML offers a powerful enhancement by learning complex, multidimensional relationships within patient result data and adapting dynamically to laboratory-specific patterns.

Recent studies have demonstrated the promise of ML algorithms such as isolation forests, RFs, and classification and regression trees (CART) for detecting analytical drift and imprecision in real-time testing (66,71-73). When applied to common analytes, including complete blood count parameters as well as liver enzymes, these models identified performance changes that traditional QC systems often missed. Similarly, the application of SVMs and neural networks to high-volume assays such as total prostate-specific antigen (PSA) testing has demonstrated improved sensitivity for identifying small analytical biases or shifts compared with conventional QC or delta-based methods (71,73).

By capturing both result-based and delta-based trends across large datasets, ML-driven patient-based real-time QC systems frameworks can detect minute changes in analytical performance before they become clinically significant. Moreover, these models can continuously adapt to instrument updates, reagent lot changes, and population-specific variations, providing a level of contextual awareness that static control charts cannot achieve. Integrating ML into analytical QC systems, therefore, represents a critical step toward proactive and predictive quality management, enabling laboratories to transition from reactive error detection to real-time process surveillance and early intervention.

Post-analytical errors arise after laboratory testing is complete and typically involve the interpretation, communication, or clinical application of results. These errors can occur when test outcomes are misinterpreted by clinicians, applied outside their intended clinical context, or compared against inappropriate RIs. The accuracy of RIs is therefore fundamental to the correct interpretation of laboratory data and directly influences clinical decision-making.

Traditionally, establishing RIs involves recruiting at least 120 healthy individuals, measuring the analyte of interest, and defining the central 95% of values as the RI (74). Although this direct approach is considered the gold standard, it carries significant practical and ethical limitations. The definition of a “healthy” population is often ambiguous, especially in hospital settings where truly disease-free individuals are rare. Moreover, this process is resource-intensive, requiring substantial time, labor, and cost, and may not be feasible for vulnerable or underrepresented populations such as children, pregnant patients, or individuals with limited access to healthcare. The relatively small sample sizes used in traditional studies also limit statistical robustness, particularly for analytes with high biological variability or for subgroups stratified by age, sex, or ethnicity.

To address these challenges, laboratories have increasingly explored computational and ML-based approaches that weight the vast amounts of real-world data generated through routine clinical testing. Instead of manually recruiting healthy volunteers, these methods infer RIs indirectly from large patient datasets by identifying and modeling the latent “healthy” component within mixed clinical populations. Such indirect methods can utilize routine laboratory data to estimate RIs that are both population-specific and dynamically adaptable to evolving demographics or analytical systems.

Earlier statistical approaches, including the Hoffman, Bhattacharya, and truncated minimum chi-square methods (75-77), laid the foundation for indirect RI estimation by transforming or deconvoluting mixed distributions. More recently, advanced algorithms such as RefineR, Kosmic, and LIMIT (78-80) have integrated ML concepts to enhance precision, automate data cleaning, and optimize model selection. These systems can efficiently process millions of test results, account for covariates such as age or sex, and identify outliers or pathological subgroups without manual exclusion. Beyond improving accuracy, ML-enabled RI estimation offers scalability and adaptability, supporting continuous recalibration as laboratory instruments, assay methods, or patient populations evolve.

By harnessing real-world data at scale, ML-based indirect methods not only reduce the logistical and ethical burden of traditional RI studies but also enable personalized and context-aware RI estimation. This evolution marks a key step toward a more responsive post-analytical framework, where laboratory interpretation becomes dynamically aligned with patient populations and real-time clinical data.

Challenges and limitations

Despite these advances, several challenges remain. Most ML-based error detection systems have been developed and validated within single institutions, raising concerns about generalizability across laboratories with different instruments, workflows, and patient populations. In addition, many models function as “black boxes”, limiting transparency and complicating root-cause analysis when errors are flagged. Finally, defining appropriate ground truth for laboratory errors, particularly pre-analytical events, remains inherently difficult and may bias model performance estimates. These limitations underscore the need for multicenter validation, improved interpretability, and standardized evaluation frameworks. Data availability and annotation remain among the most fundamental barriers. ML algorithms require large, well-curated datasets to achieve reliable performance. Generating such datasets in pathology, particularly for WSI analysis, demands extensive manual annotation by expert pathologists, a process that is both time-consuming and costly (25,81). The complexity of histopathologic interpretation, variability in diagnostic criteria, and inter-observer differences further complicate data labeling. Moreover, the storage and processing requirements for high-resolution WSIs are substantial, often necessitating specialized computational infrastructure, data servers, and advanced image management systems. These resource demands can limit accessibility, especially for smaller institutions or research groups without dedicated bioinformatics support.

Another major challenge is model generalizability. ML models often perform well when tested on data from the same institution or under similar imaging conditions, but may degrade in accuracy when applied to external datasets. Variability in staining protocols, imaging equipment, scanning resolutions, and patient demographics can all contribute to performance drift across sites (10,25). This lack of robustness highlights the importance of multicenter datasets, cross-platform validation, and standardized data formats to improve model transferability.

Workflow integration and user acceptance represent additional hurdles. Incorporating ML tools seamlessly into existing laboratory information systems (LIS) and digital pathology platforms remains technically complex. These systems must handle data flow, version control, and real-time feedback without disrupting established diagnostic workflows (30,34). Furthermore, clinical adoption depends heavily on user trust and interpretability. Many pathologists express skepticism toward algorithmic recommendations, citing concerns about reliability, potential job displacement, loss of interpretive autonomy, and increased documentation burden (40,81). To address these issues, user-centered design, explainable AI (XAI) frameworks, and transparent validation studies are essential for fostering confidence and ensuring that ML systems augment rather than replace human expertise.

Finally, regulatory and ethical considerations present emerging challenges for the clinical deployment of ML in pathology and laboratory medicine. Current regulatory frameworks were not originally designed for adaptive learning algorithms that continuously evolve with data input. In the United States, many AI-enabled diagnostic tools are regulated as Software as a Medical Device (SaMD) under the oversight of the U.S. Food and Drug Administration, requiring premarket evaluation of analytical validity and clinical performance, as well as lifecycle-based risk management. Unlike traditional static software or laboratory assays, AI-based SaMD may continue to evolve after deployment, creating regulatory challenges related to model updates, performance monitoring, and version control (82-86).

To address these challenges, regulatory concepts such as Predetermined Change Control Plans (PCCPs) have been introduced to permit predefined algorithm updates under regulatory oversight. By delineating allowable model modifications and associated validation strategies in advance, PCCPs aim to balance innovation with patient safety across the AI lifecycle. However, effective governance of AI systems also requires parallel attention to ethical considerations, including algorithmic bias, data privacy, informed consent, and auditability. Consequently, the development of consistent standards for ML validation, regulatory approval, and post-market performance monitoring specific to the operational realities of pathology and clinical laboratory medicine will be essential for translating these tools into routine clinical use (84,87,88).

Conclusions

ML has rapidly emerged as a transformative force in pathology and laboratory medicine. It enables automated image interpretation, enhances diagnostic precision, supports prognostic modeling, and improves operational efficiency through quality assurance and error detection. From preanalytical error recognition to analytical performance monitoring and post-analytical interpretation, ML offers new capabilities that extend well beyond traditional laboratory methods.

However, the full clinical potential of ML remains constrained by challenges in data quality, interpretability, generalizability, workflow integration, and regulatory oversight. Overcoming these barriers will require collaborative efforts among laboratories, data scientists, clinicians, and regulatory agencies to establish robust data standards, open-access repositories, and validation frameworks.

Despite these obstacles, progress continues at an accelerating pace. Advances in federated learning, self-supervised training, and model explainability are reducing dependence on large annotated datasets and improving model transparency. Similarly, the integration of ML into digital pathology and LIS is becoming increasingly feasible through standardized interfaces and automation pipelines. As these innovations mature, ML is poised to evolve from experimental use into a trusted, interpretable, and integral component of diagnostic practice, enhancing both the efficiency and accuracy of modern laboratory medicine.

This narrative review has several limitations. First, although a structured literature search strategy was employed, the review is not a systematic meta-analysis and may be subject to selection bias or omission of relevant studies. Second, the rapidly evolving nature of ML research means that newly published methodologies and applications may not be fully captured. Third, heterogeneity in study design, datasets, performance metrics, and validation strategies across the included literature limits direct comparison of reported results. Finally, many cited studies are based on retrospective or single-institution data, which may overestimate real-world performance and generalizability. Additionally, differences in laboratory workflows, instrumentation, and digital pathology platforms may further limit the generalizability of reported ML performance across institutions. These limitations should be considered when interpreting the findings and conclusions of this review.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://jlpm.amegroups.com/article/view/10.21037/jlpm-2025-1-62/rc

Peer Review File: Available at https://jlpm.amegroups.com/article/view/10.21037/jlpm-2025-1-62/prf

Funding: None.

Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://jlpm.amegroups.com/article/view/10.21037/jlpm-2025-1-62/coif). J.L. serves as an unpaid editorial board member of Journal of Laboratory and Precision Medicine from March 2025 to February 2027. The other author has no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Rajagopal A, Ayanian S, Ryu AJ, et al. Machine Learning Operations in Health Care: A Scoping Review. Mayo Clin Proc Digit Health 2024;2:421-37. [Crossref] [PubMed]
Andersen ES, Birk-Korch JB, Hansen RS, et al. Monitoring performance of clinical artificial intelligence in health care: a scoping review. JBI Evid Synth 2024;22:2423-46. [Crossref] [PubMed]
Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60-88. [Crossref] [PubMed]
Esteva A, Robicquet A, Ramsundar B, et al. A guide to deep learning in healthcare. Nat Med 2019;25:24-9. [Crossref] [PubMed]
Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med 2018;24:1559-67. [Crossref] [PubMed]
Pantanowitz J, Manko CD, Pantanowitz L, et al. Synthetic Data and Its Utility in Pathology and Laboratory Medicine. Lab Invest 2024;104:102095. [Crossref] [PubMed]
Rabbani N, Kim GYE, Suarez CJ, et al. Applications of machine learning in routine laboratory medicine: Current state and future directions. Clin Biochem 2022;103:1-7. [Crossref] [PubMed]
Jariyapan P, Pora W, Kasamsumran N, et al. Digital pathology and artificial intelligence in diagnostic pathology. Malays J Pathol 2025;47:3-12.
Echle A, Rindtorff NT, Brinker TJ, et al. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br J Cancer 2021;124:686-96. [Crossref] [PubMed]
Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med 2020;288:62-81. [Crossref] [PubMed]
Hu Y, Su F, Dong K, et al. Deep learning system for lymph node quantification and metastatic cancer identification from whole-slide pathology images. Gastric Cancer 2021;24:868-77. [Crossref] [PubMed]
Campanella G, Hanna MG, Geneslaw L, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 2019;25:1301-9. [Crossref] [PubMed]
Bulten W, Pinckaers H, van Boven H, et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. Lancet Oncol 2020;21:233-41. [Crossref] [PubMed]
van der Laak J, Litjens G, Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med 2021;27:775-84. [Crossref] [PubMed]
Wang J, Wang T, Han R, et al. Artificial intelligence in cancer pathology: Applications, challenges, and future directions. Cytojournal 2025;22:45. [Crossref] [PubMed]
Rathore S, Niazi T, Iftikhar MA, et al. Glioma Grading via Analysis of Digital Pathology Images Using Machine Learning. Cancers (Basel) 2020;12:578. [Crossref] [PubMed]
Yu G, Sun K, Xu C, et al. Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images. Nat Commun 2021;12:6311. [Crossref] [PubMed]
Benning L, Peintner A, Peintner L. Advances in and the Applicability of Machine Learning-Based Screening and Early Detection Approaches for Cancer: A Primer. Cancers (Basel) 2022;14:623. [Crossref] [PubMed]
Tellez D, Litjens G, Bándi P, et al. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med Image Anal 2019;58:101544. [Crossref] [PubMed]
Stacke K, Eilertsen G, Unger J, et al. Measuring Domain Shift for Deep Learning in Histopathology. IEEE J Biomed Health Inform 2021;25:325-36. [Crossref] [PubMed]
Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer. JAMA 2017;318:2199-210. [Crossref] [PubMed]
Sirinukunwattana K, Ahmed Raza SE. Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images. IEEE Trans Med Imaging 2016;35:1196-206. [Crossref] [PubMed]
Lu MY, Williamson DFK, Chen TY, et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat Biomed Eng 2021;5:555-70. [Crossref] [PubMed]
Moranguinho J, Pereira T, Ramos B, et al. Attention Based Deep Multiple Instance Learning Approach for Lung Cancer Prediction using Histopathological Images. Annu Int Conf IEEE Eng Med Biol Soc 2021;2021:2852-5. [Crossref] [PubMed]
Komura D, Ochi M, Ishikawa S. Machine learning methods for histopathological image analysis: Updates in 2024. Comput Struct Biotechnol J 2025;27:383-400. [Crossref] [PubMed]
Chen RJ, Lu MY, Williamson DFK, et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 2022;40:865-878.e6. [Crossref] [PubMed]
Jiang Y, Yang M, Wang S, et al. Emerging role of deep learning-based artificial intelligence in tumor pathology. Cancer Commun (Lond) 2020;40:154-66. [Crossref] [PubMed]
Blilie A, Mulliqi N, Ji X, et al. Artificial intelligence-assisted prostate cancer diagnosis for reduced use of immunohistochemistry. Commun Med (Lond) 2025;5:425. [Crossref] [PubMed]
Loeffler CML, Gaisa NT, Muti HS, et al. Predicting Mutational Status of Driver and Suppressor Genes Directly from Histopathology With Deep Learning: A Systematic Study Across 23 Solid Tumor Types. Front Genet 2021;12:806386. [Crossref] [PubMed]
Dernbach G, Kazdal D, Ruff L, et al. Dissecting AI-based mutation prediction in lung adenocarcinoma: A comprehensive real-world study. Eur J Cancer 2024;211:114292. [Crossref] [PubMed]
Ren W, Zhu Y, Wang Q, et al. Deep Learning-Based Classification and Targeted Gene Alteration Prediction from Pleural Effusion Cell Block Whole-Slide Images. Cancers (Basel) 2023;15:752. [Crossref] [PubMed]
Lazard T, Bataillon G, Naylor P, et al. Deep learning identifies morphological patterns of homologous recombination deficiency in luminal breast cancers from whole slide images. Cell Rep Med 2022;3:100872. [Crossref] [PubMed]
Frascarelli C, Venetis K, Marra A, et al. Deep learning algorithm on H&E whole slide images to characterize TP53 alterations frequency and spatial distribution in breast cancer. Comput Struct Biotechnol J 2024;23:4252-9. [Crossref] [PubMed]
Ziegler J, Hechtman JF, Rana S, et al. A deep multiple instance learning framework improves microsatellite instability detection from tumor next generation sequencing. Nat Commun 2025;16:136. [Crossref] [PubMed]
Couture HD. Deep Learning-Based Prediction of Molecular Tumor Biomarkers from H&E: A Practical Review. J Pers Med 2022;12:2022. [Crossref] [PubMed]
Baidoo TG, Rodrigo H. Data-driven survival modeling for breast cancer prognostics: A comparative study with machine learning and traditional survival modeling methods. PLoS One 2025;20:e0318167. [Crossref] [PubMed]
Poirion OB, Jing Z, Chaudhary K, et al. DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data. Genome Med 2021;13:112. [Crossref] [PubMed]
Yin Q, Chen W, Zhang C, et al. A convolutional neural network model for survival prediction based on prognosis-related cascaded Wx feature selection. Lab Invest 2022;102:1064-74. [Crossref] [PubMed]
Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A 2018;115:E2970-9. [Crossref] [PubMed]
Jiang M, Li CL, Luo XM, et al. Ultrasound-based deep learning radiomics in the assessment of pathological complete response to neoadjuvant chemotherapy in locally advanced breast cancer. Eur J Cancer 2021;147:95-105. [Crossref] [PubMed]
Jing B, Wang K, Schmitz E, et al. Prediction of pathological complete response to chemotherapy for breast cancer using deep neural network with uncertainty quantification. Med Phys 2024;51:9385-93. [Crossref] [PubMed]
Kather JN, Pearson AT, Halama N, et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med 2019;25:1054-6. [Crossref] [PubMed]
Hou J, Zhang R, Xie Y, et al. Multimodal deep learning for cancer prognosis prediction with clinical information prompts integration. NPJ Digit Med 2025;9:76. [Crossref] [PubMed]
Vale-Silva LA, Rohr K. Long-term cancer survival prediction using multimodal deep learning. Sci Rep 2021;11:13505. [Crossref] [PubMed]
Yu M, Bazydlo LAL, Bruns DE, et al. Streamlining Quality Review of Mass Spectrometry Data in the Clinical Laboratory by Use of Machine Learning. Arch Pathol Lab Med 2019;143:990-8. [Crossref] [PubMed]
Lee ES, Durant TJS. Supervised machine learning in the mass spectrometry laboratory: A tutorial. J Mass Spectrom Adv Clin Lab 2022;23:1-6. [Crossref] [PubMed]
Yang HS, Rhoads DD, Sepulveda J, et al. Building the Model. Arch Pathol Lab Med 2023;147:826-36. [Crossref] [PubMed]
Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and Deep Learning in Diagnostic Pathology. Front Med (Lausanne) 2019;6:185. [Crossref] [PubMed]
Faa G, Fraschini M, Barberini L. Reproducibility and explainability in digital pathology: The need to make black-box artificial intelligence systems more transparent. J Public Health Res 2024;13:22799036241284898. [Crossref] [PubMed]
Cooper M, Ji Z, Krishnan RG. Machine learning in computational histopathology: Challenges and opportunities. Genes Chromosomes Cancer 2023;62:540-56. [Crossref] [PubMed]
Cui M, Zhang DY. Artificial intelligence and computational pathology. Lab Invest 2021;101:412-22. [Crossref] [PubMed]
Pai M, Behr MA, Dowdy D, et al. Tuberculosis. Nat Rev Dis Primers 2016;2:16076. [Crossref] [PubMed]
Lin Y, Mensah IK, Doering M, et al. Machine learning-based error detection in the clinical laboratory: a critical review. Crit Rev Clin Lab Sci 2025;62:535-47. [Crossref] [PubMed]
Yang HS. Implementing Machine Learning in the Clinical Laboratory: Opportunities and Challenges. EJIFCC 2025;36:615-7.
Master SR, Badrick TC, Bietenbeck A, et al. Machine Learning in Laboratory Medicine: Recommendations of the IFCC Working Group. Clin Chem 2023;69:690-8. [Crossref] [PubMed]
Seok HS, Yu S, Shin KH, et al. Machine Learning-Based Sample Misidentification Error Detection in Clinical Laboratory Tests: A Retrospective Multicenter Study. Clin Chem 2024;70:1256-67. [Crossref] [PubMed]
Lorde N, Mahapatra S, Kalaria T. Machine Learning for Patient-Based Real-Time Quality Control (PBRTQC), Analytical and Preanalytical Error Detection in Clinical Laboratory. Diagnostics (Basel) 2024;14:1808. [Crossref] [PubMed]
Clinical & Laboratory Standards Institute. CLSI PRE02: Collection of Diagnostic Venous Blood Specimens. 2024. Available online: https://clsi.org/shop/standards/pre02/
Seok HS, Choi Y, Yu S, et al. Machine learning-based delta check method for detecting misidentification errors in tumor marker tests. Clin Chem Lab Med 2024;62:1421-32. [Crossref] [PubMed]
Mitani T, Doi S, Yokota S, et al. Highly accurate and explainable detection of specimen mix-up using a machine learning model. Clin Chem Lab Med 2020;58:375-83. [Crossref] [PubMed]
Surian D, Wang Y, Coiera E, et al. Using automated methods to detect safety problems with health information technology: a scoping review. J Am Med Inform Assoc 2023;30:382-92. [Crossref] [PubMed]
Sürmeli BG, Staritzbichler R, Ringel C, et al. Analyte Importance Analysis in Machine Learning-Based Detection of Wrong-Blood-in-Tube Errors Using Complete Blood Count Data. J Pers Med 2025;15:404. [Crossref] [PubMed]
Farrell CL, Giannoutsos J. Machine learning models outperform manual result review for the identification of wrong blood in tube errors in complete blood count results. Int J Lab Hematol 2022;44:497-503. [Crossref] [PubMed]
Farrell CJ, Makuni C, Keenan A, et al. A Machine Learning Model for the Routine Detection of "Wrong Blood in Complete Blood Count Tube" Errors. Clin Chem 2023;69:1031-7. [Crossref] [PubMed]
Randell EW, Yenice S. Delta Checks in the clinical laboratory. Crit Rev Clin Lab Sci 2019;56:75-97. [Crossref] [PubMed]
Liang YF, Padoan A, Wang Z, et al. Machine learning-based nonlinear regression-adjusted real-time quality control modeling: a multi-center study. Clin Chem Lab Med 2024;62:635-45. [Crossref] [PubMed]
Yang X, Chen Q, Pan Z, et al. Application of Patient-Based Real-Time Quality Control Based on Artificial Intelligence Monitoring Platform in Continuously Quality Risk Monitoring of Down Syndrome Serum Screening. J Clin Lab Anal 2024;38:e25019. [Crossref] [PubMed]
Badrick T, Bietenbeck A, Cervinski MA, et al. Patient-Based Real-Time Quality Control: Review and Recommendations. Clin Chem 2019;65:962-71. [Crossref] [PubMed]
Albahra S, Gorbett T, Robertson S, et al. Artificial intelligence and machine learning overview in pathology & laboratory medicine: A general review of data preprocessing and basic supervised concepts. Semin Diagn Pathol 2023;40:71-87. [Crossref] [PubMed]
Hou H, Zhang R, Li J. Artificial intelligence in the clinical laboratory. Clin Chim Acta 2024;559:119724. [Crossref] [PubMed]
Zhou R, Liang YF, Cheng HL, et al. A multi-model fusion algorithm as a real-time quality control tool for small shift detection. Comput Biol Med 2022;148:105866. [Crossref] [PubMed]
Liang Y, Wang Z, Huang D, et al. A study on quality control using delta data with machine learning technique. Heliyon 2022;8:e09935. [Crossref] [PubMed]
Min WK, Park H. Optimization of the Total Testing Process Within the Big Data-to-Big Data Loop. Ann Lab Med 2025;45:558-61. [Crossref] [PubMed]
Clinical & Laboratory Standards Institute. CLSI EP28: Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory. 2010. Available online: https://clsi.org/shop/standards/ep28/
Limpert E, Stahel WA. Problems with using the normal distribution--and ways to improve quality and efficiency of data analysis. PLoS One 2011;6:e21403. [Crossref] [PubMed]
Wosniok W, Haeckel R. A new indirect estimation of reference intervals: truncated minimum chi-square (TMC) approach. Clin Chem Lab Med 2019;57:1933-47. [Crossref] [PubMed]
Ammer T, Schützenmeister A, Prokosch HU, et al. RIbench: A Proposed Benchmark for the Standardized Evaluation of Indirect Methods for Reference Interval Estimation. Clin Chem 2022;68:1410-24. [Crossref] [PubMed]
Ammer T, Schützenmeister A, Prokosch HU, et al. refineR: A Novel Algorithm for Reference Interval Estimation from Real-World Data. Sci Rep 2021;11:16023. [Crossref] [PubMed]
Zierk J, Arzideh F, Kapsner LA, et al. Reference Interval Estimation from Mixed Distributions using Truncation Points and the Kolmogorov-Smirnov Distance (kosmic). Sci Rep 2020;10:1704. [Crossref] [PubMed]
Velev J, LeBien J, Roche-Lima A. Unsupervised machine learning method for indirect estimation of reference intervals for chronic kidney disease in the Puerto Rican population. Sci Rep 2023;13:17198. [Crossref] [PubMed]
Klauschen F, Dippel J, Keyl P, et al. Toward Explainable Artificial Intelligence for Precision Pathology. Annu Rev Pathol 2024;19:541-70. [Crossref] [PubMed]
U.S. Food and Drug Administration. Software as a Medical Device (SaMD). 2018. Available online: https://www.fda.gov/medical-devices/digital-health-center-excellence/software-medical-device-samd
Yang SR, Chien JT, Lee CY. Advancements in Clinical Evaluation and Regulatory Frameworks for AI-Driven Software as a Medical Device (SaMD). IEEE Open J Eng Med Biol 2025;6:147-51. [Crossref] [PubMed]
Singh V, Cheng S, Kwan AC, et al. United States Food and Drug Administration Regulation of Clinical Software in the Era of Artificial Intelligence and Machine Learning. Mayo Clin Proc Digit Health 2025;3:100231. [Crossref] [PubMed]
Ebad SA, Alhashmi A, Amara M, et al. Artificial Intelligence-Based Software as a Medical Device (AI-SaMD): A Systematic Review. Healthcare (Basel) 2025;13:817. [Crossref] [PubMed]
Vidal DE, Loufek B, Kim YH, et al. Navigating US Regulation of Artificial Intelligence in Medicine-A Primer for Physicians. Mayo Clin Proc Digit Health 2023;1:31-9. [Crossref] [PubMed]
Smith JA, Abhari RE, Hussain Z, et al. Industry ties and evidence in public comments on the FDA framework for modifications to artificial intelligence/machine learning-based medical devices: a cross sectional study. BMJ Open 2020;10:e039969. [Crossref] [PubMed]
Medicines and Healthcare Products Regulatory Agency. Predetermined change control plans for machine learning-enabled medical devices: guiding principles. 2023. Available online: https://www.gov.uk/government/publications/predetermined-change-control-plans-for-machine-learning-enabled-medical-devices-guiding-principles/predetermined-change-control-plans-for-machine-learning-enabled-medical-devices-guiding-principles

doi: 10.21037/jlpm-2025-1-62
Cite this article as: Wang J, Li J. Machine learning in pathology and clinical laboratory medicine: a narrative review. J Lab Precis Med 2026;11:16.

Machine learning in pathology and clinical laboratory medicine: a narrative review

Introduction

Background

Rationale and knowledge gap

Objective

Methods

Table 1

Applications of ML in pathology and clinical laboratory medicine

Image analysis and diagnosis

Biomarker and prognostic prediction

Multimodal ML: integrating histopathology, laboratory, and clinical data

Workflow automation and efficiency

ML for detection of laboratory errors

Challenges and limitations

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share