Article Search
검색
검색 팝업 닫기

Metrics

Help

  • 1. Aims and Scope

    Gut and Liver is an international journal of gastroenterology, focusing on the gastrointestinal tract, liver, biliary tree, pancreas, motility, and neurogastroenterology. Gut atnd Liver delivers up-to-date, authoritative papers on both clinical and research-based topics in gastroenterology. The Journal publishes original articles, case reports, brief communications, letters to the editor and invited review articles in the field of gastroenterology. The Journal is operated by internationally renowned editorial boards and designed to provide a global opportunity to promote academic developments in the field of gastroenterology and hepatology. +MORE

  • 2. Editorial Board

    Editor-in-Chief + MORE

    Editor-in-Chief
    Yong Chan Lee Professor of Medicine
    Director, Gastrointestinal Research Laboratory
    Veterans Affairs Medical Center, Univ. California San Francisco
    San Francisco, USA

    Deputy Editor

    Deputy Editor
    Jong Pil Im Seoul National University College of Medicine, Seoul, Korea
    Robert S. Bresalier University of Texas M. D. Anderson Cancer Center, Houston, USA
    Steven H. Itzkowitz Mount Sinai Medical Center, NY, USA
  • 3. Editorial Office
  • 4. Articles
  • 5. Instructions for Authors
  • 6. File Download (PDF version)
  • 7. Ethical Standards
  • 8. Peer Review

    All papers submitted to Gut and Liver are reviewed by the editorial team before being sent out for an external peer review to rule out papers that have low priority, insufficient originality, scientific flaws, or the absence of a message of importance to the readers of the Journal. A decision about these papers will usually be made within two or three weeks.
    The remaining articles are usually sent to two reviewers. It would be very helpful if you could suggest a selection of reviewers and include their contact details. We may not always use the reviewers you recommend, but suggesting reviewers will make our reviewer database much richer; in the end, everyone will benefit. We reserve the right to return manuscripts in which no reviewers are suggested.

    The final responsibility for the decision to accept or reject lies with the editors. In many cases, papers may be rejected despite favorable reviews because of editorial policy or a lack of space. The editor retains the right to determine publication priorities, the style of the paper, and to request, if necessary, that the material submitted be shortened for publication.

Search

Search

Year

to

Article Type

Original Article

Split Viewer

Artificial Intelligence Models May Aid in Predicting Lymph Node Metastasis in Patients with T1 Colorectal Cancer

Ji Eun Baek1,2 , Hahn Yi3 , Seung Wook Hong1 , Subin Song1 , Ji Young Lee4 , Sung Wook Hwang1 , Sang Hyoung Park1 , Dong-Hoon Yang1 , Byong Duk Ye1 , Seung-Jae Myung1 , Suk-Kyun Yang1 , Namkug Kim3 , Jeong-Sik Byeon1

1Department of Gastroenterology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea; 2Department of Gastroenterology, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Suwon, Korea; 3Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea; 4Health Screening and Promotion Center, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea

Correspondence to: Namkug Kim
ORCID https://orcid.org/0000-0002-3438-2217
E-mail namkugkim@gmail.com

Jeong-Sik Byeon
ORCID https://orcid.org/0000-0002-9793-6379
E-mail jsbyeon@amc.seoul.kr

Ji Eun Baek and Hahn Yi contributed equally to this work as first authors.

Received: June 19, 2024; Revised: September 6, 2024; Accepted: September 8, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Gut Liver 2025;19(1):69-76. https://doi.org/10.5009/gnl240273

Published online January 8, 2025, Published date January 15, 2025

Copyright © Gut and Liver.

Background/Aims: Inaccurate prediction of lymph node metastasis (LNM) may lead to unnecessary surgery following endoscopic resection of T1 colorectal cancer (CRC). We aimed to validate the usefulness of artificial intelligence (AI) models for predicting LNM in patients with T1 CRC.
Methods: We analyzed the clinical data, laboratory results, pathological reports, and endoscopic findings of patients who underwent radical surgery for T1 CRC. We developed AI models to predict LNM using four algorithms: regularized logistic regression classifier (RLRC), random forest classifier (RFC), CatBoost classifier (CBC), and the voting classifier (VC). Four histological factors and four endoscopic findings were included to develop AI models. Areas under the receiver operating characteristics curves (AUROCs) were measured to distinguish AI model performance in accordance with the Japanese Society for Cancer of the Colon and Rectum guidelines.
Results: Among 1,386 patients with T1 CRC, 173 patients (12.5%) had LNM. The AUROC values of the RLRC, RFC, CBC, and VC models for LNM prediction were significantly higher (0.673, 0.640, 0.679, and 0.677, respectively) than the 0.525 suggested in accordance with the Japanese Society for Cancer of the Colon and Rectum guidelines (vs RLRC, p<0.001; vs RFC, p=0.001; vs CBC, p<0.001; vs VC, p<0.001). The AUROC value was similar between T1 colon versus T1 rectal cancers (0.718 vs 0.615, p=0.700). The AUROC value was also similar between the initial endoscopic resection and initial surgery groups (0.581 vs 0.746, p=0.845).
Conclusions: AI models trained on the basis of endoscopic findings and pathological features performed well in predicting LNM in patients with T1 CRC regardless of tumor location and initial treatment method.

Keywords: Artificial intelligence, T1 colorectal cancer, Lymph node metastasis

With the development of a well-organized colorectal cancer (CRC) screening program and an increasing population adhering to it, the number of patients diagnosed with early-stage CRC has risen.1 Early CRC is defined as the invasion of cancer cells into the mucosal or submucosal layer, regardless of lymph node metastasis (LNM). It can be treated using endoscopic methods instead of radical surgery if the risk of LNM is negligible.2-4 Nevertheless, when early CRC is treated by endoscopic resection, there is a pitfall: the presence or absence of LNM cannot be histologically verified. Hence, if there are one or more high-risk pathological features suggestive of LNM, additional surgery with lymph node dissection is recommended after endoscopic resection., The guidelines suggested by the Japanese Society for Cancer of the Colon and Rectum (JSCCR) state the following pathological findings as the high-risk features for LNM: (1) depth of submucosal invasion ≥1,000 μm; (2) lymphovascular invasion positive; (3) poorly differentiated adenocarcinoma, signet-ring cell carcinoma, or mucinous carcinoma; and (4) budding grade of 2 or 3.

However, when an additional surgery was performed based on these recommendations and the surgically resected specimens were analyzed, LNM was confirmed only in 8.8% to 14.3% of cases.6-9 In other words, about nine out of 10 patients underwent unnecessary surgery. Furthermore, surgery-related mortality and morbidity are still not negligible despite advances in surgical techniques. Therefore, there is a need to develop a high-accuracy preoperative predictive model for LNM in early CRC.10 To meet this requirement, several attempts have been made to develop an artificial intelligence (AI)-based decision-making model for predicting LNM in early CRC.11,12 These models demonstrated superior accuracy in predicting LNM in early CRC compared to the current guidelines. However, most of these models were created based on the patient’s clinical and pathological data, excluding endoscopic findings.

In this study, we aimed to develop an AI-based LNM predictive model that focuses on both the histological features and the endoscopic characteristics of early CRC. We also evaluated whether the new AI model demonstrated good accuracy in LNM prediction compared to the JSCCR guidelines.

1. Study design and population

This was a retrospective single-center study with the development and cross-validation of an AI model. In this study, patients who underwent surgical resection of T1 CRC, with or without prior endoscopic resection at Asan Medical Center, Seoul, Korea, from 2011 to 2018, were included. Furthermore, patients who had undergone endoscopic resection at other institutions followed by surgery at Asan Medical Center were also included. Among these, the following patients were excluded: (1) those with synchronous CRC; (2) those with non-radical surgery; and (3) those with missing values, such as poor-quality endoscopic images. We gathered demographic variables and clinical information, including age, sex, body mass index, family history of CRC, medication, smoking and alcohol history, laboratory data (fasting blood glucose, total cholesterol, and serum carcinoembryonic antigen levels), pathological findings of the surgical specimen, and endoscopic images.

2. Pathological findings

The pathological findings of surgical specimens were evaluated by board-certified pathologists in our institution. The following pathological factors were included in developing an AI model to predict LNM for T1 CRC: (1) tumor histology, (2) lymphovascular invasion, (3) tumor budding, and (4) invasion depth of submucosa (SM). We defined deep SM invasion as the absolute depth of SM invasion >1,000 μm, Haggitt level 4, or Kikuchi SM level 2−3.13,14

3. Endoscopic findings

The endoscopic images were evaluated independently by two board-certified, experienced endoscopists (S.W.H. and J.S.B.). All the analyzed images were conventional white-light endoscopy images. The typical endoscopic findings suggesting deep SM invasion included in the analysis were as follows: (1) ulceration/depression, (2) expansion, (3) white spots, and (4) hardness.15 In instances of disagreement between the two endoscopists, a consensus was reached through team discussion. Other endoscopic findings, such as tumor size, location (colon or rectum), morphology (pedunculated or non-pedunculated), and pit pattern by Kudo classification, were also analyzed.16

4. Outcome definitions

The primary outcome of this study was the clinical validation following the development of an AI model for predicting LNM in T1 CRC. The stratified 5-fold cross-validation was implemented to validate the model’s performance under generalized circumstances. Receiver operating characteristics curves were plotted and the area under the receiver operating characteristics curve (AUROC) was measured to compare our model’s discriminating power with that of the Japanese guidelines.

5. Subanalysis

We compared the diagnostic performance according to the tumor location (colon vs rectum) because rectal cancer was reported to have a higher risk of LNM as compared to colon cancer.17 Furthermore, as the initial endoscopic resection group serves as an actual prediction target for our model in clinical practice, we assessed the diagnostic performance within patients who underwent endoscopic resection followed by surgical resection.

6. AI model development

The predictive models were developed using four supervised AI algorithms, which included a regularized logistic regression classifier (RLRC), random forest classifier (RFC), CatBoost classifier (CBC), and the voting classifier (VC) created with the three other models by a 2:1:4 ratio of votes. RLRC, which adds a penalty to the loss function to reduce the possibility of overfitting to the development dataset, is anticipated to demonstrate optimal performance for unseen data.18 RFC method allows the diversity of decision trees to improve classification performance while preventing overfitting to the development set, creating a model that also works well with unseen data.19 CBC exhibits outstanding performance among gradient-boosting algorithms, especially in tasks involving numerous categorical variables.20 VC tends to generate robust predictions, even in cases where one or two classifiers make errors. The classification probability of VC with well-calibrated classifiers also approaches the true probability.21

We utilized the previously mentioned four pathological and four endoscopic findings for developing the prediction model: tumor histology, lymphovascular invasion, tumor budding, depth of SM invasion, hardness, expansion, white spot, and ulceration/depression. The best-performing model among the four algorithms was chosen as the final model for subanalysis.

7. Statistical analysis

Continuous variables were described as mean±standard deviation or median (interquartile range), while categorical variables were expressed as frequency and percentage. The Mann-Whitney U tests were employed to compare continuous variables, and chi-square or Fisher exact tests were utilized for comparing categorical variables. AUROC was compared using DeLong’s test.22 All statistical tests were conducted using a two-sided approach, and a p-value <0.05 was considered statistically significant. The missing values were filled via Multivariate Imputation by Chained Equations imputation.23 All analyses were performed using R software (version 4.0.2; R Foundation for Statistical Computing, Vienna, Austria).

8. Ethical statements

This study was approved by the Institutional Review Board at Asan Medical Center (IRB number: 2020−1676). The study was conducted by the Helsinki Declaration. Because of the retrospective nature of the study, the patient consent requirement was waived.

1. Baseline characteristics of the patients

We initially identified 1,635 patients with T1 CRC who underwent surgical resection at Asan Medical Center. After excluding patients based on the exclusion criteria, 1,386 patients were ultimately selected as the study population, with 173 of them having LNM (12.5%) (Fig. 1). Baseline characteristics are demonstrated according to LNM in Table 1. In the LNM group, 64 patients (37.0%) underwent endoscopic plus surgical resection, and 109 patients (63.0%) underwent initial surgical resection. Endoscopic and pathological features of T1 CRC patients with and without LNM were summarized in Table 2. We used four pathological variables (tumor histology, lymphovascular invasion, tumor budding, and depth of invasion) and three endoscopic variables (hardness, white spot, and ulceration/depression), that exhibited statistically significant differences between the two groups to develop the prediction model for LNM. Although not statistically significant, another endoscopic factor, expansion, was included in the development of the prediction model based on its clinical significance in previous studies.15

Figure 1.Flowchart of the included patients. CRC, colorectal cancer; LNM, lymph node metastasis.

Table 1. Baseline Characteristics

VariablePatients with LNM (n=173)Patients without LNM (n=1,213)p-value
Age, yr59.9±10.360.7±10.50.375
Sex0.301
Male98 (56.6)741 (61.1)
Female75 (43.4)472 (38.9)
BMI, kg/m224.3 (22.6–26.2)24.3 (22.3–26.3)0.651
Family history of CRC0.754
No159 (91.9)1,102 (90.8)
Yes14 (8.1)111 (9.2)
Use of medications
Aspirin11 (6.36)158 (13.0)0.017
Statin13 (7.51)150 (12.4)0.084
Alcohol consumption0.350
Current or ex-drinker85 (49.1)646 (53.3)
Nonuser88 (50.9)567 (46.7)
Smoking history0.011
Never113 (65.3)665 (54.8)
Ex-smoker53 (30.6)434 (35.8)
Current smoker7 (4.05)114 (9.4)
Fasting blood glucose, mg/dL127 (104–154)130 (107–157)0.495
Total cholesterol, mg/dL162 (140–186)165 (140–189)0.469
Serum CEA level, mg/dL1.4 (0.91–2.0)1.5 (0.99–2.1)0.348
Location of T1 cancer0.458
Colon121 (69.9)810 (66.8)
Rectum52 (30.1)403 (33.2)
Treatment modality0.005
Surgery only109 (63.0)623 (51.4)
Endoscopic resection followed by surgery64 (37.0)590 (48.6)

Data are presented as mean±SD, number (%), or median (range).

LNM, lymph node metastasis; BMI, body mass index; CRC, colorectal cancer; CEA, carcinoembryonic antigen.



Table 2. Endoscopic and Pathological Features

VariablePatients with
LNM (n=173)
Patients without
LNM (n=1,213)
p-value
Tumor size, mm18 (13–24)18 (12–25)0.938
Morphology0.809
Nonpedunculated125 (72.3)891 (73.5)
Pedunculated48 (27.7)322 (26.5)
Hardness152 (87.9)925 (76.3)<0.001
Expansion99 (57.2)685 (56.5)0.925
White spot86 (49.7)466 (38.4)0.006
Ulceration/depression101 (58.4)543 (44.8)0.001
Tumor histology<0.001
WD35 (20.2)474 (39.1)
MD122 (70.5)702 (57.9)
PD9 (5.2)21 (1.73)
SRCC3 (1.7)1 (0.1)
Mucinous adenocarcinoma3 (1.7)12 (1.0)
Others1 (0.6)3 (0.2)
Tumor budding0.031
Grade 1126 (72.8)976 (80.5)
Grade 2 or 344 (25.4)222 (18.3)
Depth of SM invasion0.01
Deep136 (78.6)833 (68.7)
Superficial37 (21.4)380 (31.3)
Lymphovascular invasion<0.001
Negative97 (56.1)991 (81.7)
Positive76 (43.9)222 (18.3)

Data are presented as median (range) or number (%).

LNM, lymph node metastasis; WD, well differentiated; MD, moderately differentiated; PD, poorly differentiated; SRCC, signet-ring cell carcinoma; SM, submucosa.



2. Clinical validation of the developed model

The confusion matrix obtained through the 5-fold cross-validation was presented in Supplementary Fig. 1. The AUROC of RLRC, RFC, CBC, and VC models for LNM prediction were 0.673 (95% confidence interval [CI], 0.574 to 0.772), 0.639 (95% CI, 0.542 to 0.736), 0.679 (95% CI, 0.582 to 0.775), and 0.677 (95% CI, 0.580 to 0.774), respectively. All models demonstrated superior performance compared to the AUROC of the JSCCR guidelines (0.525; 95% CI, 0.490 to 0.560) (p<0.001 between RLRC and JSCCR, p=0.001 between RFC and JSCCR, p<0.001 between CBC and JSCCR, and p<0.001 between VC and JSCCR) (Fig. 2). Other performance indicators such as accuracy, sensitivity, specificity, positive and negative predictive values were presented in the Supplementary Table 1. Lymphovascular invasion was the most influential factor for LNM in the SHAP beeswarm plot (Supplementary Fig. 2).

Figure 2.Receiver operating characteristics (ROC) curves for the artificial intelligence models and Japanese guidelines in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval; JpnGL, Japanese guideline; RLRC, regularized logistic regression classifier; RFC, random forest classifier; CBC, CatBoost classifier; VC, voting classifier.

3. Subanalysis according to the tumor location and treatment modality

We used the CBC model, which showed the best performance, as a tool for subgroup analysis. In the subgroups of T1 colon and rectal cancer, the AUROC for LNM prediction was 0.718 (95% CI, 0.600 to 0.836) and 0.615 (95% CI, 0.498 to 0.731) (Fig. 3). Moreover, there was no significant difference in the AUROC for LNM prediction between the patients who underwent initial endoscopic resection followed by surgery and those who underwent initial surgery (0.581 [95% CI, 0.473 to 0.688] vs 0.746 [95% CI, 0.636 to 0.855], p=0.845) (Fig. 4).

Figure 3.Receiver operating characteristics (ROC) curves for the CatBoost classifier model in the colon and rectum subgroups in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval.

Figure 4.Receiver operating characteristics (ROC) curves for the CatBoost classifier model in the initial endoscopic resection and initial surgery subgroups in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval.

Accurate prediction of the presence of LNM in patients who have undergone endoscopic resection of T1 CRC is a clinically important issue for deciding on additional surgery. Despite current guidelines recommending additional surgery if the endoscopic resection specimen exhibits unfavorable pathological features, 85.7% to 91.2% of the patients who underwent surgery, showed no LNM in the surgical resection specimen.6-9,11,24 In this study, we discovered that AI models incorporating both endoscopic and pathological findings of T1 CRC were superior in predicting LNM compared to the JSCCR guidelines.

Several studies have attempted to accurately predict LNM in T1 CRC using AI algorithms. A Japanese study developed an AI model using the patient’s age, gender, tumor size, location, morphology, lymphovascular invasion, and histological grade.12 This model demonstrated superior predictive accuracy compared to decisions based on the current U.S. guidelines.12 In another study, a prediction model was proposed based on five histological findings, demonstrating a high predictive capability for LNM.25 The JSCCR guidelines are the most commonly used and currently recommended for decision-making in CRC. Nevertheless, our study demonstrated that AI models can predict LNM in T1 CRC more accurately than the JSCCR guidelines. An important aspect deserving attention in our AI models is the development of algorithm that incorporate not only pathological features but also endoscopic findings. This differentiation provides a distinct advantage compared to previous studies. While we did not directly compare an AI model trained with endoscopic findings to one without, we suggest that AI models benefiting from diverse training data may exhibit superior performance compared to those with limited training data. However, such assumptions require further validation as there is a study that reported an AUROC of 0.83 based solely on histologic risk factors, which is higher than the AUROC of our models.25 It should be noted that this study was a predictive model targeting the specific group of pedunculated T1 CRC, making direct comparison with our study, which included the entire T1 CRC population, difficult. Therefore, further research, encompassing not only pathological data but also demographic and clinical data such as medication history including aspirin, should be conducted to enhance the quality of AI models predicting LNM in T1 CRC. Additionally, a more comprehensive AI algorithm should be developed by including other endoscopic characteristics such as tumor size and location, in addition to the four endoscopic features (hardness, white spot, ulceration/depression, and expansion) used in our study.

In subgroup analysis, our AI model demonstrated high predictive power for LNM regardless of the location of T1 CRC. A previous study reported a significantly higher local recurrence rate in patients with T1 rectal cancer compared to those with T1 cancer in the colon proximal to the rectum when treated with only endoscopic resection.26 Furthermore, bowel dysfunction after radical surgery for rectal cancer significantly impacts the quality of life.27 Therefore, precise prediction of LNM in T1 rectal cancer is crucial. Our AI model revealed superior performance in the subgroup of T1 rectal cancer, similar to its performance in T1 colon cancer (Fig. 3). Thus, the AI model can be used for predicting LNM not only in T1 colon cancer but also in T1 rectal cancer, where precise LNM prediction is critical.

Reliable prediction of LNM in T1 CRC is necessary for determining the need for additional surgery after initial endoscopic resection in daily clinical practice. In our study, the AI model exhibited no significant difference in the performance of LNM prediction in T1 CRC between the initial endoscopic resection and initial surgery groups. Hence, we propose that the AI model may be readily applicable to patients who initially underwent endoscopic resection of T1 CRC to determine the necessity of additional surgery with greater confidence compared to the JSCCR guidelines. Nevertheless, although the difference was not statistically significant, the AUROC of 0.582 for the initial endoscopic resection group was numerically lower compared to 0.745 for the initial surgery group. Additionally, an AUROC of 0.582 indicates that the AI algorithm is insufficient for practical clinical application. Therefore, efforts are needed to improve performance by using larger training data and carefully considering reliable risk features, aiming to develop an AI algorithm that can be practically applied in determining the need for additional surgery in the initial endoscopic resection group.

We developed and evaluated four AI models. Among these, the CBC model showed the best performance. The CBC model has its advantage in tasks with many categorical variables.20 The data of our study consisted mostly of categorical variables. Hence, in our study, the CBC model might outperform other AI models. Nevertheless, given the complexity of LNM occurrence in T1 CRC, additional efforts should be undertaken to explore the possibility of combining and employing ensembles of various AI models to improve the predictability of LNM in T1 CRC.

Our study has several limitations. First, it is a single-center, retrospective cohort study. Second, we did not train the AI algorithms directly with the colonoscopy images but employed the interpretation of two endoscopists. Hence, there might be interobserver or intraobserver bias. Finally, validation was performed only in a relatively small cohort without the temporal validation although separation of the patient cohort into the training and validation groups according to the sample collection period could have produced more robust results. However, to address this limitation, we performed the stratified 5-fold cross-validation, ensuring that the patients in the training and validation sets did not overlap. Additionally, to further verify the performances of our models, we performed 1,000 bootstrap iterations, another reliable validation method.28 The results from bootstrapping were consistent with those from the 5-fold cross-validation (Supplementary Fig. 3), supporting the reliability of our AI models. Hence, we believe that our study holds significance, as it represents, to the best of our knowledge, the first clinical validation of AI models for predicting LNM incorporating not only pathological features but also endoscopic findings. Nevertheless, in this study, the diagnostic accuracy of the AI algorithms was modest because of small sample size and event number for the training data, and only internal validation was performed. Therefore, to improve the diagnostic performance and objectively accept the utility of our model, further large-scale study with external validation is essential.

In conclusion, our AI models outperformed the current JSCCR guidelines for predicting LNM, suggesting the applicability of AI models in clinical practice to more accurately identify patients who require additional surgery after the initial endoscopic resection of T1 CRC. Further, larger studies are warranted to enhance the performance of AI models for predicting LNM in T1 CRC by developing deep learning models like convolutional neural networks or transformers. Additionally, integrating new AI technologies such as large language models and large multi-modal models with electronic health record systems, high-resolution endoscopic and pathologic image data could improve clinical decision-making regarding the necessity of additional surgery after endoscopic resection of T1 CRC. These advancements in AI technology could significantly impact the future of CRC management.

This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (HR20C0026).

Study concept and design: S.W.H., N.K., J.S.B. Data acquisition: J.E.B., S.W.H. Data analysis and interpretation: J.E.B., H.Y., S.W.H. Drafting of the manuscript: J.E.B., H.Y., S.W.H., S.S., J.S.B. Critical revision of the manuscript for important intellectual content: J.E.B., H.Y., S.W.H., S.S., J.Y.L., S.W.H., S.H.P., D.H.Y., B.D.Y., S.J.M., S.K.Y., N.K., J.S.B. Statistical analysis: J.E.B., H.Y., S.W.H. Obtained funding: N.K., J.S.B. Administrative, technical, or material support; study supervision: J.E.B., H.Y., S.W.H., S.S., J.Y.L., S.W.H., S.H.P., D.H.Y., B.D.Y., S.J.M., S.K.Y., N.K., J.S.B. Approval of final manuscript: all authors.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

  1. Bretthauer M, Kaminski MF, Løberg M, et al. Population-based colonoscopy screening for colorectal cancer: a randomized clinical trial. JAMA Intern Med 2016;176:894-902.
    Pubmed KoreaMed CrossRef
  2. Jeon MH, Jang SW, Lee CM, Kim SB. Early colon cancer recurring as liver metastasis without local recurrence three years after complete endoscopic mucosal resection. Case Rep Gastroenterol 2019;13:403-409.
    Pubmed KoreaMed CrossRef
  3. Labianca R, Nordlinger B, Beretta GD, et al. Early colon cancer: ESMO clinical practice guidelines for diagnosis, treatment and follow-up. Ann Oncol 2013;24 Suppl 6:vi64-vi72.
    Pubmed CrossRef
  4. Hong SW, Byeon JS. Endoscopic diagnosis and treatment of early colorectal cancer. Intest Res 2022;20:281-290.
    Pubmed KoreaMed CrossRef
  5. Ohata K, Kobayashi N, Sakai E, et al. Long-term outcomes after endoscopic submucosal dissection for large colorectal epithelial neoplasms: a prospective, multicenter, cohort trial from Japan. Gastroenterology 2022;163:1423-1434.
    Pubmed CrossRef
  6. Tateishi Y, Nakanishi Y, Taniguchi H, Shimoda T, Umemura S. Pathological prognostic factors predicting lymph node metastasis in submucosal invasive (T1) colorectal carcinoma. Mod Pathol 2010;23:1068-1072.
    Pubmed CrossRef
  7. Yoda Y, Ikematsu H, Matsuda T, et al. A large-scale multicenter study of long-term outcomes after endoscopic resection for submucosal invasive colorectal cancer. Endoscopy 2013;45:718-724.
    Pubmed CrossRef
  8. Hashiguchi Y, Muro K, Saito Y, et al. Japanese Society for Cancer of the Colon and Rectum (JSCCR) guidelines 2019 for the treatment of colorectal cancer. Int J Clin Oncol 2020;25:1-42.
    Pubmed KoreaMed CrossRef
  9. Choi YS, Kim WS, Hwang SW, et al. Clinical outcomes of submucosal colorectal cancer diagnosed after endoscopic resection: a focus on the need for surgery. Intest Res 2020;18:96-106.
    Pubmed KoreaMed CrossRef
  10. Vermeer NC, Backes Y, Snijders HS, et al. National cohort study on postoperative risks after surgery for submucosal invasive colorectal cancer. BJS Open 2018;3:210-217.
    Pubmed KoreaMed CrossRef
  11. Ichimasa K, Kudo SE, Mori Y, et al. Artificial intelligence may help in predicting the need for additional surgery after endoscopic resection of T1 colorectal cancer. Endoscopy 2018;50:230-240.
    Pubmed CrossRef
  12. Kudo SE, Ichimasa K, Villard B, et al. Artificial intelligence system to determine risk of T1 colorectal cancer metastasis to lymph node. Gastroenterology 2021;160:1075-1084.
    Pubmed CrossRef
  13. Haggitt RC, Glotzbach RE, Soffer EE, Wruble LD. Prognostic factors in colorectal carcinomas arising in adenomas: implications for lesions removed by endoscopic polypectomy. Gastroenterology 1985;89:328-336.
    Pubmed CrossRef
  14. Kikuchi R, Takano M, Takagi K, et al. Management of early invasive colorectal cancer: risk of recurrence and clinical guidelines. Dis Colon Rectum 1995;38:1286-1295.
    Pubmed CrossRef
  15. Matsuda T, Parra-Blanco A, Saito Y, Sakamoto T, Nakajima T. Assessment of likelihood of submucosal invasion in non-polypoid colorectal neoplasms. Gastrointest Endosc Clin N Am 2010;20:487-496.
    Pubmed CrossRef
  16. Kudo S, Hirota S, Nakajima T, et al. Colorectal tumours and pit pattern. J Clin Pathol 1994;47:880-885.
    Pubmed KoreaMed CrossRef
  17. Wang H, Wei XZ, Fu CG, Zhao RH, Cao FA. Patterns of lymph node metastasis are different in colon and rectal carcinomas. World J Gastroenterol 2010;16:5375-5379.
    Pubmed KoreaMed CrossRef
  18. Salehi F, Abbasi E, Hassibi B. The impact of regularization on high-dimensional logistic regression. arXiv.1906.03761 [Preprint]. 2019 [cited 2024 Sep 9].
    Available from: https://doi.org/10.48550/arXiv.1906.03761
  19. Breiman L. Random forests. Mach Learn 2001;45:5-32.
    CrossRef
  20. Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. CatBoost: unbiased boosting with categorical features. arXiv:1706.09516 [Preprint]. 2019 [cited 2024 Sep 9].
    Available from: https://doi.org/10.48550/arXiv.1706.09516
    CrossRef
  21. Opitz D, Maclin R. Popular ensemble methods: an empirical study. J Artif Intell Res 1999;11:169-198.
    CrossRef
  22. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837-845.
    Pubmed CrossRef
  23. Azur MJ, Stuart EA, Frangakis C, Leaf PJ. Multiple imputation by chained equations: what is it and how does it work?. Int J Methods Psychiatr Res 2011;20:40-49.
    Pubmed KoreaMed CrossRef
  24. Yasue C, Chino A, Takamatsu M, et al. Pathological risk factors and predictive endoscopic factors for lymph node metastasis of T1 colorectal cancer: a single-center study of 846 lesions. J Gastroenterol 2019;54:708-717.
    Pubmed CrossRef
  25. Backes Y, Elias SG, Groen JN, et al. Histologic factors associated with need for surgery in patients with pedunculated T1 colorectal carcinomas. Gastroenterology 2018;154:1647-1659.
    Pubmed CrossRef
  26. Ikematsu H, Yoda Y, Matsuda T, et al. Long-term outcomes after resection for submucosal invasive colorectal cancers. Gastroenterology 2013;144:551-559.
    Pubmed CrossRef
  27. Trenti L, Galvez A, Biondo S, et al. Quality of life and anterior resection syndrome after surgery for mid to low rectal cancer: a cross-sectional study. Eur J Surg Oncol 2018;44:1031-1039.
    Pubmed CrossRef
  28. Efron B, Tibshirani RJ. An introduction to the bootstrap. New York: Chapman & Hall, 1993.
    CrossRef

Article

Original Article

Gut and Liver 2025; 19(1): 69-76

Published online January 15, 2025 https://doi.org/10.5009/gnl240273

Copyright © Gut and Liver.

Artificial Intelligence Models May Aid in Predicting Lymph Node Metastasis in Patients with T1 Colorectal Cancer

Ji Eun Baek1,2 , Hahn Yi3 , Seung Wook Hong1 , Subin Song1 , Ji Young Lee4 , Sung Wook Hwang1 , Sang Hyoung Park1 , Dong-Hoon Yang1 , Byong Duk Ye1 , Seung-Jae Myung1 , Suk-Kyun Yang1 , Namkug Kim3 , Jeong-Sik Byeon1

1Department of Gastroenterology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea; 2Department of Gastroenterology, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Suwon, Korea; 3Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea; 4Health Screening and Promotion Center, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea

Correspondence to:Namkug Kim
ORCID https://orcid.org/0000-0002-3438-2217
E-mail namkugkim@gmail.com

Jeong-Sik Byeon
ORCID https://orcid.org/0000-0002-9793-6379
E-mail jsbyeon@amc.seoul.kr

Ji Eun Baek and Hahn Yi contributed equally to this work as first authors.

Received: June 19, 2024; Revised: September 6, 2024; Accepted: September 8, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background/Aims: Inaccurate prediction of lymph node metastasis (LNM) may lead to unnecessary surgery following endoscopic resection of T1 colorectal cancer (CRC). We aimed to validate the usefulness of artificial intelligence (AI) models for predicting LNM in patients with T1 CRC.
Methods: We analyzed the clinical data, laboratory results, pathological reports, and endoscopic findings of patients who underwent radical surgery for T1 CRC. We developed AI models to predict LNM using four algorithms: regularized logistic regression classifier (RLRC), random forest classifier (RFC), CatBoost classifier (CBC), and the voting classifier (VC). Four histological factors and four endoscopic findings were included to develop AI models. Areas under the receiver operating characteristics curves (AUROCs) were measured to distinguish AI model performance in accordance with the Japanese Society for Cancer of the Colon and Rectum guidelines.
Results: Among 1,386 patients with T1 CRC, 173 patients (12.5%) had LNM. The AUROC values of the RLRC, RFC, CBC, and VC models for LNM prediction were significantly higher (0.673, 0.640, 0.679, and 0.677, respectively) than the 0.525 suggested in accordance with the Japanese Society for Cancer of the Colon and Rectum guidelines (vs RLRC, p<0.001; vs RFC, p=0.001; vs CBC, p<0.001; vs VC, p<0.001). The AUROC value was similar between T1 colon versus T1 rectal cancers (0.718 vs 0.615, p=0.700). The AUROC value was also similar between the initial endoscopic resection and initial surgery groups (0.581 vs 0.746, p=0.845).
Conclusions: AI models trained on the basis of endoscopic findings and pathological features performed well in predicting LNM in patients with T1 CRC regardless of tumor location and initial treatment method.

Keywords: Artificial intelligence, T1 colorectal cancer, Lymph node metastasis

INTRODUCTION

With the development of a well-organized colorectal cancer (CRC) screening program and an increasing population adhering to it, the number of patients diagnosed with early-stage CRC has risen.1 Early CRC is defined as the invasion of cancer cells into the mucosal or submucosal layer, regardless of lymph node metastasis (LNM). It can be treated using endoscopic methods instead of radical surgery if the risk of LNM is negligible.2-4 Nevertheless, when early CRC is treated by endoscopic resection, there is a pitfall: the presence or absence of LNM cannot be histologically verified. Hence, if there are one or more high-risk pathological features suggestive of LNM, additional surgery with lymph node dissection is recommended after endoscopic resection., The guidelines suggested by the Japanese Society for Cancer of the Colon and Rectum (JSCCR) state the following pathological findings as the high-risk features for LNM: (1) depth of submucosal invasion ≥1,000 μm; (2) lymphovascular invasion positive; (3) poorly differentiated adenocarcinoma, signet-ring cell carcinoma, or mucinous carcinoma; and (4) budding grade of 2 or 3.

However, when an additional surgery was performed based on these recommendations and the surgically resected specimens were analyzed, LNM was confirmed only in 8.8% to 14.3% of cases.6-9 In other words, about nine out of 10 patients underwent unnecessary surgery. Furthermore, surgery-related mortality and morbidity are still not negligible despite advances in surgical techniques. Therefore, there is a need to develop a high-accuracy preoperative predictive model for LNM in early CRC.10 To meet this requirement, several attempts have been made to develop an artificial intelligence (AI)-based decision-making model for predicting LNM in early CRC.11,12 These models demonstrated superior accuracy in predicting LNM in early CRC compared to the current guidelines. However, most of these models were created based on the patient’s clinical and pathological data, excluding endoscopic findings.

In this study, we aimed to develop an AI-based LNM predictive model that focuses on both the histological features and the endoscopic characteristics of early CRC. We also evaluated whether the new AI model demonstrated good accuracy in LNM prediction compared to the JSCCR guidelines.

MATERIALS AND METHODS

1. Study design and population

This was a retrospective single-center study with the development and cross-validation of an AI model. In this study, patients who underwent surgical resection of T1 CRC, with or without prior endoscopic resection at Asan Medical Center, Seoul, Korea, from 2011 to 2018, were included. Furthermore, patients who had undergone endoscopic resection at other institutions followed by surgery at Asan Medical Center were also included. Among these, the following patients were excluded: (1) those with synchronous CRC; (2) those with non-radical surgery; and (3) those with missing values, such as poor-quality endoscopic images. We gathered demographic variables and clinical information, including age, sex, body mass index, family history of CRC, medication, smoking and alcohol history, laboratory data (fasting blood glucose, total cholesterol, and serum carcinoembryonic antigen levels), pathological findings of the surgical specimen, and endoscopic images.

2. Pathological findings

The pathological findings of surgical specimens were evaluated by board-certified pathologists in our institution. The following pathological factors were included in developing an AI model to predict LNM for T1 CRC: (1) tumor histology, (2) lymphovascular invasion, (3) tumor budding, and (4) invasion depth of submucosa (SM). We defined deep SM invasion as the absolute depth of SM invasion >1,000 μm, Haggitt level 4, or Kikuchi SM level 2−3.13,14

3. Endoscopic findings

The endoscopic images were evaluated independently by two board-certified, experienced endoscopists (S.W.H. and J.S.B.). All the analyzed images were conventional white-light endoscopy images. The typical endoscopic findings suggesting deep SM invasion included in the analysis were as follows: (1) ulceration/depression, (2) expansion, (3) white spots, and (4) hardness.15 In instances of disagreement between the two endoscopists, a consensus was reached through team discussion. Other endoscopic findings, such as tumor size, location (colon or rectum), morphology (pedunculated or non-pedunculated), and pit pattern by Kudo classification, were also analyzed.16

4. Outcome definitions

The primary outcome of this study was the clinical validation following the development of an AI model for predicting LNM in T1 CRC. The stratified 5-fold cross-validation was implemented to validate the model’s performance under generalized circumstances. Receiver operating characteristics curves were plotted and the area under the receiver operating characteristics curve (AUROC) was measured to compare our model’s discriminating power with that of the Japanese guidelines.

5. Subanalysis

We compared the diagnostic performance according to the tumor location (colon vs rectum) because rectal cancer was reported to have a higher risk of LNM as compared to colon cancer.17 Furthermore, as the initial endoscopic resection group serves as an actual prediction target for our model in clinical practice, we assessed the diagnostic performance within patients who underwent endoscopic resection followed by surgical resection.

6. AI model development

The predictive models were developed using four supervised AI algorithms, which included a regularized logistic regression classifier (RLRC), random forest classifier (RFC), CatBoost classifier (CBC), and the voting classifier (VC) created with the three other models by a 2:1:4 ratio of votes. RLRC, which adds a penalty to the loss function to reduce the possibility of overfitting to the development dataset, is anticipated to demonstrate optimal performance for unseen data.18 RFC method allows the diversity of decision trees to improve classification performance while preventing overfitting to the development set, creating a model that also works well with unseen data.19 CBC exhibits outstanding performance among gradient-boosting algorithms, especially in tasks involving numerous categorical variables.20 VC tends to generate robust predictions, even in cases where one or two classifiers make errors. The classification probability of VC with well-calibrated classifiers also approaches the true probability.21

We utilized the previously mentioned four pathological and four endoscopic findings for developing the prediction model: tumor histology, lymphovascular invasion, tumor budding, depth of SM invasion, hardness, expansion, white spot, and ulceration/depression. The best-performing model among the four algorithms was chosen as the final model for subanalysis.

7. Statistical analysis

Continuous variables were described as mean±standard deviation or median (interquartile range), while categorical variables were expressed as frequency and percentage. The Mann-Whitney U tests were employed to compare continuous variables, and chi-square or Fisher exact tests were utilized for comparing categorical variables. AUROC was compared using DeLong’s test.22 All statistical tests were conducted using a two-sided approach, and a p-value <0.05 was considered statistically significant. The missing values were filled via Multivariate Imputation by Chained Equations imputation.23 All analyses were performed using R software (version 4.0.2; R Foundation for Statistical Computing, Vienna, Austria).

8. Ethical statements

This study was approved by the Institutional Review Board at Asan Medical Center (IRB number: 2020−1676). The study was conducted by the Helsinki Declaration. Because of the retrospective nature of the study, the patient consent requirement was waived.

RESULTS

1. Baseline characteristics of the patients

We initially identified 1,635 patients with T1 CRC who underwent surgical resection at Asan Medical Center. After excluding patients based on the exclusion criteria, 1,386 patients were ultimately selected as the study population, with 173 of them having LNM (12.5%) (Fig. 1). Baseline characteristics are demonstrated according to LNM in Table 1. In the LNM group, 64 patients (37.0%) underwent endoscopic plus surgical resection, and 109 patients (63.0%) underwent initial surgical resection. Endoscopic and pathological features of T1 CRC patients with and without LNM were summarized in Table 2. We used four pathological variables (tumor histology, lymphovascular invasion, tumor budding, and depth of invasion) and three endoscopic variables (hardness, white spot, and ulceration/depression), that exhibited statistically significant differences between the two groups to develop the prediction model for LNM. Although not statistically significant, another endoscopic factor, expansion, was included in the development of the prediction model based on its clinical significance in previous studies.15

Figure 1. Flowchart of the included patients. CRC, colorectal cancer; LNM, lymph node metastasis.

Table 1 . Baseline Characteristics.

VariablePatients with LNM (n=173)Patients without LNM (n=1,213)p-value
Age, yr59.9±10.360.7±10.50.375
Sex0.301
Male98 (56.6)741 (61.1)
Female75 (43.4)472 (38.9)
BMI, kg/m224.3 (22.6–26.2)24.3 (22.3–26.3)0.651
Family history of CRC0.754
No159 (91.9)1,102 (90.8)
Yes14 (8.1)111 (9.2)
Use of medications
Aspirin11 (6.36)158 (13.0)0.017
Statin13 (7.51)150 (12.4)0.084
Alcohol consumption0.350
Current or ex-drinker85 (49.1)646 (53.3)
Nonuser88 (50.9)567 (46.7)
Smoking history0.011
Never113 (65.3)665 (54.8)
Ex-smoker53 (30.6)434 (35.8)
Current smoker7 (4.05)114 (9.4)
Fasting blood glucose, mg/dL127 (104–154)130 (107–157)0.495
Total cholesterol, mg/dL162 (140–186)165 (140–189)0.469
Serum CEA level, mg/dL1.4 (0.91–2.0)1.5 (0.99–2.1)0.348
Location of T1 cancer0.458
Colon121 (69.9)810 (66.8)
Rectum52 (30.1)403 (33.2)
Treatment modality0.005
Surgery only109 (63.0)623 (51.4)
Endoscopic resection followed by surgery64 (37.0)590 (48.6)

Data are presented as mean±SD, number (%), or median (range)..

LNM, lymph node metastasis; BMI, body mass index; CRC, colorectal cancer; CEA, carcinoembryonic antigen..



Table 2 . Endoscopic and Pathological Features.

VariablePatients with
LNM (n=173)
Patients without
LNM (n=1,213)
p-value
Tumor size, mm18 (13–24)18 (12–25)0.938
Morphology0.809
Nonpedunculated125 (72.3)891 (73.5)
Pedunculated48 (27.7)322 (26.5)
Hardness152 (87.9)925 (76.3)<0.001
Expansion99 (57.2)685 (56.5)0.925
White spot86 (49.7)466 (38.4)0.006
Ulceration/depression101 (58.4)543 (44.8)0.001
Tumor histology<0.001
WD35 (20.2)474 (39.1)
MD122 (70.5)702 (57.9)
PD9 (5.2)21 (1.73)
SRCC3 (1.7)1 (0.1)
Mucinous adenocarcinoma3 (1.7)12 (1.0)
Others1 (0.6)3 (0.2)
Tumor budding0.031
Grade 1126 (72.8)976 (80.5)
Grade 2 or 344 (25.4)222 (18.3)
Depth of SM invasion0.01
Deep136 (78.6)833 (68.7)
Superficial37 (21.4)380 (31.3)
Lymphovascular invasion<0.001
Negative97 (56.1)991 (81.7)
Positive76 (43.9)222 (18.3)

Data are presented as median (range) or number (%)..

LNM, lymph node metastasis; WD, well differentiated; MD, moderately differentiated; PD, poorly differentiated; SRCC, signet-ring cell carcinoma; SM, submucosa..



2. Clinical validation of the developed model

The confusion matrix obtained through the 5-fold cross-validation was presented in Supplementary Fig. 1. The AUROC of RLRC, RFC, CBC, and VC models for LNM prediction were 0.673 (95% confidence interval [CI], 0.574 to 0.772), 0.639 (95% CI, 0.542 to 0.736), 0.679 (95% CI, 0.582 to 0.775), and 0.677 (95% CI, 0.580 to 0.774), respectively. All models demonstrated superior performance compared to the AUROC of the JSCCR guidelines (0.525; 95% CI, 0.490 to 0.560) (p<0.001 between RLRC and JSCCR, p=0.001 between RFC and JSCCR, p<0.001 between CBC and JSCCR, and p<0.001 between VC and JSCCR) (Fig. 2). Other performance indicators such as accuracy, sensitivity, specificity, positive and negative predictive values were presented in the Supplementary Table 1. Lymphovascular invasion was the most influential factor for LNM in the SHAP beeswarm plot (Supplementary Fig. 2).

Figure 2. Receiver operating characteristics (ROC) curves for the artificial intelligence models and Japanese guidelines in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval; JpnGL, Japanese guideline; RLRC, regularized logistic regression classifier; RFC, random forest classifier; CBC, CatBoost classifier; VC, voting classifier.

3. Subanalysis according to the tumor location and treatment modality

We used the CBC model, which showed the best performance, as a tool for subgroup analysis. In the subgroups of T1 colon and rectal cancer, the AUROC for LNM prediction was 0.718 (95% CI, 0.600 to 0.836) and 0.615 (95% CI, 0.498 to 0.731) (Fig. 3). Moreover, there was no significant difference in the AUROC for LNM prediction between the patients who underwent initial endoscopic resection followed by surgery and those who underwent initial surgery (0.581 [95% CI, 0.473 to 0.688] vs 0.746 [95% CI, 0.636 to 0.855], p=0.845) (Fig. 4).

Figure 3. Receiver operating characteristics (ROC) curves for the CatBoost classifier model in the colon and rectum subgroups in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval.

Figure 4. Receiver operating characteristics (ROC) curves for the CatBoost classifier model in the initial endoscopic resection and initial surgery subgroups in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval.

DISCUSSION

Accurate prediction of the presence of LNM in patients who have undergone endoscopic resection of T1 CRC is a clinically important issue for deciding on additional surgery. Despite current guidelines recommending additional surgery if the endoscopic resection specimen exhibits unfavorable pathological features, 85.7% to 91.2% of the patients who underwent surgery, showed no LNM in the surgical resection specimen.6-9,11,24 In this study, we discovered that AI models incorporating both endoscopic and pathological findings of T1 CRC were superior in predicting LNM compared to the JSCCR guidelines.

Several studies have attempted to accurately predict LNM in T1 CRC using AI algorithms. A Japanese study developed an AI model using the patient’s age, gender, tumor size, location, morphology, lymphovascular invasion, and histological grade.12 This model demonstrated superior predictive accuracy compared to decisions based on the current U.S. guidelines.12 In another study, a prediction model was proposed based on five histological findings, demonstrating a high predictive capability for LNM.25 The JSCCR guidelines are the most commonly used and currently recommended for decision-making in CRC. Nevertheless, our study demonstrated that AI models can predict LNM in T1 CRC more accurately than the JSCCR guidelines. An important aspect deserving attention in our AI models is the development of algorithm that incorporate not only pathological features but also endoscopic findings. This differentiation provides a distinct advantage compared to previous studies. While we did not directly compare an AI model trained with endoscopic findings to one without, we suggest that AI models benefiting from diverse training data may exhibit superior performance compared to those with limited training data. However, such assumptions require further validation as there is a study that reported an AUROC of 0.83 based solely on histologic risk factors, which is higher than the AUROC of our models.25 It should be noted that this study was a predictive model targeting the specific group of pedunculated T1 CRC, making direct comparison with our study, which included the entire T1 CRC population, difficult. Therefore, further research, encompassing not only pathological data but also demographic and clinical data such as medication history including aspirin, should be conducted to enhance the quality of AI models predicting LNM in T1 CRC. Additionally, a more comprehensive AI algorithm should be developed by including other endoscopic characteristics such as tumor size and location, in addition to the four endoscopic features (hardness, white spot, ulceration/depression, and expansion) used in our study.

In subgroup analysis, our AI model demonstrated high predictive power for LNM regardless of the location of T1 CRC. A previous study reported a significantly higher local recurrence rate in patients with T1 rectal cancer compared to those with T1 cancer in the colon proximal to the rectum when treated with only endoscopic resection.26 Furthermore, bowel dysfunction after radical surgery for rectal cancer significantly impacts the quality of life.27 Therefore, precise prediction of LNM in T1 rectal cancer is crucial. Our AI model revealed superior performance in the subgroup of T1 rectal cancer, similar to its performance in T1 colon cancer (Fig. 3). Thus, the AI model can be used for predicting LNM not only in T1 colon cancer but also in T1 rectal cancer, where precise LNM prediction is critical.

Reliable prediction of LNM in T1 CRC is necessary for determining the need for additional surgery after initial endoscopic resection in daily clinical practice. In our study, the AI model exhibited no significant difference in the performance of LNM prediction in T1 CRC between the initial endoscopic resection and initial surgery groups. Hence, we propose that the AI model may be readily applicable to patients who initially underwent endoscopic resection of T1 CRC to determine the necessity of additional surgery with greater confidence compared to the JSCCR guidelines. Nevertheless, although the difference was not statistically significant, the AUROC of 0.582 for the initial endoscopic resection group was numerically lower compared to 0.745 for the initial surgery group. Additionally, an AUROC of 0.582 indicates that the AI algorithm is insufficient for practical clinical application. Therefore, efforts are needed to improve performance by using larger training data and carefully considering reliable risk features, aiming to develop an AI algorithm that can be practically applied in determining the need for additional surgery in the initial endoscopic resection group.

We developed and evaluated four AI models. Among these, the CBC model showed the best performance. The CBC model has its advantage in tasks with many categorical variables.20 The data of our study consisted mostly of categorical variables. Hence, in our study, the CBC model might outperform other AI models. Nevertheless, given the complexity of LNM occurrence in T1 CRC, additional efforts should be undertaken to explore the possibility of combining and employing ensembles of various AI models to improve the predictability of LNM in T1 CRC.

Our study has several limitations. First, it is a single-center, retrospective cohort study. Second, we did not train the AI algorithms directly with the colonoscopy images but employed the interpretation of two endoscopists. Hence, there might be interobserver or intraobserver bias. Finally, validation was performed only in a relatively small cohort without the temporal validation although separation of the patient cohort into the training and validation groups according to the sample collection period could have produced more robust results. However, to address this limitation, we performed the stratified 5-fold cross-validation, ensuring that the patients in the training and validation sets did not overlap. Additionally, to further verify the performances of our models, we performed 1,000 bootstrap iterations, another reliable validation method.28 The results from bootstrapping were consistent with those from the 5-fold cross-validation (Supplementary Fig. 3), supporting the reliability of our AI models. Hence, we believe that our study holds significance, as it represents, to the best of our knowledge, the first clinical validation of AI models for predicting LNM incorporating not only pathological features but also endoscopic findings. Nevertheless, in this study, the diagnostic accuracy of the AI algorithms was modest because of small sample size and event number for the training data, and only internal validation was performed. Therefore, to improve the diagnostic performance and objectively accept the utility of our model, further large-scale study with external validation is essential.

In conclusion, our AI models outperformed the current JSCCR guidelines for predicting LNM, suggesting the applicability of AI models in clinical practice to more accurately identify patients who require additional surgery after the initial endoscopic resection of T1 CRC. Further, larger studies are warranted to enhance the performance of AI models for predicting LNM in T1 CRC by developing deep learning models like convolutional neural networks or transformers. Additionally, integrating new AI technologies such as large language models and large multi-modal models with electronic health record systems, high-resolution endoscopic and pathologic image data could improve clinical decision-making regarding the necessity of additional surgery after endoscopic resection of T1 CRC. These advancements in AI technology could significantly impact the future of CRC management.

ACKNOWLEDGEMENTS

This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (HR20C0026).

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

AUTHOR CONTRIBUTIONS

Study concept and design: S.W.H., N.K., J.S.B. Data acquisition: J.E.B., S.W.H. Data analysis and interpretation: J.E.B., H.Y., S.W.H. Drafting of the manuscript: J.E.B., H.Y., S.W.H., S.S., J.S.B. Critical revision of the manuscript for important intellectual content: J.E.B., H.Y., S.W.H., S.S., J.Y.L., S.W.H., S.H.P., D.H.Y., B.D.Y., S.J.M., S.K.Y., N.K., J.S.B. Statistical analysis: J.E.B., H.Y., S.W.H. Obtained funding: N.K., J.S.B. Administrative, technical, or material support; study supervision: J.E.B., H.Y., S.W.H., S.S., J.Y.L., S.W.H., S.H.P., D.H.Y., B.D.Y., S.J.M., S.K.Y., N.K., J.S.B. Approval of final manuscript: all authors.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

SUPPLEMENTARY MATERIALS

Supplementary materials can be accessed at https://doi.org/10.5009/gnl240273.

Fig 1.

Figure 1.Flowchart of the included patients. CRC, colorectal cancer; LNM, lymph node metastasis.
Gut and Liver 2025; 19: 69-76https://doi.org/10.5009/gnl240273

Fig 2.

Figure 2.Receiver operating characteristics (ROC) curves for the artificial intelligence models and Japanese guidelines in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval; JpnGL, Japanese guideline; RLRC, regularized logistic regression classifier; RFC, random forest classifier; CBC, CatBoost classifier; VC, voting classifier.
Gut and Liver 2025; 19: 69-76https://doi.org/10.5009/gnl240273

Fig 3.

Figure 3.Receiver operating characteristics (ROC) curves for the CatBoost classifier model in the colon and rectum subgroups in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval.
Gut and Liver 2025; 19: 69-76https://doi.org/10.5009/gnl240273

Fig 4.

Figure 4.Receiver operating characteristics (ROC) curves for the CatBoost classifier model in the initial endoscopic resection and initial surgery subgroups in predicting lymph node metastasis. AUROC, area under the ROC curve; CI, confidence interval.
Gut and Liver 2025; 19: 69-76https://doi.org/10.5009/gnl240273

Table 1 Baseline Characteristics

VariablePatients with LNM (n=173)Patients without LNM (n=1,213)p-value
Age, yr59.9±10.360.7±10.50.375
Sex0.301
Male98 (56.6)741 (61.1)
Female75 (43.4)472 (38.9)
BMI, kg/m224.3 (22.6–26.2)24.3 (22.3–26.3)0.651
Family history of CRC0.754
No159 (91.9)1,102 (90.8)
Yes14 (8.1)111 (9.2)
Use of medications
Aspirin11 (6.36)158 (13.0)0.017
Statin13 (7.51)150 (12.4)0.084
Alcohol consumption0.350
Current or ex-drinker85 (49.1)646 (53.3)
Nonuser88 (50.9)567 (46.7)
Smoking history0.011
Never113 (65.3)665 (54.8)
Ex-smoker53 (30.6)434 (35.8)
Current smoker7 (4.05)114 (9.4)
Fasting blood glucose, mg/dL127 (104–154)130 (107–157)0.495
Total cholesterol, mg/dL162 (140–186)165 (140–189)0.469
Serum CEA level, mg/dL1.4 (0.91–2.0)1.5 (0.99–2.1)0.348
Location of T1 cancer0.458
Colon121 (69.9)810 (66.8)
Rectum52 (30.1)403 (33.2)
Treatment modality0.005
Surgery only109 (63.0)623 (51.4)
Endoscopic resection followed by surgery64 (37.0)590 (48.6)

Data are presented as mean±SD, number (%), or median (range).

LNM, lymph node metastasis; BMI, body mass index; CRC, colorectal cancer; CEA, carcinoembryonic antigen.


Table 2 Endoscopic and Pathological Features

VariablePatients with
LNM (n=173)
Patients without
LNM (n=1,213)
p-value
Tumor size, mm18 (13–24)18 (12–25)0.938
Morphology0.809
Nonpedunculated125 (72.3)891 (73.5)
Pedunculated48 (27.7)322 (26.5)
Hardness152 (87.9)925 (76.3)<0.001
Expansion99 (57.2)685 (56.5)0.925
White spot86 (49.7)466 (38.4)0.006
Ulceration/depression101 (58.4)543 (44.8)0.001
Tumor histology<0.001
WD35 (20.2)474 (39.1)
MD122 (70.5)702 (57.9)
PD9 (5.2)21 (1.73)
SRCC3 (1.7)1 (0.1)
Mucinous adenocarcinoma3 (1.7)12 (1.0)
Others1 (0.6)3 (0.2)
Tumor budding0.031
Grade 1126 (72.8)976 (80.5)
Grade 2 or 344 (25.4)222 (18.3)
Depth of SM invasion0.01
Deep136 (78.6)833 (68.7)
Superficial37 (21.4)380 (31.3)
Lymphovascular invasion<0.001
Negative97 (56.1)991 (81.7)
Positive76 (43.9)222 (18.3)

Data are presented as median (range) or number (%).

LNM, lymph node metastasis; WD, well differentiated; MD, moderately differentiated; PD, poorly differentiated; SRCC, signet-ring cell carcinoma; SM, submucosa.


References

  1. Bretthauer M, Kaminski MF, Løberg M, et al. Population-based colonoscopy screening for colorectal cancer: a randomized clinical trial. JAMA Intern Med 2016;176:894-902.
    Pubmed KoreaMed CrossRef
  2. Jeon MH, Jang SW, Lee CM, Kim SB. Early colon cancer recurring as liver metastasis without local recurrence three years after complete endoscopic mucosal resection. Case Rep Gastroenterol 2019;13:403-409.
    Pubmed KoreaMed CrossRef
  3. Labianca R, Nordlinger B, Beretta GD, et al. Early colon cancer: ESMO clinical practice guidelines for diagnosis, treatment and follow-up. Ann Oncol 2013;24 Suppl 6:vi64-vi72.
    Pubmed CrossRef
  4. Hong SW, Byeon JS. Endoscopic diagnosis and treatment of early colorectal cancer. Intest Res 2022;20:281-290.
    Pubmed KoreaMed CrossRef
  5. Ohata K, Kobayashi N, Sakai E, et al. Long-term outcomes after endoscopic submucosal dissection for large colorectal epithelial neoplasms: a prospective, multicenter, cohort trial from Japan. Gastroenterology 2022;163:1423-1434.
    Pubmed CrossRef
  6. Tateishi Y, Nakanishi Y, Taniguchi H, Shimoda T, Umemura S. Pathological prognostic factors predicting lymph node metastasis in submucosal invasive (T1) colorectal carcinoma. Mod Pathol 2010;23:1068-1072.
    Pubmed CrossRef
  7. Yoda Y, Ikematsu H, Matsuda T, et al. A large-scale multicenter study of long-term outcomes after endoscopic resection for submucosal invasive colorectal cancer. Endoscopy 2013;45:718-724.
    Pubmed CrossRef
  8. Hashiguchi Y, Muro K, Saito Y, et al. Japanese Society for Cancer of the Colon and Rectum (JSCCR) guidelines 2019 for the treatment of colorectal cancer. Int J Clin Oncol 2020;25:1-42.
    Pubmed KoreaMed CrossRef
  9. Choi YS, Kim WS, Hwang SW, et al. Clinical outcomes of submucosal colorectal cancer diagnosed after endoscopic resection: a focus on the need for surgery. Intest Res 2020;18:96-106.
    Pubmed KoreaMed CrossRef
  10. Vermeer NC, Backes Y, Snijders HS, et al. National cohort study on postoperative risks after surgery for submucosal invasive colorectal cancer. BJS Open 2018;3:210-217.
    Pubmed KoreaMed CrossRef
  11. Ichimasa K, Kudo SE, Mori Y, et al. Artificial intelligence may help in predicting the need for additional surgery after endoscopic resection of T1 colorectal cancer. Endoscopy 2018;50:230-240.
    Pubmed CrossRef
  12. Kudo SE, Ichimasa K, Villard B, et al. Artificial intelligence system to determine risk of T1 colorectal cancer metastasis to lymph node. Gastroenterology 2021;160:1075-1084.
    Pubmed CrossRef
  13. Haggitt RC, Glotzbach RE, Soffer EE, Wruble LD. Prognostic factors in colorectal carcinomas arising in adenomas: implications for lesions removed by endoscopic polypectomy. Gastroenterology 1985;89:328-336.
    Pubmed CrossRef
  14. Kikuchi R, Takano M, Takagi K, et al. Management of early invasive colorectal cancer: risk of recurrence and clinical guidelines. Dis Colon Rectum 1995;38:1286-1295.
    Pubmed CrossRef
  15. Matsuda T, Parra-Blanco A, Saito Y, Sakamoto T, Nakajima T. Assessment of likelihood of submucosal invasion in non-polypoid colorectal neoplasms. Gastrointest Endosc Clin N Am 2010;20:487-496.
    Pubmed CrossRef
  16. Kudo S, Hirota S, Nakajima T, et al. Colorectal tumours and pit pattern. J Clin Pathol 1994;47:880-885.
    Pubmed KoreaMed CrossRef
  17. Wang H, Wei XZ, Fu CG, Zhao RH, Cao FA. Patterns of lymph node metastasis are different in colon and rectal carcinomas. World J Gastroenterol 2010;16:5375-5379.
    Pubmed KoreaMed CrossRef
  18. Salehi F, Abbasi E, Hassibi B. The impact of regularization on high-dimensional logistic regression. arXiv.1906.03761 [Preprint]. 2019 [cited 2024 Sep 9]. Available from: https://doi.org/10.48550/arXiv.1906.03761
  19. Breiman L. Random forests. Mach Learn 2001;45:5-32.
    CrossRef
  20. Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. CatBoost: unbiased boosting with categorical features. arXiv:1706.09516 [Preprint]. 2019 [cited 2024 Sep 9]. Available from: https://doi.org/10.48550/arXiv.1706.09516
    CrossRef
  21. Opitz D, Maclin R. Popular ensemble methods: an empirical study. J Artif Intell Res 1999;11:169-198.
    CrossRef
  22. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837-845.
    Pubmed CrossRef
  23. Azur MJ, Stuart EA, Frangakis C, Leaf PJ. Multiple imputation by chained equations: what is it and how does it work?. Int J Methods Psychiatr Res 2011;20:40-49.
    Pubmed KoreaMed CrossRef
  24. Yasue C, Chino A, Takamatsu M, et al. Pathological risk factors and predictive endoscopic factors for lymph node metastasis of T1 colorectal cancer: a single-center study of 846 lesions. J Gastroenterol 2019;54:708-717.
    Pubmed CrossRef
  25. Backes Y, Elias SG, Groen JN, et al. Histologic factors associated with need for surgery in patients with pedunculated T1 colorectal carcinomas. Gastroenterology 2018;154:1647-1659.
    Pubmed CrossRef
  26. Ikematsu H, Yoda Y, Matsuda T, et al. Long-term outcomes after resection for submucosal invasive colorectal cancers. Gastroenterology 2013;144:551-559.
    Pubmed CrossRef
  27. Trenti L, Galvez A, Biondo S, et al. Quality of life and anterior resection syndrome after surgery for mid to low rectal cancer: a cross-sectional study. Eur J Surg Oncol 2018;44:1031-1039.
    Pubmed CrossRef
  28. Efron B, Tibshirani RJ. An introduction to the bootstrap. New York: Chapman & Hall, 1993.
    CrossRef
Gut and Liver

Vol.19 No.1
January, 2025

pISSN 1976-2283
eISSN 2005-1212

qrcode
qrcode

Supplementary

Share this article on :

  • line

Popular Keywords

Gut and LiverQR code Download
qr-code

Editorial Office