Purpose
This study aims to develop and evaluate a multimodal, knowledge graph–guided retrieval-augmented generation (RAG) framework for clinical decision support in pediatric acute leukemia.
Materials and Methods
Authoritative pediatric hematology-oncology textbooks were decomposed into text, tables, and figures. Visual and tabular elements were converted into structured textual descriptions using a multimodal large language model (LLM). A biomedical knowledge graph was constructed using LightRAG with gpt-oss-20b and Qwen3 embeddings. System performance was evaluated using 10 clinical questions, with responses generated by the RAG system and GPT-4.5. Nine medical experts (4 pediatric hematology-oncology specialists, 3 nurse specialists, and 2 medical students) conducted blind evaluations, complemented by two LLM evaluators (Claude Sonnet 4.5 and Gemini 3).
Results
The knowledge graph comprised 10,062 nodes and 15,876 edges. In expert evaluation, RAG was preferred in 47.8% of 90 paired comparisons versus 35.6% for GPT-4.5, with higher completeness scores (3.84 vs 3.51, p = 0.016). RAG showed significant advantage for ETP-ALL immunophenotype definition (p = 0.016). LLM-based evaluation consistently favored RAG: Claude Sonnet 4.5 preferred RAG in 6 of 10 questions, and Gemini 3 in 9 of 10 (Fast mode) and 7 of 10 (Thinking mode).
Conclusion
Multimodal graph-based RAG is feasible for clinical decision support in pediatric leukemia. RAG showed complementary strengths to foundation model LLMs, providing added value for questions requiring evidence-dependent information. Unlike LLMs with static training knowledge, RAG can incorporate updated guidelines and protocols without model retraining, particularly relevant in rapidly evolving fields. Further validation regarding privacy and regulatory issues is required before clinical deployment.
Purpose Acute myeloid leukemia (AML) shows significant heterogeneity in therapeutic responses. We aimed to develop a gene signature for the stratification of high-risk pediatric AML using publicly available AML datasets, with a focus on literature-based prognostic gene sets.
Materials and Methods We identified 300 genes from 12 well-validated studies on AML-related gene signatures. Clinical and gene expression data were obtained from three datasets: TCGA-LAML, TARGET-AML, and BeatAML. Least absolute shrinkage and selection operator–Cox regression analysis was used to perform the initial gene selection and to construct a prognostic model using the The Cancer Genome Atlas (TCGA) database (n=132). The final gene signature was validated with two independent cohorts: BeatAML (n=411) and TARGET-AML (n=187).
Results We identified a six-gene signature (ETFB, ARL6IP5, PTP4A3, CSK, HS3ST3B1, PLA2G4A), referred to as the literature-based signature 6 (LBS6), that was significantly associated with lower overall survival rates across the TCGA (high-risk [HR], 4.2; 95% confidence interval [CI], 2.59 to 6.81; p < 0.001), BeatAML (HR, 1.52; 95% CI, 1.17 to 1.96; p=0.001), and TARGET (HR, 2.05; 95% CI, 1.36 to 3.08; p < 0.001) datasets. The high-LBS6 score group exhibited significantly poorer five-year event-free survival compared to the low-LBS6 score group (HR, 2.09; 95% CI, 1.38 to 3.15; p < 0.001). After adjusting for key risk factors, including gene mutations (WT1, FLT3, and NPM1), protocol-based risk group, white blood cell count, and age, the LBS6 score was independently associated with worse survival rates in validation cohorts.
Conclusion Our literature-driven approach identified a robust gene signature that stratifies AML patients into distinct risk groups. The LBS6 score shows promise in redefining initial risk stratification and identifying high-risk AML patients.
Citations
Citations to this article as recorded by
Integrated network propagation identifies prognostic metabolic signatures in acute myeloid leukemia Jong Keon Song, Hyery Kim, Sang-Hyun Hwang Journal of Translational Medicine.2025;[Epub] CrossRef
Jungnam Joo, Kyong-Ah Yoon, Tomonori Hayashi, Sun-Young Kong, Hye-Jin Shin, Boram Park, Young Min Kim, Sang-Hyun Hwang, Jeongseon Kim, Aesun Shin, Joo-Young Kim
Cancer Res Treat. 2016;48(2):708-714. Published online June 22, 2015
Purpose
Defects in the DNA damage repair process can cause genomic instability and play an important role in cervical carcinogenesis. The purpose of this study was to analyze the association of 29 candidate single nucleotide polymorphisms (SNPs) in genes in the DNA repair pathway, TP53, and TP53BP1 with the risk of cervical cancer.
Materials and Methods
Twenty-nine SNPs in four genes in the DNA repair pathway (ERCC2, ERCC5, NBS1, and XRCC1), TP53, and TP53BP1 were genotyped for 478 cervical cancer patients and 922 healthy control subjects, and their effects on cervical carcinogenesis were analyzed.
Results
The most significant association was found for rs17655 in ERCC5, with an age-adjusted p-value < 0.0001, for which a strong additive effect of the risk allele C was observed (odds ratio, 2.01 for CC to GG). On the other hand, another significant polymorphism rs454421 in ERCC2 showed a dominant effect (odds ratio, 1.68 for GA+AA to GG) with an age-adjusted p-value of 0.0009. The association of these polymorphisms remained significant regardless of the age of onset. The significant result for rs17655 was also consistent for subgroups of patients defined by histology and human papillomavirus (HPV) types. However, for rs454421, the association was observed only in patients with squamous cell carcinoma and non-HPV 18 type.
Conclusion
The results of this study show a novel association of cervical cancer and the genes involved in the nucleotide excision pathway in the Korean population.
Citations
Citations to this article as recorded by
RFC1 regulates the expansion of neural progenitors in the developing zebrafish cerebellum Fanny Nobilleau, Sébastien Audet, Alexandra da Silva Babinet, Sanaa Tork, Charlotte Zaouter, Meijiang Liao, Nicolas Pilon, Martine Tétreault, Shunmoogum A. Patten, Éric Samarut Nature Communications.2025;[Epub] CrossRef
Association between ERCC2 Lys751Gln, Asp312Asn, and Arg156Arg polymorphisms and gynecological cancer susceptibility: a meta-analysis Fen Chen, Jiayang Yu, Chun-Guang Wang Frontiers in Oncology.2025;[Epub] CrossRef
Genetic Polymorphisms in Base Excision Repair (BER) and Nucleotide Excision Repair (NER) Pathways as Potential Biomarkers for Gynecological Cancers: A Comprehensive Literature Review Magdalena Szatkowska, Julita Zdrada-Nowak Cancers.2025; 17(13): 2170. CrossRef
Role of NTRK Fusion Genes in the Tumor Immune Microenvironment of HPV (+/−) Cervical Cancer Qiongying Wang, Chan Zhang, Shijia Liu, Wangshu Li, Wenjuan Wei, Aziz ur Rehman Aziz, Han Lu, Daqing Wang Journal of Medical Virology.2025;[Epub] CrossRef
Exploring Erythrocyte Glycophorin a Somatic Mutations and ERCC5 Genotypes in Atomic Bomb Survivors: An Association Analysis Tomonori Hayashi, Kousuke Tanimoto, Naohiro Kato, Ikue Hayashi, Kengo Yoshida, Misa Imaizumi, Ayumi Hida, Waka Ohishi, Osamu Tanabe, Seishi Kyoizumi Radiation Research.2025;[Epub] CrossRef
KIAA1549 promotes the development and chemoresistance of colorectal cancer by upregulating ERCC2 Feng Ye, Yuwen Xie, Mingdao Lin, Yang Liu, Yuan Fang, Keli Chen, Yaowei Zhang, Yi Ding Molecular and Cellular Biochemistry.2024; 479(3): 629. CrossRef
Elucidation of Increased Cervical Cancer Risk Due to Polymorphisms in XRCC1 (R399Q and R194W), ERCC5 (D1104H), and NQO1 (P187S) Agneesh Pratim Das, Sandeep Saini, Shrishty Tyagi, Nisha Chaudhary, Subhash Mohan Agarwal Reproductive Sciences.2023; 30(4): 1118. CrossRef
Genetic polymorphisms in DNA repair genes and their association with risk of cervical cancer: A systematic review and meta‐analysis Xueting Shao, Xiaole Yang, Ying Liu, Qingxia Song, Xin Pan, Wansu Chen, Wei Jiang, Dan Xu, Yuanyuan Song, Renshou Chen Journal of Obstetrics and Gynaecology Research.2022; 48(9): 2405. CrossRef
Association of nonsynonymous SNPs of nucleotide excision repair genes ERCC4 rs1800067 (G/A) and ERCC5 rs17655 (G/C) as predisposing risk factors for gallbladder cancer Kumari Anjali, Tarun Kumar, Puneet Kumar, Gopeshwar Narayan, Sunita Singh Digestive and Liver Disease.2022; 54(11): 1533. CrossRef
Rare germline variants in DNA repair-related genes are accountable for papillary thyroid cancer susceptibility Catia Mio, Antonella Verrienti, Valeria Pecce, Marialuisa Sponziello, Giuseppe Damante Endocrine.2021; 73(3): 648. CrossRef
A meta-analysis of XRCC1 single nucleotide polymorphism and susceptibility to gynecological malignancies Xue Qin Zhang, Li Li Medicine.2021; 100(50): e28030. CrossRef
The association of integration patterns of human papilloma virus and single nucleotide polymorphisms on immune- or DNA repair-related genes in cervical cancer patients Jungnam Joo, Yosuke Omae, Yuki Hitomi, Boram Park, Hye-Jin Shin, Kyong-Ah Yoon, Hiromi Sawai, Makoto Tsuiji, Tomonori Hayashi, Sun-Young Kong, Katsushi Tokunaga, Joo-Young Kim Scientific Reports.2019;[Epub] CrossRef
The Pivotal Role of DNA Repair in Infection Mediated-Inflammation and Cancer Ayse Z. Sahan, Tapas K. Hazra, Soumita Das Frontiers in Microbiology.2018;[Epub] CrossRef
Somatic mutation load and spectra: A record of DNA damage and repair in healthy human cells Natalie Saini, Dmitry A. Gordenin Environmental and Molecular Mutagenesis.2018; 59(8): 672. CrossRef
Purpose
In some countries with high smoking prevalence, smoke-free legislation has only been implemented in specific public places, as opposed to a comprehensive ban on smoking in all public places. The purpose of this study was to provide valid data on second-hand smoke (SHS) exposure that reflect the consequences of incomplete smoke-free legislation, and provide a rationale for expanding this legislation.
Materials and Methods
Indoor and outdoor environmental exposure (fine particulate matter [PM2.5], air nicotine, and dust 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone [NNK]) was monitored in 35 public places where smoking is prohibited by law in Goyang, Republic of Korea. Biomarkers of SHS exposure (urinary cotinine, hair nicotine, and urinary 4-(methylnitrosamino)-1-(3-pyridyl)-1- butanol) were measured in 37 non-smoking employees. Geometric means and standard deviations were used in comparison of each measure.
Results
Considerable exposure of SHS was detected at all indoor monitoring sites (PM2.5, 95.5 μg/m3 in private educational institutions; air nicotine, 0.77 μg/m3 in large buildings; and dust NNK, 160.3 pg/mg in large buildings); environmental measures were higher in private or closed locations, such as restrooms. Outdoor measures of SHS exposure were lowest in nurseries and highest in government buildings. Biochemical measures revealed a pattern of SHS exposure by monitoring site, and were highest in private educational institutions.
Conclusion
The evidence of SHS exposure in legislative smoke-free places in Korea suggests that incomplete smoke free legislation and lack of enforcement of it might not protect people from exposure to smoke. Therefore, active steps should be taken toward a comprehensive ban on smoking in all public places and its enforcement.
Citations
Citations to this article as recorded by
Probing Tobacco-Specific Nitrosamines on Indoor Surfaces Using Chemical Ionization Mass Spectrometry Wen Zhang, Xiaochen Tang, Xiaoyang Liu, Timothy Leong, Hugo Destaillats, Haofei Zhang ACS ES&T Air.2025; 2(10): 2230. CrossRef
Environmental tobacco smoke at home and in public places prior to smoking ban enforcement: Assessment by hair analysis in a population of young adult students Claire Roseren, Sylvia Binck, François Faÿs, Maria Ruiz‐Castell, Hanen Samouda, Brice M. R. Appenzeller Drug Testing and Analysis.2023; 15(9): 962. CrossRef
Thirdhand smoke exposure: Differences in smoke exposure indices and cultural norms between hotels and motels in South Korea Myung-Bae Park, Tae Sic Lee, Jee Eun Oh, Do Hoon Lee Indoor and Built Environment.2022; 31(2): 510. CrossRef
Assessment of passive human exposure to tobacco smoke by environmental and biological monitoring in different public places in Wuhan, central China Qing Zhong, Yilin Li, Xin Mei, Junlin Li, Yuanxia Huang International Journal of Hygiene and Environmental Health.2022; 244: 114008. CrossRef
Exposure to Secondhand Smoke: Inconsistency between Self-Response and Urine Cotinine Biomarker Based on Korean National Data during 2009–2018 Boram Sim, Myung-Bae Park International Journal of Environmental Research and Public Health.2021; 18(17): 9284. CrossRef
Characteristics of Non-Smokers’ Exposure Using Indirect Smoking Indicators and Time Activity Patterns Byung Lyul Woo, Min Kyung Lim, Eun Young Park, Jinhyeon Park, Hyeonsu Ryu, Dayoung Jung, Marcus J. Ramirez, Wonho Yang Sustainability.2020; 12(21): 9099. CrossRef
Towards smoke-free cars in the Republic of Korea: Evidence
from environmental and biochemical monitoring of thirdhand
smoke exposure in taxis Eun Park, Min Lim, Sun Yeol Hong, Jee Oh, Bo Jeong, E Yun, Wonho Yang, Do-Hoon Lee Tobacco Induced Diseases.2018;[Epub] CrossRef
Biomarkers of Exposure to Secondhand and Thirdhand Tobacco Smoke: Recent Advances and Future Perspectives Sònia Torres, Carla Merino, Beatrix Paton, Xavier Correig, Noelia Ramírez International Journal of Environmental Research and Public Health.2018; 15(12): 2693. CrossRef