Integrating machine learning and single-cell analysis to uncover lung adenocarcinoma progression and prognostic biomarkers

J Cell Mol Med. 2024 Jul;28(13):e18516. doi: 10.1111/jcmm.18516.

Abstract

The progression of lung adenocarcinoma (LUAD) from atypical adenomatous hyperplasia (AAH) to invasive adenocarcinoma (IAC) involves a complex evolution of tumour cell clusters, the mechanisms of which remain largely unknown. By integrating single-cell datasets and using inferCNV, we identified and analysed tumour cell clusters to explore their heterogeneity and changes in abundance throughout LUAD progression. We applied gene set variation analysis (GSVA), pseudotime analysis, scMetabolism, and Cytotrace scores to study biological functions, metabolic profiles and stemness traits. A predictive model for prognosis, based on key cluster marker genes, was developed using CoxBoost and plsRcox (CPM), and validated across multiple cohorts for its prognostic prediction capabilities, tumour microenvironment characterization, mutation landscape and immunotherapy response. We identified nine distinct tumour cell clusters, with Cluster 6 indicating an early developmental stage, high stemness and proliferative potential. The abundance of Clusters 0 and 6 increased from AAH to IAC, correlating with prognosis. The CPM model effectively distinguished prognosis in immunotherapy cohorts and predicted genomic alterations, chemotherapy drug sensitivity, and immunotherapy responsiveness. Key gene S100A16 in the CPM model was validated as an oncogene, enhancing LUAD cell proliferation, invasion and migration. The CPM model emerges as a novel biomarker for predicting prognosis and immunotherapy response in LUAD patients, with S100A16 identified as a potential therapeutic target.

Keywords: S100A16; immunotherapy response; lung adenocarcinoma; machine learning; single‐cell analysis.

MeSH terms

  • Adenocarcinoma of Lung* / genetics
  • Adenocarcinoma of Lung* / pathology
  • Biomarkers, Tumor* / genetics
  • Biomarkers, Tumor* / metabolism
  • Disease Progression*
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Immunotherapy / methods
  • Lung Neoplasms* / genetics
  • Lung Neoplasms* / pathology
  • Machine Learning*
  • Prognosis
  • Single-Cell Analysis* / methods
  • Tumor Microenvironment* / genetics

Substances

  • Biomarkers, Tumor