Abstract
Background
Gastric cancer is a significant contributor to the global cancer burden. Risk prediction models aim to estimate future risk based on current and past information, and can be utilized for risk stratification in population screening programs for gastric cancer. This review aims to explore the research design of existing models, as well as the methods, variables, and performance of model construction.Methods
Six databases were searched through to November 4, 2023 to identify appropriate studies. PRISMA extension for scoping reviews and the Arksey and O'Malley framework were followed. Data sources included PubMed, Embase, Web of Science, CNKI, Wanfang, and VIP, focusing on gastric cancer risk prediction model studies.Results
A total of 29 articles met the inclusion criteria, from which 28 original risk prediction models were identified that met the analysis criteria. The risk prediction model is screened, and the data extracted includes research characteristics, prediction variables selection, model construction methods and evaluation indicators. The area under the curve (AUC) of the models ranged from 0.560 to 0.989, while the C-statistics varied between 0.684 and 0.940. The number of predictor variables is mainly concentrated between 5 to 11. The top 5 most frequently included variables were age, helicobacter pylori (Hp), precancerous lesion, pepsinogen (PG), sex, and smoking. Age and Hp were the most consistently included variables.Conclusion
This review enhances understanding of current gastric cancer risk prediction research and its future directions. The findings provide a strong scientific basis and technical support for developing more accurate gastric cancer risk models. We expect that these conclusions will point the way for future research and clinical practice in this area to assist in the early prevention and treatment of gastric cancer.Free full text
Risk Prediction Models for Gastric Cancer: A Scoping Review
Abstract
Background
Gastric cancer is a significant contributor to the global cancer burden. Risk prediction models aim to estimate future risk based on current and past information, and can be utilized for risk stratification in population screening programs for gastric cancer. This review aims to explore the research design of existing models, as well as the methods, variables, and performance of model construction.
Methods
Six databases were searched through to November 4, 2023 to identify appropriate studies. PRISMA extension for scoping reviews and the Arksey and O’Malley framework were followed. Data sources included PubMed, Embase, Web of Science, CNKI, Wanfang, and VIP, focusing on gastric cancer risk prediction model studies.
Results
A total of 29 articles met the inclusion criteria, from which 28 original risk prediction models were identified that met the analysis criteria. The risk prediction model is screened, and the data extracted includes research characteristics, prediction variables selection, model construction methods and evaluation indicators. The area under the curve (AUC) of the models ranged from 0.560 to 0.989, while the C-statistics varied between 0.684 and 0.940. The number of predictor variables is mainly concentrated between 5 to 11. The top 5 most frequently included variables were age, helicobacter pylori (Hp), precancerous lesion, pepsinogen (PG), sex, and smoking. Age and Hp were the most consistently included variables.
Conclusion
This review enhances understanding of current gastric cancer risk prediction research and its future directions. The findings provide a strong scientific basis and technical support for developing more accurate gastric cancer risk models. We expect that these conclusions will point the way for future research and clinical practice in this area to assist in the early prevention and treatment of gastric cancer.
Introduction
Gastric cancer is the fifth most common cancer worldwide and the third most common cause of cancer death.1 According to the latest global cancer burden statistics released by the International Agency for Research on Cancer (IARC),2 as of 2020, there were 1,0899,100 new cases of gastric cancer worldwide and 769,000 deaths, accounting for 5.64% and 7.69% of all new and fatal malignant tumor cases, respectively.
In the face of such a large group of gastric cancer patients, the difference in treatment effect has become the focus of our attention.3 In particular, there is a significant difference in treatment effectiveness between early and advanced gastric cancer.4,5 According to American Cancer Society, the 5-year survival rate of localized stomach cancer (cancer is in the stomach only) can be as high as 75%.6 The tumor has not spread to surrounding tissues or organs at this time, the success rate of early screening and surgical removal is higher, and the survival rate of patients is significantly increased.7 In contrast, the 5-year survival rate of metastatic stomach cancer (cancer has spread beyond the stomach to a distant part of the body) is only 7%.6 By this time the cancer has spread and treatment is much more difficult.8 Because the wall of the stomach and the wall of the colon are divided into five layers, early gastric cancer almost does not metastasize, and direct local resection of the lesion has great hope of recovery. Therefore, early screening is crucial for the prevention and treatment of gastric cancer, especially for people with higher risk of disease.4 Improving the detection rate of lesions through early screening can not only significantly improve the cure rate, reduce the difficulty and cost of treatment, but also reduce the pain of patients and significantly improve their quality of life.9
In order to detect gastric cancer early, screening has become an important means.10,11 Endoscopy and biopsy are considered the gold standard for diagnosing gastric cancer and are widely recommended for routine screening.12,13 However, due to its high cost, invasiveness, and high technical requirements, its widespread use is greatly limited, especially in countries with low incidence or limited medical resources.3,14 Therefore, there is an urgent need to develop more economical and convenient methods to effectively identify high-risk groups during follow-up endoscopy.15
However, traditional methods such as endoscopy and biopsy face challenges of high cost, invasiveness, and high technical requirements in widespread application, especially in regions with limited medical resources, limiting their use.16,17 These limitations have prompted researchers to explore new and more promising diagnostic technologies to improve early gastric cancer detection. Recently, there has been increasing research attention on multi-omics analysis and machine learning methods, which have shown significant potential in the diagnosis of early gastric cancer.17–25 Multi-omics analyses, such as next-generation sequencing (NGS), metabolomics, and proteomics, have achieved significant breakthroughs in the medical field, providing direct microscopic evidence to understand the heterogeneity of gastric cancer.26 Furthermore, machine learning algorithms are gaining increasing attention in the analysis of complex datasets. Machine learning can enhance pattern recognition capabilities, improve the accuracy of risk stratification, and potentially offer more personalized and precise diagnostic strategies.27–29 These innovations open the door to earlier and more accurate gastric cancer detection, possibly overcoming the limitations of traditional methods.
The goal of risk prediction models is to estimate future risk based on current and past information,30 which can be used for risk stratification in population screening programs. That is, to predict the likelihood of an outcome before it happens. The advance of its methodology lies in the sublimation of the understanding of clinical problems, which is a major change in our thinking of solving problems.31 Compared with the traditional multi-factor regression analysis, which only stops at screening independent influencing factors, the risk prediction model can predict the possibility of outcome through several screened independent influencing factors, so as to guide clinical practice more directly.32 At the same time, given the limited health resources, it is difficult to implement a broad preventive strategy for the whole population. Therefore, precise individual prevention through risk stratification strategy is not only more effective, but also more cost-effective, especially in the prevention of chronic tumor diseases such as gastric cancer.
Gastric cancer risk prediction model, as a quantitative tool to assess risk and benefit, is becoming more and more popular in the field of gastric cancer. At present, a large number of studies have explored the risk factors of gastric cancer, such as age,1,5 gender,1,33 body mass index (BMI),34–36 smoking,37–40 drinking,41 helicobacter pylori (Hp) infection,42,43 first-degree relatives’ history of gastric cancer,44,45 diet factors,46–50 etc. It laid a foundation for the construction of gastric cancer risk prediction model. However, despite the potential of these models, their clinical application and impact in gastric cancer lags far behind other areas of medicine. For example, the risk prediction model constructed by Charvat et al,51 as well as Iida et al52 mainly relies on internal verification and lacks necessary external verification links. Although these models are based on long-term cohort studies with large sample sizes, relying only on internal validation limits the wide applicability and reliability of their results. In addition, the samples of these models are all from the domestic population, and whether they can be applied to other ethnic groups needs further research. Therefore, in practice, only a few models have been applied in clinical practice.
At present, there is some uncertainty about the predictive models available for people at risk of gastric cancer, the predictive variables included, and how well these models perform. In view of this, we conducted this scope review.
Materials and Methods
A scoping review methodology was chosen given the broad scope of the review question.53 The scoping review was developed based on the 5 steps from the Arksey and O’Malley’s54 framework and the latest guidance from the Joanna Briggs Institute.55 The following activities were conducted: identifying the research question; identifying the relevant studies; selecting records; charting the data; and collating, summarizing, and reporting the results.
We reported according to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) recommendations.56
Bias Risk Assessment: Although this study is a scoping review, we incorporated a bias risk assessment to enhance research transparency and depth. The PROBAST tool was utilized to evaluate bias risks related to participant selection, predictors, outcome measurement, and analytical methods. This assessment helped identify patterns in research quality and provided additional insights into each study’s contribution to the overall analysis.
Identifying the Research Question
Our research questions were:
What are the types and performance of gastric cancer risk prediction models?
What are the construction method and samples of gastric cancer risk prediction models?
What predictors are included in gastric cancer risk prediction models?
Inclusion Criteria
study population include individuals aged ≥18 years.
studies related to gastric cancer prevention or prediction.
presented is a newly developed algorithm or risk prediction model within the general population.
study design involves original studies conducted for the purpose of constructing or validating models, such as cross-sectional studies, cohort studies, case-control studies, and so on.
Exclusion Criteria
animal studies, reviews, protocols, and meta-analysis.
the model included a single predictor, test, or marker only.
the main goal was a prognostic model.
Identifying Relevant Studies
An initial limited search of the peer-reviewed literature was conducted to identify studies reporting models for gastric cancer risk prediction. A literature search of the PubMed, EMBASE, Web of Science, Chinese National Knowledge Infrastructure (CNKI), The WanFang database and Chinese Science and Technology Periodicals (VIP) database was performed on November 4, 2023, to identify relevant studies. The search terms “gastric cancer”, “ risk score”, and “tool” were combined using Medical Subject Headings (MeSH) and free words. The retrieval is limited to original research published in Chinese and English. The references to the included studies were manually searched as a supplement. The full electronic search strategy is contained in the Supplementary Materials.
Study Selection
All duplications were removed using the Endnote 20 deduplication function. Two reviewers (LY and JX) independently performed the screening and full text reviews. Disagreements were resolved by consensus with a third reviewer (XT).
Bias Risk Assessment
To enhance the transparency and depth of our review, we have incorporated an additional bias risk assessment. We employed the Prediction model Risk of Bias Assessment Tool (PROBAST)57 to evaluate risk of bias (ROB) and applicability. PROBAST covers four domains: participants, predictors, outcomes, and statistical analysis. The ROB is evaluated across all four domains, while the applicability assessment is limited to the first three.
Charting the Data
The fields for data extraction were adapted from the Joanna Briggs Institute template found in the JBI Manual for Evidence Synthesis.55 All studies reviewed for inclusion were obtained in full text. Data extraction included: author, year of publication, country, study design, sample size, number of events, model-related information (statistical methods, model performance, modeling building strategies, validation method, and predictors in final analysis).
According to the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines58 and methodology of clinical prediction model construction,30 we classified the methods of eligible studies published for each risk prediction model. The data extraction form was piloted and refined by reviewers LY and JX to ensure the comprehensive and accurate capture of all necessary data.
Collating, Summarizing and Reporting the Results
Due to the heterogeneous nature of the data, conducting a meta-analysis on the included studies was not feasible. Consequently, the results were synthesized and reported through a narrative synthesis of the extracted data from all included full-text publications, complemented by graphical displays to present the extracted information. The findings of this review are reported in a narrative format.
Results
The systematic literature search described above yielded 2772 studies; hand searches identified another 10 studies. A total of 803 studies were left after removing duplicates. 1930 studies were excluded after the title/article review and 20 following the full paper review. Ultimately, 29 studies51,52,59–85 were included in this scoping review; see Figure 1.
The results of this review are presented in a narrative form.
Overview of the Studies
Among the included studies, except for one published in 2009, the remaining 28 (96.6%) were published during or after 2014. This reflects the relative novelty of the conceptualization of the gastric cancer risk prediction model.
The highest number of studies, totaling 13, originated from China. Other studies included 8 from Japan, 3 from South Korea, and 1 each from the United States, Europe, Iran, and India. Additionally, there was a multicenter study involving populations from Singapore and South Korea. See Figure 2.
What are the Types and Performance of Gastric Cancer Risk Prediction Models?
A total of 29 studies were included in this review, encompassing 16 cohort studies, 9 case-control studies, and 4 cross-sectional studies.
The model evaluation focused primarily on discrimination and calibration. Discrimination was assessed using the area under curve (AUC), and calibration was evaluated using C-statistics. The AUC of the models ranged from 0.560 to 0.989, while the C-statistics varied between 0.684 and 0.940. Among the included models, one model70 had a C-statistic below 0.7, and three models63,68,73 had an AUC below 0.7. The performance of the remaining models was generally good; however, only four studies52,59,75,82 had a relatively comprehensive evaluation. see Table 1.
Table 1
Author | Year | Country | Study Design | Study Period | AUC | C-statistics |
---|---|---|---|---|---|---|
Lee T et al60 | 2015 | China | Cohort study | 1997–2004 | / | 0.780 |
Zhou R et al59 | 2021 | China | Cohort study | 2017–2021 | 0.763(EV1) 0.706(EV2) 0.696(EV3) | / |
Wang X et al74 | 2022 | China | Cohort study | / | 0.75(EV) | / |
Zhu X et al79 | 2023 | China | Cohort study | 2004–2022 | / | 0.754 0.736(EV) |
Wong M et al82 | 2023 | China | Cohort study | 1997–2018 | 0.834(EV) | 0.834 |
Ikeda F et al62 | 2016 | Japan | Cohort study | 1988–2008 | / | 0.773 |
Charvat H et al51 | 2016 | Japan | Cohort study | 1993–2009 | / | 0.768 |
Park C et al63 | 2016 | Japan | Cohort study | 2012–2014 | 0.600 | / |
Iida M et al52 | 2018 | Japan | Cohort study | 1988–2007 | 0.790 | 0.790 0.760(EV) |
Charvat H et al83 | 2020 | Japan | Cohort study | 1990–1993 | / | 0.798(EV) |
Kawamura M et al72 | 2022 | Japan | Cohort study | 2017–2019 | 0.750 | 0.749 |
Arai J et al77 | 2022 | Japan | Cohort study | 1996–2017 | / | 0.840 |
Park B et al73 | 2021 | Korea | Cohort study | / | 0.607 | / |
So Jimmy et al71 | 2021 | Korea, Singapore | Cohort study | / | 0.930 0.920(EV) | / |
Choi Je t al.68 | 2020 | Europe | Cohort study | / | 0.560 | / |
Afrash M et al80 | 2023 | Iran | Cohort study | 2015–2021 | 0.849 | / |
Tu H et al76 | 2017 | China | Cross-sectional study | 1997–2012 | / | 0.803 |
Cai Q et al64 | 2019 | China | Cross-sectional study | 2016–2017 | / | 0.760 0.730(EV) |
Zhang P et al75 | 2023 | China | Cross-sectional study | 2021–2022 | 0.760 | / |
Eom BW et al (MEN)61 | 2015 | Korea | Cross-sectional study | 1996–2007 | / | 0.768 0.782(EV) |
Eom BW et al (WOMEN)61 | 2015 | Korea | Cross-sectional study | 1996–2007 | / | 0.706 0.714(EV) |
Zhu C et al84 | 2014 | China | Case–control | 2007.01–2011.11 | 0.989 0.812(IV) | / |
Wang S et al85 | 2018 | China | Case–control | 2013–2015 | 0.841 0.856(EV) | / |
Tao W et al65 | 2020 | China | Case–control | 2017–2019 | 0.875 | / |
Qiu L et al69 | 2020 | China | Case–control | 2009–2011 | / | 0.684 |
Duan F et al81 | 2023 | China | Case–control | 2015–2019 | 0.779 | / |
Taninaga J et al66 | 2019 | Japan | Case–control | 2006–2017 | 0.870 0.900(EV) | / |
Lee D et al78 | 2009 | Korea | Case–control | 2005.03–08 | 0.888 | 0.900 |
In H et al67 | 2020 | America | Case–control | / | / | 0.940 |
Chakraborty P et al70 | 2021 | India | Case–control | 2016–2019 | 0.940 | / |
Abbreviations: AUC, The area under the curve; IV, Independent validation; EV, external validation; EV1, the Southern outpatient cohort; EV2, the Northern outpatient cohort; EV3, the Endoscopic Screening for Esophageal Cancer in China cohort.
What are the Construction Method and Samples of Gastric Cancer Risk Prediction Models?
A total of 29 studies were included. Among them, Charvat H (2020)83 was an external validation of Charvat H (2016),51 which was not part of model development. Additionally, Eom BW (2015)61 developed two models based on gender. Therefore, 29 models were eventually included.
Among the 29 models, they can be classified into three categories based on the construction methods:
16 studies59,63–65,67,69,70,72,74–76,78,81,82,84,85 employed parametric models utilizing Logistic regression;
11 studies51,52,60–62,68,71,73,77,79 employed semi-parametric models utilizing COX regression;
2 studies66,80 employed non-parametric models utilizing machine learning algorithms.
The construction and validation scenarios for the models are as follows: 13 studies included both construction and validation phases, 15 studies51,59,62–64,68–70,73–76,78,80,81 solely included the construction phase, and 1 study83 exclusively focused on the validation phase.
For model construction, sample sizes ranged from 40 to 4,347,224, with positive cases varying from 40 to 19,465. For model validation, sample sizes ranged from 102 to 1,862,473, with positive cases varying from 4 to 6628. See Table 2.
Table 2
Authors | Method | Development | Validation | Age (mean ± SD), Years | Variable | ||
---|---|---|---|---|---|---|---|
Samples | Cases | Samples | Cases | ||||
Lee D et al78 | Logistic | 382 | 183 | / | / | / | Age, Personal history of gastric ulcer, Family history of gastric ulcer, Family history of gastric cancer, Water source, Rapid eating, Health status, Financial status, Occupation |
Park C et al63 | Logistic | 562 | 182 | / | / | 58.5±12.5 | Age, Sex, HP, PG I/PG II |
Tu H et al76 | Logistic | 9002 | 94 | / | / | P:61.2±11.4 C:50.7±10.1 | HP, PGI, PGII, PGI/II, Gastrin-17 |
Cai Q et al64 | Logistic | 9383 | 267 | 5091 | 138 | P:62.9±9.5 C:56.1±9.5 | Age, Sex, HP, Pickled food, Fried food, PG I/II, Gastrin-17 |
Tao W et al65 | Logistic | 383 | 99 | 26 | 4 | / | Age, Sex, HP, Family history of gastric cancer, PGI, PGI/II |
In H et al67 | Logistic | 140 | 40 | / | / | / | Age, Salt preference, Family history of gastric cancer, Alcohol, cultural food at ages 15–18 years, Education, Country (America), Ethnicity |
Qiu L et al69 | Logistic | 2287 | 1115 | / | / | / | BMI, Genetic risk factors (SNPs) |
Zhou R et al59 | Logistic | / | / | 48,079 | 125 | / | Age, Salt preference, Sex, Family history of gastric cancer, BMI, Smoking, Alcohol, Pickled food, Meal regularity |
Chakraborty P et al70 | Logistic | 240 | 80 | / | / | / | Salt preference, BMI, Alcohol, Smoking, Smoked food |
Kawamura M et al72 | Logistic | 380 | 115 | / | / | P:69±8 C:64±12 | OLGIM, EGGIM, Kimura-Takemoto stage |
Wang X et al74 | Logistic | 1022 | 253 | / | / | / | Genetic risk factors (SNPs) |
Zhang P et al75 | Logistic | 240 | 102 | / | / | / | Surveillance endoscopy (atrophy, map-like redness, xanthelasma) |
Duan F et al81 | Logistic | 1320 | 660 | / | / | P: 57.64±12.08 C: 57.88±11.50 | HP, Smoking, Alcohol, Genetic risk factors (SNPs+lncRNA) |
Wong M et al82 | Logistic | 43,47,224 | 4402 | 18,62,473 | 1899 | D:44.52±14.49 V:44.50±14.48 | Age, Sex, HP, Medication history (Proton pump inhibitors, Aspirin, NSAID, Statins) |
Zhu C et al84 | Logistic | 40 | 40 | 102 | 48 | D: P: 53.83± 10.34; C: 53.55± 10.11 V: P: 56.63± 10.37; C: 54.03±10.45 | miRNA(miR-16, miR-25, miR-92a, miR-451, miR-486-5p) |
Wang S et al85 | Logistic | 279 | 279 | 141 | 186 | D: 58.7±12.0 V: 58.8±11.6 | Autoantibodies against tumor-associated antigens(p62, c-Myc, NPM1, 14-3-3ξ, MDM2 and p16) |
Lee T et al60 | COX | 2,78,898 | 1269 | 17,247 | / | 64.8±13.1 | Age, Sex, HP, Peptic ulcer sites, Peptic ulcer complications, Medication history (NSAID), Surveillance endoscopy |
Eom BW et al (MEN)61 | COX | 13,72,424 | 19,465 | 4,84,335 | 6628 | 45.08±10.47 | Age, Salt preference, Family history, BMI, Smoking, Alcohol, Meal regularity, Physical activity |
Eom BW et al (WOMEN)61 | COX | 8,04,077 | 5579 | 4,66,013 | 2920 | 48.74±11.01 | Age, Salt preference, Family history, BMI, Smoking, Alcohol |
Ikeda F et al62 | COX | 2446 | 123 | / | / | 58.3±11.4 | Age, Salt preference, Sex, HP, BMI, Smoking, PGII, HbA1c, Cholesterol, Physical activity |
Charvat H et al51 | COX | 19,028 | 412 | / | / | P: 63.3±4.9 C: 59.3±6.8 | Age, Salt preference, Sex, HP, Family history of gastric cancer, Smoking, PGI, PGII |
Iida M et al52 | COX | 2444 | 90 | 3204 | 35 | D:58±11 V:62±13 | Age, Sex, HP, Smoking, HbA1c, the combination of HP and PG |
Choi Je t al.68 | COX | 4,00,807 | 272 | / | / | / | Genetic risk factors (PRS) |
So Jimmy et al71 | COX | 472 | 236 | 210 | 94 | P:61.2±8.4 C:68.0±10.9 | MicroRNA(serum 12-miRNA biomarker assay) |
Park B et al73 | COX | 1586 | 450 | / | / | P:55.4±10.7 C:52.1±8.5 | Age, Salt preference, Sex, HP, Alcohol, Smoking, Meal preference, Meat consumption frequency, Meal regularity, Physical activity, Genetic risk factors (SNPs) |
Arai J et al77 | COX | 879 | 77 | 220 | 17 | D:63.49±10.32 V:61.75±10.68 | Age, OLGIM/OLGA stage, endoscopic atrophy, history of malignant tumors other than gastric cancer |
Zhu X et al79 | COX | 4,16,343 | 3089 | 13,982 | 329 | / | Age, Sex, BMI, Smoking, Alcohol, Vegetables and fruits, Pickled food, Education, Family history of cancer in first-degree relatives, History of peptic ulcer, Family history of gastric cancer |
Taninaga J et al66 | Machine learning | 1144 | 89 | 287 | / | P: 56.7±8.8 C: 46.2±1.0 | Age, HP, BMI, Chronic atrophic gastritis, Post-gastrectomy, HbAIc, MCV, Lymphocyte ratio |
Afrash M et al80 | Machine learning | 2029 | 429 | / | / | / | Salt preference, HP, Chronic atrophic gastritis, Gastric or duodenal ulcer, Weight loss, Smoking, Fruits consumption, High fat foods, Education, Stress, Weight loss |
Charvat H et al83 | / | / | / | 1292 | 33 | 56.52±5.78 | Age, Salt preference, Sex, HP, Family history of gastric cancer, Smoking, PGI, PGII |
Abbreviations: P, gastric cancer patients; C, Controls; D, development; V, validation; HP, Helicobacter pylori; PG, pepsinogen; BMI, Body Mass Index; OLGIM, the operative link on gastric intestinal metaplasia assessment; EGGIM, the endoscopic grading of gastric intestinal metaplasia; OLGA, the operative link on gastritis assessment; HbAIc, Haemoglobin A1c; MCV, mean corpuscular volume.
What Predictors are Included in Gastric Cancer Risk Prediction Models?
In all the included models, the number of predictor variables is mainly concentrated between 5 to 11. The research team categorized the variables in the model into five groups based on the variable collection method and guideline recommendations.1,11,86 These categories include demographic factors, gastric cancer disease-related factors, diet factors, lifestyle factors, laboratory examination, and other factors such as polygenic risk score and single nucleotide polymorphism. The predictors and classifications of each model are shown in Tables 2. The top 5 most frequently included predictors among the 28 models were age, Hp, precancerous lesions (eg, atrophic gastritis, intestinal metaplasia and benign gastric polyps), pepsinogen (PG), sex, and smoking. When considering individual variables, age emerged as the most frequently incorporated indicator, appearing in a total of 18 models. Among these, 11 models included Hp. The number of times each variable was included is detailed in Table 3, and more variables are analyzed in Figure 3.
Table 3
Predictor Classification | Number of Inclusions |
---|---|
Demographic factors | |
Age | 18 |
Gender | 12 |
BMI | 8 |
Educational attainment | 3 |
Gastric cancer disease-related factors | |
Precancerous lesion | 13 |
Family History | 9 |
Medication history | 2 |
Dietary factors | |
Salt intake | 10 |
Pickled/fried/smoked | 5 |
Vegetable and fruit intake | 2 |
Eating speed and regularity | 4 |
Lifestyle factors | |
Smoking | 12 |
Drinking | 8 |
Physical activity | 3 |
Laboratory tests | |
HP | 14 |
PG | 13 |
G-17 | 2 |
HbA1c | 2 |
Others | 7 |
Abbreviations: BMI, Body Mass Index; HP, Helicobacter pylori; PG, pepsinogen. Others: Genetic risk factors, MicroRNA, Autoantibodies against tumor-associated antigens.
ROB and Applicability
The evaluation of ROB and applicability, according to PROBAST, is illustrated in the Supplementary Table. Based on the PROBAST assessment, all included models were evaluated as having a high ROB. Specifically, most diagnostic models lacked calibration reports; some model development studies converted continuous variables into two or more categories, used different definitions and transformations, or applied different cut-off points for categorical variables, such as age and dietary habits. Additionally, some models were affected by insufficient sample sizes.
Among the 29 diagnostic models, only 17 are considered applicable, indicating an overall low applicability. This is primarily due to the fact that only a subset of the studies utilized registry data. Additionally, the included studies employed different definitions, assessment methods, and evaluation timelines, which may result in variations in the predictive performance of certain models compared to other research outcomes. This is particularly evident in areas such as endoscopic detection and gastric cancer classification. See Figure 4.
Discussion
This scoping review provides an overview of the methods, variables, and performance of gastric cancer risk prediction model construction, as well as the different study design types used to construct them. Despite intense interest in risk prediction models, research surrounding gastric cancer risk prediction remains limited, as only 29 studies were included in the scoping review. Notably, nearly two-thirds of these studies identified were published within the last five years, indicating a growing interest and recognition in the risk prediction of gastric cancer. Given the highly diverse nature of these studies, which makes it difficult to reach conclusive judgments, we extracted several key themes through a scope review.
The regional disparities in the incidence and mortality of gastric cancer worldwide are crucial topics for discussion.87,88 It is noteworthy that hotspots are primarily concentrated in East Asia, Eastern Europe, and South America.1,89 The global epidemiological variations in gastric cancer incidence among different regions and ethnic groups show significant differences, with variations as high as 15~20 times between high-incidence and low-incidence regions.10 Despite a substantial decline in gastric cancer incidence over the past few decades in regions like North America and Western Europe,90,91 gastric cancer remains a major global health concern, particularly in East Asian countries.92–95 In this study, we observed that three-quarters of the relevant articles originated from China, Japan, and South Korea. This indicates that these countries are currently major hotspots for gastric cancer research, possibly linked to the high incidence rates observed in these regions. A thorough analysis of these regional disparities may provide a better understanding of the results observed in our study. These regional differences may be influenced by various factors, including genetics, environment, lifestyle, and diet.6,96 Future research could further explore these aspects to uncover specific reasons behind the incidence and mortality of gastric cancer in these hotspot regions.
For early-stage gastric cancer patients identified through risk models, developing structured follow-up and diagnostic treatment strategies is crucial for improving patient outcomes. Early identification offers a valuable opportunity to implement comprehensive interventions to reduce incidence rates and increase survival rates. These strategies not only emphasize the importance of prevention and advance intervention to reduce mortality rates but also optimize the treatment pathways once early-stage gastric cancer is detected. For patients diagnosed with early-stage gastric cancer, implementing personalized follow-up strategies is critical; this includes regular monitoring and assessments to promptly identify any signs of disease progression. Through close monitoring and tailored treatment plans, timely intervention and precise management can be ensured, significantly enhancing patient experiences and long-term health outcomes.
In the course of variable selection for the gastric cancer risk prediction model, our primary focus lies in the exploration of risk factors associated with gastric cancer. All variables incorporated into the model stand independently; nevertheless, the absence of standardized classification for these variables may present challenges in subsequent gastric cancer prevention and treatment. To furnish a guide for the development of forthcoming models, we opted to categorize the variables into five domains. This classification is anticipated to facilitate comprehension of the model’s structure and ensure a thorough consideration of diverse factors. Further considerations should involve the etiological prevention and precision-targeted treatment of gastric cancer. For instance, modifiable factors like diet and lifestyle can be perceived as entry points for preventing gastric cancer, thereby contributing to a reduction in the incidence risk. Conversely, ostensibly non-modifiable factors such as laboratory tests (eg, endoscopy) may still play a role in diminishing the mortality rate of gastric cancer through precision-targeted treatment.97
The design types and construction methods of gastric cancer risk prediction models exhibit significant differences, clearly indicating a lack of consensus in the literature regarding overall design and construction methods in this field. Categorizing articles based on study design, we distinguished among cohort studies, cross-sectional studies, and case-control studies. Cohort studies, as an observational study design, offer a notable advantage in providing robust causal inference, facilitating the assessment of the causal relationship between exposure and outcome and, consequently, inferring future risks of gastric cancer.98 In this review, three out of five articles employed cohort studies, with over half originating from Japan and the remainder from China (constituting 41.7% of the total). Concerning construction methods, early papers (2009–2018) predominantly focused on exploring risk factors and temporal information related to gastric cancer, leading to the predominant use of the Cox proportional hazard model. As a semi-parametric model, the Cox model does not necessitate assumptions about the specific form of the underlying risk but does not directly provide disease probability.30 As research on gastric cancer risk factors and five-year survival rates deepens, the emergence of relevant research results and the gradual formation of consensus become increasingly significant. Recent articles have shown a tendency to shift focus towards Logistic regression, directly employed to predict whether a patient has a specific disease. While our aim is to explore approaches to model building and types of research design, noteworthy differences exist in these topics. Some articles delve deeper into these changes than others, once again reflecting the article’s focus and the author’s perspective.
There are some important limitations. First, the publications included may not fully represent the breadth of all gastric cancer risk prediction models implemented in the actual clinic, which may lead to too much emphasis on studies conducted in academic or highly resourced centers. In addition, this limitation is likely to be further exacerbated by the choice to limit the search to include only Chinese and English publications, resulting in under-representation of studies from low-income and middle-income countries in the review. This is of particular concern because a large proportion of areas with high rates of stomach cancer are located in low- and middle-income countries. As a result, these predictive models may not adequately reflect all regions with high gastric cancer incidence. Lastly, we acknowledge that failing to separately analyze cohort studies and cross-sectional studies in this review may affect the interpretation of results. These two research designs possess distinct characteristics in terms of methodology, temporal dimension, and causal inference capability. Therefore, combining them without distinction could lead to a lack of in-depth understanding of the study results.
Conclusion
In the field of gastric cancer, research on prediction models has shown significant growth over the last five years, underscoring the ongoing academic interest in the field. In this review, we provide readers with a comprehensive overview of research on gastric cancer risk prediction models, including details on study design, model construction methods, variables, and performance. Through this review, we can deeply reflect on achievements and existing problems. In order to more accurately predict the risk of gastric cancer in the population, more in-depth studies are needed in the future, and these studies need to be more practically oriented in clinical work. Ultimately, these efforts will allow patients to benefit from these research findings.
Funding Statement
We would like to express our gratitude for the support from the Liaoning Province 2022 “Open Competition Mechanism to Select the Best Candidates” Key Science and Technology Project (Project Number 2022JH1/10800072). Please note that this funding did not influence the design, execution, statistical analysis, or interpretation of the data in our study.
Disclosure
The authors report no conflicts of interest in this work.
References
Articles from Journal of Multidisciplinary Healthcare are provided here courtesy of Dove Press
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.
Cochrane Database Syst Rev, 2(2022), 01 Feb 2022
Cited by: 12 articles | PMID: 36321557 | PMCID: PMC8805585
Review Free full text in Europe PMC
Prediction Models for Gastric Cancer Risk in the General Population: A Systematic Review.
Cancer Prev Res (Phila), 15(5):309-318, 01 May 2022
Cited by: 9 articles | PMID: 35017181
Review
Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia.
Med J Aust, 213 Suppl 11:S3-S32.e1, 01 Dec 2020
Cited by: 10 articles | PMID: 33314144
Promoting Artificial Intelligence for Global Breast Cancer Risk Prediction and Screening in Adult Women: A Scoping Review.
J Clin Med, 13(9):2525, 25 Apr 2024
Cited by: 1 article | PMID: 38731054 | PMCID: PMC11084581
Review Free full text in Europe PMC