Identifying Latent Subgroups of High-Risk Patients Using Risk Score Trajectories

J Gen Intern Med. 2018 Dec;33(12):2120-2126. doi: 10.1007/s11606-018-4653-x. Epub 2018 Sep 17.

Abstract

Objective: Many healthcare systems employ population-based risk scores to prospectively identify patients at high risk of poor outcomes, but it is unclear whether single point-in-time scores adequately represent future risk. We sought to identify and characterize latent subgroups of high-risk patients based on risk score trajectories.

Study design: Observational study of 7289 patients discharged from Veterans Health Administration (VA) hospitals during a 1-week period in November 2012 and categorized in the top 5th percentile of risk for hospitalization.

Methods: Using VA administrative data, we calculated weekly risk scores using the validated Care Assessment Needs model, reflecting the predicted probability of hospitalization. We applied the non-parametric k-means algorithm to identify latent subgroups of patients based on the trajectory of patients' hospitalization probability over a 2-year period. We then compared baseline sociodemographic characteristics, comorbidities, health service use, and social instability markers between identified latent subgroups.

Results: The best-fitting model identified two subgroups: moderately high and persistently high risk. The moderately high subgroup included 65% of patients and was characterized by moderate subgroup-level hospitalization probability decreasing from 0.22 to 0.10 between weeks 1 and 66, then remaining constant through the study end. The persistently high subgroup, comprising the remaining 35% of patients, had a subgroup-level probability increasing from 0.38 to 0.41 between weeks 1 and 52, and declining to 0.30 at study end. Persistently high-risk patients were older, had higher prevalence of social instability and comorbidities, and used more health services.

Conclusions: On average, one third of patients initially identified as high risk stayed at very high risk over a 2-year follow-up period, while risk for the other two thirds decreased to a moderately high level. This suggests that multiple approaches may be needed to address high-risk patient needs longitudinally or intermittently.

Keywords: high risk; latent subgroups; machine learning; patient-centered medical home; risk stratification; trajectory.

Publication types

  • Observational Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Aged
  • Female
  • Follow-Up Studies
  • Hospitalization / trends*
  • Hospitals, Veterans / standards
  • Hospitals, Veterans / trends*
  • Humans
  • Machine Learning / standards
  • Machine Learning / trends*
  • Male
  • Middle Aged
  • Prospective Studies
  • Risk Factors
  • United States / epidemiology
  • United States Department of Veterans Affairs / standards
  • United States Department of Veterans Affairs / trends*