Testing calibration of risk models at extremes of disease risk

Biostatistics. 2015 Jan;16(1):143-54. doi: 10.1093/biostatistics/kxu034. Epub 2014 Jul 14.

Abstract

Risk-prediction models need careful calibration to ensure they produce unbiased estimates of risk for subjects in the underlying population given their risk-factor profiles. As subjects with extreme high or low risk may be the most affected by knowledge of their risk estimates, checking the adequacy of risk models at the extremes of risk is very important for clinical applications. We propose a new approach to test model calibration targeted toward extremes of disease risk distribution where standard goodness-of-fit tests may lack power due to sparseness of data. We construct a test statistic based on model residuals summed over only those individuals who pass high and/or low risk thresholds and then maximize the test statistic over different risk thresholds. We derive an asymptotic distribution for the max-test statistic based on analytic derivation of the variance-covariance function of the underlying Gaussian process. The method is applied to a large case-control study of breast cancer to examine joint effects of common single nucleotide polymorphisms (SNPs) discovered through recent genome-wide association studies. The analysis clearly indicates a non-additive effect of the SNPs on the scale of absolute risk, but an excellent fit for the linear-logistic model even at the extremes of risks.

Keywords: Case–control studies; Gene–gene and gene–environment interactions; Genome-wide association studies; Goodness-of-fit tests; Polygenic score; Risk stratification.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Breast Neoplasms / genetics
  • Calibration
  • Genetic Predisposition to Disease*
  • Genome-Wide Association Study / statistics & numerical data*
  • Humans
  • Models, Genetic*
  • Models, Statistical*
  • Risk Assessment