It is well-known that designing a cluster randomized trial (CRT) requires an advance estimate of the intra-cluster correlation coefficient (ICC). In the case of longitudinal CRTs, where outcomes are assessed repeatedly in each cluster over time, estimates for more complex correlation structures are required. Three common types of correlation structures for longitudinal CRTs are exchangeable, nested/block exchangeable and exponential decay correlations-the latter two allow the strength of the correlation to weaken over time. Determining sample sizes under these latter two structures requires advance specification of the within-period ICC and cluster autocorrelation coefficient as well as the intra-individual autocorrelation coefficient in the case of a cohort design. How to estimate these coefficients is a common challenge for investigators. When appropriate estimates from previously published longitudinal CRTs are not available, one possibility is to re-analyse data from an available trial dataset or to access observational data to estimate these parameters in advance of a trial. In this tutorial, we demonstrate how to estimate correlation parameters under these correlation structures for continuous and binary outcomes. We first introduce the correlation structures and their underlying model assumptions under a mixed-effects regression framework. With practical advice for implementation, we then demonstrate how the correlation parameters can be estimated using examples and we provide programming code in R, SAS, and Stata. An Rshiny app is available that allows investigators to upload an existing dataset and obtain the estimated correlation parameters. We conclude by identifying some gaps in the literature.
Keywords: Sample size; clinical trials; cluster autocorrelation coefficient; statistical power; stepped wedge.
© The Author(s) 2023; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association.