Europe PMC

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


Background

Direct estimates of rare disease prevalence from public health surveillance may only be available in a few catchment areas. Understanding variation among observed prevalence can inform estimates of prevalence in other locations. The Muscular Dystrophy Surveillance, Tracking, and Research Network (MD STARnet) conducts population-based surveillance of major muscular dystrophies in selected areas of the United States. We identified sources of variation in prevalence estimates of Duchenne and Becker muscular dystrophy (DBMD) within MD STARnet from published literature and a survey of MD STARnet investigators, then developed a logic model of the relationships between the sources of variation and estimated prevalence.

Results

The 17 identified sources of variability fell into four categories: (1) inherent in surveillance systems, (2) particular to rare diseases, (3) particular to medical-records-based surveillance, and (4) resulting from extrapolation. For the sources of uncertainty measured by MD STARnet, we estimated each source's contribution to the total variance in DBMD prevalence. Based on the logic model we fit a multivariable Poisson regression model to 96 age-site-race/ethnicity strata. Age accounted for 74% of the variation between strata, surveillance site for 6%, race/ethnicity for 3%, and 17% remained unexplained.

Conclusion

Variation in estimates derived from a non-random sample of states or counties may not be explained by demographic differences alone. Applying these estimates to other populations requires caution.

Free full text 


Logo of orphjrdisLink to Publisher's site
Orphanet J Rare Dis. 2023; 18: 65.
Published online 2023 Mar 22. https://doi.org/10.1186/s13023-023-02662-0
PMCID: PMC10031951
PMID: 36949506

Sources of variation in estimates of Duchenne and Becker muscular dystrophy prevalence in the United States

Nedra Whitehead,corresponding author1 Stephen W. Erickson,2 Bo Cai,3 Suzanne McDermott,4 Holly Peay,2 James F. Howard,5 Lijing Ouyang,6 and the Muscular Dystrophy Surveillance, Tracking and Research Network

Associated Data

Supplementary Materials
Data Availability Statement

Abstract

Background

Direct estimates of rare disease prevalence from public health surveillance may only be available in a few catchment areas. Understanding variation among observed prevalence can inform estimates of prevalence in other locations. The Muscular Dystrophy Surveillance, Tracking, and Research Network (MD STARnet) conducts population-based surveillance of major muscular dystrophies in selected areas of the United States. We identified sources of variation in prevalence estimates of Duchenne and Becker muscular dystrophy (DBMD) within MD STARnet from published literature and a survey of MD STARnet investigators, then developed a logic model of the relationships between the sources of variation and estimated prevalence.

Results

The 17 identified sources of variability fell into four categories: (1) inherent in surveillance systems, (2) particular to rare diseases, (3) particular to medical-records-based surveillance, and (4) resulting from extrapolation. For the sources of uncertainty measured by MD STARnet, we estimated each source’s contribution to the total variance in DBMD prevalence. Based on the logic model we fit a multivariable Poisson regression model to 96 age–site–race/ethnicity strata. Age accounted for 74% of the variation between strata, surveillance site for 6%, race/ethnicity for 3%, and 17% remained unexplained.

Conclusion

Variation in estimates derived from a non-random sample of states or counties may not be explained by demographic differences alone. Applying these estimates to other populations requires caution.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13023-023-02662-0.

Keywords: Epidemiology, Public health surveillance, Epidemiological monitoring, Epidemiologic methods, Muscular dystrophy Duchenne, Muscular dystrophy Becker, Prevalence

Background

Public health surveillance, defined as the "systematic and continuous collection, analysis, and interpretation of data" [1] is foundational to public health practice [2]. Public health surveillance provides accurate, representative information on the occurrence of a disease in the population from which the data is collected but is not usually designed to be generalizable to other populations. Resources and logistics may limit surveillance programs to a few catchment areas that may not be representative of the entire population. In the absence of other data, prevalence and other epidemiologic measures from these few catchment areas are often generalized to the population, which is valid only if the epidemiology of the disease of interest is consistent across the population.

Significant variation in epidemiologic measures among catchment areas suggests the underlying epidemiology of the disease differs among geographic areas. However, rare diseases are vulnerable to random fluctuation in prevalence estimates, which can be difficult to distinguish from true differences among populations. Structured uncertainty analysis can be an important tool for assessing such differences. Taruscio and Mantovani recently demonstrated the value of uncertainty analysis to identify gaps in our knowledge of the epidemiology of rare diseases and assess their impact [3]. They categorize the sources of uncertainty into epistemic (uncertainty due to lack of knowledge), sampling uncertainty (uncertainty associated with data and disparate methods), and variability (uncertainty due to heterogeneity within a population).

The Muscular Dystrophy Surveillance, Tracking and Research Network (MD STARnet), which conducts population-based surveillance of muscular dystrophies in selected areas of the US, is the sole source of population-based prevalence estimates in the country [4, 5]. The 2007 MD STARnet estimated prevalence of Duchenne/Becker muscular dystrophy (DBMD) among males age 5 to 24 was 1.47 cases per 10,000 males (calculated from data in the article) [6, 7]. The range among the four individual catchment areas was 1.3 to 1.8 cases per 10,000 males ages 5 to 24 years, a variance of 12% [6]. Among the three catchment areas with estimates for 2007 and 2014–2019, the same catchment areas had higher prevalence in both time periods, indicating that the differences between catchment areas are likely not random (Personal communication, Suzanne McDermott, DBMD Ascertainment Progress Presented: Fall 2017 MD STARnet Principal Investigators Meeting. Atlanta, GA, 2017).

Variation across catchment areas could be due to true differences in the population frequency of pathogenic alleles of the dystrophin gene; the population distribution of sex, age or ancestry; or migration among individuals with DBMD. It could also be due to random or systematic error. Our aim was to understand what factors explain the observed differences in DBMD prevalence among catchment areas and the implications for the generalizability of the prevalence estimates. Our analysis examined sources of sampling uncertainty and population variability. If population demographics or regional differences in diagnosis or surveillance practices explain the variation among catchment areas, adjustment for these differences would allow MD STARnet estimates to be extrapolated to the broader U.S. population. Unexplained variation between catchment areas indicates that MD STARnet prevalence estimates may not be an accurate estimate of DBMD prevalence in the broader U.S. population.

Results

Literature review and investigator survey

After abstract and title review, we identified 52 unique citations, of which 12 advanced to full text review (Additional file 1: Fig. S2, Additional file 2). We included findings from five articles, from which we identified 12 potential sources of variation (Table (Table1)1) [812]. None of the minor discrepancies in abstraction required adjudication. Most information on sources of variation was in surveillance or registry methodological articles. These articles examined rare disease cluster identification [8], drug registries for treatments of lysosomal storage disorders [9], a cancer registry [11], and surveillance based on multiple data sources [12]. The fifth article was an epidemiological report from a registry of arthritis, musculoskeletal and skin diseases [10].

Table 1

Sources of variation in estimating the national prevalence of muscular dystrophies

Source of VariationIdentified from
In All SurveillanceLiteratureSurvey
Unidentified or unavailable data sourcesxx
Unidentified cases at known data sourcesx
Misclassified disease statusx
Migration into and out of surveillance systemx
Time period for case capturexx
Time between diagnosis and ascertainmentx
Regional differences in disease incidencex
Unreliable, non-specific coding in screening databasesxx
Migration into and out of surveillance regionxx
Demographic changes due to rapid population changexx
Specific to Rare Disease Surveillance
Unstable estimates due to small number of casesx
Misclassification of muscular dystrophy typexx
Specific to Medical Records-Based Surveillance
Lack of standardized data in medical recordsx
Underreported and incomplete data in medical recordsxx
Number and proportion of treatment centers within the study areax
Specific to Extrapolation to National Estimates
Differences between study population and national populationx
Differential ascertainment between areas or groups of patientsx

Twenty investigators from six sites completed our survey on sources and magnitude of bias in MD STARnet. The investigators included six analysts, four abstractors, three clinicians, three study coordinators, two data managers, and two people with unspecified roles. The survey identified 12 sources of variation, five of which had not been identified by the literature review (Table (Table1).1). The average investigator estimate of bias in DMD prevalence from a given source ranged from 5% (for residents obtaining care outside the study region and demographic changes in the population) to 12% (for differences between the MD STARnet and the U.S population) (Additional file 3: Table S1).

In total, we identified 17 sources of variation in national estimates from the literature review or the investigator survey (Table (Table1).1). We grouped the sources of variation into four categories comprising sources of variation that are:

  1. Inherent to all surveillance systems, including case ascertainment, misclassification of disease status, and migration;

  2. Specific to rare disease surveillance, including small case numbers, regional differences in incidence, the relatively large impact of a few misclassified cases, and biases in care-seeking behaviors and diagnostic practices;

  3. Specific to medical records-based surveillance, including lack of standardization and incomplete data; and

  4. Due to extrapolation from local to national estimates, including differences between the local and national populations.

Sources and magnitude of variation

The expanded MD STARnet data set included 720 cases from a surveilled male population of 8 million (Table (Table2).2). Of these cases, 249 (34%) were identified in Arizona, 193 (27%) in Colorado, 152 (21%) in Iowa, and 126 (17%) in western New York. The cases were mostly non-Hispanic and white (67%). The racial and ethnic distribution of the cases was similar to that of the surveilled populations, although individuals of Black or Other race were slightly underrepresented among the cases.

Table 2

Sample and Population Characteristics, MD STARnet Expanded Surveillance Pilot, 2007–2011

DBMD casesSurveilled populationUS Male Standard Population
NumberPercentNumberPercentNumberPercent
Male7201008,037,535100.0152,082,993100.0
Age (years)
Under 5618.5553,8426.910,312,6416.8
 5 to 911616.1565,9097.010,380,2816.8
 10 to 1413118.2560,5817.010,578,2357.0
 15 to 1913919.3592,8687.411,278,0277.4
 20 to 2410114.0585,4177.311,072,5387.3
 25 + 17223.95,178,91864.498,461,27164.7
Race/Ethnicity
 Black192.6347,4364.318,116,74611.9
 Hispanic14620.31,602,34319.925,749,68616.9
 Other1699.6481,3656.011,151,6017.3
 White48667.55,606,39169.897,064,96063.8
State
 Arizona24934.63,175,82339.5NA
 Colorado19326.82,520,66231.4
 Iowa15221.11,508,31918.8
 New York12617.5832,73110.4

1Includes any race other than Black, Hispanic, or White, including multiple races and missing

DBMD, Duchenne/Becker Muscular Dystrophy; US, United States

Age and ethnicity distributions were significantly associated with prevalence. Age group explained the majority of the variability between strata, accounting for 74% of the deviance in the model. However, the similarity of unadjusted, standardized, and adjusted prevalence estimates indicates that population differences in age and ethnicity or differences in the surveillance process account for very little of the variation between catchment areas (Table (Table3).3). Catchment area accounted for the second largest proportion of the variability between strata, 6% of the total variance (Table (Table4).4). Arizona was the reference site due to alphabetical coding order. Prevalance in Colorado and Iowa did not differ significantly from those in Arizona (Table (Table5).5). However, the prevalence in the New York catchment area was twice that of Arizona (Prevalence Ratio. 2.2, p < 0.001). Seventeen percent of the variation in prevalence across strata remained unexplained after controlling for the demographic and process factors in the model.

Table 3

Unadjusted, Standardized and Adjusted Duchenne and Becker Muscular Dystrophy Prevalence by Participant Characteristics, MD STARnet Expanded Surveillance Pilot, 2007–2011

UnadjustedStandardized1Adjusted2
Prevalence395% CIPrevalence395% CIPrevalence395% CI
All US males8.968.33, 9.648.688.03, 9.388.647.97, 9.33
Age (years)
Under 511.018.58, 14.1510.177.77, 13.0810.738.04, 13.59
 5 to 920.5017.09, 24.5820.5916.81, 24.9719.9316.26,23.80
 10 to 1423.3719.70, 27.7322.7718.82, 27.3122.6918.72, 26.84
 15 to 1923.4519.86, 27.6823.5919.51, 28.2722.6518.80,26.65
 20 to 2417.2514.20, 20.9615.9512.95, 19.4516.6113.38,20.04
 25 + 3.322.86, 3.863.232.76, 3.773.232.74, 3.74
Race/Ethnicity
 Black5.473.50, 8.545.493.31, 8.585.473.14, 8.10
 Hispanic9.117.75, 10.718.797.42, 10.348.737.29, 10.22
 Other414.3311.33, 18.1413.4910.48, 17.0913.1110.05,16.37
 White8.677.93, 9.478.707.94, 9.518.707.91, 9.51
State
 Arizona7.846.93, 8.887.226.29, 8.257.466.52, 9.26
 Colorado7.666.65, 8.827.646.51, 8.927.366.19, 9.24
 Iowa10.088.60, 11.819.627.82, 11.719.897.89, 12.46
 New York15.1312.71, 18.0113.4610.90, 16.4514.3011.53,18.97

MD STARnet, Muscular Dystrophy Surveillance, Research and Tracking Network; US, United States; CI, confidence interval

1Standardized to US male population by age and race/ethnicity

2Adjusted by age, race/ethnicity, site, number of reporting sources, and proportion of cases seen at a neuromuscular clinic. Based on multivariable Poisson model, with confidence intervals obtained from 100,000 random simulations

3Per 100,000 individuals

4Includes any race other than Black, Hispanic, or White, including multiple races and missing

Table 4

Analysis of deviance, MD STARnet Expanded Surveillance Pilot, 2007–2011

VariableDegrees of freedomDeviancePercent of deviance
Age5527.073.9%
State341.55.8%
Race/ethnicity319.82.8%
Proportion treated at MD clinic113.30.4%
Average number of ascertainment sources210.1 < 0.1%
Proportion diagnosed by genetic testing310.0 < 0.1%
Residuals81121.417.0%

MD STARnet, Muscular Dystrophy Surveillance, Research and Tracking Network

1Proportion of patients within stratum who were treated at a neuromuscular clinic

2Average number of the number reporting sources at which each patient in stratum was identified

3Proportion of cases diagnosed by genetic testing in the index case or a family member

Table 5

Association of Population Characteristics with Prevalence of Duchenne/Becker Muscular Dystrophy1, MD STARnet Expanded Surveillance Pilot, 2007–2011

Prevalence Rate Ratio95% Confidence IntervalP-value
Age (years)
Under 50.5040.366–0.686 < 0.001
 5 to 90.9010.690–1.1740.427
 10 to 14Ref.
 15 to 190.9940.775–1.2750.959
 20 to 240.7320.558–0.9570.020
 25 + 0.1700.115–0.250 < 0.001
State
 ArizonaRef.
 Colorado1.1640.786–1.7210.444
 Iowa1.3680.930–2.0010.086
 New York2.1641.620–2.875 < 0.001
Race/Ethnicity
 Black0.5010.301–0.7850.004
 Hispanic0.8820.716–1.0810.233
 Other1.4241.077–1.8560.009
 WhiteRef.
 Average number of ascertainment sources21.0950.735–1.6290.650
 Proportion diagnosed by genetic testing31.0090.533–1.9150.993
 Proportion treated at MD clinic142.6960.905–8.1640.073

1The dependent variable was number of Duchenne and Becker muscular dystrophy cases, with the logarithm of stratum population used as an offset variable

2Average number of reporting sources for each patient in stratum

3Proportion of cases diagnosed by genetic testing in the index case or a family member

4Proportion of patients within stratum that were treated at a muscular dystrophy clinic

Discussion

Our primary goal was to determine whether adjusting for sources of variability in site-specific prevalence estimates would reduce differences among catchment areas, increasing confidence that findings are generalizable beyond the areas included within the surveillance system. Unfortunately, adjusting for known and potential sources of variability by standardization or multivariate modeling did not substantially reduce between-site differences. Surveillance site accounted for 6% of the deviance between prevalence rates, and 17% of the deviance was unexplained after adjusting for age, race/ethnicity, and ascertainment details. The large proportion (74%) of the deviance explained by age group is expected given the natural history of DBMD. In this progressive disorder, prevalence is low in children younger than the usual age of diagnosis (approximately 5 years) and highest among children age 5–19 years, when most affected boys have been diagnosed and mortality is still low. The prevalence declines among adults age 20 years and older, when mortality increases.

Our analysis complements the article by Taruscio and Mantovani 3 by providing an example of a structured analysis to evaluate the uncertainty in prevalence estimates of rare diseases. We experienced several challenges in analyzing the sources of variability. Population level data on potential sources of variation such as the number of unsurveilled health care providers within a catchment area was unavailable. We could not evaluate how well our proxy measures, the mean number of sources at which cases were ascertained and the proportion of cases seen at a neuromuscular clinic, estimated the completeness of coverage of health care facilities treating muscular dystrophy for each stratum. Socioeconomic status was unavailable at the case level. The limited data on potential sources of variability and the relatively small number of strata limited our ability to explain the sources and magnitude of variation in DBMD prevalence rates.

Our analysis is strengthened by factors that reduce process variability in case ascertainment. MD STARnet sites use a standard protocol [4]. Cases are actively sought using multiple data sources, and identifying information allows duplicate cases to be identified and consolidated. For the pilot, case eligibility was reviewed by a local clinician experienced in treating muscular dystrophy cases, with additional review of uncertain cases by a committee of clinicians [4, 13].

Our findings suggest that the estimated prevalence of muscular dystrophy may be dependent on which sites are included in MD STARnet. More generally, they suggest that estimates derived from a non-random sample of states or counties cannot be assumed to represent national rates. Although not all the factors that impact MD STARnet estimates are generalizable to other surveillance systems, our study illustrates a valuable approach for evaluating the sources and impact of uncertainty that is applicable to rare disease surveillance systems generally. This analysis provides an example of one methodology for such an evaluation. The Poisson model we used provided estimates of the magnitude and relative contribution of each potential source of variability of DBMD prevalence across demographic strata within the limitations of our data.

Conclusions

Estimating sources of variability in the extrapolation of the prevalence of DBMD from a local to a national scale requires attention to surveillance methodology, the characteristics of the condition under surveillance, and differences and similarities between the local and national populations. In this study, 17% of the variation was not explained by the model.

Methods

Our objectives were to identify sources of variation in MD STARnet prevalence estimates between sites and to estimate the magnitude of the total variation in DBMD prevalence estimates and the relative contribution of each source of variation.

Sources of variation

We identified potential sources of variation in prevalence estimates from the scientific literature and expert opinion. We synthesized the findings into a theoretical model of how the sources contributed to potential bias in generalizing the estimates to the US population (Fig. 1).

An external file that holds a picture, illustration, etc.
Object name is 13023_2023_2662_Fig1_HTML.jpg

Sources of variation in prevalence estimates - conceptual model

Literature review. Two analysts independently searched PubMed and Google Scholar and reviewed the retrieved citations for eligibility. Our original criteria for inclusion were methodological studies of the types, sources, or magnitude of bias in surveillance or research studies. PubMed and Google Scholar were chosen because they were available to both analysts and were expected to capture most articles on public health surveillance methods. The search terms included surveillance, rare disease, prevalence, error, limitations, uncertainty, epidemiology, estimation, MD STARnet, muscular dystrophy, prevalence, US Census, and variations of these terms. Details on the search strategies are provided in the Additional file 4. The last search was conducted on November 3, 2016 and included all articles published prior to that date. The search was not updated after the final logic model was constructed.

We adhered to a rigorous search methodology to the extent possible but deviated from a full systematic review methodology in two regards. First, we could not develop a complete, deduplicated count of identified citations because Google Scholar results cannot be exported, making it impossible to identify duplicates. Second, we found very few studies that met our pre-determined eligibility criteria of being designed explicitly to study the sources or magnitude of bias in surveillance systems. Instead, information on sources of bias was more commonly found in reports about surveillance or research study design. We therefore include articles that discussed possible sources of bias in their surveillance system or data even if they did not estimate the magnitude of the bias. The placement of the information within the article and the depth of detail varied greatly among studies. This variability made the use of structured abstraction or a data extraction tool impossible. Instead, relevant information was manually extracted into Word.

Both analysts reviewed the combined list of eligible citations and classified each as included or excluded. Included articles were abstracted by both analysts independently and reviewed for discrepancies.

Investigator survey. We surveyed MD STARnet investigators to explore their experiences and perceptions of different sources of variation that may affect MD STARnet prevalence estimates, and the approximate magnitude of bias that may be introduced by each source (Additional file 5: Fig. S1). Due to the small number of eligible sites, instead of formally piloting the survey, it was reviewed by North Carolina investigators who did not participate in developing the survey. We emailed the link to the Survey Gizmo [14] survey to the principal investigators of six sites (Colorado, Iowa, western New York, central North Carolina, South Carolina, and Utah) funded from 2014 to 2019 and asked them to distribute it to the MD STARnet investigators at their site. Because staff roles and responsibilities vary across MD STARnet sites, we relied on the principal investigators to distribute the survey to appropriate site colleagues. The survey was anonymous; investigators who responded online could not be identified or linked to a specific site, and a formal response rate could not be calculated. There was at least one response from all sites. Four sites submitted responses through the link, and two sites submitted aggregate responses for their site by email. The institutional review board (IRB) at RTI International, employer of the primary analysts, determined the survey was program evaluation, not human subjects research as defined by 45 CFR 46.102. Due to the small sample size and the aggregate responses obtained from two sites, all data were analyzed descriptively.

MD STARnet data

The analytic data were from MD STARnet’s pilot expanded muscular dystrophy surveillance (EMDS) [4]. Four geographically defined surveillance sites (Arizona, Colorado, Iowa, and 12 counties in western New York) conducted retrospective active surveillance of nine muscular dystrophies (MD) (Duchenne, Becker, congenital, distal, Emery-Dreifuss, facioscapulohumeral, limb-girdle, and oculopharyngeal MD, MD not otherwise specified, and myotonic dystrophy) from 2011 to 2014. All four sites had authority to conduct public health surveillance by the legal authority of their state department of health and/or institutional review board approval or exemption [4]. Informed consent was waived because the project was public health surveillance. Trained medical coders reviewed electronic or paper medical records of eligible cases to abstract information about signs and symptoms, diagnostic tests, treatment and follow-up care. Eligible individuals had evidence of a physician’s diagnosis of a specific MD type within their medical record, resided within a MD STARnet catchment area, and had at least one healthcare encounter from 2007 to 2011 inclusive [4]. Case ascertainment sources varied between sites but included physician and other provider medical records, hospital records, vital statistics, and administrative data. Cases were ascertained using International Classification of Diseases, Ninth Revision, Clinical Modification codes (359.0: congenital hereditary MD, 359.1: hereditary progressive MD, 359.21: myotonic dystrophy) in medical and administrative records and International Classification of Diseases, Tenth Revision mortality codes (G71.0: MD, G71.1: myotonic dystrophy) in death certificates. At each site, a clinician who treated patients with muscular dystrophy reviewed the abstracted case notes and decided if the MD type specified was consistent with standard diagnostic practice. If the diagnosis was in question, a panel of 5 neuromuscular experts made the final determination about MD type. The muscular dystrophies differ in inheritance pattern, age and sex of individuals affected, and prevalence of the disorders. Therefore, we limited our analyses to DBMD. Because we estimated the point prevalence of DBMD, we only included individuals with DBMD who were alive on July 1, 2010, leaving a total of 720 cases.

To determine if the variability in site-specific prevalence was within expected random variation, controlling for site population demographics and surveillance procedures, we constructed a dataset with one record for each age-race/ethnicity-site stratum, with a total of 96 strata. The dataset variables were number of DBMD cases, total population, age category (5-year intervals as shown in Table Table2),2), surveillance site, race/ethnicity (White, Black, Hispanic and Other, which included Asian, Pacific Islander, American Indian, and unknown or unspecified race), method of diagnosis (proxy for diagnostic certainty; defined as genetic diagnosis in case or family member, family history of MD, or clinical diagnosis), the average number of reporting sources per patient (proxy for likelihood of identification at surveilled facilities), and the proportion of patients within the stratum treated at a MD clinic (proxy for likelihood of being treated at surveilled facilities). Data were too sparse to include zip code in the strata definition, which would have allowed us to use Census data as a proxy for socioeconomic status. We defined age and vital status as of July 1, 2010.

Sources of variation in calculated prevalence

We calculated the unadjusted prevalence of DBMD overall and by site, age, and race/ethnicity. We calculated standardized prevalence for the US population using standard methods [15]. Briefly, we analyzed the prevalence for each age-race/ethnicity stratum, calculated the expected number of cases for the US based on the US population for equivalently-defined strata, then assessed the prevalence using the projected number of cases. Similar methods were used for standardized prevalence for subpopulations. We used the July 1, 2010 US Census estimated population of the surveillance catchment areas and the United States for all prevalence calculations and statistical models.

We used our theoretical model to develop a multivariable Poisson regression model to quantify the contribution of each measured source of variation to the total variance and how much variation remained unexplained. The Poisson model, fit to the stratum level dataset, controlled for the potential sources of uncertainty for which we had data. The MD STARnet data did not include a measure of socioeconomic status. Independent variables were age group, race/ethnicity, method of diagnosis, average number of reporting sources per patient, and whether the patient was treated at a specialized neuromuscular clinic. The natural log of the total stratum population was used as an offset variable to adjust for the differences in opportunity for the outcome. The number of DBMD cases in each stratum was the dependent variable. Analysis of deviance, the difference between the predicted outcome variables and the actual values for each record, was used to quantify the contribution of each variable to the variation in prevalence among the 96 strata.

We compared the unadjusted, standardized and modeled estimates of prevalence to assess the extent to which controlling for age, race/ethnicity and differences in surveillance process explained prevalence differences between sites. Primary analyses were conducted in R software, version 3.4.3 [16]. The secondary analyst used R software, version 3.6.0 [17] and SAS/STAT software, version 9.4 [18].

Acknowledgements

We acknowledge and appreciate the contributions of the MD STARnet network members to data collection and case classification. The analysts for the sources of variability analyses were Nedra Whitehead (primary) and Suzanne McDermott (secondary). The analysts for the magnitude of variation were Stephen Erickson (primary) and Bo Cai (secondary).

Author contributions

NW developed the goals and concept for the analysis, conducted the investigator survey and was the primary analyst for the literature review, and was the primary writer of the manuscript. She reviewed and provided approval of the final manuscript. SE developed the modeling approach, conducted the statistical analyses, and wrote sections of the manuscript. He reviewed and provided approval of the final manuscript. BC contributed to the modeling approach and replicated and confirmed the statistical analysis. He reviewed and provided approval of the final manuscript. SM was the secondary analyst for the investigator survey and the literature review. She identified and abstracted articles and contributed to the development of the conceptual model of the sources of variation. She reviewed and provided approval of the final manuscript. HP contributed to the development of the conceptual model of the sources of variation. She reviewed and provided approval of the final manuscript. JH provided clinical expertise in muscular dystrophy and contributed to the development of the conceptual model of the sources of variation. He reviewed and provided approval of the final manuscript. LO contributed to the development of the conceptual model of the sources of variation. She reviewed and provided approval of the final manuscript. The Muscular Dystrophy Surveillance, Tracking and Research Network collected the data used for this analysis.

Funding

This analysis was supported by CDC cooperative agreements 5U01DD00116 and 1U01DD001255 (North Carolina) and 6U01DD00117 and 6U01DD00145 (South Carolina). The Expanded Muscular Dystrophy Surveillance pilot was supported by the following CDC cooperative agreements, DD000830 (Arizona), DD000835 (Colorado), DD000831 (Iowa), DD000836 (Western New York), DD000832 (coordinating center), DD000834 (data coordinating center) and DD000837 (Abstractor QA Center). The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Diseases Control and Prevention or the Department of Health and Human Services.

Availability of data and materials

Because of state policies governing access to public health surveillance data, MD STARnet data is only available through collaboration with a MD STARnet principal investigator. For more information on access to MD STARnet data, please contact the Centers for Disease Control and Prevention at [email protected].

Declarations

Ethics approval and consent to participate

This study complies with the guidelines for human studies and was conducted ethically in accordance with the World Medical Association Declaration of Helsinki. As described in the manuscript, all four sites had authority to conduct public health surveillance by the legal authority of their state department of health and/or institutional review board approval or exemption.(3) Informed consent was waived because the project was public health surveillance.

Consent for publication

No individual data included.

Competing interests

The authors have no competing interests to declare.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1. Porta M, ed. A Dictionary of Epidemiology. 6th ed. New York: Oxford University Press. ISBN: 978-0-19-997673-7.
2. Thacker SB, Qualters JR, Lee LM, et al. Public health surveillance in the United States: evolution and challenges. MMWR Suppl. 2012;61(3):3–9. [Abstract] [Google Scholar]
3. Taruscio D, Mantovani A. Multifactorial rare diseases: can uncertainty analysis bring added value to the search for risk factors and etiopathogenesis. Medicina. 2021 10.3390/medicina57020119. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
4. Do TN, Street N, Donnelly J, et al. Muscular dystrophy surveillance, tracking, and research network pilot: population-based surveillance of major muscular dystrophies at four US sites, 2007–2011. Birth Defects Res. 2018;110(19):1404–11. 10.1002/bdr2.1371. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
5. Miller LA, Romitti PA, Cunniff C, et al. The muscular dystrophy surveillance tracking and research network (MD STARnet): surveillance methodology. Birth Defects Res A Clin Mol Teratol. 2006;76(11):793–797. 10.1002/bdra.20279. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
6. Centers for Disease C, Prevention. Prevalence of Duchenne/Becker muscular dystrophy among males aged 5–24 years - four states, 2007. MMWR Morb Mortal Wkly Rep 2009;58(40):1119–22. [Abstract]
7. Romitti PA, Zhu Y, Puzhankara S, et al. Prevalence of Duchenne and Becker muscular dystrophies in the United States. Pediatrics. 2015;135(3):513–521. 10.1542/peds.2014-2044. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
8. Besag J, Newell J. The detection of clusters in rare disease. J R Stat Soc A Stat Soc. 1991;154(1):143–155. 10.2307/2982708. [CrossRef] [Google Scholar]
9. Hollak CE, Aerts JM, Ayme S, et al. Limitations of drug registries to evaluate orphan medicinal products for the treatment of lysosomal storage disorders. Orphanet J Rare Dis. 2011;6:16. 10.1186/1750-1172-6-16. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
10. Mendez EP, Lipton R, Ramsey-Goldman R, et al. US incidence of juvenile dermatomyositis, 1995–1998: results from the national institute of arthritis and musculoskeletal and skin diseases registry. Arthritis Rheum. 2003;49(3):300–305. 10.1002/art.11122. [Abstract] [CrossRef] [Google Scholar]
11. Yu JB, Gross CP, Wilson LD, et al. NCI SEER public-use data: applications and limitations in oncology research. Oncology. 2009;23(3):288–295. [Abstract] [Google Scholar]
12. Papoz L, Balkau B, Lellouch J. Case counting in epidemiology: limitations of methods based on multiple data sources. Int J Epidemiol. 1996;25(3):474–478. 10.1093/ije/25.3.474. [Abstract] [CrossRef] [Google Scholar]
13. Mathews KD, Cunniff C, Kantamneni JR, et al. Muscular dystrophy surveillance tracking and research network (MD STARnet): case definition in surveillance for childhood-onset Duchenne/Becker muscular dystrophy. J Child Neurol. 2010;25(9):1098–1102. 10.1177/0883073810371001. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
14. SurveyGizmo [program]. https://www.alchemer.com/survey/. Alchemer. Louisville, CO.
15. Rothman KJ. Standardization of rates. Modern Epidemiology. Boston, MA: Little, Brown and Company 1986:42–44.
16. Bates D et al. R: A language and environment for statistical computing. [program]. 3.4.3 version. Vienna, Austria: R Foundation for Statistical Computing, 2017. https://www.R-project.org.
17. Bates D et al. R: A language and environment for statistical computing. [program]. 3.6.0 version. Vienna, Austria: R Foundation for Statistical Computing, 2019. https://www.R-project.org.
18. Base SAS® 9.4 Procedures Guide, Seventh Edition. Cary, NC: SAS Institute Inc. https://www.sas.com.

Articles from Orphanet Journal of Rare Diseases are provided here courtesy of BMC

Citations & impact 


This article has not been cited yet.

Impact metrics

Alternative metrics

Altmetric item for https://www.altmetric.com/details/144288711
Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/144288711

Data 


Data behind the article

This data has been text mined from the article, or deposited into data resources.

Similar Articles 


To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.


Funding 


Funders who supported this work.

NCBDD CDC HHS (3)

National Center on Birth Defects and Developmental Disabilities (1)