Abstract
Purpose
To determine whether automatic and manual measurements of hippocampal volume differences on MRI between normal aging, cognitive impairment (CI), and Alzheimer's disease (AD) yield similar results.Materials and methods
Reliability was determined for an automatic and a manual method on nine volunteers (22-83 years old) who underwent MRI twice in 1 day. Hippocampal volumes of 20 cognitively normal subjects (mean age 74.0 +/- 6.2 years) and age-matched patients (20 CI and 20 AD) were compared.Results
The intraclass correlation for automatic calculations of hippocampal volume was 0.94; for manual tracing it was 0.99. Volume differences between cognitively normal, CI, and AD subjects from the automatic and manual methods were similar.Conclusion
Because the automatic calculations were faster and less susceptible to rater bias than manual tracing, this automated method is expected to be very useful for analyzing hippocampal changes in studies of aging and dementia.Free full text
Comparison of Automated and Manual MRI Volumetry of Hippocampus in Normal Aging and Dementia
Abstract
Purpose
To determine whether automatic and manual measurements of hippocampal volume differences on MRI between normal aging, cognitive impairment (CI), and Alzheimer’s disease (AD) yield similar results.
Materials and Methods
Reliability was determined for an automatic and a manual method on nine volunteers (22–83 years old) who underwent MRI twice in 1 day. Hippocampal volumes of 20 cognitively normal subjects (mean age 74.0 ± 6.2 years) and age-matched patients (20 CI and 20 AD) were compared.
Results
The intraclass correlation for automatic calculations of hippocampal volume was 0.94; for manual tracing it was 0.99. Volume differences between cognitively normal, CI, and AD subjects from the automatic and manual methods were similar.
Conclusion
Because the automatic calculations were faster and less susceptible to rater bias than manual tracing, this automated method is expected to be very useful for analyzing hippocampal changes in studies of aging and dementia.
Several neurodegenerative and psychiatric disorders, including Alzheimer’s disease (AD) and schizophrenia show hippocampal atrophy on magnetic resonance imaging (MRI) (1,2). Most studies have used manual tracing of the hippocampus to determine volume changes; however, this approach requires extensive rater training and has a potential risk of rater bias. High-dimensional brain warping algorithms were developed that can be used to automatically mask the hippocampus on MRI data (3–5). Although such algorithms reduce to a large extent dependency on raters, automated marking can be more variable than manual procedures because of computational errors due to image noise, which compromise the reliability of the measurements. This could particularly be a problem for MRI studies of aging brain and dementia, because image contrast is often reduced with older brains. Therefore, the aim of this study was to compare the ability of automated and manual hippocampal volumetry in differentiating between normal aging, cognitive impairment (CI) (which could be an early stage of AD), and AD.
AD is the most common cause of dementia in the elderly; it is characterized by neuron loss, especially involving the hippocampus (6). In accordance with pathological findings, a number of MRI studies have shown significant atrophy of the hippocampus in patients with AD (1,7,8). Hippocampal volume loss was also found in subjects with CI (9–12) in the absence of clinical symptoms of dementia, and in cognitively normal subjects with genetic risks for AD (13,14). Furthermore, longitudinal studies suggest that hippocampal volume loss predicts cognitive decline (11,15). In this study, differences in hippocampal volumes among cognitively normal, CI, and AD subjects are determined using both manual and automatic procedures.
MATERIALS AND METHODS
Subjects
To determine the reliability of manual and automated measurements of hippocampal volumes, nine volunteers (two males and seven females, 22–84 years of age, mean age 44 ± 18 years) were recruited, and MR data were acquired twice from each subject on the same day in separate sessions.
To compare manual and automated measurements of hippocampal volume loss in aging and dementia, MRI data from 20 healthy elderly subjects, 20 subjects with CI, and 20 patients with AD were selected from a large database of MRI data, matched for age and sex. The elderly control subjects were recruited from the community, and the CI and AD patients were recruited from the University of California–San Francisco and –Davis Alzheimer centers, where they received a comprehensive clinical evaluation that included a neurological exam and neuropsychological testing. A diagnosis of AD was made according to the criteria established by the National Institute of Neurological and Communicative Disorders and Stroke-Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) (16). A diagnosis of CI required complaints of memory loss, with a Clinical Dementia Rating score of 0.5, but without meeting the DSM-IV criteria for dementia (17). The general cognitive functioning of all subjects, including the healthy controls, was assessed using the Mini-Mental State Examination (MMSE) (18). Furthermore, a neuroradiologist read the MR images of each subject to exclude other major neuropathologies, such as cerebral infarcts or tumors. However, white matter lesions and brain atrophy were not a reason for exclusion. Other exclusion criteria were a clinical history of alcoholism, psychiatric illness, epilepsy, hypertension, diabetes, major heart disease, or head-trauma with temporary loss of consciousness. All subjects (or their caregiver) gave written informed consent after the nature of the procedure had been fully explained to them.
MRI Acquisition
MRI studies were performed on a 1.5-T system (Magnetom VISION; Siemens Inc., Iselin, NJ) equipped with a standard head coil. The MRI protocol consisted of volumetric magnetization-prepared rapid acquisition gradient-echo scans (MP-RAGE: TR/TI/TE = 10/300/4 msec, flip angle = 15°), yielding T1-weighted coronal images, aligned approximately perpendicular to the long axis of the hippocampus with 1.5 × 1.0 × 1.0 mm3 resolution. In addition, double spin-echo images, yielding proton density- and T2-weighted data (DSE: TR/TE1/TE2 = 5000/20/80 msec, 1.0 × 1.4 mm2 resolution, 3-mm-thick slices) were acquired for clinical evaluation and MRI tissue segmentation.
Volume Measurements
Volume measurements of the hippocampus were performed using both a manual and an automated method. In the manual method, boundaries of the hippocampus were drawn on the T1-weighted MR images using in-house-written software (19), following previously published guidelines that include the hippocampal proper, dentate gyrus, subiculum, fimbria, and alveus (20). Volumes were calculated by counting all pixels within the traced area and multiplying by the nominal MRI resolution. Manual tracing of both hippocampi required about 30 minutes for each subject. Automated hippocampal volumetry was carried out using a commercially available high-dimensional brain-mapping tool (Medtronic Surgical Navigation Technologies, Louisville, CO), which combined a coarse and then a fine transformation to match cerebral MR images with a template brain (21,22). Global landmarks were placed at external boundaries of the target brain by manual adjustment of the angle and dimension of a three-dimensional box in orthogonal MR images. The next step was manual selection of 22 control points as local landmarks for hippocampal segmentation: one at the hippocampal head, one at the tail, and four per image (i.e., at the superior, inferior, medial, and lateral boundaries) on five equally spaced images perpendicular to the long axis of the ipsilateral hippocampus. This step was repeated for the contralateral hippocampus. Marking both hippocampi required less than 10 minutes for each subject. Using both the global and local landmarks, a coarse transformation was computed using landmark matching. Automated hippocampal morphometry was then performed by a fluid image matching transformation (3). In contrast to the manual method, automated volumetry did not include the alveus and fimbria as parts of the hippocampus. Despite the difference, we expected that manual tracing based on the protocol by Watson et al (20) would be an excellent reference for testing the performance of the automatic calculations because this protocol is often referenced (195 citations since 1992), and high rater reliability has been consistently achieved (23,34). The raters performed manual tracing and placement of landmarks for automated warping independently.
Statistical Analysis
To examine the reliability of the manual and automated volume measurements, a model was built with effects describing the influence of each subject, so as to separate out subject-to-subject effects from within-subject effects (test-retest variation). With the assumption that the effects have random distributions, the overall measurement variance can be expressed as
Since rater judgment was involved in selecting initial landmarks for warping, the reproducibility of the automated method was also determined. Two raters (A.T.D. and Y.Y.H.) independently placed landmark points twice on 10 randomly selected MRI datasets from the group of elderly subjects. Reproducibility was expressed as a coefficient of variation (CoV), defined as the standard deviation of the difference between two measurements divided by the mean value of the measurements.
Differences in hippocampal volumes between manual and automated volumetry were evaluated using paired t-tests. The ability of the two methods to separate the groups by hippocampal volume was measured using effect sizes. The effect size of a measure, such as hippocampal volume, to separate two groups a and b can be defined as
RESULTS
Manual tracing achieved an ICC of 0.99, while automated warping yielded a lower ICC of 0.94. The difference in ICC was significant (F[1,14] = 4.8, P = 0.04). Since automated warping—in contrast to manual tracing—did not include the alveus and fimbria, the volumes by automated warping were smaller (P < 0.0001) than the volumes by manual tracing. Table 1 lists the reproducibility of the volume measurements using automated warping by different raters, who independently placed landmarks for segmentation of the hippocampus twice on MRI data from 10 subjects. Reproducibility within and between raters ranged from 4.2% to 5.2% CoV. Reproducibility did not differ significantly between the left and right hippocampus.
Table 1
Subject | Age | Sex | Rater A 1st volume | Rater A 2nd volume | %Δa | Rater B 1st volume | Rater 2nd volume | %Δa |
---|---|---|---|---|---|---|---|---|
a | 73 | F | 1967 | 1969 | −0.2 | 1843 | 1894 | −2.8 |
b | 70 | F | 2240 | 2288 | −2.2 | 2183 | 2149 | 1.6 |
c | 80 | F | 2386 | 2220 | 7.3 | 2482 | 2280 | 8.5 |
d | 74 | F | 2239 | 2275 | −1.6 | 2407 | 2329 | 3.3 |
e | 71 | F | 1839 | 1774 | 3.6 | 1819 | 1724 | 5.4 |
f | 68 | F | 2064 | 1979 | 4.3 | 1953 | 1953 | 0 |
g | 70 | M | 2678 | 2830 | −5.6 | 2667 | 2787 | −4.5 |
h | 80 | F | 1929 | 1831 | 5.3 | 1792 | 1821 | −1.7 |
i | 77 | M | 2150 | 2183 | −1.6 | 2103 | 2288 | −8.5 |
j | 74 | F | 2215 | 2224 | −0.5 | 2323 | 2379 | −2.4 |
Mean | 2170.7 | 2157.3 | 0.9 | 2157.2 | 2160.4 | −0.2 | ||
SDc | 244.3 | 300.6 | 4.1 | 306.8 | 319.6 | 5 |
Table 2 lists demographic data of the cognitive normal, CI, and AD subjects, indicating that the groups were significantly different in cognitive functions as measured by MMSE, but were similar in age and sex distribution (P > 0.3 by χ2 for CI compared to controls or AD). Also listed in Table 2 are hippocampal volumes by group, obtained using manual tracing and automated warping, and volume differences between the groups, expressed as effect sizes. Figure 1 also shows the data as boxplots for better visual representation. As in test-retest, hippocampal volumes by automated warping were smaller (P < 0.001) than by manual tracing. However, the volume difference between automated warping and manual tracing was independent of group (F[2,228] = 0.4, P > 0.6), and thus did not modify the effects of group on the hippocampal volume changes. This is also reflected in Table 2 by the similarity of effect sizes that were derived from measurements by manual tracing and automated warping. Using manual tracing, differences in hippocampal volumes between cognitive normal and CI subjects were significant on both the left (F[1,38] = 9.2, P = 0.004) and right side (F[1,38] = 23.3, P < 0.001), equivalent to effect sizes of 1.0 and 1.2, respectively. Using automated warping, these differences remained significant (left: (F[1,38] = 7.8, P = 0.008) and right: (F[1,38] = 9.3, P = 0.004)), while effect sizes dropped insignificantly to 0.9 and 1.0, respectively. Similarly, differences in hippocampal volumes between cognitive normal and AD subjects using manual tracing were significant on both the left (F[1,38] = 48.4, P < 0.001) and the right side (F[1,38] = 54.0, P < 0.001), equivalent to effect sizes of 2.2 and 2.4, respectively. Using automated warping, these differences also remained significant (left: F[1,38] = 29.8, P < 0.001 and right: F[1,38] = 45.2, P < 0.001), while effect sizes dropped again insignificantly to 1.8 and 2.2, respectively. Finally, differences in hippocampal volumes between CI and AD subjects using manual tracing were also significant for both the left (F[1,38] = 17.4, P < 0.001) and the right side (F[1,38] = 13.8, P < 0.001), equivalent to effect sizes of 1.4 and 1.2, respectively. Using automated warping, these differences remained significant (left: (F[1,38] = 9.1, P = 0.004) and right: (F[1,38] = 10.0, P = 0.003)), while effect sizes dropped again insignificantly to 1.0 and 1.4, respectively.
Table 2
CN | CI | AD | CN vs. CIa | CN vs. ADa | CI vs. ADa | |
---|---|---|---|---|---|---|
Men/women | 10/10 | 16/4 | 10/10 | |||
Age (year) | 74.0 + 6.2 | 74.2 + 6.7 | 74.5 + 6.2 | 0 | 0 | 0 |
MMSEb | 29.0 + 1.3 | 27.7 + 2.7 | 22.7 + 3.5 | 0.7‡ | 2.4† | 1.6† |
Hippocampal volumec | ||||||
Left (manual) | 2945 + 503 | 2471 + 484 | 1791 + 546 | 1.0† | 2.2† | 1.4† |
Right (manual) | 3103 + 505 | 2487 + 538 | 1821 + 595 | 1.2† | 2.4† | 1.2† |
Left (automated) | 2323 + 326 | 2009 + 386 | 1573 + 521 | 0.9† | 1.8† | 1.0† |
Right (automated) | 2276 + 253 | 1945 + 414 | 1519 + 435 | 1.0† | 2.2† | 1.1† |
CN = Cognitive normal; CI = Cognitive impaired; AD = Alzheimer’s disease.
Figure 2 depicts the correlation between manual and automated measurements of hippocampal volumes in the elderly subjects, independent of their cognitive status. Results obtained by the two methods were strongly correlated for both hippocampi (left: r = .92, P < 0.001; right: r = 0.91, P < 0.001) without a significant difference between the sides. Finally, Figure 3 shows that measurements of hippocampal asymmetry (defined as 2*[left – right]/mean[left + right]) by manual tracing and automated warping were strongly correlated (r = 0.79, P < 0.001).
DISCUSSION
This study demonstrates that automated volumetry of hippocampus can be performed with high reliability in older subjects, including patients with AD, who may present small and irregularly shaped hippocampi that are difficult to measure. Extending previous MRI studies of young adults (4), this study showed that measurements of hippocampal volumes by automated warping remained strongly correlated to the “gold standard” of manual tracing in elderly subjects. Furthermore, automated volumetry yielded differences in hippocampal volumes between cognitively normal, CI, and AD subjects that were comparable to those obtained by manual tracing, implying that the two methods are similar in power for differentiation of these groups. Because automated warping is less time-consuming and less susceptible to rater bias than manual tracing, this method should therefore enjoy widespread use in MRI studies of hippocampal changes in aging and dementia. Although reliability was slightly better with manual tracing than with automated warping, the difference is likely insignificant in cross-sectional studies, where differences between subjects dominate variability in hippocampal volumes. In longitudinal studies, however, where variability between subjects is eliminated, less reliability might compromise sensitivity in detecting volume change over time. Compensation for image noise is one possible explanation for the lower reliability obtained by automated warping as compared to manual tracing. Another possibility is that limited MRI resolution induced additional variability for automated warping, because the automated method was segmenting hippocampal structures about 10% smaller than the manual method, which included the alveus and fimbria. Currently, we are exploring methods to improve the reliability of automated warping for longitudinal studies, e.g., by utilizing the hippocampal boundaries obtained in the first study as an initial estimate of the hippocampus in subsequent studies.
The reproducibility (as opposed to reliability) of automated volumetry in this study was similar to that found in previous studies on younger subjects. Haller et al (4) reported 3.1% reproducibility with automated warping in young adults and schizophrenia patients, compared to 4.1% and 5.2% reproducibility within and between raters in this study. In another study on more subjects, Haller et al (5) found a superior reproducibility for automated warping than for manual tracing. However, the evaluations by Haller et al did not involve test-retest MRI data, and therefore effects on automated warping from instrumental and physiological noise were ignored. In contrast, our analysis, including test-retest MRI data, revealed a slightly lower reliability for automated warping than for manual tracing, presumably because mathematical algorithms are less effective in compensating for image noise than judgment by an expert rater.
The reliability of automated volumetry also depends on the selection of an appropriate hippocampal reference template. Volume measurements in older subjects that are based on a hippocampal template from a young subject could be inaccurate and might introduce a bias for age. On the other hand, the use of different templates, each appropriate for a specific group of patients, may complicate the interpretation of findings. Alternatively, a probabilistic template that represents the most common hippocampal structure of a large group of subjects could be used instead (26). Finally, several other approaches for nonlinear image registration have been proposed in recent years to measure brain atrophy, including intensity-based algorithms such as statistical parametric mapping (SPM) (27) and voxel compression (28,29), and model-based algorithms such as elastic shape deformations (30), parametric mesh deformations (31), and tensor mapping (32). Results of hippocampal atrophy in the CI and AD subjects of this study were similar to those from other MRI studies that used manual tracing (1,11,14,33). However, hippocampal atrophy is not specific to AD and has also been found in other neurodegenerative diseases, such as ischemic vascular dementia in the absence of AD pathology (34). Therefore, measurements of hippocampal volume may be of limited use for a differential diagnosis of AD. Compared to manual tracing, an advantage of high-dimensional warping is that shape information of the hippocampus is provided in addition to volume, which might add specificity (22). It remains to be determined whether the alveus and fimbria, which were excluded in this automated method, are important in detecting hippocampal atrophy in normal aging and AD. The alveus and fimbria are white matter regions, covering mainly the intraventricular surface of the cornu ammonis. The alveus contains axons of the hippocampal and subicular neurons, which are the main efferent pathways of the hippocampus. These axons enter the fimbria and then extend through a polysynaptic pathway to the cortex (35). Histopathological studies showed that neuronal loss in the cornu ammonis, dentate gyrus, and subiculum is more extensive in AD than in normal aging (36,37). Axonal degeneration secondary to neuronal loss in the cornu ammonis, dentate gyrus, and subiculum could result in atrophy of the alveus and fimbria, which might be missed by automated warping. In this study, however, automated warping yielded similar differences in hippocampal volumes between normal aging, CI, and AD as compared with manual tracing, implying that the subjects differed little with respect to atrophy in the alveus and fimbria.
In summary, automated measurements of hippocampal volume can be performed with high reliability in normal aging and dementia. Furthermore, automated measurements yield hippocampal volume losses in CI and AD that are similar to results obtained by manual tracing. Automated warping should be advantageous in MRI studies of hippocampal atrophy in normal aging and dementia.
Acknowledgments
We are grateful to Dr. John Csernansky, Washington University, St. Louis, and to Dr. Sarang Joshi, University of North Carolina, Chapel Hill, for providing an image template and for helpful discussions.
Footnotes
Contract grant sponsor: NIH; Contract grant numbers: AG10897; AG12435.
References
Full text links
Read article at publisher's site: https://doi.org/10.1002/jmri.10163
Read article for free, from open access legal sources, via Unpaywall: https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/jmri.10163
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1002/jmri.10163
Article citations
Progressive cervical cord atrophy parallels cognitive decline in Alzheimer's disease.
Sci Rep, 14(1):21595, 16 Sep 2024
Cited by: 0 articles | PMID: 39284823 | PMCID: PMC11405669
Presynaptic density determined by SV2A PET is closely associated with postsynaptic metabotropic glutamate receptor 5 availability and independent of amyloid pathology in early cognitive impairment.
Alzheimers Dement, 20(6):3876-3888, 18 Apr 2024
Cited by: 2 articles | PMID: 38634334 | PMCID: PMC11180932
ATN Classification and Clinical Progression of the Amyloid-Negative Group in Alzheimer's Disease Neuroimaging Initiative Participants.
Chonnam Med J, 60(1):51-58, 25 Jan 2024
Cited by: 1 article | PMID: 38304128 | PMCID: PMC10828081
Towards validation in clinical routine: a comparative analysis of visual MTA ratings versus the automated ratio between inferior lateral ventricle and hippocampal volumes in Alzheimer's disease diagnosis.
Neuroradiology, 66(4):487-506, 19 Jan 2024
Cited by: 0 articles | PMID: 38240767
IL-6 Enhances the Negative Impact of Cortisol on Cognition among Community-Dwelling Older People without Dementia.
Healthcare (Basel), 11(7):951, 25 Mar 2023
Cited by: 0 articles | PMID: 37046878 | PMCID: PMC10094120
Go to all (127) article citations
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Manual validation of FreeSurfer's automated hippocampal segmentation in normal aging, mild cognitive impairment, and Alzheimer Disease subjects.
Psychiatry Res, 181(3):219-225, 11 Feb 2010
Cited by: 57 articles | PMID: 20153146
Automated 3D segmentation of hippocampus based on active appearance model of brain MR images for the early diagnosis of Alzheimer's disease.
Minerva Med, 105(2):157-165, 01 Apr 2014
Cited by: 2 articles | PMID: 24727880
Discrimination between Alzheimer disease, mild cognitive impairment, and normal aging by using automated segmentation of the hippocampus.
Radiology, 248(1):194-201, 05 May 2008
Cited by: 129 articles | PMID: 18458242
Structural imaging of hippocampal subfields in healthy aging and Alzheimer's disease.
Neuroscience, 309:29-50, 22 Aug 2015
Cited by: 175 articles | PMID: 26306871
Review
Funding
Funders who supported this work.
NIA NIH HHS (4)
Grant ID: AG10897
Grant ID: R01 AG010897
Grant ID: P01 AG012435
Grant ID: AG12435