Mild cognitive impairment can be detected by multiple assessments in a single day
Citation Manager Formats
Make Comment
See Comments

Abstract
Background: Reliable detection of mild cognitive impairment (MCI), in many cases preceding AD, is important in determining the efficacy of emerging treatments. The operational definition of MCI is currently imprecise and would be improved by objective criteria. Inherent in the transition from MCI to AD is cognitive decline, which can be detected using multiple assessments over several years.
Objective: To determine whether multiple assessments on the same day could also differentiate well-studied subjects with very mild MCI from normal control subjects.
Methods: This study utilized a novel 15- to 18-minute computerized cognitive battery designed for frequent serial use, administered four times within 3 hours. Subjects were participants in a longitudinal healthy aging study (20 with MCI, 40 control subjects matched for age, gender, and education).
Results: The MCI group showed significantly attenuated learning performance with repetition on accuracy and reaction time tasks. Discriminant function analysis correctly classified 95% of subjects and 80% of those with MCI.
Conclusions: Multiple assessments with standardized, repeatable cognitive measures is a promising method for reliably differentiating patients with early MCI in a single test session and deserves further study for refining patient selection in trials of therapeutic agents for MCI.
Mild cognitive impairment (MCI) develops prior to AD with a rate of conversion of approximately 10 to 15% per year.1-3⇓⇓ MCI is a promising therapeutic target for putative disease-modifying agents with the aim of preventing transition to AD and subsequent personal and societal morbidity. However, detection of MCI is currently imprecise, being based on behavioral features derived from cognitive testing4 and with no currently accepted standard identification method.3,5,6⇓⇓ Memory complaints and deficits on neuropsychological testing are considered important, but their relative contributions to the syndrome of MCI are debated.3,5,7⇓⇓ In addition, the characteristics of memory deficit regarded as necessary and sufficient for a diagnosis of MCI are left unspecified and could be as varied as difficulty making associations,8 increased rate of forgetting, impaired paragraph recall,4 or memory disorder secondary to attentional dysfunction, as is common in mood disorder.9 Without objective criteria, it is hardly surprising that divergent results are reported for rates of transition of MCI to AD10-13⇓⇓⇓ or whether nonamnestic subtypes should be included.14
A recent serial assessment approach emphasized the early identification of very mild impairment while performance was still within the normal range.7 This study used multiple assessments every 6 months over a 2-year period in healthy community-dwelling volunteers. Such multiple assessment approaches have distinct advantages over single assessments for improving the reliability of identification of individual patients with MCI. In particular, in the same aging study, 25% of the cohort met criteria for MCI on any single assessment, but only 13% met the criteria on three or more consecutive sessions.15 Thus, up to 50% of those classified as MCI based on a single assessment were classified incorrectly presumably for non-neurologic reasons such as test anxiety, fatigue, or lower motivation levels. Only repeated testing of the same individuals was able to reduce the high false-positive classification rates. These findings raise a previously untested hypothesis, specifically whether multiple assessments performed a few minutes apart could also reliably classify individuals with MCI. If so, this might provide a rapid screening technique for fine-tuning patient selection for clinical diagnosis and therapeutic trials.
One major limitation with the rapid reassessment of cognition in older people is that the standard neuropsychological tests cannot be used because of their lack of alternative forms, poor reliability, and lengthy administration time. Tests should be brief and sufficiently motivating to minimize fatigue. Alternative forms must be matched for equivalence and have high test–retest reliability to ensure that differences in results are due to changed performance rather than varying task difficulty. We therefore selected a computerized cognitive test battery designed and validated for rapid and repeated use, with ideal psychometric properties for this experiment.16,17⇓ We sought to determine whether the previously identified MCI patients classified on the basis of the prior 2-year study could be identified by multiple assessments conducted in a single 3-hour testing session.
Subjects and methods.
Subjects.
Subjects were drawn from a longitudinal healthy aging study initially recruited in 1996. The recruitment, selection, and measurement of biochemical, neurologic, cognitive, and psychiatric functions in this study have been described previously.7,18-20⇓⇓⇓ The MCI group consisted of 20 healthy Caucasian older people whose cognitive performance had been measured at 6-month intervals since 1996 and who had met clinical criteria for MCI5 on their three most recent visits to ensure inclusion only of those with persistent impairment. There were no significant differences in demographic, premorbid intelligence estimates, Mini-Mental State Examination, anxiety, or mood measures (see table E-1 on the Neurology Web site; go to www.neurology.org and scroll down the Table of Contents to find the title link for this article). Their specific deficits were in episodic memory tasks (Consortium to Establish a Registry for Alzheimer’s Disease word list delayed recall score below 6) with subtle impairment not below normative ranges in executive tasks. A group of 40 healthy Caucasian older people from the same longitudinal study without decline were selected as control subjects. This group was selected so that two control subjects matched each MCI patient on age, education, and gender. Inclusion criterion for these subjects was that at no time since 1996 had they met the clinical criteria for MCI.
Materials.
All participants completed a battery of cognitive tests (CogState) on four occasions within a 3-hour period. Tests within this battery were chosen to sample from a range of cognitive domains, including psychomotor speed, attention, working memory, and episodic learning and memory. The test has been described previously.16,17,21⇓⇓ In brief, for each test, the stimuli consisted of playing cards that assisted in reducing test anxiety owing to a game-like context. The playing cards were also chosen to lessen test dependence on specific languages due to their universal familiarity, and pilot data confirmed that most participants were familiar with playing card stimuli, could differentiate the cards without additional training, and perceived their presentation to represent a game. The test battery required approximately 15 to 20 minutes to complete depending upon the speed of the individual subject’s responses. Stimuli were randomly chosen for each response trial. Responses were indicated by pressing one of three keys on the computer keyboard (D, K, or spacebar). The D key was designated as the button to indicate “false,” and the K key was designated to indicate “true.” A single button press (spacebar) was required for simple reaction time (SRT) tasks. A binary decision (false or true) was required for the matching, one back-working memory, and associative learning tasks. Each task consisted of two parts: 1) an initial interactive demonstration with simulated trials and feedback about response accuracy using a visual representation of their computer’s keyboard and 2) similar trials with minimal or no visual clues as to required responses. Hence, no textual instructions were presented and the subjects were required to abstract the principles of each task using their own interpretation of the task requirements. The dependent variables collected for each task included the participants’ response times and accuracy (i.e., the percentage of correct responses or hit rate). Incorrect responses, failures to respond, or anticipatory responses faster than 100 milliseconds elicited a buzzer, and the data associated with these trials were omitted from the analysis. Correct responses received no auditory feedback. All tasks in the battery were always presented in the same order of increasing difficulty. This work was part of a prospective longitudinal study utilizing all of the above tasks with wide sampling of cognitive abilities. After data collection, three of the nine tasks with high test–retest reliability (0.64 to 0.77), as determined in the whole group, were used in this analysis. For all tasks, the trial design included an initial fixed interval of 1,500 milliseconds followed by presentation of the stimulus. Stimulus duration was a maximum of 3,500 milliseconds or abbreviated by subject response. If a response was detected, feedback was provided followed by a random interval from 0 to 1,000 milliseconds before commencing the next trial. The individual tasks were as follows:
-
SRT. A single card was presented face-down in the center of the computer screen. Participants were required to press the spacebar whenever the card was turned face-up. The task continued until at least 15 correct responses were recorded or total task time was 60 seconds. This was repeated in the middle and at the end of the battery. Scores are derived from all three SRT tasks.
-
Color-matching reaction time (CoRT). Two cards were presented simultaneously in the center of the computer screen. The participant had to indicate whether the color of the two cards was the same (right key) or different (left key). These keys were reversed in meaning for left-handed responders to ensure the dominant hand responded with the canonical response. The task terminated if 17 correct responses were detected or total task time was 60 seconds.
-
One back-working memory task (WM). Participants were required to decide whether a new face-up card was the same as (right key) or different from (left key) the prior card presented. The response key designations were reversed for left-handed subjects as for the CoRT. The task terminated if 17 correct responses were recorded or total task time was 60 seconds.
Procedure.
All subjects were tested in a quiet computer laboratory. They were given instructions on how to perform the tasks but were not given any practice before they commenced the first assessment. Two test sessions (T1, T2) were completed 5 minutes apart prior to a lunch break (1 hour), and the final two test sessions (T3, T4) were then completed separated by 5 minutes. Hence, total testing time was about 2.5 to 3 hours with sufficient breaks to ensure maintenance of motivation and minimize fatigue. All subjects gave informed consent prior to the trial, and the test protocol was approved by the institutional ethics committee.
Data analysis.
For each subject on each trial, the number of correct responses on each task was calculated and expressed as a percentage of the total trials. This yielded a hit rate measure. For each subject in test session T1 to T4, the mean reaction time (RT) for correct responses on each task was calculated. Initial normalization of RT data was performed using a logarithmic base 10 transformation. Statistical evaluation used the Statistical Program for Social Sciences (SPSS Inc., Chicago, IL). For each task, hit rate and log 10 RT data were submitted to a 2 (group) × 4 (trial) repeated measures analysis of variance (ANOVA) implemented using the multivariate ANOVA procedure in SPSS to prevent violation of the homogeneity of covariance assumption. The η2 statistics were computed as measures of effect size. To protect against Type I error, the results of analyses were considered significant only if the probability was <0.01. Performance measures that yielded a significant difference between groups were then submitted to a discriminant function analysis to determine whether a rule for the classification of individual participants could be developed on the basis of a combination of the performance measures sensitive to MCI.
Results.
A summary of the effect of repeated assessment on the speed and accuracy of performance on the four cognitive trial sessions (T1 to T4) is shown in table E-2 on the Neurology Web site (go to www.neurology.org). For each performance measure, the group main effect (F[1,35]) and the group × trial interaction (F[3,33]) from the analyses are also summarized. Significant group effects were found for the speed of responses on the SRT task (η2 = 0.25) and the CoRT task (η2 = 0.25). Significant group × trial interactions were observed on the WM task for both the speed (η2 = 0.15) and the accuracy (η2 = 0.21) of responses. To reduce the speed and accuracy data for the one back memory task to a single performance index, a memory accuracy/speed ratio or throughput was created for each trial. For each patient on each trial, the hit rate was divided by the log 10 mean RT. As the first analysis indicated that control subjects’ performance on this task becomes more accurate (i.e., hit rate increases) and faster (i.e., RT decreases), the ratio of these two variables will maximize differences between groups on any trial. The group × trial ANOVA was repeated. As expected, this yielded a significant interaction between group and trial (figure), and the magnitude of the effect size for the interaction increased to 0.32. Post hoc t-tests indicated significant differences between groups occurred on trial 3 (η2 = 0.21) and trial 4 (η2 = 0.51).
Figure. Mean accuracy/speed ratio (±SE) across trials for control and mild cognitive impairment (MCI) groups. Ratios are different at the third (T3) and fourth (T4) trials (p < 0.01).
To determine the usefulness of these measures in identifying MCI, a discriminant function analysis was conducted on the variables that separated the control and MCI groups with effect sizes of ≥0.25. These included RT (log 10 transformed) for the SRT averaged over trials, RT (log 10 transformed) for the matching RT task averaged over the four trials, and the accuracy/speed ratio for the one back memory task on T4. The discriminant function analysis yielded a single significant eigenvalue (1.81, χ2[3] = 39.4, p < 0.0001) with a discriminant function:
where SRT-RTlog 10 is the log 10-transformed mean SRT, CoRT-RTlog10 is the log 10-transformed mean CoRT, and ACC-SPD-WMT4 is the accuracy/speed ratio for the WM task at the fourth trial. This function correctly classified 95% of the control subjects and 80% of the MCI subjects.
Discussion.
When assessed on four occasions in 3 hours using a brief, repeatable, computerized battery, older people with mild MCI and control subjects displayed a different pattern of serial test performance. Whereas the normal older people demonstrated practice effects over the course of the four tests, the MCI patients tested here displayed only minor effects of practice with a performance plateau. The groups were well matched with respect to age, sex, and education and were all independent community-dwelling volunteers. Differentiation of these groups was not possible on the first assessment, when both groups were unfamiliar with the tasks, but became significant at the third and final test. Although the study MCI and normal groups were relatively small in number, they were well characterized with respect to their cognitive status owing to longitudinal recruitment and evaluations over 5 years. Optimization of these measures, using a composite of several speed- and accuracy-derived indexes, allowed high sensitivity for detection of MCI subjects (80%) and specificity for rejection of normal older subjects (95%).
This study is the first report of the use of a multiple assessment approach on a single day to successfully differentiate MCI patients from normal subjects. The results support the hypothesis that multiple assessments using appropriate tools can leverage a lack of normal practice effects in impaired populations.22,23⇓ In addition, given appropriate instruments, these impairments can be detected much faster than has been previously possible.7 The subjects used in this study were highly selected on the basis of years of longitudinal evaluation, ensuring that both abnormal performances were persistent and subjects were proven not to be declining. Cross-sectional selection of patients with presumed MCI based on single assessments is less reliable, raising the possibility that currently used inclusion criteria in MCI trials may be less than optimally powered to detect significant treatment efficacy. Screening with multiple baselines may improve MCI therapeutic trial patient selection by reducing false-positive rates and improving sample homogeneity. Furthermore, it is possible that attenuation or normalization of impaired serial performance curves in MCI patients may be a useful rapid endpoint for screening of putative therapeutic agents. A potential role also exists for multiple assessments to be used in the general population to reduce the uncertainty of diagnosis where time and specialist assessment resources are scarce or unavailable.
A limitation of the current study is the lack of pathologic confirmation of the disease process underlying the MCI patients’ deficits. However, there is suggestive evidence that the pattern of declarative memory impairment is consistent with hippocampal involvement8 and with this pattern of memory and milder executive impairment being associated with transition to AD.14 In addition, these patients will need to be followed to determine whether their impairment is also progressive. If so, this would firm the contention that they are representative of the earliest form of AD detected while these people are independent and asymptomatic. In addition, careful replication of this result using the same measures in comparably well-studied subjects would be desirable.
The cognitive instrument used in the current study was specifically designed for frequent serial assessment, but it remains possible that standard tests could also be used in this manner. For example, alternative forms of a verbal memory test repeated within minutes may cause disproportionately greater interference effects for patients with MCI, which may provide similar diagnostic specificity. This has not been investigated, to our knowledge, though practical problems with standard tests might be expected to confound any discernible effects through exaggerated random intertest fluctuations. It may also be possible to reduce the total time for the four complete tests to much less than the 2 to 3 hours used in this study. This timing was necessary to accommodate all tasks in the battery for prospective longitudinal evaluations, but many were not used in this discrimination. Hence, in theory, limiting a diagnostic test to the SRT, CoRT, and WM tasks alone might show similar differentiation in naive subjects in only about 5 minutes per session and 20 to 30 minutes overall. On the other hand, practical issues including subject fatigue might interfere with this effect if the tasks that are relatively repetitive were presented more frequently. An alternative means of reducing total task time might be design improvements to more rapidly increase task familiarity. Such manipulations are worthy of further investigation. Furthermore, the tasks found to be sensitive to impairment here were not specific for episodic memory deficits, suggesting that test–retest reliability may be more important than task-specific impairment. Our findings suggest that key features of the current tasks that reduce these sources of variation included multiple sampling of performance allowing valid statistical analyses, equivalence of alternative forms promoting test–retest reliability, and brief duration, maintaining interest and minimizing fatigue. Further studies of the use of different stimuli and tasks loaded for specific impairments presented multiple times may allow these questions to be answered.
Acknowledgments
Supported by CogState Ltd., Carlton, and an Australian Federal Government START grant.
Acknowledgment
The authors thank the Mental Health Research Institute of Victoria for providing facilities and CogState Ltd. for supporting this work.
Footnotes
-
All authors are currently employees of and hold equity in CogState Ltd.
-
Additional material related to this article can be found on the Neurology Web site. Go to www.neurology.org and scroll down the Table of Contents for the October 8 issue to find the link for this article.
- Received March 11, 2002.
- Accepted June 27, 2002.
References
- ↵
- ↵
Tierney MC, Szalai JP, Snow WG, et al. Prediction of probable Alzheimer’s disease in memory-impaired patients: a prospective longitudinal study. Neurology . 1996; 46: 661–665.
- ↵
- ↵
- ↵
Petersen RC, Stevens JC, Ganguli M, Tangalos EG, Cummings JL, DeKosky ST. Practice parameter: early detection of dementia: mild cognitive impairment (an evidence-based review): report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology . 2001; 56: 1133–1142.
- ↵
- ↵
Collie A, Maruff P, Shafiq-Antonacci R, et al. Memory decline in healthy older people: implications for identifying mild cognitive impairment. Neurology . 2001; 56: 1533–1538.
- ↵
- ↵
- ↵
Ritchie K, Artero S, Touchon J. Classification criteria for mild cognitive impairment: a population-based validation study. Neurology . 2001; 56: 37–42.
- ↵
Petersen RC, Smith GE, Waring SC, Ivnik RJ, Kokmen E, Tangelos EG. Aging, memory, and mild cognitive impairment. Int Psychogeriatr . 1997; 9: 65–69.
- ↵
- ↵
- ↵
- ↵
Collie A, Maruff P, Currie J. Behavioural characterization of mild cognitive impairment. J Clin Exp Neuropsychol (in press).
- ↵
Collie A, Darby DG, Maruff P. Computerised cognitive assessment of athletes with sports related head injury. Br J Sports Med . 2001; 35: 297–302.
- ↵
Makdissi M, Collie A, Maruff P, et al. Computerized cognitive assessment of concussed Australian rules footballers. Br J Sports Med . 2001; 35: 354–360.
- ↵
Shafiq-Antonacci R, Maruff P, Whyte S, Tyler P, Dudgeon P, Currie J. The effects of age and mood on saccadic function in older individuals. J Gerontol B Psychol Sci Soc Sci . 1999; 54: 361–368.
- ↵
- ↵
- ↵
Westermann R, Darby DG, Maruff P, Collie A. Cognitive assessment of pilots: why and how? Aust Defence Forces J Health . 2001; 2: 29–36.
- ↵
- ↵
Letters: Rapid online correspondence
REQUIREMENTS
You must ensure that your Disclosures have been updated within the previous six months. Please go to our Submission Site to add or update your Disclosure information.
Your co-authors must send a completed Publishing Agreement Form to Neurology Staff (not necessary for the lead/corresponding author as the form below will suffice) before you upload your comment.
If you are responding to a comment that was written about an article you originally authored:
You (and co-authors) do not need to fill out forms or check disclosures as author forms are still valid
and apply to letter.
Submission specifications:
- Submissions must be < 200 words with < 5 references. Reference 1 must be the article on which you are commenting.
- Submissions should not have more than 5 authors. (Exception: original author replies can include all original authors of the article)
- Submit only on articles published within 6 months of issue date.
- Do not be redundant. Read any comments already posted on the article prior to submission.
- Submitted comments are subject to editing and editor review prior to posting.
You May Also be Interested in
Hastening the Diagnosis of Amyotrophic Lateral Sclerosis
Dr. Brian Callaghan and Dr. Kellen Quigg