Measuring Friedreich ataxia: Interrater reliability of a neurologic rating scale
Citation Manager Formats
Make Comment
See Comments

Abstract
Measuring the severity of neurologic dysfunction in patients with inherited ataxias, including Friedreich ataxia (FA), is difficult because of the variable rate of progression, the variable age at onset and the variety of neural systems that may be affected. The authors discuss the problems related to rating scales in the ataxias, report a neurologic rating scale for FA, and demonstrate acceptable interrater reliability of the instrument.
There have been few attempts at validating neurologic assessment methods to measure disease severity in ataxias, with the exception of an interrater reliability study of the International Cooperative Ataxia Rating Scale (ICARS).1 There are many problems in the development of such scales. Ataxias are rare disorders with variable onset age and progression rate. The phenotype can be multifaceted with many non-cerebellar signs in some patients, even within the same genotype. Clinical signs evolve in a way that can be difficult to measure: hyperreflexia may give way to areflexia or nystagmus may disappear with progression of disease. Thus “better” performance in these measures may not mean less severe disease. Some quantifiable abnormalities, such as oculomotor abnormalities, may have minimal impact on patient performance. We describe a rating scale for Friedreich ataxia (FA) and report on the interrater reliability of the instrument.
Methods.
Development of the scale
The FA scale was developed from a longer scale devised by the Cooperative Ataxia Group (CAG) to evaluate functional and neurologic deficits with greater weight given to gait and stance and included elements of other scales.2,3 (Appendix E-1 on the Neurology Web site is a copy of the scale.) The neurologic examination assessed bulbar, upper limb, lower limb, peripheral nerve, and upright stability/gait functions (maximum scores of 11, 36, 16, 26, and 28; maximum deficit = 117). We incorporated a functional staging (0 to 6 assessing overall mobility) and activities of daily living (ADL) assessment (0 to 36). Timed activities tested included the number of repetitions of “PATA” in a 10 second interval using a tape recorder to examine speech and time taken to place and retrieve pegs in a 9-hole pegboard to test hand coordination. Gait was assessed using a timed walk of 50 feet (25 feet one way, turn and walk back) with or without a device. The scale was not compared to any existing scale.
Study population.
Fourteen patients (8 males, 6 females, age 14 to 47 years) with molecularly proven FA were studied. Two patients (brothers) were heterozygous for the FA GAA expansion with a point mutation in the unexpanded allele. Twelve patients were ambulatory with or without walking aids; two were chair-bound.
Evaluators.
Seven neurologists, experienced in the evaluation of patients with ataxia, participated in the study. Each patient was examined by each of the seven examiners in a specific sequence within a day. The examination was preceded by a discussion of the scale methodology employed and the scoring sheet had explicit instructions on the procedures.
Statistics.
To assess agreement between raters, we calculated the intraclass correlation coefficients (ICC) for each variable under study. An analysis of variance (ANOVA) approach using SPSS statistical software (SPSS, V.12.0, Chicago, IL) was used. Average scores on each subscale for each subject were used in repeated measure type ANOVA models to estimate the variance components necessary to compute the ICC for various subscales.4 The subscales were analyzed separately. An ICC of over 0.75 was considered evidence of good interrater reliability. Additionally, ANOVA models provide a test of whether some raters score consistently higher/lower than others. Post hoc paired comparisons with Bonferroni correction were performed to identify raters who appeared to score higher/lower than others. A significant ANOVA p value indicates that some raters consistently tend to score high or low. We also performed factor analysis for each of the subscales to investigate whether the raters agree by assuming a single factor model based on the rater scores for each subject. If all raters are measuring a similar construct, the factor loadings for each rater should be relatively high. We assume a loading of ≧0.9 indicates that the rater is measuring a common trait with all other raters.
To determine which sections of the examination were measuring common constructs, we used exploratory factor analysis with Promax rotation using the nine measures (stage of disease, bulbar score, upper limb score, lower limb score, peripheral nerve score, upright stability score, PATA rates, and right and left pegboard scores). This measure looks at correlation among all the variables at once. In addition, correlations of each subscale with ADL and mobility were determined as a reflection of the overall ability of the subscale to measure disease process.
Results.
The mean scores and SDs for each subset of the scale for the 14 patients are shown in table E-1 (on the Neurology Web site at www.neurology.org). The patients ranged from those with nearly normal ambulation to those that were severely affected.
Excellent interrater reliability as shown by high ICC was demonstrated for the following scores: disease stage, ADL, upper limb coordination, lower limb coordination, upright stability/gait, total neurologic examination, PATA rate, pegboard and gait times (table E-2). Bulbar and peripheral nerve scores were less reliable among the raters. ANOVA values indicating some rater bias were present for: disease stage, ADL, upper limb coordination, peripheral nerve scores, and total neurologic examination. Factor loadings were high for most of the measures with more scatter for bulbar, peripheral nerve, and lower limb scores.
The Promax rotation analysis (table E-3) indicates that just two factors explain 87% of the variance of the data; thus stage of disease, lower limb scores, peripheral nerve scores, and upright stability scores are highly correlated implying these measure a common underlying construct; similarly bulbar, upper limb, PATA, and pegboard scores significantly correlated among each other. Additionally, there were substantial correlations between the neurologic examination and most of the subscores with ADL and mobility measures (table).
Table Correlation of subscores with ADLs and disability stage (n = 14)
To address the issue of fatigue and learning, the 1st and the 7th examinations for the patients tested over the course of the day were compared. The examiners involved were different for each occasion. Paired t tests of these comparisons revealed no significant trends over the course of the day. The time to complete one examination typically did not exceed 30 minutes.
Discussion.
Previous scales used for quantifying ataxic diseases have been for small-scale studies and had subjective estimates of clinical abnormalities.5,6,7 An early scale to undergo reliability testing was not specifically designed for ataxia and most of the ataxia measures did not meet reliability criteria.8 Recently ICARS has been shown to have good interrater reliability.1
The scale reported here included neurologic signs reflecting specific neural substrates that are affected in FA. The scale was found to have high ICC values for most of its components, indicating acceptable interrater reliability. We have not yet tested its intrarater reliability. The scale also incorporated a functional stage to denote overall mobility and an ADL score to examine functional status, not addressed in previous ataxia instruments. Components of the neurologic examination correlated well with ADL and mobility scores, demonstrating construct validity of the scale as a measure of FA, though further studies are needed to demonstrate validity and sensitivity to change.
Timed activities tested had high interrater reliability and less rater bias, as indicated by nonsignificant ANOVA p values, and may be especially useful to detect mild to moderate changes in disease severity. The correlations of individual timed measures with ADLs and mobility were lower, suggesting that a composite of both timed and examination measures may be needed to accurately reflect the global dysfunction in FA.
The correlations found using Promax rotation analysis appear to support the global nature of the scale, with one set of measures reflecting lower body dysfunction and the other, upper body dysfunction. Because the factor analysis of the subscales indicated possibly two underlying constructs at work, we did not compute Cronbach’s α.
Larger scale studies should provide evidence for the validity of the scale.
Footnotes
-
Additional material related to this article can be found on the Neurology Web site. Go to www.neurology.org and scroll down the Table of Contents for the April 12 issue to find the title link for this article.
Supported by Friedreich’s Ataxia Research Alliance (FARA).
Received May 3, 2004. Accepted in final form December 7, 2004.
References
Letters: Rapid online correspondence
REQUIREMENTS
You must ensure that your Disclosures have been updated within the previous six months. Please go to our Submission Site to add or update your Disclosure information.
Your co-authors must send a completed Publishing Agreement Form to Neurology Staff (not necessary for the lead/corresponding author as the form below will suffice) before you upload your comment.
If you are responding to a comment that was written about an article you originally authored:
You (and co-authors) do not need to fill out forms or check disclosures as author forms are still valid
and apply to letter.
Submission specifications:
- Submissions must be < 200 words with < 5 references. Reference 1 must be the article on which you are commenting.
- Submissions should not have more than 5 authors. (Exception: original author replies can include all original authors of the article)
- Submit only on articles published within 6 months of issue date.
- Do not be redundant. Read any comments already posted on the article prior to submission.
- Submitted comments are subject to editing and editor review prior to posting.
You May Also be Interested in
Hastening the Diagnosis of Amyotrophic Lateral Sclerosis
Dr. Brian Callaghan and Dr. Kellen Quigg
► Watch
Topics Discussed
Alert Me
Recommended articles
-
Articles
Measuring Friedreich ataxiaComplementary features of examination and performance measuresD. R. Lynch, J. M. Farmer, A. Y. Tsou et al.Neurology, June 12, 2006 -
Article
Neurologic outcomes in Friedreich ataxiaStudy of a single-site cohortMassimo Pandolfo et al.Neurology: Genetics, March 20, 2020 -
Articles
Riluzole in cerebellar ataxiaA randomized, double-blind, placebo-controlled pilot trialG. Ristori, S. Romano, A. Visconti et al.Neurology, March 08, 2010 -
Articles
Scale for the assessment and rating of ataxiaDevelopment of a new clinical scaleT. Schmitz-Hübsch, S. Tezenas du Montcel, L. Baliko et al.Neurology, June 12, 2006