Measuring the impact of MS on walking ability
The 12-Item MS Walking Scale (MSWS-12)
Citation Manager Formats
Make Comment
See Comments

Abstract
Objective: To develop a patient-based measure of walking ability in MS.
Methods: Twelve items describing the impact of MS on walking (12-Item MS Walking Scale [MSWS-12]) were generated from 30 patient interviews, expert opinion, and literature review. Preliminary psychometric evaluation (data quality, scaling assumptions, acceptability, reliability, validity) was undertaken in the data generated by 602 people from the MS Society membership database. Further psychometric evaluation (including comprehensive validity assessment, responsiveness, and relative efficiency) was conducted in two hospital-based samples: people with primary progressive MS (PPMS; n = 78) and people with relapses admitted for IV steroid treatment (n = 54).
Results: In all samples, missing data were low (≤3.8%), item test–retest reproducibility was high (≥0.78), scaling assumptions were satisfied, and reliability was high (≥0.94). Correlations between the MSWS-12 and other scales were consistent with a priori hypotheses. The MSWS-12 (relative efficiency = 1.0) was more responsive than the Functional Assessment of Multiple Sclerosis mobility scale (0.72), the 36-Item Short Form Health Survey physical functioning scale (0.33), the Expanded Disability Status Scale (0.03), the 25-ft Timed Walk Test (0.44), and Guy’s Neurologic Disability Scale lower limb disability item (0.10).
Conclusions: The MSWS-12 satisfies standard criteria as a reliable and valid patient-based measure of the impact of MS on walking. In these samples, the MSWS-12 was more responsive than other walking-based scales.
The Office of Population Censuses and Surveys survey of disability in the United Kingdom identified poor mobility as the most common single disability in the community.1 A survey of stroke patients identified ability to walk as the most important activity of daily living.2 In MS, two prevalence studies demonstrated that 75% of people experienced mobility problems,3,4⇓ and focus groups of people with mild and moderate MS, conducted earlier this year on behalf of the Chronic Care Collaborative Center at the Royal College of Physicians, identified gait, mobility, and balance as key physical problems.
Rigorous measures of walking ability are needed for clinical trials and clinical practice to ensure that results are reliable and valid. In MS, a number of generic and disease-specific walking measures have been used. These include Timed Walk Tests (TWT)5 and the Ambulation Index.6 Other measures of physical function are biased toward walking. For example, the Expanded Disability Status Scale (EDSS)7 and Impairment Scale of the European Database for MS (EDMUS)8 assess walking ability in the ranges 3.5 to 7.5 and 2 to 7. The 10-item physical functioning dimension of the Medical Outcomes Study 36-Item Short Form Health Survey (SF-36-PF)9 has 6 items evaluating walking. The seven-item mobility scale of the Functional Assessment of Multiple Sclerosis (FAMS)10 has one walking item. Guy’s Neurologic Disability Scale (GNDS)11 has one item that addresses lower limb disability. Only three of these scales (SF-36, FAMS, GNDS) incorporate patients’ own evaluations.
All these scales have limitations as measures of walking ability. Some concentrate on specific aspects. For example, although TWT may be simple,12 they reflect only the ability to walk a given distance, the stride length, and walking speed. The EDSS, EDMUS Impairment Scale, and walking items of the SF-36-PF focus on walking distance, whereas the lower limb disability item of the GNDS focuses on the use of walking aids. The walking item of the FAMS mobility scale asks about the extent of trouble walking. The EDSS13 and SF-36-PF14 have limited measurement properties in MS.
As walking is a complex motor activity consisting of a number of components, the most reliable and valid method of measuring it is to combine multiple items that address these different components.15 We sought to develop a multi-item rating scale of walking that combines patients’ perspectives with psychometric methods and is suitable for epidemiologic studies, clinical trials, and routine data collection for audit purposes.
Methods.
Overview.
The walking scale described in this article was developed during a study to construct patient-based outcome measures for MS. This study generated a pool of 141 items concerning the health impact of MS from 30 patient interviews, expert opinion, and literature review. A total of 12 items described the impact of MS on walking; these form the 12-Item MS Walking Scale (MSWS-12; Appendix). Items are summed to generate a total score and transformed to a scale with a range of 0 to 100. High scores indicated greater impact on walking. For respondents with missing data but where at least 50% of the items (n ≥ 6) in the scale had been completed, a respondent-specific mean score was imputed from the completed items so that total scores could be computed.9 The analysis of item responses did not include imputed data.
Evaluation was undertaken in two stages. The initial psychometric evaluation of the MSWS-12 was undertaken in a postal survey of members of the MS Society of Great Britain and Northern Ireland. The aim of this stage was to determine whether the 12 items constituted a summed rating scale. Consequently, more comprehensive psychometric evaluations were conducted in two hospital-based samples: a postal survey of people from the primary progressive MS (PPMS) database of the National Hospital for Neurology and Neurosurgery in London (NHNN) and consecutive admissions to NHNN for IV steroid treatment of MS relapses. The Ethics Committee of the NHNN approved all the studies.
Preliminary psychometric evaluation.
All 141 items were administered by postal survey to 1,530 people, randomly selected and geographically stratified, from the membership database of the MS Society. A subsample of 400 people was randomly selected to study item and scale test–retest reproducibility. These people completed two questionnaires 10 days apart. Full details of the survey methods have been reported.4
MSWS-12 data generated by this survey were examined for five psychometric properties: data quality, scaling assumptions, acceptability, reliability, and validity. These analyses were undertaken on Time 1 data. Full accounts of the statistical methods and criteria used can be found elsewhere.4,13,14⇓⇓ Data quality16 was determined by calculating percentage missing data for items, percentage computable scale scores, and item test–retest reproducibility (intraclass correlation coefficient [ICC])17. The extent to which the MSWS-12 satisfied scaling assumptions as a summed rating scale was determined by examining item response option frequency distributions, the magnitude and equivalence of item total correlations,18 and the extent to which factor analysis (cross-validated in random half-samples) supported the undimensionality of the items.
Acceptability was determined by examining MSWS-12 score distributions, floor and ceiling effects, and skewness statistics. Two types of reliability were examined: internal consistency (Cronbach’s α coefficients)19 and scale test–retest reproducibility (ICC). Convergent and discriminant construct validity20 of the MSWS-12 was determined by examining correlations with the physical and psychological scales of the Multiple Sclerosis Impact Scale (MSIS-29)4 and other sociodemographic variables. Correlations with the MSIS-29 physical scale were predicted to be high (approximately r = 0.70) and substantially greater than correlations with the MSIS-29 psychological scale, age, and duration of MS.
Further psychometric evaluation.
More comprehensive psychometric evaluations were undertaken in two hospital-based samples: a postal survey of people with PPMS and consecutive receivers of IV steroids for relapses. Steroid patients completed measures immediately before starting treatment and 6 weeks later.
All patients completed a booklet of rating scales that included the MSWS-12, MSIS-29, two self-report measures of mobility (SF-36-PF, FAMS mobility scale), two rating scales of psychological well-being (SF-36 mental health scale [SF-36-MH]; FAMS emotional well-being scale [FAMS-EWB]), and demographic questions. In addition, steroid patients were rated on the EDSS by a neurologist, had a timed 25-ft walk test, and completed the GNDS.
Time 1 data were examined for data quality, scaling assumptions, acceptability, reliability, and convergent and discriminant validity. Convergent and discriminant validity20 was determined by examining correlations between the MSWS-12 and the other rating scales and demographic variables. We predicted that correlations between the MSWS-12 and MSIS-29 physical scale, SF-36-PF, FAMS mobility, and EDSS would be high (r > 0.70) and substantially exceed correlations with the TWT, GNDS lower limb disability item, SF-36-MH, FAMS-EWB, and demographic variables.
Responsiveness was determined from Time 1 and 2 data by calculating effect sizes21 (mean change score divided by SD of admission scores) and standardized response means22 (mean change score divided by SD of change scores). The relative responsiveness of the walking-based measures (MSWS-12, SF-36-PF, FAMS mobility, EDSS, GNDS lower limb disability item, 25-ft TWT) was determined by computing their relative efficiency.23 This estimates the extent to which one scale is more or less efficient at detecting change over time relative to another scale. Relative efficiency and the significance of change scores24 are typically computed from paired-samples t-tests as pairwise squared t-values (t2 scale 1/t2 scale 2). However, results generated by this method confound “responsiveness” with the effects of nonnormality such that scales with more normally distributed outcomes are favored. Therefore, we examined the significance of change scores using Wilcoxon’s matched-pairs signed-ranks test and computed relative efficiency as pairwise squared z-values (z2 scale 1/z2 scale 2) generated by this test. The z-value2 for the MSWS-12 was chosen as the denominator so that values for the other scales estimate their relative efficiency as a percentage of the MSWS-12.
Results.
Patients.
In the postal survey of the preliminary psychometric evaluation, 766 completed questionnaires were returned, and 602 people (79%) indicated that they could walk and completed the MSWS-12. In the further psychometric evaluations, questionnaires were sent to 119 people with PPMS; 104 (87%) returned completed questionnaires, and 78 of these (75%) indicated they could walk and completed the MSWS-12. In the steroid sample, a total of 70 people were recruited; 65 of these (93%) were able to walk on admission, and 54 people (83%) completed both Time 1 and Time 2 questionnaires. Table 1 shows the sample characteristics. Differences between samples are predictable on clinical grounds. For example, it is expected that patients in the steroid sample will be younger, more likely to be employed, more likely to walk unaided, and to have a higher probability of relapsing–remitting disease than the other samples.
Table 1 Characteristics of samples
Data quality.
Across the three samples, missing data for items ranged from 0 to 3.8%, and total scores could be computed for at least 98% of patients (table 2). Item test–retest reproducibility, studied only in the community sample, ranged from 0.75 to 0.85.
Table 2 Data quality, scaling assumptions, acceptability, and reliability of 12-Item MS Walking Scale
Scaling assumptions.
In all samples, frequency distributions for item response scales were relatively symmetric and not unduly skewed (see table 2). Item total scale correlations exceeded the criterion of 0.40. Principal-components analysis of the 12 items consistently extracted one component with an eigenvalue exceeding unity. This accounted for 66% (steroids), 74% (PPMS), and 80% (community) of the variance. Item factor loadings (item-to-component correlations) ranged from 0.62 to 0.93. Almost identical results were obtained using principal-axis factoring. These findings suggest that it is legitimate to report an MSWS-12 total score.
Acceptability and reliability.
In all three samples, scale scores spanned the entire scale range and were not notably skewed, mean scores were above the scale midpoint, and floor and ceiling effects were less than the recommended maximum of 20% (see table 2).25 Internal consistency and test–retest reproducibility estimates exceeded recommended criteria. MSWS-12 scores at Time 1 and Time 2 for the test–retest reproducibility subsample were not significantly different.
Validity.
As predicted, the MSWS-12 was highly correlated with the MSIS-29 physical scale, SF-36-PF, FAMS mobility scale, and EDSS and had low to moderate correlations with the TWT, MSIS-29 psychological scale, GNDS legs item, SF-36-MH, FAMS-EWB, age, sex, and years since diagnosis (table 3).
Table 3 Convergent and discriminant construct validity of MSWS-12: Pearson’s product–moment correlations with other scales and variables at Time 1
Responsiveness and relative efficiency.
Change scores for most of the six scales demonstrated significant improvement following IV steroid treatment (table 4). Effect sizes and standardized response mean suggested that the MSWS-12 was more responsive than the other scales in this sample. Relative efficiency statistics suggested that the other scales were estimated to be 12% (GNDS legs item) to 76% (FAMS mobility) as efficient at detecting change as the MSWS-12 in this sample.
Table 4 Relative responsiveness of MSWS-12, SF-36-PF, FAMS mobility, EDSS, and TWT in subsample of patients who received steroids for relapses
Discussion.
The aim of this study was to develop a patient-based measure of an important clinical problem for a disease with considerable public health impact. Twelve items were generated from patient interviews, expert opinion, and literature review. Results from a two-stage evaluation provided supportive evidence that the MSWS-12 was acceptable, reliable, valid, and responsive. In addition, the self-report MSWS-12 perhaps offers more flexible and simple measurement in clinical practice and clinical trials than observer-rated scales. It has the potential to be used in cross-sectional studies evaluating the impact of MS on walking, in longitudinal studies to monitor change in walking over time, and most importantly, in clinical trials to evaluate therapeutic effectiveness from the patients’ perspective.
Correlations between the MSWS-12 and other scales provide useful information. The high correlations between the MSWS-12 and the physical scales of the MSIS-29 and SF-36 emphasize the strong relationship between walking and physical function. Nevertheless, at least 45% of the variance in MSWS-12 scores remained unexplained by these two scales, implying that walking and physical function are related but not adequate surrogate markers of each other. The moderate correlations of the MSWS-12 with the TWT and GNDS lower limb disability item suggest that time to walk a specified distance and the use of a walking aid are limited indicators of patients’ perceptions of their walking limitations. Two reasons may contribute to this finding. First, the TWT and GNDS lower limb item address very specific aspects of walking, which is a complex motor task. Second, patients and clinicians have been demonstrated to differ in their perceptions of the impact of illness.26,27⇓ These two reasons, and the suggestion from this study that different methods of evaluating walking ability produce different results, highlight the importance of clearly defining the outcomes to be measured and the perspectives to be included.
Another important fact to consider is the extent to which nonambulatory symptoms, such as mood and emotional disturbances, influence self-ratings of ambulatory impairment. Correlations between the MSWS-12 and three scales measuring mood and emotional disturbances (MSIS-29 psychological scale, SF-36-MH scale, FAMS-EWB scale) range from 0.09 to −0.46. This suggests only 1 to 21% shared variance between these scales and implies that the influence of mood and emotional disturbance on ambulatory impairment is not strong.
The psychometric properties of the MSWS-12 were similar in three different MS samples. Although this supports the generalizability of the results, further studies of the MSWS-12 are essential as the psychometric properties of rating scales cannot be established in a single study.28 Critical evaluations of the MSWS-12 in different settings are needed to define its strengths and weaknesses and its role in clinical practice and research and are the basis of evidence-based selection of outcome measures.29 More comparisons of the MSWS-12 with other walking measures will determine their relative performance (especially relative responsiveness, which could differ across settings), advantages and disadvantages, and how they might complement each other. In particular, the most relevant setting to study relative responsiveness may be worsening over time, as would be observed in patients participating in a trial of course-modifying therapy. Further evaluations should also consider using new psychometric methods (Rasch Item Analysis30 and Item Response Theory31). New psychometric methods take a different approach to examining a group of items. They examine the extent to which the data generated by a rating scale satisfy a stringent mathematical definition of measurement. When the data fit the model (mathematical definition), a measurement instrument is defined.
The results provide another demonstration that the responsiveness of rating scales can influence the interpretation of clinical trials. It is our experience that this fact is not fully appreciated by clinicians. The EDSS has been consistently used to measure disability outcomes in clinical trials of MS. Its limited responsiveness13 is supported by this study, which shows that, using the EDSS, IV steroid treatment was not associated with a significant change in disability. In the same sample, the TWT, a component of the recently developed MS Functional Composite (MSFC),32 a proposed successor to the EDSS, demonstrated a significant change. Yet neither scale appears to be as responsive as the MSWS-12, which, in this sample, had a larger effect size associated with IV steroid treatment. These findings suggest it is possible that the EDSS and MSFC might fail to detect small but clinically significant changes in walking.
As the responsiveness study was unblinded, the evaluators may have expected a change in walking ability to occur with steroids. This raises the question of whether the better responsiveness of the MSWS-12 in this study may simply reflect the fact that patients are more prone to the power of suggestion than clinical evaluators. Two facts may be against this suggestion. First, patients were asked to complete multiple questionnaires 6 weeks apart without reference to their responses at Time 1. Second, the three patient-based measures (MSWS-12, SF-36, FAMS mobility) demonstrated variable responsiveness, suggesting patients made different value judgments. Nevertheless, further evaluations of the responsiveness of the MSWS-12 are essential as responsiveness to steroid treatment in the absence of a blinded control may not reflect its responsiveness to a more modest treatment in a double-blind, appropriately controlled trial.
Appendix Multiple Sclerosis Walking Scale (MSWS-12)
• These questions ask about limitations to your walking due to MS during the past 2 weeks.
• For each statement, please circle the one number that best describes your degree of limitation.
• Please answer all questions even if some seem rather similar to others, or seem irrelevant to you.
• If you cannot walk at all, please tick this box. □ ⇓
Acknowledgments
Supported by a grant from the NHS Health Technology Assessment Programme.
Acknowledgment
The authors thank the patients who participated in this study, the MS Society of Great Britain and Northern Ireland, Mr. Peter Cardy and Dr. Iain Smith (Advisory Committee), the staff at the NHNN for their assistance during scale development, Dr. G.T. Ingle for help with the PPMS database, Ms. Irene Richardson for undertaking the patient interviews and for research assistance during the first field test, and one of the reviewers for educational comments.
Footnotes
-
The views and opinions expressed herein do not necessarily reflect those of the NHS Executive.
- Received January 24, 2002.
- Accepted September 12, 2002.
References
- ↵
Martin J, Meltzer M, Elliot D. OPCS surveys of disability in Great Britain. Report 1: the prevalence of disability among adults. London: Her Majesty’s Stationary Office, 1988.
- ↵
Chiou IL, Burnett CN. Values of activities of daily living: a survey of stroke patients and their home therapists. Arch Phys Med Rehabil . 1985; 65: 901–906.
- ↵
Swingler R, Compston DAS. The morbidity of multiple sclerosis. Q J Med . 1992; 83: 325–337.
- ↵
Hobart JC, Lamping DL, Fitzpatrick R, Riazi A, Thompson AJ. The Multiple Sclerosis Impact Scale (MSIS-29): a new patient-based outcome measure. Brain . 2001; 124: 962–973.
- ↵
Wade DT. Measurement in neurological rehabilitation. Oxford: Oxford University Press, 1992.
- ↵
- ↵
Kurtzke JF. Rating neurological impairment in multiple sclerosis: an Expanded Disability Status Scale (EDSS). Neurology . 1983; 33: 1444–1452.
- ↵
Confavreux C, Compston DAS, Hommes OR, McDonald WI, Thompson AJ. EDMUS, a European database for multiple sclerosis. J Neurol Neurosurg Psychiatry . 1992; 55: 671–676.
- ↵
Ware JE Jr, Snow KK, Kosinski M, Gandek B. SF-36 Health Survey manual and interpretation guide. Boston: Nimrod Press, 1993.
- ↵
Cella DF, Dineen K, Arnason B, et al. Validation of the Functional Assessment of Multiple Sclerosis quality of life instrument. Neurology . 1996; 47: 129–139.
- ↵
Sharrack B, Hughes RAC. The Guy’s Neurological Disability Scale (GNDS). a new disability measure for multiple sclerosis. Mult Scler . 1999; 5: 223–233.
- ↵
- ↵
Hobart JC, Freeman JA, Thompson AJ. Kurtzke scales revisited: the application of psychometric methods to clinical intuition. Brain . 2000; 123: 1027–1040.
- ↵
Hobart JC, Freeman JA, Lamping DL, Fitzpatrick R, Thompson AJ. The SF-36 in multiple sclerosis (MS): why basic assumptions must be tested. J Neurol Neurosurg Psychiatry . 2001; 71: 363–370.
- ↵
Nunnally JC, Bernstein IH. Psychometric theory. 3rd ed. New York: McGraw-Hill, 1994.
- ↵
- ↵
- ↵
Likert RA. A technique for the development of attitudes. Arch Psychol . 1932; 140: 5–55.
- ↵
- ↵
- ↵
Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care . 1989; 27: S178–S189.
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
Rothwell PM, McDowell Z, Wong CK, Dorman PJ. Doctors and patients don’t agree: cross sectional study of patients’ and doctors’ perceptions and assessments of disability in multiple sclerosis. Br Med J . 1997; 314: 1580–1583.
- ↵
- ↵
Hobart JC, Lamping DL, Freeman JA, et al. Evidence-based measurement: which disability scale for neurological rehabilitation? Neurology . 2001; 57: 639–644.
- ↵
Rasch G. Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press, 1960.
- ↵
Lord FM, Novick MR. Statistical theories of mental test scores. Reading, MA: Addison-Wesley, 1968.
- ↵
Letters: Rapid online correspondence
REQUIREMENTS
If you are uploading a letter concerning an article:
You must have updated your disclosures within six months: http://submit.neurology.org
Your co-authors must send a completed Publishing Agreement Form to Neurology Staff (not necessary for the lead/corresponding author as the form below will suffice) before you upload your comment.
If you are responding to a comment that was written about an article you originally authored:
You (and co-authors) do not need to fill out forms or check disclosures as author forms are still valid
and apply to letter.
Submission specifications:
- Submissions must be < 200 words with < 5 references. Reference 1 must be the article on which you are commenting.
- Submissions should not have more than 5 authors. (Exception: original author replies can include all original authors of the article)
- Submit only on articles published within 6 months of issue date.
- Do not be redundant. Read any comments already posted on the article prior to submission.
- Submitted comments are subject to editing and editor review prior to posting.
You May Also be Interested in
Hemiplegic Migraine Associated With PRRT2 Variations A Clinical and Genetic Study
Dr. Robert Shapiro and Dr. Amynah Pradhan
Related Articles
- No related articles found.