J Clin Pharmacol
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (7)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Averbuch, M.
Right arrow Articles by Katzper, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Averbuch, M.
Right arrow Articles by Katzper, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Journal of Clinical Pharmacology, 2004; 44:368-372
© 2004 the American College of Clinical Pharmacology


ANALGESIA

Assessment of Visual Analog versus Categorical Scale for Measurement of Osteoarthritis Pain

Mordechai Averbuch, MD and Meyer Katzper, PhD

From the Division of Analgesic, Anti-inflammatory, and Ophthalmic Drug Products, Center for Drug Evaluation and Research, Food and Drug Administration (FDA), Rockville, Maryland.

Address for reprints: Dr. Meyer Katzper, Division of Analgesic, Anti-inflammatory and Ophthalmic Drug Products, HFD-550, Center for Drug Evaluation and Research, Food and Drug Administration, 5600 Fisher's Lane, Rockville, MD 20857.


    ABSTRACT
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
There is disagreement in the literature regarding which scales to use in pain measurement. The difference has usually been between nonverbal scales, such as visual analog scales and verbal ones, which usually provide only a limited number of response categories. A 12-week, randomized, double-blind naproxen sodium (500 mg bid) and placebo-controlled trial using the hip osteoarthritis (OA) flare-up pain mode, in which pain was measured on both visual analog and categorical scales simultaneously, was analyzed. The authors found a good correlation (> 0.995) between the time-series average of the unconstrained visual analog scale and a 5-point categorical scale pain measurement in the osteoarthritis pain model in both active and placebo treatment arms. However, for individuals, there is a wide range of VAS responses for each categorical score, with overlaps between categories. The visual analog and categorical scales appear as effective in determining average osteoarthritis pain. However, a combined metric scale for pain measurement that provides the subject with multiple cues may improve communication and concordance between scales for individual pain determination.

Key Words: Visual analogcategoricalscalepain


One of the most frequently used pain rating scales is the visual analog scale (VAS).1 The VAS is a unidimensional scale with several appealing characteristics. It is easy to use, requires no verbal or reading skills, and is sufficiently versatile to be employed in a variety of settings.2-4 All pain rating scales depend on the subjective response of the patient as pain is a subjective experience. In clinical trials of analgesia, there is no simple, ethical way to objectively quantify it. As a result, validity tests of pain are not done in clinical research. There is a need to see to what extent a given scale captures the subjective response correctly within the limitations of existing errors of the method. There are studies regarding precision and sensitivity of different pain rating scales in other clinical conditions.1-7 This study examines VAS versus a 5-point categorical (CAT) pain scale regarding their degree of coincidence and describes changes in symptoms during treatment in a population of osteoarthritis patients. The CAT scale is often referred to as a Likert-type scale.

To explore the relation between pain measurement with VAS and CAT scales in response to analgesia, we analyzed the results of an osteoarthritis clinical trial included in a new drug application submitted to the Food and Drug Administration (FDA). This study had the standard placebo and active controlled, parallel-group design trial and measured pain on both VAS and CAT scales simultaneously.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The study was a 12-week, randomized, double-blind naproxen sodium (500 mg bid) and placebo-controlled trial using the hip osteoarthritis (OA) flare-up pain model. The study design met the FDA standards and was approved by the responsible institutional review board. Eligible subjects were hip OA patients, males or nonpregnant females, older than 35 years of age. Patients were enrolled in the study if they had OA of the hip (hip pain plus at least two of the following: Westergren erythrocyte sedimentation rate < 20 mm/h, radiographic femoral or acetabular osteophytes, and radiographic joint space narrowing). All subjects selected were on therapy with a nonsteroidal anti-inflammatory drug. Subjects must have been experiencing moderate to severe pain at 48 hours or more following withdrawal of therapy (patient's assessment of arthritis pain at least 40 mm on VAS and patient's and physician's global assessment of arthritis = poor or very poor). Subjects were excluded if they used any analgesic within 48 hours prior to taking the study medication or if they had a recent history of chronic analgesic use. Subjects with obesity also were excluded from the study.

Efficacy end points were pain scores measured by a 5-point categorical scale (1 = none, 2= mild, 3= moderate, 4= severe, 5= extreme) and pain scores measured by an unconstrained VAS on a 0- to 100-mm scale. These scores were recorded just prior to drug administration and at weeks 2, 6, and 12 postdose.


    RESULTS
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Ninety-eight patients with osteoarthritis of the hip were included in the naproxen 500-mg bid treatment arm at screening. Seventy-three were females and 25 were males with an average age (± SD) of 63 ± 12 years (range: 39-87). The placebo arm had 108 patients; 75 were females and 33 were males. The average age (± SD) was 62 ± 12 years (range: 33-87). Pain observations were taken at screening, at baseline, and at weeks 2, 6, and 12.

Figure 1 shows the relationship between pain scores, as measured at screening by CAT and VAS scales, for the OA hip pain average. The naproxen 500-mg arm categorical values are on the left axis, and VAS is on the right axis. Clearly, this graph shows the coincidence of the average measure by both scales. The data demonstrate the flare-up effect of this pain model. Pain levels increased sharply following discontinuation of analgesic medications at screening. The time period covers screening, flare-up at baseline, and the analgesic effect of naproxen and time at weeks 2 through 12. These values show a good correlation (> 0.995) between the time-series average of the unconstrained visual analog scale and the 5-point categorical scale pain responses. Similarly, for the OA hip pain average of the placebo arm, there is a remarkable coincidence between CAT and VAS over the entire time range. Here, too, the correlation is greater than 0.995. This demonstrates the average equivalence of CAT and VAS measures both with the placebo and the active drug.



View larger version (14K):
[in this window]
[in a new window]
 
Figure 1. Relation between osteoarthritis pain scores as measured by categorical (left axis) and visual analog (right axis) scales for the naproxen 500-mg bid treatment arm.

 

When looking at a comparison of screen CAT and VAS measurement in the naproxen treatment arm for individual responses, a wide range of VAS responses for each CAT score is demonstrated (Figure 2). The ranges of response overlap and some of the responses are even contradictory. Patients in the placebo treatment arm showed similar results. One might expect that as they are tested multiple times, people would eventually give consistent responses to both scales. This is not the case as similar results were found at the other weeks as well.



View larger version (10K):
[in this window]
[in a new window]
 
Figure 2. A comparison of screen categorical (x-axis) and visual analog (y-axis) scales measurement in the naproxen treatment arm for individual responses.

 

Looking at a comparison of CAT and VAS for individuals, all varieties of convergent and divergent responses can be seen.

When applying a linear regression analysis of CAT versus VAS measurements, the result is highly significant with F < 0.0001 (Figure 3). However, R2 is only 0.503905 due to the spread of values. This is in contrast to the correlation of the average values.



View larger version (10K):
[in this window]
[in a new window]
 
Figure 3. Relationship and least squares linear fit between pain scores as measured at screening on categorical and visual analog scales.

 

Calculation of a theoretical linear fit for the expected relationship, if the pain intensity relationship of VAS to CAT is linear and categories in CAT are nonoverlapping, will yield data points that are then constrained to be within the two lines bounding the central line, as shown in Figure 4.



View larger version (14K):
[in this window]
[in a new window]
 
Figure 4. Theoretical linear fit for the expected relationship if the pain intensity relationship of the visual analog to the categorical scale is linear and categories in the categorical scale are nonoverlapping.

 

At each time (screening, baseline, and weeks 2, 6, and 12), if the relationship between VAS and CAT is linear, the slope should equal 20. The slope should remain constant independent of distribution of pain in the population. Figure 5 verifies that the slope is approximately 20 for all observation times, confirming our theoretical construct.



View larger version (11K):
[in this window]
[in a new window]
 
Figure 5. Verification of the linear relationship between visual analog and categorical scales.

 

We propose an instrument that will eliminate the observed individual inconsistencies. There are logical arguments for introducing such an instrument. We suggest a combined metric scale for pain measurement that may improve bounding the central line, as shown in Figure 4. Our suggested scale has not been tested, so we leave further consideration for the discussion section.

To ascertain how severe the deficiency in current scales is, we made use of the screening data associated with the entire osteoarthritis trial. There were a total of 419 screened subjects with CAT and VAS values. Using our calculated range of VAS values for each CAT value, we counted the number of subjects meeting the constraint. Only 45% (189) were within the expected bounds. For the 55% out of bounds, 27.5% were above and 27.5% below expected bounds. These data and this analysis demonstrate the need for an improved way to ascertain pain intensity.


    DISCUSSION
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The VAS entered the realm of pain research in the 1980s, demonstrating a greater sensitivity to increments or decrements in pain than other instruments.5 In contrast with verbal rating scales, which require a patient to choose a categorical pain descriptor (e.g., mild, moderate, severe), the VAS provides a smooth continuum of choices. If a patient's pain decreases slightly over time, a verbal scale may lack the sensitivity to detect it because a qualitative shift in degree of pain would be required to trigger a change in choice of descriptor. In contrast, the VAS allows patients to record small changes in pain severity.

This ability to make small changes in pain experienced means that the patient may record a change in VAS when a change in category is not warranted. On the downside, this can be problematic when using the VAS to compare the effectiveness of different analgesic strategies. With large samples, small differences in mean VAS score can be declared "statistically significant," even though they may be of little clinical significance to the patient.8

Self-rating scales may be influenced by factors in the subject's life that are irrelevant to the essence of the question asked. However, in determining pain, there is no alternative to asking the subject. Another problem is that the scales used here assume pain to be a unidimensional experience that can only vary in intensity. There are many instruments that try to capture the multidimensional aspects of pain.9-13 Despite this problem, rating scales are commonly employed procedures because they are simple, economic, and easy for subjects to comprehend.14 Our data deal with pain in this simple manner.

There is controversy in the literature regarding which scales are most sensitive. The conflict has usually been between nonverbal scales,15-17 such as VAS, which can provide different pain reports between two extremes of pain, and verbal ones,18,19 which usually provide only four to five response categories. Because verbal scales have few steps, they are usually considered to be less sensitive than VAS. However, although scales with more steps are supposed to be more sensitive, this is not always true. In one study,7 all scales used—the VAS, a behavior rating scale (points that are described by sentences that use no pain, painful, very strong pain, and totally handicapped), a numerical scale (11 points that initiate in 0 [no pain] and end in 10 [worst imaginable pain]), and a verbal scale (a 5-point scale that is described by sentences that use no pain, mild pain, and very severe pain)—have been demonstrated to be sensitive, showing an improvement of 30% to 50% in symptoms after a 6-month follow-up in patients with chronic temporomandibular pain. Other investigators also found VAS and CAT measurement to be of a similar sensitivity.20-23 Our focus has been on pain status rather than change of pain status. To that extent, we do not deal with sensitivity to change per se. Our motivation is that improved measurement of pain status is the necessary precursor to improved measurements of pain change. Sensitivity to change by itself does not yield improved discrimination between treatments. A more sensitive measure yields a stronger response to all treatments.

There is confirmation for the linear relationship between the VAS and CAT pain scales.24 As noted, this has no implications for the actual functional form of the response to pain.

A recently published report has also demonstrated good correlation between visual and categorical pain scales in osteoarthritis.25 It concludes that the results are similar enough for both scales that the CAT may be preferred due to its ease of administration and interpretation.

We suggest that as a result of improved communication, individual pain determination may be improved. Our recommendation is use of an anchored VAS line with labeled sectors. Figure 6 shows the proposed instrument. The proposed multiple cues are designed to reduce error. The instrument constrains CAT and VAS scores to be consistent. The response to pain as a function of the intensity of the pain stimulus has not been established. Whatever its form may be, it affects the response to both the categorical and VAS scales. For this reason, we find that the regression between the scales for the population average is quite good. The variability we see between the scales for the responding individuals cannot be attributed to different individual sensitivities, for in such a case, responses to both scales must differ. A component of the difference that we can correct is that of nonuniform understanding. If a group of individuals were asked, for example, to mark the position for moderately severe pain on a VAS scale, we would expect to get a wide divergence of answers. This aspect of variation can be remedied by using the combined scale. Another aspect of variation is inherent in the nature of the scales. When one feels a bit more (or less) pain, it is a simple matter to move one's mark slightly on the VAS scale. On the CAT scale, one is confronted with the decision as to whether the difference one feels warrants a change in category. This difficulty is also avoided in the combined scale. We may improve the sensitivity of pain determination by having the VAS score as a modifier of the chosen CAT. This is somewhat like the use of a vernier caliper in the measurement of length. With explanatory instruction to the subject, we can get a refined measurement. A uniformity of interpretation is placed on the line. Then, within any category, the subject can respond with his or her nuanced feeling of more or less pain. Random variability will be constrained, and the variability found can be mainly attributed to real differences in pain perception. Our claim is that by using our proposed scale, we will achieve a greater uniformity of interpretation combined with the flexibility to identify small changes. The question may be raised as to why we desire the ability to identify small changes. They are not needed for regulatory purposes. They are not of obvious use for medical practice. When we seek to compare efficacy of medications competitively, then use of the best measure possible will definitely be a necessity.



View larger version (3K):
[in this window]
[in a new window]
 
Figure 6. A suggested combined metric scale for pain measurement.

 


    CONCLUSIONS
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Looking at the OA pain model as an exemplar for chronic pain generally, we found a good correspondence between unconstrained VAS and 5-point CAT scale pain measurements. At the same time, we found a large degree of variability, which is at least partially due to individual judgment differences as to how to relate to the VAS line. We propose that a combined scale anchored by multiple cues may provide the patient with greater clarity and yield a more accurate pain measurement. The cues obviate differences due to individual judgments. We thus eliminate a major source of error variance. Of necessity, we continue to rely on subjective pain judgment. We expect increased consistency and precision with the new instrument. With this instrument, we may expect to get not only averages suitable for regulatory purposes but also reliable values for individual pain.


    FOOTNOTES
 
The views expressed in this article are those of the authors and do not necessarily represent those of the FDA or imply endorsement by the FDA or U.S. government.

DOI: 10.1177/0091270004263995

Submitted for publication July 1, 2002; Revised version accepted January 18, 2004.


    REFERENCES
 TOP
 ABSTRACT
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 

1. Price D, Bush F, Long S, et al: A comparison of pain measurement characteristics of mechanical visual analogue and simple numerical rating scales. Pain 1994;56: 217-226.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

2. Jensen MP, Karoly P: The measurement of clinical pain intensity: a comparison of six methods. Pain 1986;27: 117-126.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

3. Collins S, Moore A, McQuay H: The visual analog pain intensity scale: what is moderate pain in millimeters? Pain 1997;72: 95-97.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

4. Ho K, Spence J, Murphy M: Review of pain measurement tools. Ann Emerg Med 1996;27: 427-431.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

5. Littman GS, Walker BR, Schneider BE: Reassessment of verbal and visual analog ratings in analgesic studies. Clin Pharmacol Ther 1985;38: 16-23.[Web of Science][Medline] [Order article via Infotrieve]

6. Magnusson T, List T, Helkimo M: Self-assessment of pain and discomfort in patients with temporomandibular disorders: a comparison of five different scales with respect to their precision and sensitivity as well as their capacity to register memory of pain and discomfort. J Oral Rehabil 1995;22: 549-554.[Web of Science][Medline] [Order article via Infotrieve]

7. Conti PC, de Azevedo LR, de Souza NV, Ferreira FV: Pain measurement in TMD patients: evaluation of precision and sensitivity of different scales. J Oral Rehabil 2001;28: 534-539.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

8. Feinstein AR: Clinical Epidemiology: The Architecture of Clinical Research. Philadelphia: W. B. Saunders, 1985; 396-406.

9. Mcguire D: Comprehensive and multidimensional assessment and measurement of pain. J Pain Symptom Manag 1992;7: 312-319.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

10. Melzack R: The McGill Pain Questionnaire: major properties and scoring methods. Pain 1975;1: 277-299.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

11. Kerns RD, Turk DC, Rudy TE: The West Haven-Yale Multidimensional Pain Inventory (WHYMPI). Pain 1985;23: 345-356.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

12. Fishman B, Pasternak S, Wallenstein SL, Houde RW, Holland JC, Foley KM: The Memorial Pain Assessment Card: a valid instrument for the evaluation of cancer pain. Cancer 1987;60: 1151-1158.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

13. Daut RL, Cleeland CS, Flanery RC: Development of the Wisconsin Brief Pain Questionnaire to assess pain in cancer and other diseases. Pain 1983;17: 197-210.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

14. Chapman CR, Casey KL, Dubner R, Foley KM, Gracely RH, Reading AE: Pain measurement: an overview. Pain 1985;22: 1-31.[Web of Science][Medline] [Order article via Infotrieve]

15. Johnson JE, Rice VH: Sensory and distress components of pain: implications for the study of clinical pain. Nurs Res 1974;23: 203-220.[Web of Science][Medline] [Order article via Infotrieve]

16. Seymour RA: The use of pain scales in assessing the efficacy of analgesics in post-operative dental pain. Eur J Clin Pharmacol 1982;23(5): 441-444.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

17. Sriwatanakul K, Kelvie W, Lasagna L, Calimlim JF, Weis OF, Mehta G: Studies with different types of visual analog scales for measurement of pain. Clin Pharmacol Ther 1983;34: 234-239.[Web of Science][Medline] [Order article via Infotrieve]

18. Gracely RH, McGrath F, Dubner R: Ratio scales of sensory and affective verbal pain descriptors. Pain 1978;5: 5-18.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

19. Gracely RH, McGrath P, Dubner R: Validity and sensitivity of ratio scales of sensory and affective verbal pain descriptors: manipulation of affect by diazepam. Pain 1978;5: 19-29.[CrossRef][Web of Science][Medline] [Order article via Infotrieve]

20. Bijur PE, Silver W, Gallagher EJ: Reliability of the visual analog scale for measurement of acute pain. Acad Emerg Med 2000;8: 1153-1157.[CrossRef][Web of Science]

21. Freeman K, Smyth C, Dallam L, Jackson B: Pain measurement scales: a comparison of the visual analogue and faces rating scales in measuring pressure ulcer pain. J Wound Ostomy Continence Nurs 2001;28: 290-296.[Medline] [Order article via Infotrieve]

22. Le Resche L, Burgess J, Dworkin SF: Reliability of visual analog and verbal descriptor scales for "objective" measurement of temporomandibular disorder pain. J Dent Res 1988;67: 33-36.[Abstract/Free Full Text]

23. Banos JE, Bosch F, Canellas M, Bassols A, Ortega F, Bigorra J: Acceptability of visual analogue scales in the clinical setting: a comparison with verbal rating scales in postoperative pain. Methods Find Exp Clin Pharmacol 1989;11: 123-127.

24. Wallenstein SL, Heidrich G, Kaiko R, Houde RW: Clinical evaluation of mild analgesics: the measurement of clinical pain. Br J Clin Pharmacol 1982;10(Suppl.): 319-327.

25. Bolognese JA, Schnitzer TJ, Ehrich EW: Response relationship of VAS and Likert scales in osteoarthritis efficacy measurement. Osteoarthritis Cartilage 2003;11: 499-507.d[CrossRef][Web of Science][Medline] [Order article via Infotrieve]
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Social Science Computer ReviewHome page
M. P. Couper, R. Tourangeau, F. G. Conrad, and E. Singer
Evaluating the Effectiveness of Visual Analog Scales: A Web Experiment
Social Science Computer Review, May 1, 2006; 24(2): 227 - 245.
[Abstract] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (7)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Averbuch, M.
Right arrow Articles by Katzper, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Averbuch, M.
Right arrow Articles by Katzper, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS