The social validity of script training related to the ...

5 downloads 53 Views 565KB Size Report
Aug 12, 2011 - Department of Speech and Hearing Sciences, George ... this article: Scott R. Youmans, Gina L. Youmans & Adrienne B. Hancock (2011): The.
This article was downloaded by: [George Washington University] On: 31 August 2011, At: 08:26 Publisher: Psychology Press Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Aphasiology Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/paph20

The social validity of script training related to the treatment of apraxia of speech a

a

Scott R. Youmans , Gina L. Youmans & Adrienne B. Hancock b a

Department of Communication Sciences and Disorders, Long Island University, Brooklyn Campus, New York, NY, USA b

Department of Speech and Hearing Sciences, George Washington University, Washington, DC, USA Available online: 12 Aug 2011

To cite this article: Scott R. Youmans, Gina L. Youmans & Adrienne B. Hancock (2011): The social validity of script training related to the treatment of apraxia of speech, Aphasiology, 25:9, 1078-1089 To link to this article: http://dx.doi.org/10.1080/02687038.2011.577205

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-andconditions This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan, sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

APHASIOLOGY, 2011, 25 (9), 1078–1089

The social validity of script training related to the treatment of apraxia of speech Scott R. Youmans1 , Gina L. Youmans1 , and Adrienne B. Hancock2

Downloaded by [George Washington University] at 08:26 31 August 2011

1

Department of Communication Sciences and Disorders, Long Island University, Brooklyn Campus, New York, NY, USA 2 Department of Speech and Hearing Sciences, George Washington University, Washington, DC, USA

Background: Social validity is an important yet under-examined aspect of treatment efficacy that determines how the effects of treatment are perceived by people other than the clinician/researcher. This is particularly true of treatments to improve the speech of adults with acquired neurogenic disorders. Aims: The purposes of this investigation were to evaluate the social validity of a modified script training treatment protocol, to explore how aspects of a client’s speech correspond to varying rater judgements, and to determine which of the listener ratings were most predictive of ratings of overall quality. Methods & Procedures: A total of 124 young, naïve listeners were asked to rate the quality of speech of an 81-year-old woman with moderate–severe apraxia of speech during utterances with varying levels of script correctness (low, medium, high), number of errors (low, high), and rate (slow, faster). Judgements were made on the understandability, ease of production, naturalness, and overall quality of speech. Outcomes & Results: All main effects and interactions were statistically significant. As script correctness, speaking rate, and number of errors increased, listener ratings were significantly more favourable. Interactions demonstrated increasing correctness, error, and rate with significantly more favourable listener ratings. The listener ratings that predicted improved perceptions of overall quality, weighted highest to lowest, were: understandability, naturalness, and ease of production. Error types were analysed and revealed that phrase repetitions appeared to be perceived positively and that unintelligible words and interjections appeared to be perceived negatively. Conclusions: The modified script training protocol applied to a woman with marked apraxia of speech appeared to be socially valid with these naïve raters. Listeners appeared to be sensitive to the amount and the quality of speech output generated by the speaker. Listeners appeared to perceive struggle behaviours as negative, increased speech output (including repetitions and empty speech) as positive, faster speech (closer to a normal speaker’s average word per minute) as preferable to slow speech, and more understandable speech and more natural speech as better quality speech. Based on these data, the accuracy and fluency of script production are important treatment goals. Keywords: Script training; Treatment efficacy; Social validity; Motor speech disorders; Apraxia.

Address correspondence to: Scott Youmans, Department of Communication Sciences and Disorders, Long Island University, Brooklyn Campus, 1 University Plaza, Brooklyn, New York, NY 11201-5372, USA. E-mail: [email protected] © 2011 Psychology Press, an imprint of the Taylor & Francis Group, an Informa business http://www.psypress.com/aphasiology DOI: 10.1080/02687038.2011.577205

Downloaded by [George Washington University] at 08:26 31 August 2011

SOCIAL VALIDITY OF SCRIPT TRAINING

1079

Treatment efficacy is an important component of evidence-based practice that has received substantial attention in the recent literature (Goldstein, 2002; Holland, Fromm, DeRuyter, & Stein, 1996; Schiavetti & Metz, 2006). The efficacy of speechlanguage treatment is typically evaluated by clinical researchers who measure subjective or objective changes in a client’s communication that are attributable to a particular treatment. In the case of treatments for improving speech output, perceptual or instrumental measurements of the client’s phonemic production, acoustic output, speech intelligibility, comprehensibility, and/or physiology are typically used to determine whether improvement has occurred. These typically used methods that measure the change in the client’s communication over the course of therapy are certainly crucial for establishing the effectiveness of treatments. However, to be truly meaningful, such treatment effects should be detectable, relevant, and valued not only by clinical researchers, but by other relevant individuals (Goldstein, 1990). Therefore, the change in the client’s communication, as measured by the clinician, is only one component of evidence-based practice. Other factors are also important for evaluating the success of a treatment, such as an assessment of the relevance of the goals of treatment, the appropriateness of the treatment methods, and reactions to the client’s communicative changes by relevant persons; relevant persons include the client and members of the client’s family, community, and society at large (Goldstein, 1990; Wolf, 1978). Such components of evidence-based practice, which speak to the question of whether a behavioural change attributed to treatment is socially meaningful, are known as measures of social validity (LaPointe, Katz, & Braden, 1999; Wolf, 1978). Social validity measures provide an effective avenue for the speech-language pathologist to assess the relevance and social significance of various speech and language treatments. Such measures could be used to validate the appropriateness of different treatment methods at different stages of recovery, both for the individual with an acquired disorder and for relevant consumers such as family members and conversational partners. Although Wolf first introduced the importance of social validity when conducting behavioural research in 1978, few researchers have established social validity for treatments of adults with neurogenic communication disorders. Researchers in this area who have addressed social validity also commented on the paucity of social validity components in evidence-based treatment studies (Carr, Austin, Britton, Kellum, & Bailey, 1999; Hickey & Rondeau, 2005; LaPointe et al., 1999). One treatment used to improve the linguistic output of persons with neurogenic communication disorders, script training, did include an analysis of social validity when it was described initially for two patients with non-fluent aphasia (Youmans, Holland, Munoz, & Bourgeois, 2005). In script training, a patient selects three or four scripts, each several phrases in length, that are personally meaningful to him/her at the time of treatment. The participant then practises each script at the phrase level, until it becomes automatic. The practice consists of repeating the phrase together with the clinician, choral production of the phrase, initial phonemic cueing of the phrase, and then independent production of the phrase. When the participant masters the phrase and can generate it spontaneously, a new phrase is selected and learned. Individuals in this initial script training investigation acquired all scripts, and demonstrated the ability to use these scripts flexibly in conversation. The social validity of this script training approach was established by comparing baseline/pre-treatment performances in each script to the generalisation/maintenance phase performances. These pre- and post-treatment examples of script productions were rated by nine independent raters who marked the performance on a 7-inch

Downloaded by [George Washington University] at 08:26 31 August 2011

1080

YOUMANS, YOUMANS, HANCOCK

unscaled line with descriptive adjectives at either end. Raters assessed each script excerpt on three aspects: informativeness, speaking rate, and naturalness of speech. All judges consistently rated post-treatment excerpts as more informative, more normal in speaking rate, and more natural sounding, thus suggesting some degree of social validity for this treatment approach. Subsequent to this initial investigation, a computerised version of script training has been effective with three individuals with both fluent and non-fluent aphasia profiles (Cherney, Halper, Holland, & Cole, 2008), and most recently a modified version of script training has been successful for three adults with primary apraxia of speech and co-existing aphasia (Youmans, Youmans, & Hancock, 2011). As with the initial script training publication of Youmans and colleagues (2005), those studies that do include social validity measures almost exclusively focus on social validation of the effects of treatment. To date, this has generally been accomplished by having raters with little or no knowledge of communication disorders—as such representing general society—rate communication examples pre- and post-treatment (Hickey, Bourgeois, & Olswang, 2004; Hopper, Holland, & Rewega, 2002; LaPointe et al., 1999; Lustig & Tompkins, 2002; Youmans et al., 2005); although, Hickey and Rondeau (2005) also examined differences among raters with different levels of expertise. Certainly it is important for investigators to validate the social relevance of the effects of treatment for individuals with neurogenic disorders such as aphasia and apraxia of speech, particularly because of the widespread neglect of social validation in clinical speech-language research. As mentioned, researchers tend to determine the social validity of the aggregate effects of a treatment. This method allows the investigators to gauge the overall social benefit of a treatment compared to baseline. However, this approach does not enlighten us as to the relative weight of treatment components or how the specific changes in a client’s speech individually changed the raters’ perceptions. In his seminal work, which introduced issues of social validity in behavioural research, Wolf (1978) proposed that, in addition to validating the effects of treatment, the goals of treatment should be systematically validated by individuals other than clinical researchers. He proposed that the goals of treatment, as determined by researchers, should match the wants and needs of clients and their communities, funding agencies, and general society for treatments to be maximally successful. Social validation of treatment goals requires examining whether the specific goals of a treatment are truly what society values as important. Speech-language researchers, for example, might ask whether aspects of communication commonly targeted for treatment, such as understandability, speech naturalness, and high narrative quality, are truly important to lay listeners and/or to those involved in treatment. In addition, investigations into social validity can be used to more objectively determine which specific, measurable behaviours (such as error counts, speaking rate, and information units) contribute most to subjective rater judgements of behaviour (Wolf, 1978). For example, behavioural researchers have correlated counts of specific verbal and non-verbal behaviours, such as the amount of eye contact and the number of conversational questions asked, with rater judgements of affection and conversational quality (Hasse & Tepper, 1972; Minkin et al., 1976). In the area of neurogenic speech-language treatment such an analysis could help define the meaning of complex, subjective labels such as naturalness of speech, communication effectiveness, or conversational ease.

Downloaded by [George Washington University] at 08:26 31 August 2011

SOCIAL VALIDITY OF SCRIPT TRAINING

1081

In summary, treatment efficacy has been recognised as an important component of evidenced-based practice. Social validity is an important component of treatment efficacy. Unfortunately the social validity of treatments for rehabilitating the communication of persons with neurogenic communication disorders has rarely been analysed, and thus the complex and comprehensive impact of these treatments is not yet understood. Furthermore, in the few cases where social validity has been analysed for given treatments, blind perceptual ratings of pre- versus post-treatment audio clips have been used to evaluate social validity. This serves to determine the overall perceptual changes from baseline to post-treatment, but this type of investigation does not allow us to rank the contributions of various treatment components, nor does it highlight the most influential speech characteristics related to consumers’ perceptions. The purpose of this investigation was to determine if the application of a script training protocol to a woman with apraxia of speech modified to include principles of motor learning (Youmans et al., 2011) was socially valid. In other words, as the speaker learned to produce scripts correctly did listeners change their judgements of understandability, ease, naturalness, and overall quality? Additionally, this investigation served to determine how specific aspects of the woman’s speech (script correctness, number of errors, and speaking rate) corresponded to rater judgements about the quality of her speech. Finally, we investigated how the three specific rater judgements (understandability, naturalness, ease of production) factored into judgements of overall quality of speech.

METHOD Participants All of the work described was approved by Long Island University’s Institutional Review Board. A total of 124 adults (92 female, 32 male) served as raters in the investigation. All of the participants were young adults, aged 19–38 (M = 25.23, SD = 4.18). They were considered naïve raters due to their self-reported absence of experience interacting with adults with neurogenic communication disorders. All of the participants completed a brief questionnaire that established their naïveté of adult neurogenic disorders, demographic information, and hearing status. All of the participants reported 2 to 6 years of higher education, middle socioeconomic status, and normal hearing.

Stimuli A total of 12 audio clips comprised the stimuli. The clips were extracted samples of recordings made during baseline and treatment phases of script training with an 81 year-old woman with apraxia of speech. The woman’s scores on the Apraxia Battery for Adults (Dabul, 1979) categorised her as moderate to severe; she scored in the mild to moderate range on the diadochokinetic rate and limb/oral apraxia subtests and in the severe to profound range for two-syllable word production and utterance time. Additionally she exhibits all of the hallmark features of AoS identified by McNeil, Robin, and Schmidt (1997): slow rate, prolonged segment/intersegment durations, distortions/distorted sound substitutions, errors consistent in type, and prosodic abnormalities.

Downloaded by [George Washington University] at 08:26 31 August 2011

1082

YOUMANS, YOUMANS, HANCOCK

The clips were selected based on the woman’s percentage of the script she produced correctly (PSC), errors produced while attempting to produce the script (word or phrase repetitions deemed as non-communicative, pauses greater than 3 seconds, interjections, and unintelligible words), and speech rate. Three levels of PSC were included: low (0–32% correct), medium (33–66% correct), and high (67–100% correct). Two levels of errors were included: relatively low number of errors (0–15) and high (16–30). Two levels of rate were included: slow (0–26 words per minute) and faster (26+ words per minute). Then 30% of the measurements were re-measured by a second examiner to ensure reliability (r = 0.99; p < .0001 for all measurements; Youmans et al., 2011). Table 1 illustrates the ranges used to categorise the three stimulus variables and each variable’s mean in each category. The 12 audio clips represented a sample from each of the conditions. For example, one clip included an excerpt with a low PSC, low number of errors, and a slow rate. Another sample included an excerpt with a high PSC, high number of errors, and a fast rate. Each combination of the three PSC conditions (low, medium, high), two error conditions (low, high), and two rate conditions (slow, faster) were represented. The 12 audio clips were then randomised and put onto a compact disk (CD) with a space between the clips. A second disk was created using the same clips in a different, randomised order so that an order-effect analysis could be completed. An independent t-test was conducted for each of the four dependent variables to determine whether or not there were rater differences due to stimulus order. None of the differences was statistically significant: understandability, t(1486) = 1.35, p = .18; ease of production, t(1486) = 0.88, p = .38; naturalness, t(1486) = 0.35, p =.73; overall quality, t(1486) = 1.77, p = .08. Therefore there was no order effect and ratings were pooled.

Procedure All of the research was conducted in a well-lit, quiet environment free from auditory and visual distractions. Participants were seen individually or in groups of up to three. Following informed consent and administration of the questionnaire, the participants were given standard instructions. The participants were then familiarised with

TABLE 1 Ranges used to categorise the three stimulus variables with means Variable

PSC

Errors

Rate (WPM)

Level 1 Range Mean Level 2 Range Mean Level 3 Range Mean

Low 0–32% 16.67% Medium 33–66% 33.33% High 67–100% 95.84%

Relatively Low 0–15 9.83 High 16–30 19.67

Slow 0–26 20.56 Faster 26 + 36.55

PSC = percent script correct. WPM = words per minute.

Downloaded by [George Washington University] at 08:26 31 August 2011

SOCIAL VALIDITY OF SCRIPT TRAINING

1083

the behaviours and the scales that comprised the rating forms. Once the raters indicated understanding of their task, they were presented with the 12 pre-recorded audio excerpts. Following each audio clip, the CD was paused and the raters were asked to subjectively rate four different aspects of the woman’s speech. The behaviours of interest (dependent variables) were measured by the raters’ responses to four questions: How understandable was the person? How easily did the person produce the words? How natural was the person’s speech? How would you rate her overall speech quality? Under each question was a 129-mm continuous line. On the left of each line were the words “Not at All” and on the right of each line was the word “Very” for the first three questions and “Not Good at All” and “Very Good” for the question regarding overall quality. The participants were asked to rate their impressions of the woman’s speech by making a mark somewhere along the unscaled line. The farther to the right along each line that the rater made his/her mark, the higher his/her rating.

Data analysis The subjective ratings by each participant for each audio clip were then quantified. The line was measured (in millimetres) from the beginning of the line to the mark made by the rater. Therefore, a number from 0 to 129 was recorded for each of the four dependent variables for each of the 12 stimuli (see Bourgeois, 1993, for an overview/defence of the use of this procedure called a visual analogue scale). A repeated-measures multivariate analysis of variance (RM-MANOVA) was computed to determine whether significant main effects or interactions existed for PSC, errors, and/or rate for each of the dependent variables. Effect size estimates were made for all significant pairwise comparisons using Cohen’s d. Effect sizes were estimated using the following criteria: small = 0.20, medium = 0.50, and large = 0.80 (Cohen, 1988). Additionally, partial correlations and a linear regression analysis were conducted to determine which of the variables (understandability, naturalness, ease of production) contributed most to overall quality ratings. SPSS version 13.0 was used for all statistical computations. An alpha of 0.05 was used for determining statistical significance. Bonferroni adjustments were made for multiple comparisons to protect against family-wise error.

Reliability Following all of the measurements, a second examiner measured all of the lines recorded as responses by the participants to determine consistency of measurement. Bivariate correlations were computed to determine the consistency between the raters for each of the variables. The measurements for each of the variables were considered to be reliable: understandability (r = 1.0; p < .0001); ease of production (r = 1.0; p < .0001); naturalness (r = 1.0; p < .0001); overall quality (r = 1.0; p < .0001).

RESULTS RM-MANOVA The results of all of the multivariate omnibus tests for main effects and interactions (using Wilks’ ) were statistically significant at the p < .0001 level. Univariate

Downloaded by [George Washington University] at 08:26 31 August 2011

1084

YOUMANS, YOUMANS, HANCOCK

comparisons (using Greenhouse-Geisser) and pairwise comparisons were also all statistically significant at the p ≤ .001 level with one exception: the univariate test for the interaction between error and rate (p = .16). Tables 2 and 3 provide mean differences, standard errors, significance levels, and effect size estimates for the pairwise comparisons. As PSC increased, the ratings for the dependent variables (understandability, ease of production, naturalness, and overall quality) increased significantly. Significant differences were found between low script correct, medium script correct, and high script correct. As errors increased, the dependent variables increased significantly. As rate increased, the dependent variables increased significantly. An interaction was observed between increasing PSC and increased error that resulted in significantly increased dependent variables. An interaction was seen between increasing PSC and increased rate that resulted in significantly increased dependent variables. Finally an interaction was seen between increasing PSC, increased error, and increased rate that resulted in

TABLE 2 Mean differences, standard errors, significance levels, and effect size estimates for the pairwise comparisons for PSC Measure Understandability

Ease of Production

Naturalness

Overall Quality

Level

Level

MD

SE

p Value

ES

ES

Low Low Medium Low Low Medium Low Low Medium Low Low Medium

Medium High High Medium High High Medium High High Medium High High

−24.04 −31.99 −7.94 −14.74 −20.99 −6.25 −14.47 −19.06 −4.59 −18.48 −25.61 −7.13

1.46 1.52 1.65 1.19 1.49 1.35 1.24 1.67 1.47 1.16 1.45 1.28