Comparing Two Methods of Identifying Alliance Rupture ... - PsycNET

3 downloads 0 Views 256KB Size Report
Jan 27, 2012 - Comparing Two Methods of Identifying Alliance Rupture Events. Joana Coutinho and Eugénia Ribeiro. University of Minho. Inês Sousa.
Psychotherapy 2014, Vol. 51, No. 3, 434 – 442

© 2013 American Psychological Association 0033-3204/14/$12.00 DOI: 10.1037/a0032171

Comparing Two Methods of Identifying Alliance Rupture Events Joana Coutinho and Eugénia Ribeiro

Inês Sousa

University of Minho

University of Minho

Jeremy D. Safran

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

New School for Social Research and Beth Israel Medical Center New York, New York This study compared two methods of detecting ruptures in therapy sessions, a procedure based on a self-report measure, the Working Alliance Inventory (WAI), and an observational Rupture Resolution Rating System (3RS). We anticipated that the 3RS would detect more ruptures than the WAI. We examined the longitudinal data of 38 patient–therapist dyads in a cognitive– behavioral therapy condition. The sample included cases that did not complete treatment (dropped cases) as well as good-outcome and poor-outcome cases. At the end of each session, patients completed the WAI self-report questionnaire. Six judges were trained to observe and detect the occurrence of ruptures, and then rated 201 videotaped sessions. Longitudinal statistical models were applied to the data retrieved from the WAI questionnaires completed by patients. We found discrepancies in the ability of the two methods to detect rupture events with the observational 3RS detecting more ruptures than the WAI. Thus, the use of observational systems for the detection of alliance ruptures is crucial for effectively assessing the quality of the therapeutic alliance over the course of treatment. Furthermore, observational systems proven to detect ruptures can be used to improve clinical practice and training of new clinicians. Keywords: therapeutic alliance, alliance rupture events, self-report measures, observational systems

episodes and outcome. The other meta-analysis indicated that rupture resolution training had a significant impact on patient outcome. According to Safran and Muran (2006), a rupture in the therapeutic alliance can be defined as a tension, conflict, or misunderstanding in the collaborative relationship between patient and therapist. By conceptualizing rupture events in this manner, the authors departed from Bordin’s (1979) model of the therapeutic alliance. In Bordin’s theoretical framework, a rupture is defined as a disagreement between therapist and client with regard to therapeutic goals and tasks. For example, a disagreement in goals could occur when the patient seeks to improve his or her social skills, whereas the therapist strives to understand the relationship between the patient’s social anxiety and his/her experiences as a child. With respect to tasks, a rupture could occur when the patient expects a didactic strategy using role-plays and modeling exercises, whereas the therapist adopts an experiential strategy, such as the empty-chair technique. Finally, within Bordin’s framework, a rupture can also consist of a strain in the bond between patient and therapist (e.g., the patient feels the therapist is critical and unsupportive). Safran and Muran (2000) explained that there are two types of alliance ruptures: withdrawal ruptures and confrontation ruptures. During withdrawal ruptures, the patient avoids or resists the therapist. The patient may provide minimal responses to the therapist’s questions or may defer to or appease the therapist to an excessive degree. Rather than addressing his or her dissatisfaction or discomfort, the patient withdraws in an effort to “protect” the therapeutic relationship. In confrontation ruptures, on the other hand, the patient expresses his or her anger and dissatisfaction in a direct, often hostile, manner. The patient might complain about the ther-

Previous research has shown that the alliance plays an important role in therapy. For example, the therapeutic alliance has been shown to be a consistent predictor of therapy outcome (e.g., Horvath & Bedi, 2002; Horvath, Del Re, Fluckiger, & Symonds, 2011; Martin, Garske & Davis, 2000) as well as one of the most important common factors across various therapy modalities (Wampold, 2001; Horvath, 2011). More recently, however, Safran, Muran, and Eubanks-Carter (2011, p. 80) clarified the factors that contribute to the therapeutic alliance in what they refer to as “the second generation” of alliance research. This line of research on the alliance investigates the processes of alliance rupture and resolution. For example, two meta-analyses by Safran et al. (2011) provided support for the clinical relevance of the identification and repair of alliance rupture events. One of the meta-analyses showed that there is a relationship between the frequency of rupture-repair

This article was published Online First May 13, 2013. Joana Coutinho and Eugénia Ribeiro, Department of Applied Psychology, School of Psychology, University of Minho, Braga, Portugal; Inês Sousa, Department of Mathematics and Applications, University of Minho, Guimarães Portugal; Jeremy D. Safran, Department of Psychology New School for Social Research New York, New York, and Beth Israel Medical Center New York, New York. This article was supported by the Portuguese Foundation for Science and Technology (FCT)—PhD Grant: SFRH/BD/27654/2006. The authors want to thank Ariel Westermen for her support on the editing of this paper and revision of the English writing quality. Correspondence concerning this article should be addressed to Joana Coutinho, School of Psychology, University of Minho, Campus de Gualtar 4710-057 Braga, Portugal. E-mail: [email protected] 434

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COMPARING TWO METHODS OF IDENTIFYING ALLIANCE RUPTURE EVENTS

apist as a person or might try to pressure or control the therapist by telling the therapist what to do. Confrontation ruptures may lead to a cycle of hostility between patient and therapist, which, if not recognized and worked through, may prevent the treatment from helping the patient move forward in his or her change process. Ruptures are common in therapy cases, including cases that are successful (Eubanks-Carter, Muran, Safran, & Hayes, 2011; Safran, Muran, & Samstag, 1994). The frequency and intensity of ruptures depend on several factors, including therapist’s theoretical orientation as well as other therapist characteristics (e.g., Castonguay, Goldfried, Wiser, Raue, & Hayes, 1996; Constantino et al., 2008). The patient’s presenting problem, as well as his or her personality organization, also contributes to the occurrence of ruptures (e.g., Eames & Roth, 2000). In most cases, ruptures are difficult to recognize by either the therapist or the patient. An undetected rupture results in a missed opportunity to reveal the patient’s dysfunctional interpersonal pattern, and without detecting a rupture, the therapist and patient cannot work to resolve it (Regan & Hill, 1992; Rhodes, Hill, Thompson, & Elliot, 1994). Because rupture events are difficult to detect, the purpose of this study was to compare the capacity of two different methods to identify ruptures in the therapeutic alliance.

Two Methods of Detecting Alliance Rupture Events Several methods have been used in empirical research on the therapeutic alliance as well as alliance rupture events. Horvath (2011) explained that Bordin’s conceptualization of the therapeutic alliance has had a strong influence on research about rupture events, and that the measures used to detect rupture events are based on Safran and Muran’s theories on ruptures. Researchers have attempted to detect ruptures in a number of ways. Ruptures can be detected as a single utterance (e.g., Colli & Lingiardi, 2009), as sequences of events within a session (e.g., Safran, Muran, & Samstag, 1994), or as fluctuations on alliance scores between sessions (e.g., Stiles et al., 2004; Strauss et al., 2006). Furthermore, several procedures have been used based on changes in alliance scores. One simple method of rupture identification defines a rupture as a decrease in the alliance score of at least the mean standard deviation (Strauss et al., 2006). In another method, there must be a decrease of at least one point on the Working Alliance Inventory (WAI) to qualify as a rupture event (Safran et al., 2006). More complex methods have been imported from economics and epidemiology (Eubanks-Carter, Gorman, & Muran, 2009, cited by Eubanks-Carter, Muran & Safran, 2010, p. 96). For example, the Control Chart method establishes upper and lower limits of two standard deviations above and below the mean and considers a rupture to be a score outside the lower limit. The Regime Shift method applies a t test to each new data point to see if the data point represents a statistically significant deviation from the mean (see Eubanks-Carter, Gorman, & Muran, 2009 for a review of different methods of identifying ruptures or shifts in alliance). According to Eubanks-Carter, Muran, and Safran (2011), both self-report methods and observer-based methods can be used to detect alliance rupture events. Previous research identified ruptures by examining overall alliance scores on the WAI over the course of treatment (e.g., Safran et al., 1994). Other research focused on the alliance and the occurrence of rupture events in therapy used

435

indirect methods, requiring a range of statistical criteria for the detection of rupture episodes. Other longitudinal studies (e.g., Golden & Robbins, 1990; Patton, Kivlighan, & Multon, 1997; Kivlighan & Shaughnessy, 2000; Stiles et al., 2004) have tracked the strength of the alliance between patient and therapist by using statistical methods to identify distinct patterns associated with outcome. In general, the aforementioned studies suggest that positive linear increase and quadratic high-low-high patterns of strength in the alliance are related to good outcome. Stiles et al. (2004) did not find this quadratic pattern; however, they identified a subset of patients who presented better outcomes experienced rupture-resolution sequences signaled by brief V-shaped deflections on the alliance scores. Strauss et al. (2006) replicated these results in a sample of patients with either obsessive– compulsive or avoidant personality disorder who received cognitive– behavioral therapy (CBT): they found that rupture-resolution sequences were significantly related to the relief of depressive and disordered personality symptoms, assessed as the pretreatment-to-posttreatment difference in the Beck Depression Inventory and the Wisconsin Personality Disorders Inventory. Despite the benefits of self-report questionnaires—such as convenience and data reduction—this method is not always the most effective for evaluating the therapeutic alliance. As with any self-report measure, the participant’s response depends on several factors including, but not limited to, his or her emotional state while completing the questionnaire as well as his or her degree of commitment to answering the questions truthfully (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). The aforementioned limitations are also present when identifying alliance rupture events by using direct self-report methods, such as the use of direct questions regarding ruptures and their resolution in a Postsession Questionnaire (e.g., Muran, Safran, Samstag & Winston, 1992). Studies on patient deference to the therapist indicated that these factors are particularly relevant with regard to alliance evaluation (Rennie, 1994). Often, patients are unable or unwilling to reveal their discomfort or dissatisfaction to their therapist (Regan & Hill, 1992; Gonçalves, 2009). Thus, the same type of defensive behavior that occurs in withdrawal ruptures may also occur when patients complete a postsession self-report measure; that is, the patient may try to protect the therapist or the therapeutic relationship by evaluating the quality of the alliance positively, thus omitting the rupture. Therefore, as Westen and Shedler (as cited in Colli & Lingiardi, 2009) noted, self-report measures of alliance are flawed owing to patient bias and poor self-reflection. Furthermore, these measures rely on session recall, which is problematic because alliance rupture events tend to evoke anxiety, anger, and guilt in the patient (Coutinho, Ribeiro, Hill, & Safran, 2011). The limitations of self-report methods suggest that other methods may vary in terms of detection of alliance rupture events. In studies that detected ruptures using direct self-report measures (e.g., Nagy, Safran, Muran, & Winston, 1998; see also EubanksCarter, Muran & Safran, 2010 for a review), the frequency of rupture events ranged from 11% to 42% per patient accounts and from 25% to 56% per therapist accounts. In addition, studies that detected rupture and repair sequences by indirect self-report by assessing the overall WAI score across sessions reported a frequency ranging from 21.5% to 56% of the cases (e.g., Stiles et al., 2004; Strauss et al., 2006). Muran, Safran, Gorman, EubanksCarter, and Banthin (2008) found that 20 patients receiving CBT

COUTINHO, RIBEIRO, SOUSA, AND SAFRAN

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

436

reported ruptures in 11.2% of sessions and in 60% of cases, whereas an indirect measure of ruptures based on an analysis of patient WAI ratings using control charting identified them in 8.67% of sessions but in 100% of cases. There is evidence suggesting that observational systems of detecting ruptures may be more efficient because these systems detect a greater frequency of ruptures (e.g., 77%; Sommerfeld, Orbach, Zim, & Mikulincer, 2008). Eubanks-Carter, Mitchell, Muran, and Safran (2010, cited by Eubanks-Carter et al., 2010, p. 94) found that 20 patients receiving CBT reported ruptures in 35% of sessions toward the beginning of treatment, whereas an observer-based method detected withdrawal ruptures in 100% of the sessions and confrontation ruptures in 75% of the sessions. Observer-based methods have been used to examine the role of patient deference in self-report measures and its effect on overall alliance scores. Furthermore, observer-based methods have also explored whether a patient’s difficulty in reporting ruptures is due to lack of awareness or to discomfort in acknowledging them (Eubanks-Carter et al., 2010). Some examples of observer-based methods used to detect alliance rupture events are as follows: the Harper’s Coding System (Harper, 1989a; Harper, 1989b) and the Collaborative Interaction Scale (Colli & Lingiardi, 2009), both of which use transcripts of sessions, and the Rupture Resolution Rating System (3RS; Eubanks-Carter, Mitchell, Muran, & Safran, 2009), which uses video data of sessions. The goal of this longitudinal study was to explore the discrepancy between using a self-report measure on the alliance and using an observational system to measure rupture events. We intended to determine whether one of the methods detects more ruptures and whether these methods detect rupture episodes in the same therapeutic sessions. We anticipated that the observational 3RS will be more sensitive than the WAI, a self-report measure. Moreover, we expected that the method based on fluctuations in self-report WAI measures would detect fewer ruptures than using the observational 3RS, as the former defines ruptures as events that occur across sessions (from one session to the next) as opposed to during sessions.

Method Participants The sample consisted of 38 patient–therapist dyads in a CBT treatment condition. The initial sample consisted of 50 dyads, but we eliminated 12 dyads owing to incomplete data (e.g., missing videotapes or WAI scores). Patients presented with a variety of psychopathological symptoms from Axis I (30 cases with depression and anxiety disorders) and Axis II disorders (eight cases with personality disorders from Clusters B and C). Therapy was conducted at a university clinic. All patients who underwent treatment at the clinic during the 2-year period of data collection were included in the study if they signed an informed consent contract, except for patients with psychotic symptoms, who were not included. The sample of participants was evenly composed of university students (50%) and community members (50%). Participants ranged in age from 18 to 56 years with a mean age of 29 years (SD ⫽ 8.24 years); 68% (n ⫽ 26) were female. All participants were Caucasian Portuguese.

Treatment The treatment consisted of weekly CBT sessions, which is the therapeutic approach most frequently used in the clinic. The treatment process for patients with personality disorders also incorporated principles of cognitive interpersonal therapy (Safran & Segal, 1990). Fifteen therapists participated in this study. Nine therapists each treated three patients; five therapists treated two patients; and one therapist treated one patient. Therapist level of experience ranged from 2 to 8 years of clinical practice. All therapists were Caucasian and either masters or doctoral students at the university. Therapists received weekly group supervision to monitor adherence to CBT protocols. Supervisors were senior therapists and faculty members at the university.

Measures The client version of the Working alliance—WAI-client version (Horvath & Greenberg, 1989)—was given to patients in the study. The WAI measures three aspects of the therapeutic alliance (goals, tasks, and bond) independent of the therapist’s theoretical orientation. The internal consistency estimates for the WAI have ranged from ␣ ⫽ .88 to ␣ ⫽ .93 (Horvath & Greenberg, 1989; Kokotovic & Tracey, 1990). Considerable evidence has been obtained to support the validity of the WAI (Horvath & Symonds, 1991). The Portuguese version of the WAI has high levels of internal consistency and reliability for the overall scale and for each subscale (Machado & Horvath, 1999). The short form of the WAI, which was used in the present study, includes 17 seven-point Likert scale items anchored by 1 (never) and 7 (always); the items reflect judgments about the quality of the collaboration between patient and therapist. Higher scores reflect stronger therapeutic alliances. This questionnaire was administered at the end of each session. Internal consistency estimates for the global WAI in this sample was ␣ ⫽ .78. Alliance rupture events were measured by using the 3RS (Eubanks, Mitchell, Muran, & Safran, 2009). The 3RS is an observerbased system for detecting ruptures and rupture resolutions. While observing a therapeutic session, raters watch for a lack of collaboration and tension between patient and therapist. If either are present, raters determine if a confrontation rupture (when the patient moves against the therapist by expressing anger or dissatisfaction) or a withdrawal rupture (when the patient either moves away from the therapist or the patient moves toward the therapist, but in a way that denies an aspect of his or her experience) has occurred in the session. For each detected confrontation or withdrawal rupture event, raters choose a specific subtype of rupture event from a list (e.g., denial, complains about the progress of therapy). Once raters have defined the rupture, they rate its clarity and intensity on a 5-point Likert scale. Raters then assigned an overall withdrawal and confrontation score to the session. The score ranges from 1 [withdrawal/confrontation rupture(s) did not occur; not significant for the alliance] to 5 [withdrawal/confrontation rupture(s) occurred; significant for the alliance]. A team of six judges rated 201 sessions using 3RS. Each judge rated approximately the same number of sessions (n ⫽ 33). Thirty percent of the 201 sessions (60 sessions) were rated by more than one judge to assess interrater reliability. Thus, each of these 60 sessions was rated by three judges to calculate the intraclass correlation coefficient (ICC). We used the single-rater ICC be-

COMPARING TWO METHODS OF IDENTIFYING ALLIANCE RUPTURE EVENTS

cause 70% of our data was coded by only one rater. Considering recommendations from previous studies (Shrout & Fleiss, 1979; Colli & Lingiardi, 2009; Dimaggio et al., 2008), the interreliability values were adequate (ICC ⫽ 0.73 for withdrawal global ratings, ICC ⫽ 0.96 for confrontation global ratings).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Procedure The study is part of a research project approved by the Scientific Council of the University of Minho. We obtained permission from the University Clinical Centre to sample its patients. Participants were informed of the requirements of their participation in the study and then signed a consent form. Research clinicians that were in charge of intake administered the Structured Clinical Interview for DSM–IV, Axis I, and Axis II, to determine patient diagnoses (First, Spitzer, Gibbon, & Williams, 1995). At the end of each session, patients completed the WAI and put the questionnaire in an envelope to ensure confidentiality of answers. Participants were informed that only researchers would have access to the evaluations in an attempt to reduce social desirability effects. Successful and unsuccessful cases were determined by the Global Assessment of Functioning (GAF) of the DSM–IV. We defined an unsuccessful case as one in which there was no increase on the GAF, and patient and therapist agreed on termination. In successful cases, the therapist considered that the patient had made significant clinical improvement, which was reflected in an increase on the GAF [e.g., from a score of 51– 60 (moderate symptoms) at the beginning of therapy to 61–70 (some mild symptoms) at termination]. Dropped cases were defined as cases in which the patients decided to terminate treatment without discussing the decision with the therapist. In many of the dropped cases, patients missed a scheduled therapy session without informing the clinic. Six judges received 2 months of weekly training sessions on how to use the observation-based 3RS system. The judges were masters and PhD-level students at the university. The training of judges included reading the manual, independently coding a different session each week, and discussing scores in team meetings. The process continued until raters achieved high reliability in ratings. Once reliable, the judges then rated 201 videotaped sessions. For example, for a case with four sessions, raters coded the first and fourth session; for a case with 20 sessions, raters coded the 1st, 3rd, 4th, 7th, 11th, 13th, 15th, and so forth, and 20th sessions). The number of rated sessions per each case varied because (1) duration of treatment varied between cases (mean duration of treatment ⫽ 13.2 sessions; minimum ⫽ 4 sessions; maximum ⫽ 30 sessions), and (2) 10 sessions in the sample were not recorded properly due to technical problems. The number of rated sessions per case ranged from 2 (for cases with four sessions) to 15 (for cases with 30 sessions). Raters scored sessions from the beginning, middle, and concluding phases of treatment. We rated the first session, last session, and alternating sessions for all cases.

Data Analyses To determine the global consistency between the two methods—the self-report measure on the alliance and the observational system—we first performed an exploratory analysis. For that we used nonparametric regression and spline smoothing model as exploratory tools to understand the progression of the variables

437

(WAI, withdrawal and confrontation) as a function of therapy termination type. The advantage of this technique is that it does not impose a rigid function to the data. In the next stage of the analysis, we used a longitudinal parametric statistical model based on which we defined our criteria of rupture detection. This parametric model included a subjectspecific random intercept, as well as a serial correlation component with an exponential correlation structure (Diggle, Heagerty, Liang, & Zeger, 2002). This model is also known as a mixed-effects model because it parametrically models the expected values and the correlation structure in the data. This model allows us to separate the different sources of variability (i.e., the variability between participants, within participants, and measurement error). For a detailed description of the statistical formulas of the parametric model, please see the Appendix. As we mentioned before we defined ruptures using a criterion that was based on the parametric model that we fitted to the data, that is, we used the fitted model to detect sessions that had a greater decrease in WAI alliance scores compared with the sample average. We used the estimated variance of the serial correlation process to detect alliance ruptures. Compared with criteria established for other methods of rupture detection that use self-report (Strauss et al., 2006; Eubanks-Carter, Gorman & Muran, 2009), the criteria we established is more conservative because we not only included the mean and standard deviation of the WAI across sessions (as previous studies had done), but we also examined the correlation between two WAI measurements of the same subject. The data analysis was developed with R (http://cran.r-project.org). As the criteria that we established to determine the presence of ruptures using the self-reported WAI is new in the literature of alliance ruptures, we tested its reliability with another better known previously tested method of analysis of the WAI for ruptures. From the several existing methods of this type (see EubanksCarter et al., 2012 for a revision), we selected the Tukey=s Control Chart, which is a type of control chart method for tracking changes in times series data. Tukey=s control chart was constructed using procedures described by Tukey (Hoaglin, Mosteller, & Tukey, 2000) for calculating confidence intervals based on medians and interquartile ranges, which are more robust to predict outliers and departures from normality than are means. This method can be applied to data sets with as few as seven data points. To do this analysis, we excluded from the sample the eight cases that abandoned therapy before the seventh session. Thus, the sample that was used for this analysis of reliability between both methods of WAI self-report rupture analysis was composed of 30 cases (400 sessions). By applying both methods to each case, we found that the Tukey=s control chart detected 12 ruptures. Our criteria had detected 20 ruptures. There is a significant association between the two methods ␹2(2) ⫽ 35.17, p ⫽ .000. This seems to represent the fact that ruptures detected by Tukey=s control chart were also likely detected by our criteria of WAI=s decrease (42%). Both methods are conservative being that in 90% of the sessions none of the self-report methods detected any ruptures. Regarding the observational system, to identify sessions with rupture events, we used the 3RS criteria of withdrawal or confrontation scores ⱖ3 [3 ⫽ withdrawal/confrontation rupture(s) occurred, significant to the alliance].

COUTINHO, RIBEIRO, SOUSA, AND SAFRAN

438

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Results We will present the results comparing alliance-rupture detection between observational 3RS and self-reported WAI. We will first present the results of the global comparison between the two methods and then the degree of agreement between methods at the session level. Our sample included 14 successful cases, 17 dropped cases, and 7 unsuccessful cases. Table 1 presents quantitative descriptions of the WAI and the confrontation and withdrawal scores. At first glance, the two methods seemed to capture similar findings (see Figure 1). The WAI scores for dropped cases increased in the beginning phase of therapy and then decreased. The decrease in WAI score occurred at the same time that confrontation and withdrawal scores increased. In addition, we found a consistency among measures for successful cases: a pattern of high and stable WAI scores that gradually increased, as well as low and stable levels of confrontation and withdrawal scores. We observed the same consistency when we examined data from the concluding phase of treatment. As mentioned above, patients who dropped out of treatment differed from those who completed therapy; specifically, their WAI score decreased and their confrontation and withdrawal scores increased just before they left therapy. Although the aforementioned findings illustrate a global consistency between the two methods, those results do not allow us to determine whether they were able to detect the rupture events during the same sessions. To address this question, we examined the WAI scores to identify sessions with rupture events. We defined a rupture episode as a time point during which the WAI decreased and the variability between the previous and the current time point was higher than the estimated sample average. To identify sessions with rupture episodes with the 3RS, we used the 3RS criteria of withdrawal or confrontation scores ⱖ3 [3 ⫽ withdrawal/confrontation rupture(s) occurred, significant to the alliance]. As Table 2 shows, in only 2 of the 13 (15%) sessions in which a rupture was detected using the WAI’s criteria for a significant decrease in alliance scores, had confrontation scores ⱖ3. Conversely, only 2 of the 25 (8%) sessions that contained a confrontation score ⱖ3 also met the WAI’s criteria for rupture detection. The Cohen’s kappa value that indicates the agreement between systems was low (k ⫽ .02), suggesting a poor agreement between the two systems of rupture detection. We compared the decrease in WAI with the scores that indicated withdrawal ruptures (as scored by the raters), and we found that only 5 of the 34 (15%) sessions with rupture withdrawal markers also had a rupture detected by the WAI (see Table 3). Conversely Table 1 Summary of the WAI, Confrontation, and Withdrawal Score Rupture Indicators

Variable

Minimum

Maximum

Mean

Standard deviation

WAI Confrontation Withdrawal

31 1 1

119 5 5

97.8 1.55 1.67

15.5 0.92 0.88

Number ruptures detected 13 25 34

only 5 of the 13 (38%) sessions with WAI score decreases also met the criterion for an alliance rupture according to the withdrawal scale. Also, in this case, the Cohen’s kappa value of .12 indicated a poor agreement between the two systems of rupture detection. The data suggest that there are discrepancies between the selfreport and observation-based measures. Only two sessions with high confrontation scores and only five sessions with high withdrawal scores had been marked as sessions with rupture events by the WAI. Meaning, the 3RS detected more rupture episodes than the WAI.

Discussion The main goal of this study was to compare the abilities of observational and self-report methods to detect alliance ruptures. When we first considered the progression of the WAI versus the withdrawal and confrontation scores throughout treatment, there seemed to be an overall consistency between the observational 3RS and the WAI self-report methods. However, when we more closely examined the data to determine if the two methods detected ruptures during the same sessions, we found discrepancies between the decreases in the WAI scores and the confrontation or withdrawal rupture events detected by 3RS. In 92% of the sessions in which 3RS detected ruptures (as confrontations), the decrease in the WAI score did not successfully identify ruptures. Furthermore, in 84% of the sessions in which the WAI score decrease detected ruptures, 3RS did not detect ruptures based on the confrontation scores. The same occurred for withdrawal ruptures per 3RS: the WAI decrease only detected ruptures for 15% of the sessions in which 3RS found them. Only in 13% of the sessions in which WAI decrease detected rupture events, did the 3RS withdrawal method also detect ruptures. In both cases, the low kappa values that we found indicate a poor agreement between systems regarding the presence of a rupture in a session. The discrepancies between the self-report and the observational method suggest that ruptures may not be as accurately identified when patients assess the quality of their therapeutic alliance through self-report questionnaire compared with judges who use an observational system like the 3RS. We found that the 3RS identified more ruptures than the WAI score decrease method. Our results are consistent with previous studies (Sommerfeld, Orbach, Zim, & Mikulincer, 2008) in which it was shown that there exists a difference between a self-report measure of ruptures assessed by directly questioning the client about the occurrence of these events in the Postsession Questionnaire (42% of ruptures reported) and an observer-based measure based on Harper=s coding system (77% of ruptures reported). The differences among WAI decreasing scores, withdrawal rupture scores, and confrontation rupture scores raise the question of whether these methods actually examine different phenomena (the broad construct of the alliance vs. the construct of alliance rupture events) or different levels of the same phenomenon. The point is that the same phenomena (alliance ruptures) is evaluated in different levels of analysis: the 3RS assesses ruptures at the segment level (moment to moment interaction), whereas the WAI criteria assesses them at the session level. Besides the 3RS is an observer perspective and the WAI is a self-report perspective, as pointed out by Eubanks-Carter et al. (2012). Moreover, discrepancies may also be the result of a delay in time between the emergence of a rupture

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COMPARING TWO METHODS OF IDENTIFYING ALLIANCE RUPTURE EVENTS

439

Figure 1. Nonparametric estimates of the population averages of WAI, confrontation and withdrawal scores for dropouts, successful cases, and unsuccessful cases. Note. The solid black line represents the nonparametric estimate for population average, and the black dashes represent the 95% confidence intervals.

and the time at which a patient rates the alliance on the WAI. There were 11 sessions in which the WAI detected ruptures but the confrontation scale did not, probably because the negative events that the client experienced during those session led to a decrease in his or her WAI=s ratings but were not intense enough to be captured by the 3RS. Thus, the accumulation of ruptures of lower intensity may have resulted in a decrease in the quality of the alliance as evaluated by the client at the end of session, although they were not intense enough to meet our criteria for “rupture session” in the observational system. Finally, measures like the WAI evaluate the therapeutic alliance and not alliance ruptures per se, which means that they can only

detect ruptures via alliance fluctuations from session to session. It is based on these fluctuations that the researchers infer the presence of a rupture from one session to the next one, that is, the scores reported on the WAI are not direct measures of alliance ruptures. Moreover, because this self-report method did not examine data on a microlevel (as does 3RS), it could not identify changes in the quality of the therapeutic alliance during sessions, it can only detect changes that occur across sessions. Our data suggest that the two methods may be sensitive to slightly different manifestations and different levels of intensity of the same phenomena. Different methods of measurement that derive from specific ways of operationalizing it, give us different pictures of the

COUTINHO, RIBEIRO, SOUSA, AND SAFRAN

440

Table 2 Cross Tabulation: WAI Score Decrease With Confrontation

Limitations

Confrontation scale

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

WAI score decrease Without ruptures With ruptures Total

Without ruptures

With ruptures

Total

156 11 167

23 2 25

179 13 192

same phenomena. As Eubanks-Carter et al. (2012) argue, when we rely on the alliance score of the sessions, we cannot identify the ruptures that occur within the session like the observational systems can do. Finally, self-report measures may be subject to bias because they are based on patients’ recall of sessions. This argument supports the argument made by Westen and Shedler (as cited in Colli & Lingiardi, 2009) about how ruptures might evoke negative feelings in the patient, thus making it difficult to acknowledge or recall events associated with those negative feelings. The discrepancy between the two methods has important implications for clinical training as well as for research. It should alert clinicians and supervisors to the importance of using observational systems to assess the quality of the therapeutic alliance. Our results suggest that training of psychotherapists should be based on systems like 3RS to better help trainees detect the emergence of ruptures in session. Our findings demonstrate that ruptures can easily go undetected with the use of self-report measures and thus do not provide the opportunity for therapists to resolve the rupture. Observer-based methods can be useful in clinical supervision, even with experienced therapists as trainees. However, our data may also stress the importance of using both methods in a complementary fashion. We must remember that the patient=s evaluation of the quality of the alliance through self-report is a good predictor of outcome and permanence in therapy (Horvath & Bedi, 2002). Therefore, we cannot rule out the hypothesis that the observers and the patients may simply have a different opinion about what constitutes a rupture, that is, the 3RS may evaluate as ruptures events the patients do not consider ruptures. For example, the observers would rate as ruptures moments of withdrawal, while the patients would only consider as ruptures moments of more intense negative affect. In this sense, another possible explanation for the greater number of ruptures detected by the 3RS is that it may be over evaluating alliance ruptures. Therefore, both methods may be useful. The clinician=s and supervisor=s attention to the fluctuations of the alliance ratings may create the opportunity to review the sessions, go back on the interaction between therapist and patient and repair undetected low intense but important ruptures, thus fostering the repairing process in both levels of analysis. In conclusion, our study suggests that researchers and clinicians should adapt each of these methods to their own clinical goals or research questions and data specificities. As Eubanks-Carter et al. (2012) argue, the use of multiple methods, looking for points where the findings converge or diverge may be informative per se.

The majority of studies examining the therapeutic alliance have limited the analyses to the initial period of alliance formation. Other studies such as Kramer, De Roten, Beretta, Michel, & Despland (2009) limited their analyses to a predefined number of sessions per case. Our study, however, included cases with different termination types (including dropped cases), thus extending the length of treatment past the initial stages of alliance formation. The methodology used in our study has greater ecological validity than other studies intending to study the alliance. Whereas other studies only include cases with the same treatment length (in which all patients completed treatment), our study included patients who completed treatment as well as patients who terminated treatment prematurely. However, we must stress that we cannot generalize our data to other forms of psychotherapy, as our sample only included cases of CBT. This is important owing to the fact that the therapeutic alliance may take different forms in different therapeutic orientations (Bordin, 1979). One important methodological weakness of our study is the way in which we defined successful versus unsuccessful cases. We acknowledge that the distinction between successful and unsuccessful therapies was only based on clinical criteria, which prevents us from comparing our study with previous research, as most other studies use a standardized measure as a way of indicating clinically significant change in patients. In addition, the absence of a common symptomatic measure administered to all the cases, prevents us from controlling the effect that symptom improvement may have had in the WAI ratings over time. Another limitation of our work is that we did not account for therapist variables such as their level of clinical experience or adherence to CBT, both of which might influence the establishment of a working alliance between patient and therapist. Most importantly, we also did not evaluate therapist contribution to the emergence of rupture events or to rupture resolution. Another important limitation of this study is that it compared patient and observer perspectives on the therapeutic alliance and on rupture events, but it did not include therapist perspectives. We intend to include therapist perspective in future studies on the therapeutic alliance and alliance rupture events. Finally, the discrepancy found between the WAI and the observational system seems to suggest that these two methods are measuring different levels of the same phenomena. Thus, in future studies, researchers should try to address this issue by comparing these methods at the same analysis level. This can be done by using ratings of the alliance made every 5 min of the session using the Segmented Working Alliance Inventory-Observer Form

Table 3 Cross Tabulation: WAI Score Decrease With Withdrawal Withdrawal scale

WAI score decrease Without ruptures With ruptures Total

Without ruptures

With ruptures

Total

150 150 5 158

29 29 5 34

179 179 13 192

COMPARING TWO METHODS OF IDENTIFYING ALLIANCE RUPTURE EVENTS

(S-WAI-O) (Berk, Safran, Muran, & Eubanks-Carter, 2010). Although this is not a self-report measure of the alliance ruptures, it would allow for the comparison between methods that are both at the segment session level.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

References Bordin, E. S. (1979). The generalization of the psychoanalytic concept of the working alliance. Psychotherapy: Theory, Research and Practice, 16, 252–260. doi:10.1037/h0085885 Castonguay, L. G., Goldfried, M. R., Wiser, S., Raue, P. J., & Hayes, A. M. (1996). Predicting the effect of cognitive therapy for depression: A study of unique and common factors. Journal of Consulting and Clinical Psychology, 64, 497–504. doi:10.1037/0022-006X.64.3.497 Colli, A., & Lingiardi, V. (2009). The Collaborative Interactions Scale: A new transcript-based method for the assessment of therapeutic alliance ruptures and resolutions in psychotherapy. Psychotherapy Research, 19, 718 –734. doi:10.1080/10503300903121098 Constantino, M. J., Marnell, M., Haile, A. J., Kanther-Sista, S. N., Wolman, K., Zappert, L., & Arnow, B. A. (2008). Integrative cognitive therapy for depression: A randomized pilot comparison. Psychotherapy: Theory, Research, Practice, Training, 45, 122–134. doi:10.1037/00333204.45.2.122 Coutinho, J., Ribeiro, E., Hill, C., & Safran, J. (2011). Therapists’ and clients’ experiences of alliance ruptures: A qualitative study. Psychotherapy Research, 21, 525–540. doi:10.1080/10503307.2011.587469 Diggle, P. J., Heagerty, P., Liang, K. Y., & Zeger, S. L. (2002). Analysis of longitudinal data (2nd ed.). Oxford: Statistical Science Series. Dimaggio, G., Nicolò, G., Fiore, D., Centenero, E., Semerari, A., Carcione, A., & Pedone, R. (2008). States of minds in narcissistic personality disorder: Three psychotherapies analyzed using the grid of problematic states. Psychotherapy Research, 18, 466 – 480. doi:10.1080/ 10503300701881877 Eames, V., & Roth, A. (2000). Patient attachment orientation and the early working alliance–A study of patient and therapist reports of alliance quality and ruptures. Psychotherapy Research, 10, 421– 434. doi: 10.1093/ptr/10.4.421 Eubanks-Carter, C., Gorman, B. S., & Muran, J. C. (2012). Quantitative naturalistic methods for detecting change points in psychotherapy research: An illustration with alliance ruptures. Psychotherapy Research, 22, 621– 637. Eubanks-Carter, C., Mitchell, A., Muran, J. C., & Safran, J. D. (2009). Rupture resolution rating system (3RS): Manual. Unpublished manuscript. Eubanks-Carter, C., Muran, J. C., & Safran, J. D. (2010). Alliance ruptures and resolution. In J. C. Muran & J. P. Barber (Eds.), The therapeutic alliance: An evidence based-guide to practice. New York: Guilford Press. Eubanks-Carter, C., Muran, J. C., Safran, J. D., & Hayes, J. A. (2011). Interpersonal interventions for maintaining an alliance. In L. M. Horowitz & S. Strack (Eds.), Handbook of interpersonal psychology: Theory, research, assessment, and therapeutic interventions (pp. 519 –531). Hoboken, NJ: John Wiley & Sons. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1995). Structured clinical interview for DSM–IV. New York: Biometrics Research Department, New York Psychiatric Institute. Golden, B., & Robbins, S. (1990). The working alliance within timelimited therapy: A case analysis. Professional Psychology: Research and Practice, 21, 476 – 481. doi:10.1037/0735-7028.21.6.476 Gonçalves, A. (2009). Compreensão da mudança terapêutica a partir da co-construção de episódios terapêuticos significativos (Unpublished dissertation). University of Minho, Braga.

441

Harper, H. (1989a). Coding guide I: Identification of confrontation challenges in exploratory therapy. Sheffield, England: University of Sheffield. Harper, H. (1989b). Coding guide II: Identification of withdrawal challenges in exploratory therapy. Sheffield, England: University of Sheffield. Hoaglin, D. C., Mosteller, F., & Tukey, J. W. (Eds.) (2000). Understanding robust and exploratory data analysis. New York: Wiley-Interscience. Horvath, A. O. (2011). Alliance in common factor land: A view through the research lens. Research in Psychotherapy, 14, 121–135. Retrieved from http://www.researchinpsychotherapy.net Horvath, A. O., & Bedi, R. P. (2002). The alliance. In J. C. Norcross (Ed.), Psychotherapy relationship that work (pp. 37–70). New York: Oxford University Press. Horvath, A. O., Del Re, A. C., Fluckiger, C., & Symonds, D. (2011). Alliance in individual psychotherapy. Psychotherapy (Chicago, Ill.), 48, 9 –16. doi:10.1037/a0022186 Horvath, A. O., & Greenberg, L. S. (1989). Development and validation of the Working Alliance Inventory. Journal of Counseling Psychology, 36, 223–233. doi:10.1037/0022-0167.36.2.223 Horvath, A. O., & Symonds, B. D. (1991). Relation between working alliance and outcome in psychotherapy: A meta-analysis. Journal of Counseling Psychology, 38, 139 –149. doi:10.1037/0022-0167.38.2.139 Kivlighan, D. M., & Shaughnessy, P. (2000). Patterns of working alliance development: A typology of client’s working alliance ratings. Journal of Counseling Psychology, 47, 362–371. doi:10.1037/0022-0167.47.3.362 Kokotovic, A., & Tracey, T. (1990). Working alliance in the early phase of counseling. Journal of Counseling Psychology, 37, 16 –21. doi:10.1037/ 0022-0167.37.1.16 Kramer, U., De Roten, Y., Beretta, V., Michel, L., & Despland, J. (2009). Alliance patterns over the course of short dynamic psychotherapy: The shape of productive relationships. Psychotherapy Research, 19, 699 – 706. doi:10.1080/10503300902956742 Little, R., & Rubin, D. (1987). Statistical analysis with missing data. London: John Wiley & Sons. Machado, P. P., & Horvath, A. (1999). Inventário da Aliança Terapêutica – W. A. I. In M. R. Simões, M. M. Gonçalves, & L. S. Almeida (Eds.), Testes e provas psicológicas em Portugal (Vol. 2, pp. 87–94). Braga: APPORT/SHO. Martin, D. J., Garske, J. P., & Davis, M. K. (2000). Relation of the therapeutic alliance with outcome and other variables: A meta-analytic review. Journal of Consulting and Clinical Psychology, 68, 438 – 450. doi:10.1037/0022-006X.68.3.438 Muran, J. C., Safran, J. D., Gorman, B. S., Eubanks-Carter, C., & Banthin, D. (2008, June). Identifying ruptures & their resolution from post session self-report measures. Paper presented at the annual meeting of the Society for Psychotherapy Research, Barcelona, Spain. Muran, J. C., Safran, J. D., Samstag, L. W., & Winston, A. (1992). Patient and therapist postsession questionnaires, Version 1992. New York: Beth Israel Medical Center. Nagy, J., Safran, J. D., Muran, J. C., & Winston, A. (1998, June). A comparative analysis of treatment process and therapeutic ruptures. Paper presented at the international meeting of the Society for Psychotherapy Research, Snowbird, UT. Patton, M. J., Kivlighan, D. M., & Multon, K. D. (1997). The Missouri Psychoanalytic Counseling Research Project: Relation of changes in counseling process to client outcomes. Journal of Counseling Psychology, 44, 189 –208. doi:10.1037/0022-0167.44.2.189 Podsakoff, P. M., MacKenzie, S. B., Lee, J., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88, 879 –903. doi:10.1037/0021-9010.88.5.879

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

442

COUTINHO, RIBEIRO, SOUSA, AND SAFRAN

Regan, A. M., & Hill, C. E. (1992). Investigation of what clients and counsellors do not say in brief therapy. Journal of Counseling Psychology, 39, 168 –174. doi:10.1037/0022-0167.39.2.168 Rennie, D. L. (1994). Client’s deference in psychotherapy. Journal of Counseling Psychology, 41, 427– 437. doi:10.1037/0022-0167.41.4.427 Rhodes, R. H., Hill, C. E., Thompson, B. J., & Elliot, R. (1994). Client retrospective recall of resolved and unresolved misunderstanding events. Journal of Counseling Psychology, 41, 473– 483. doi:10.1037/00220167.41.4.473 Safran, J. D., & Muran, J. C. (2000). Negotiating the therapeutic alliance: A relational treatment guide. New York: Guilford Press. Safran, J. D., & Muran, J. C. (2006). Has the concept of the alliance outlived its usefulness? Psychotherapy: Theory, Research, Practice, Training, 43, 286 –291. doi:10.1037/0033-3204.43.3.286 Safran, J. D., Muran, J. C., & Eubanks-Carter, C. (2011). Repairing alliance ruptures. Psychotherapy, 48, 80 – 87. doi:10.1037/a0022140 Safran, J. D., Muran, J. C., & Samstag, L. W. (1994). Resolving therapeutic alliance ruptures: A task analytic investigation. In A. O. Horvath & L. S. Greenberg (Eds.) The Working alliance: Theory, research and practice (pp. 225–255), New York: Wiley. Safran, J. D., & Segal, Z. V. (1990). Interpersonal process in cognitive therapy. New York: Basic Books.

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420 – 428. doi: 10.1037/0033-2909.86.2.420 Sommerfeld, E., Orbach, I., Zim, S., & Mikulincer, M. (2008). An in-session exploration of ruptures in working alliance and their associations with clients’ core conflictual relationship themes, alliance-related discourse, and clients’ postsession evaluations. Psychotherapy Research, 18, 377–388. doi:10.1080/ 10503300701675873 Stiles, W. B., Glick, M. J., Osatuke, K., Hardy, G. E., Shapiro, D. A., Agnew-Davies, R., Rees, A., & Barkham, M. (2004). Patterns of Alliance Development and the Rupture–Repair Hypothesis: Are Productive Relationships U-Shaped or V-Shaped?. Journal of Counseling Psychology, 51, 81–92. doi:10.1037/0022-0167.51.1.81 Strauss, J. L., Hayes, A. M., Johnson, S. L., Newman, C. F., Brown, G. K., Barber, J., Laurenceau, J., & Beck, A. T. (2006). Early alliance ruptures and symptom change in a nonrandomized trial of cognitive therapy for avoidant and obsessive– compulsive personality disorders. Journal of Consulting and Clinical Psychology, 74, 337–345. doi:10.1037/0022006X.74.2.337 Wampold, B. E. (2001). The great psychotherapy debate. Mahwah, NJ: Erlbaum.

Appendix Statistical Formulas of the Parametric Model Let Yij be the WAI score measured for patient i in session j. Remember that j is the session number and has by design a distance of one. However, there are situations of intermittent missing data assumed to be missing completely at random (Little & Rubin, 1987). Next, let the following be true: Y ij ⫽ ␮ij ⫹ Ui ⫹ Wi共tij兲 ⫹ εij where Ui is the patient-specific random intercept with a distribution of N(0, ␲2), Wi(tij) is the stochastic process that represents the correlation between measurements within the same patient with variance ␴2 and correlation structure of Corr [Wi(tij), Wi(tik)] ⫽

exp(⫺␸|tij⫺tik|) and εij is the measurements error that cannot be explained by this function, with a distribution N(0, ␲2). The component ␮ij represents the expected value of WAI (i.e., ␮ij ⫽ E [Yij]) and can be explained as the average WAI score for patient i at time j. In this case, the specified model is: ␮ij ⫽ E关Y ij兴 ⫽ 共␤0 ⫹ ␤2tij兲If 共i ⫽ Axis I兲 ⫹ 共␤1 ⫹ ␤3tij兲If 共i ⫽ Axis II兲 Received January 27, 2012 Revision received January 24, 2013 Accepted February 4, 2013 䡲