Page images
PDF
EPUB

difference is less than zero. Since Figure 2 is a probability distribution, it is easy to determine the probability that the true difference in the effect of the treatment is greater than, less than, or equal to any selected value. The standard deviation suggests the shape of the distribution; distributions with small standard deviations relative to their mean are tall and narrow, indicating a high degree of certainty regarding the result.

Several factors can compromise the internal validity of the metaanalyses. First, while the random effects model accounts for among-study variations and, therefore, for random bias, it cannot account for any systematic biases that occurred in all of the studies. Second, to be included in the meta-analysis, studies had to include sufficient data to permit calculation of the percent response for each treatment, based on a modified intent-to-treat analysis that used either the HAM-D or the CGI (medication trials) or the BDI (psychotherapy trials). If studies without sufficient data to permit inclusion were fundamentally different from those that were included, summary statistics may be biased. Similarly, a variety of publication biases (particularly the tendency to publish only those studies with positive findings) may result in biased summary statistics.

On the other hand, the hierarchical random effects model is robust. Sensitivity analyses reveal that it would take a huge number of very large studies to change the results in any important way.

Threats to the external validity of the meta-analysis relate primarily to the generalizability of the study populations. While most studies entered a well characterized group of patients with major depressive disorder, others included small, but unspecified, numbers of patients with bipolar disorder or other psychiatric co-morbidities. It is, therefore, difficult to state with certainty the patient populations that the meta-analyses describe. Evidence suggests that patients with different types of depression have different prognoses and react differently to treatment. For example, the placebo response rate for nonpsychotic major depressive disorder is on the order of 25 percent, while the placebo response rate for psychotic depressions is only about 10 percent (Schatzberg and Rothschild, in press). In the absence of knowledge about the exact case mix included in the meta-analysis, caution is necessary to avoid overstating the degree of certainty that is assigned to the results.

Limitations in Comparisons Across Trials. After the meta-analyses have been performed, there is a great temptation to make direct

comparisons between summary statistics for medication and psychotherapy trials. Such comparisons should be approached with caution because of the following limitations:

■ Patients who agree to participate in studies that provide both medication and psychotherapy treatment cells may differ from patients who

participate in medication trials with two active medication cells or in psychotherapy trials that do not offer a medication cell.

■ Psychotherapy studies typically last longer, which, on the one hand, increases the chances of a spontaneous remission and prolongs exposure to treatment but, on the other hand, increases the opportunity for

dropout.

■ The pill placebo controls used in medication studies are not equivalent to the wait-list controls employed in psychotherapy trials. The pill placebo controls receive the same support and case management that the active medication group receives. Many medication trials contain such a pill placebo arm. What is the appropriate control for a talking therapy? Only one psychotherapy trial containing a pill placebo control has been conducted to date (Elkin, Shea, Watkins, et al., 1989). More commonly, patients kept on a wait-list constitute the control group. Most proponents of talking therapies would argue that it is not just the talking, but the content of the talking, that makes the therapy effective. Therefore, the appropriate control should be some type of nonspecific talking that is devoid of the "active" content of the therapy in question so that controls still receive the support and case management that active psychotherapy patients receive. The appropriateness of the placebo becomes important in attempting to compare the treatment-placebo difference across treatment types. A medication that is 30 percent better than its pill placebo control may be more effective, less effective, or equally effective than a psychotherapy that is 40 percent better than its wait-list control. However, one must suspect that those in the pill placebo group received more support, guidance, and reassurance than did those placed on a wait-list, making the comparison between medication and placebo more stringent than that between psychotherapy and wait-list.

■ The BDI and other self-reports commonly used in psychotherapy trials may be less sensitive (or slower to show improvement) than the clinician ratings (i.e., HAM-D, CGI) typically used in medication trials. Medication trials usually require greater symptom severity for study entry than do psychotherapy trials. Thus, less severely ill patients are typically studied with psychotherapy, while more severely ill patients are studied in medication trials.

Given these limitations, comparisons of medication and psychotherapy trials should not be given undue importance and should be considered hypothesis-generating rather than hypothesis-testing.

Generalizability Issues. Although randomized trials provide the best evidence for the efficacy of a treatment in a specific type of patient, their applicability to the community at large may be limited by the trials' stringent enrollment criteria, unique treatment settings, and unrepresentative clinical procedures. While researchers are working to design methods that address this problem (see Cross Design Synthesis: A New Strategy for Medical Effectiveness Research, US GAO B244808, 1992), these methods were not yet sufficiently developed for use in this project. For these reasons, the panel restricted its analysis to clinical trials.

Two caveats must be emphasized, however, one related to patient populations and the other related to treatment quality.

The purpose of these guidelines is to make recommendations regarding the treatment of major depressive disorder in primary care settings. Analyses rely on the results of clinical trials that were typically performed in academic psychiatric settings and involved patients without other medical problems. This methodology raises concerns. First, clinical trials require patient consent and participation. Many patients decline a protocol for various reasons (e.g., too severely ill to accept the additional measurement procedures, fear of being placed on a placebo or wait-list, desire for medication, desire for psychotherapy). Thus, patients who enroll in trials may not be representative of the population of interest. Second, clinical trials of medications are often sponsored by pharmaceutical companies seeking Food and Drug Administration (FDA) approval for their products. To optimize the chances of a good result and to avoid problems with human subject protection committees, these studies typically exclude patients with important medical co-morbidities. Because many patients treated in primary care settings have such co-morbidities, the population of interest is different from the population summarized by the meta-analysis. It is not possible to determine, from the available evidence, how depressed patients with medical co-morbidities will respond to the various therapies. While the few studies done in primary care settings demonstrate response rates similar to those found in psychiatric settings, they are few in number, and they, too, have generally excluded patients with medical co-morbidities.

Randomized controlled trials are often conducted in research settings and always follow a prespecified protocol. Psychotherapy is often specified in a written "manual," with therapists restricted to particular procedures in research trials. For these reasons, the type of treatment provided as part of a clinical trial may differ substantially from that provided in routine practice. Thus, inferences from research trials to day-to-day practice remain tentative, as protocol-driven treatments may perform better or worse than do their community practice counterparts.

Potential Problems with Meta-Analysis Interpretation. Any time a new method for generating numerical output—in this case, meta-analysis —becomes available, there is potential for misunderstanding and abuse of the numbers. The panel offers some warnings regarding the meta-analysis presented in this Clinical Practice Guideline.

It is important not to attach undue significance to small differences. Figure 3 depicts the results of two meta-analyses, one for treatment A, with a success rate of 34 percent (SD 12), and one for treatment B, with a success rate of 28 percent (SD 11). While comparison of the means reveals that A is 6 percent better than B, there is about a 34 percent chance that B is actually better than A. Therefore, it would be improper to conclude with any certainty that A is superior.

Figure 3. Comparison of meta-analysis results for treatments A and B

[blocks in formation]

It is also important not to make improper inferences regarding numbers that are the same. In many cases, the data show that several treatments have similar response rates. These data could lead to the assertion that it is necessary only to use the least expensive agent. This assertion is true if the same patients respond to each treatment. However, there is strong evidence for biologic and psychological heterogeneity among patients with major depressive disorder, which is evident in the differential response of patients to medication (Goodwin and Jamison, 1990; Rush, Cain, Raese, et al., 1991). Thus, for a particular patient, drug A may be ineffective, while drug B may be quite effective; in another patient, the opposite may be true. Furthermore, some research suggests that certain medications may be effective earlier in the longitudinal course of recurrent mood disorders, while others may be better in more longstanding cases (Post, 1992). Similarly, clinical pharmacology studies and clinical experience provide evidence that patients differ in the nature, likelihood, and severity of the side effects that they experience with a medication. One patient may become sedated on a drug, another may develop insomnia, while others have no sleep difficulties. This heterogeneity suggests that more than one agent must be available to ensure adequate treatment of all patients.

« PreviousContinue »