Page images
PDF
EPUB

ness from one year to another; Sydenham's observations on the undetermined or unknown influence of what he calls the epidemic genius prove it. To be valid, comparative experiments have therefore to be made at the same time and on as comparable patients as possible. In spite of that, such comparisons still bristle with immense difficulties which physicians must strive to lessen; for comparative experiment is the sine qua non of scientific experimental medicine; without it a physician walks at random and becomes the plaything of endless illusions. A physician, who tried a remedy and cures his patients, is inclined to believe that the cure is due to his treatment. Physicians often pride themselves on curing all their patients with a remedy that they use. But the first thing to ask them is whether they have tried doing nothing, i.e., not treating other patients; for how can they otherwise know whether the remedy or nature cured them? Gall wrote a little known book on the question as to what is nature's share and what is the share of medicine in healing disease, and he very naturally concludes that their respective shares are quite hard to assign. We may be subject daily to the greatest illusions about the value of treatment, if we do not have recourse to comparative experiment. I shall recall only one recent example concerning the treatment of pneumonia. Comparative experiment showed, in fact, that treatment of pneumonia by bleeding, which was believed most efficacious, is a mere therapeutic illusion.

From all this, I conclude that comparative observation and experiment are the only solid foundation for experimental medicine, and that physiology, pathology and therapeutics must be subject to this criticism in common.

Influenced by such teaching, modern doctors have developed certain habits of thought with regard to the evaluation of new therapeutic agents. Some of these ways of thinking are positive and some negative. One of the latter is manifest by a distrust of the unorganized clinical experience, of the impressions one gets and retains while busy with the practice of medicine. The human mind is so likely to play tricks on us; we remember the spectacular case rather than the general average, and, humanly enough, we are more prone to remember our successes than our failures. Too often in medical school 40 years ago, I heard opinions loudly stated and vigorously defended by practitioners of wide experience, only to learn later how flimsy was their foundation.

Turning to the positive side, let us now consider the means by which we hope to avoid the many pitfalls surrounding the judgment of new remedies, for we have methods that our medical ancestors did not possess. Let us discuss these methods and give special attention to their deficiencies, for no method is perfect. And we must do everything we can to avoid the predicament in which Rush found himself.

In any discussion of the means available to determine the clinical effectiveness of new therapeutic agents, one fact is self-evident: the method needed will depend on how effective the agent is. If very effective, a simple method will suffice; if of little benefit, a much more elaborate method will be needed to detect that little benefit.

When I first saw insulin given to a child in

diabetic coma and saw the child recover, little further study was needed. My confreres and I had seen many patients in diabetic coma, but we had never before seen one recover.

When the administration of liver, first given to a patient with pernicious anemia, was followed by rapid improvement in the blood count, we were much impressed, but since all of us had seen sportaneous remissions in that disease, it took a small series of such observations to convince us.

When the early antibiotics began to shorten the course and cut down the mortality of the "captain of the men of death," lobar pneumonia, we were ali delighted, but we knew well that, for a reason altogether unknown, pneumonia was less severe and less lethal in some years than in others. Thus, studies of longer duration were required before definitive evidence could be secured.

These familiar facts, drawn from developments which occurred during my medical lifetime-a lifetime which has witnessed an advance in medical therapeutics greater than that of the preceding 500 years-are mentioned here to emphasize an important point. If the therapeutic action of a new drug is spectacular, the time-honored subjective clinical methods, relying on observation, experience, clini cal judgment, and impression, will suffice to estab lish its usefulness. The modern refinements of these methods find their greatest usefulness in evaluating the group of remedies whose effectiveness is not of the first order. This is not to decry these less spectacular remedies; a bit of aid may tip the balance of any patient in the direction of life and health. and everything of any value must be identified The difficulty with the methods relying on clinical impressions has been not so much that important remedies were missed, but that useless remedies have been considered important.

Indeed, in the days when clinical impression was depended on for such judgments, the opinion of experienced doctors concerning the value of many remedies differed widely. Hobart Hare, in his book on therapeutics published in 1904, clearly implied. if he did not actually state, that the hundreds of ancient remedies described therein were all of real value; on the other hand, Richard Cabot, Osler, and others of the pathological school were equally convinced that all but a handful of remedies were utterly worthless. It was to bring light into this darkness that the newer methods were devised. And before discussing them, let us recall that doctors may differ in what may be called their therapeutic philosophy. Thus, one may hold that it is perfectly proper to try any supposed remedy which does not do demonstrable harm, while others may hold that no remedy should be given that does not do demonstrable good. The majority of doctors will doubtless be found with a philosophical position somewhere between these two extremes.

Let us, therefore, consider the ideas underlying the methods on which we now rely to determine the importance of new therapeutic agents. Our chief reliance is on the "experimental method" which ideally requires: (1) the production of a steady state so that interfering factors will be absent; (2) the introduction of one extraneous factor. such as the administration of a drug; and (3) the observation of any deviations from the steady state which follow, for these changes are to be attributed to the factor introduced.

Animal Experiments

We think that no study of a new drug is complete without testing it by animal experiments. Yet this method has many deficiencies. Drug action is not necessarily similar in different species of animals; morphine depresses the dog, but excites the cat. The lethal dosage may vary greatly from one species to another. Drug action in men may differ from that in animals. The anesthetics and operative procedures used in so many animal experiments may affect the result. Isolated organs may react differently than those subjected to normal nervous and endocrine influences. As clinicians, our interest centers about the action of drugs in disease; however, many human diseases cannot be reproduced in animals, and other diseases take forms which are very different in animals than in man. Also, our knowledge of drug action in sick animals is very limited.

But such difficulties, while real enough, are a minor matter in comparison with the advantages gained by careful animal experiments. With very few exceptions, drug action demonstrated in animal experiments has proved to be similar to that found in man. Such experiments also provide opportunity for an analysis of drug action far more exact than that which is possible in man.

Indeed, results secured in animal experiments have often pointed the way to revisions of clinical judgment. A conspicuous example of this is the case of strychnine. For a long time clinicians were in agreement that this drug was a valuable heart stimulant; this belief was still current among the older and less well-informed members of our profession at the beginning of my medical career. But, after animal experiments had failed to demonstrate any effect of this drug on the heart, the clinical evidence was reviewed, found wanting, and the drug is no longer used for that purpose. Countless other examples could be cited to show that results obtained in animal experiments have often pointed the way to better clinical use of drugs.

In short, animal experiments are invaluable to us, and no new drug should be given to any patient before it has had extensive trial on animals. Such trials can never prove that any drug will benefit our patient, Mrs. Smith, but they may clearly indi

cate that benefit may reasonably be expected to follow and that there is little chance of Mrs. Smith being harmed by such therapy.

Experiments on Man

It was during my medical lifetime that experiments on man became feasible. I well recall the reception accorded to one of my earliest experiments of this kind. I had become interested in the production of albuminuria by renal vasoconstriction, which I was easily able to demonstrate in animal experiments. Wishing to extend these observations, I gave myself and several others the renal vasoconstrictor, ephedrine, a new drug which had been studied in China and just introduced to this country by Chen and Schmidt. Using delicate methods of examination, I found that this drug did produce transient albuminuria in most of us. However, since the protein was not present in sufficient quantity to be detected by the clinical tests in common use, the albuminuria I produced experimentally was less conspicuous than that which commonly follows severe exercise in many healthy persons. This experience, written up for publication and sent to the editor of a leading American journal, was received with horror. I had been experimenting on my patients! The editor was mollified when he learned that ephedrine, though new to this country, had been used in Chinese medicine for many centuries, but his reaction was typical of the time.

Nowadays, imbued with the ideals of the experimental method and conscious of the great advances which have resulted from it, we do not hesitate to experiment on ourselves and on others. In a sense, the best therapeutics is always experimental. The doctor gives his patient, Mrs. Smith, the drug he believes most likely will help her, but he has no positive knowledge that it will. If the experiment fails to benefit Mrs. Smith, the doctor tries something else. This experimental attitude was possessed by the teachers who had most influence on me, and it is the way I hope to be treated when I am ill. Indeed, I myself believe that in the future drugs will be given to the sick, not only to attack the cause of their disability, but also for diagnosis in order to evaluate the cause of the disability. When people react differently to the same drug, it is because they are different; and it is my belief that such differences will be found to be important. An advance in this direction would require the doctor to spend much more time with each patient than he now does. At present only one group of physicians, the anesthetists, stay with their patients to determine and control the action of the drugs they give. I would like to predict that such close observation of drug action in our patients would be profitable if extended more widely. As soon as one graduates from the conception that disease is alto

gether anatomical and realizes its physiological aspects, drugs become diagnostic agents of infinite possibilities.

While much of this is for the future, giving new drugs to human volunteers has become a recognized part of the study of drugs. Obtaining the volunteers has not proved as great a problem as one might suppose. Medical students have always been conspicuous in coming forward. Large numbers of conscientious objectors, laudably anxious to demonstrate that their scruples against military duty were not due to lack of courage, volunteered for painful and sometimes dangerous experiments during the last war and have continued to do so since that time. Prisoners in penal institutions have often been active in volunteering for such experiments, to their great credit.

Thus, the testing of new therapeutic agents on normal volunteers has become a recognized procedure of great value. Needless to say, the observations that can be made on volunteers are more limited than those that can be obtained in animal experiments, but difficulties due to species differences can be avoided. Although drugs may act differently on the sick than on those who are well, it seems evident that no drug should be tried on the sick that has not been tried on the well, and our opportunities for such testing are steadily increasing. The testing of drugs on volunteers consenting to be made ill has as yet been limited, as far as I know, to studies on malaria and certain infections of the skin.

The information that may be obtained from such experiments on volunteers, like that obtained from animal experiments, provides us with knowledge of the basic action of the drugs tested. If we fully understand the mechanism that has gone wrong in our patient, Mrs. Smith, and as fully understand the action of our drugs, we should be able to select from our armamentarium those drugs that would benefit Mrs. Smith. Unhappily, our knowledge of the basic mechanism of Mrs. Smith's disability is so likely to be deficient that such a reasoned attack may fall far short of success. So the next logical step is the clinical trial of the new drug on Mrs. Smith and similar patients. But here we must pause to ask how we are to avoid the pitfalls that beset the clinical trials of Dr. Rush. The new methods of thinking, by which we hope to overcome such difficulties, will be discussed next. These methods can be used with advantage in interpreting results obtained in animal experiments and on healthy men, as well as in clinical trials.

Statistics

Medicine, like philosophy, may be described as the sum total of all knowledge, in the sense that it is our duty to draw from every body of knowledge anything which promises to help our patients. Medicine is far older than the sciences. Of the de

veloping sciences, chemistry was drawn on first, as for analysis of urine and blood, and physics a little later; roentgenology is its greatest contribution s far. Much more recently we have begun to draw on mathematics, and we have done so with great

success.

Training in mathematics has never been a necessary or conspicuous part of the doctor's education. and one wonders how well the average practitioner is equipped to interpret the statistical analysis which accompanies the best medical papers of today. The danger is that, not fully understanding the mathematical background, doctors will attribute to such analysis virtues that it does not possess. So it seems wise to point out the difficulties and dangers as well as the advantages inherent in the statistical methods on which we have placed so much reliance in our testing of therapeutic agents in recent years.

The validity of the results secured by statistical methods depends on an assumption. For the results to be trustworthy, the variation in the data subjected to such analysis should obey the law of probability, the law which defines the expected frequency of any result when dice are cast repeatedly Needless to say, this law does not always accurately describe the variability found in medical data; for example, it would not apply to a situation in which the magnitude of the errors in one direction exceeds that in the other. Often, when we must draw conclusions from medical data, we do not know enough about the inherent errors to make a firm judgment whether they do or do not conform to the law of probability. We usually assume that they do and proceed to draw our conclusions on that basis. This risk has proved well worth while.

I pause to re-emphasize that statistics can never prove anything. The results of a statistical analysis demonstrate one thing only, namely, the likelihood that chance will explain the phenomena under study. If the odds are large that chance will explain what happens after giving a new drug, the drug is not worthy of further study. If the odds are large that the results cannot be so explained, the drug may, or may not, be worthy of further study. The important fact for doctors to remember is that the value of statistics is negative; it helps us to elimi nate from our thoughts many things which doctors in the past thought worthy of consideration in or der to concentrate our efforts in directions with greater promise of success.

There is a danger that doctors, unfamiliar with mathematical thought and methods, will expect too much of statistics. A difference between the mortality from pneumonia in patients who received and those who did not receive a new drug may be statistically significant and therefore should not be attributed to chance; however, the cause of the difference may be something other than the drug. To use as illustration a matter keenly debated at this moment, the relation between the increasing

frequency of lung cancer and the increasing consumption of tobacco has proved to be significant, and one has the right to wonder whether the one is a factor in the causation of the other. This might indeed be true and, as the statistics indicate, the matter is well worthy of further study. But the yearly increase in frequency of lung cancer is also significantly correlated with the yearly increase in the number of automobiles manufactured in Detroit, with the number of nylon stockings sold yearly, and, indeed, with everything else in this growing country which is increasing yearly. The demonstration that the relation between two variables cannot be explained by chance is not valid evidence that the one causes the other. Statistics never prove a causal relationship. The judgment that tobacco smoking may be a factor in lung cancer, and that the sale of stockings is not likely to be, is not based on the statistics, but is based on reasoning of another kind. I myself have stopped smoking, but I continue to buy stockings.

Nevertheless, the value of statistics to modern medicine has been very great indeed, and no modern study of therapeutic agents can be considered complete without the criticism involved in such an analysis. This rigorous criticism has caused us to prune out a great body of data which many doctors had thought worthy of attention, to the great benefit of the tree of knowledge. And the rigorous mathematical thought, which has taught us to apply the statistical method to our problem, has led to many advances.

The Elimination of Bias

We doctors are desperately anxious for our patients to get well. And human nature being what it is, all of us would like our patients to get well because of something we had done for them. Therefore, we have no right to expect that the average doctor will face the results of his therapy without a certain bias. He will tend to attribute improvement to something he did for the patient; if the patient's condition deteriorates, he will tend to attribute this unhappy event not to something he has done, but to the increasing severity of the disease.

Our patients, desperately anxious to get well, are biased also, especially when a new remedy is suggested by a confident doctor.

It is to avoid such difficulties in our mental make-up that “blind" studies have been designed in recent years. In such studies, some patients are given the drug to be tested, and others are given the placebo with which it is to be compared, without knowing which they are taking. In "double blind" studies the observer also is not aware whether the patient is receiving the drug or the placebo. These are interesting and important types of study, and doctors have often found that they were much more subject to bias than they had supposed.

Very few theoretical objections can be made to "blind" tests, but I would like to emphasize one point. Long experience with the administration of active drugs and placebos to medical students who are ignorant of which is which has convinced me that most intelligent subjects can readily distinguish between the two by the sensations the drugs produce. Doubtless, many patients can distinguish between the two in the same way, and, by answering the observer's questions, convey the same knowledge to him. So the object of the "blind" experiment is probably not always as fully realized as some statisticians seem to think. Nevertheless, we now make strenuous efforts to guard against unconscious bias and this is a great gain.

The Paired Experiment

We could give a new drug to 10 people and a placebo to 10 other people, and, by comparing the average effect produced on each group, try to determine the effect of the new drug. If the drug's effect was striking, this rough experiment would suffice. But important drug effects might be missed if the subjects of the control group were not sufficiently similar to those who received the drug. Unforeseen factors, such as differences in age or in conditions of disease, might have caused so much variability themselves that the effect of the drug was masked.

Many of the difficulties inherent in such a test disappear if we set up “paired experiments" to test each of our subjects, not only with the drug but also with the placebo. This experiment could best be conducted by tossing a coin to see which test is to be made first. After testing all our subjects in this manner, we could demonstrate the action of the drug on each one by subtracting the effect found after the administration of the placebo from that which followed the administration of the drug. Thus, for each subject we obtain a difference which represents the action of the drug on him, and the average of such differences represents the action of the drug on the group. If the drug was without effect, this average difference would approach zero, although it is not likely to be exactly zero because of the unavoidable errors and the inherent variability present.

Obviously in such a “paired experiment" a host of causes of variation, such as differences due to age, sex, or race of the subjects, are filtered out by the experimental design. The size of the variability not due to the drug, assuming it obeys the law of probability, may be estimated by mathematical means and the effect of the drug is revealed. Unhappily, in many types of studies comparison of the effects of the drug and placebo on the same subject is impossible. In such cases, if one makes pairs by matching two subjects as closely as possible, much of the advantage of this experimental design will be retained. Unfortunately, however, no two pa

tients are exactly the same, and the question of how much latitude should be allowed in arranging pairs is likely to be hard to answer.

More Elaborate Experimental Designs

The hard mathematical thinking which has been the chief contribution of the statisticians toward solving our medical problems has also been used to design experiments of a far more elaborate type, much too elaborate to describe in detail here. In these, the action of a new drug under study may be compared with that of the placebo, with that of a well-known drug, or perhaps with both. These comparisons are made under circumstances as similar as possible, and in such a manner that bias is eliminated as far as possible. Large numbers of patients are often necessary in order to find subjects similar enough to make pairs and also to obtain enough pairs. Often enough patients cannot be secured in any one place, and a number of clinics must pool their material. To carry out such work, trained teams are necessary. These large experiments, usually designed and presided over by a statistician, have become important in the study of new drugs in recent years.

Such designed experiments provide far more exact information than could possibly be obtained by the older clinical methods of observation and impression. For example, these methods have given us far more exact estimates of subjective drug effects, such as the relief of pain, than any we have had before.

The more exact data obtained in this way have already begun to throw doubt on some widely held clinical conceptions. For example, the effect of the drugs commonly used to suppress the frequency and violence of coughing (if demonstrable at all), has proved to be of a much smaller magnitude than the textbooks would have one believe. Such a finding leads to the reflection that, as medicine is usually practiced, the doctor hands the patient a prescription or gives an order, but he seldom waits to observe the effect of the drug he has prescribed. Too often the necessary circumstances of practice cause the doctor to rely on what the patient tells him about the efficiency of the drugs given, and the patient, anxious to get well, is a biased observer. Most cough medicines are sedatives; doubtless many patients have told their doctors that their coughs were better, not because they coughed less frequently, but because, under the influence of the sedation, they were less annoyed by the coughing.

The increased knowledge of therapeutic activity, or its lack, that such designed experiments can contribute to clinical medicine seems without limit. Unfortunately, the practical difficulties are formidable; such studies are very expensive, very difficult to organize and carry out, and very time-consuming. A major difficulty is concerned with securing proper patients, for these experiments require a number

of very similar patients whose disease is relatively stable and who can remain under close observation for the duration of the trials. Since such favorable conditions are found chiefly in hospitals for chronic diseases, most of these studies have been made on patients with tuberculosis or cancer. But certain studies of this type can also be made in general hospitals. For example, there have been excellent studies on the effect of certain analgesics on postoperative pain. But patients found in general hospital wards, in outpatient departments, or in office practice are likely to be so dissimilar from one another that satisfactory pairs are difficult or impossible to obtain.

A second practical difficulty lies in the personnel who are making the observations. To my way of thinking, the task of making observations in a large "blind" experiment, without any knowledge of what is going on, would be unbelievably dull. I find it hard to imagine that any doctor with a keen mind would be attracted to a job so unrewarding and of such little educational value. Doubtless, it is for reasons such as this that the personnel making the clinical observations in these large experiments do not usually consist of doctors at all, but of nurses and technicians hired especially for this purpose. But while some observations, like temperature and blood pressure, can be made objectively, others are subjective, and the proper evaluation of a patient's statements may be a most difficult matter, even for those with years of experience. And if, in a "blind experiment, some of the observers "went wrong,” i would be difficult or impossible to detect it. In addition, if the basic observations are not well made. no brilliance of mathematical analysis will suffice to draw the correct conclusion from the results.

Therefore, I am concerned about the distance between the top and the bottom in such experiments. At the top we are likely to have a statistician, often untrained or without experience in clini cal medicine, who may even live in a different city from that in which the observations are being made. At the bottom we often find personnel without medical training. The doctors are likely to oe cupy a position between these two, but they neither make the observations nor write them up. One wonders whether there may not be distortion of the clinical data on the way up and whether the top man, the man who writes the paper, is sufficiently aware of the deficiencies of the personnel collecting the data, and so of the data they send him.

Another practical disadvantage in the elaborately planned experiment is the time required to complete it. One expert in this field has recently advised that, before even starting a multi-clinic experiment, the group leaders meet once a month for a year to make plans. Thus, in the study of any new drug made by this method, the observations could hardly be completed in less than one year, and it would probably take a third year to complete the

« PreviousContinue »