« PreviousContinue »
scientific statistics, and a thorough training is essential for their proper use. But in the first place there should be a clear understanding of what is necessary to be taught. We read many chapters on the theory and practice of statistics. What is the theory of statistics? The use of the word theory, in connection with statistical science, is to my mind unfortunate, for the word theory, when used in connection with positive information, antagonizes the public mind. When you speak of the theory of statistics, the word theory meaning speculation, the popular feeling is that theoretical statistics are not wanted, but facts. Theory may be fact; statistics may substantiate theory or controvert it. All this we know, and yet I feel that the word is used unfortunately in this connection. If I understand it correctly, the theory of statistics is simply a statement of what it is desired to accomplish by statistics.
Every branch of social science serves to explain the facts of human life. There are some facts which can be explained only by statistics. For instance, it is asserted that there is an alarming amount of illiteracy in Massachusetts. Statistical inquiry shows that by far the greater number of these illiterates are of foreign birth, so that the fault is not with the public school system, but the evil is due to a temporary cause, namely, immigration.
Again, it has been freely asserted that in the United States women of native birth do not have as many children as women of foreign birth. The Census of Massachusetts will show that although American women do have a less number of children, on the average, yet a larger number survive. Common observation would never have shown these things, or would not have shown them accurately.
So everywhere statistics attempt to explain the facts of human life, which can be explained in no other way, as for instance, the effect of scarcity of food on births, on marriages, or crime; the effect of marriage laws on the frequency of divorce, etc. The theory of statistics points out where the statistical method is applicable, and what it can and cannot accomplish. In my opinion, however, it would be better to avoid the use of the word theory entirely, and adopt a concrete term like statistical science, which has three branches : collection, presentation, and analysis. Statistics is a science in its nature, and practical in its working.
The science of statistics, practically considered, comprehends the gathering of original data in the most complete and accurate manner; the tabulation of the information gathered by the most approved methods, and the presentation of the results in compact and easily understood tables, with the necessary text explanations. It is the application of statistics which gives them their chief popular value, and this application may, therefore, legitimately be called a part of the science of statistics. The theoretical statistician is satisfied if his truth is the result of statistical investigation, or if his theory is sustained. The practical statistician is satisfied only when the absolute truth is shown, or, if this is impossible, when the nearest approximation to it is reached. But the belief that theory must be sustained by the statistics collected, or else the statistics be condemned, is an idea which gets into the popular mind when the expression, theory of statistics, is used. I would, therefore, avoid it, and I hope that should our colleges adopt courses in statistical science, they will agree
upon a nomenclature which shall be expressive, easily understood, and comprehensive in its nature.
The necessity of the study of statistical science would not be so thoroughly apparent if the science was confined to the simple enumeration and presentation of things, or primitive facts, like the number of the people ; to tables showing crops, exports, imports, immigration, quantities, values, valuation, and such elementary statements, involving only the skill of the arithmetician to present and deal with them. The moment the combinations essential for comparison are made, there is needed something beyond the arithmetician, for with the production of averages, percentages, and ratios, for securing correct results, there must come in play mathematical genius, and a genius in the exercise of which there should be discernible no influence from preconceived ideas. The science of statistics has been handled too often without statistical science, and without the skill of the mathematician. Many illustrations of this point in. volving the statistics of this country could be given.
In collating statistics relating to the cost of production, the best mathematical skill is essential, even the skill which would employ algebraic formulæ. So with relation to statistics of capital invested in production. To illustrate, the question may be asked, what elements of capital are involved in the census question of " capital invested ?” Is it simply the cash capital invested by the concern under consideration, or is it all the money which is used to produce a given quantity of goods? If the members of a firm contribute the sum of $10,000, and they have a line of discounts of $100,000, the avails of which are used in producing $200,000 worth of completed goods, what
is the capital invested ? What is the capital invested which should be returned in the census ? If a man has $5,000 invested in his business as a manufacturer, and he buys his goods on 90 days, or four months, and sells for cash, or 30 days, what is his capital invested? This question is one among many of the practical problems that arise in a statistical bureau, but which has not yet been treated scientifically. What has been the result of the reported statistics relating to capital invested ? Simply that calculations, deductions, and arguments based on such statistics have been, and are, vicious, and will be until all the elements involved in the term are scientifically classified. Another illustration in point arises in connection with the presentation of divorce statistics, especially when it is desired to compare such statistics with marriages, or to make comparisons to show the progress, or the movement of divorces. Shall the number of divorces be compared with the number of marriages celebrated in the year in which the divorces are granted, or with the population, or with the number of married couples living at the time? I need not multiply illustrations. The lies of statistics are unscientific lies.
The conditions of this country necessitate knowledge as to the parent nativity of the population, features not included in any foreign census, and need not be. Such features lead to what may be called correlated statistics ; for instance, where there are presented three or more facts relating to each person in the population, the facts being coördinate in their nature. In this class of work skill beyond that which belongs to the simple operations in arithmetic becomes necessary. There must be employed sonte knowledge of statistical science beyond elementary statistical tables, or the correlations will be faulty, all the conclusions drawn from them false, and harm done to the public. While the scientific statistician does not care to reach conclusions from insufficient data, he much less desires to be misled by the unscientific use of correct data, or from data the presentation of which has been burdened with disturbing causes. The analytical work of statistical science demands the mathematical man. While this
. is true, it is also true that the man who casts a schedule (for instance, to comprehend the various economic facts associated with production), should have the ability to analyze the tabulated results of the answers to the inquiries borne upon the schedule. In other words, the man who casts the schedule should not only be able to foresee the work of the enumerator, or the gatherer of the answers desired, but he should foresee the actual form in which the completed facts should be presented. Furthermore, he should foresee the analysis which such facts stimulate and not only foresee the detail, but foresee in a comprehensive way the whole superstructure which grows from the foundation laid in the schedule. He should comprehend his completed report before he gathers the needed information.
How can these elements in one's statistical education be secured? The difficulties in the way of the best statistical work are not slight. Dr. Dewey, in a recent address upon average prices, before the American Statistical Association, gave an exceedingly valuable, and a very clear explanation of the difficulties which underlie all efforts to secure average prices ranging over a period of years; he pointed out the