Page images
PDF
EPUB

errors made in processing the data, errors made in estimating values for missing data, and failure to represent all units with the sample (undercoverage).

Undercoverage in the CPS results from missed housing units and missed persons within sample households. Overall undercoverage, as compared to the level of the 1980 decennial census, is about 7 percent. It is known that CPS undercoverage varies with age, sex, and race. Generally, undercoverage is larger for males than for females and larger for Blacks and other races than for Whites. Ratio estimation to independent age-sex-race population controls, as described previously, partially corrects for the bias due to survey undercoverage. However, biases exist in the estimates to the extent that missed persons in missed households or missed persons in interviewed households have different characteristics than interviewed persons in the same age-sex-race group. Further, the independent population controls used have not been adjusted for undercoverage in the decennial

census.

Comparability with other data. In using metropolitan and nonmetropolitan data, caution should be used in comparing estimates for 1977 and 1978 to each other or to any other years. Methodological and sample design changes occurred in these years resulting in relatively large differences in the metropolitan and nonmetropolitan area estimates. However, estimates for 1979 and later are comparable as are estimates for 1976 and earlier.

Caution should also be used when comparing estimates for 1980 and later, which reflect 1980 census-based population controls, to those for 1971 through 1979, which reflect 1970 census-based population controls. This change in population controls had relatively little impact on summary measures such as means, medians, and percent distributions, but did have a significant impact on levels. For example, use of the 1980-based population controls resulted in about a 2-percent increase in the civilian noninstitutional population and in the number of families and households. Thus, estimates of levels for 1980 and later will differ from those for earlier years more than what could be attributed to actual changes in the population and these differences could be disproportionately greater for certain subpopulation groups than for the total population.

Sampling variability. The standard errors given in the following tables are primarily measures of sampling variability, that is, of the variations that occurred by chance because a sample rather than the entire population was surveyed. The sample estimate and its standard error enable one to construct confidence intervals, ranges that would include the average result of all possible samples with a known probability. For example, if all possible samples were selected, each of these being surveyed under essentially the same general conditions. and using the same sample design, and if an estimate and its standard error were calculated from each sample, then:

1. Approximately 68 percent of the intervals from one

above the estimate would include the average result of all possible samples.

2. Approximately 90 percent of the intervals from 1.6 standard errors below the estimate to 1.6 standard errors above the estimate would include the average result of all possible samples.

3. Approximately 95 percent of the intervals from two standard errors below the estimate to two standard errors above the estimate would include the average result of all possible samples.

The average estimate derived from all possible samples is or is not contained in any particular computed interval. However, for a particular sample, one can say with a specified confidence that the average estimate derived from all possible samples is included in the confidence interval.

Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing between population parameters using sample estimates. The most common types of hypotheses are 1) the population parameters are identical 2) they are different. An example of this would be comparing the average size of households maintained by male householders versus that maintained by female householders. Tests may be performed at various levels of significance, where a level of significance is the probability of concluding that the parameters are different when, in fact, they are identical.

All statements of comparison in the text have passed a hypothesis test at the 0.10 level of significance or better, and most have passed a hypothesis test at the 0.05 level of significance or better. This means that, for most differences cited in the text, the estimated difference between parameters is greater than twice the standard error of difference. For the other differences mentioned, the estimated difference between parameters is between 1.6 and 2.0 times the standard error of the difference. When this is the case, the statement of comparison will be qualified in some way; e.g., by use of the phrase "some evidence."

Note when using small estimates. Summary measures (such as averages and percent distribution) are shown when the base is 75,000 or greater. Because of the large standard errors involved, there is little chance that summary measures would reveal useful information when computed on a smaller base. Estimated numbers are shown, however, even though the relative standard errors of these numbers are larger than those for the corresponding percentages. These smaller estimates are provided primarily to permit such combinations of the categories as serve each user's need.

[graphic]

give an indication of the order of magnitude of the standard error of an estimate rather than the precise standard error. The figures in tables B-1 and B-2 provide approximations to standard errors of estimated numbers and estimated percentages. Estimated standard errors for specific characteristics cannot be obtained from tables B-1 and B-2 without the use of factors in table B-3. These factors must be applied to the generalized standard errors in order to adjust for the combined effect of sample design and estimating procedure on the value of the characteristic. Standard errors for intermediate values not shown in the generalized tables of standard errors may be approximated by linear interpolation. The factors for all household members should be used for characteristics pertaining to all persons in a household. For characteristics which include only some household members, such as "children under 18 years of age," the factor for some household members should be used.

Two parameters (denoted "a" and "b") are used to calculate standard errors for each type of characteristic; they are presented in table B-4. These parameters were used to calculate the standard errors in tables B-1 and B-2, and to calculate the factors in table B-3. They also may be used to directly calculate the standard errors for estimated numbers and

[blocks in formation]
[blocks in formation]

Note: For a particular characteristic see table B-3 for the appropriate factor to apply to the above standard errors.

Table B-2. Standard Errors of Estimated Percentages

[graphic]

from cross-tabulations involving different characteristics, use the factor or set of parameters for the characteristic which will give the largest standard error.

Illustration of the computation of the standard error of an estimated number. Table A of this report shows that in 1982 there were 22,508,000 nonfamily households. Using formula (2) with a = -0.000010 and b = 1389 from table B-4, the approximate standard error1 is

(-0.000010) (22,508,000)2 + (1389) (22,508,000) 162,000

The 68-percent confidence interval for the number of nonfamily households is from 22,346,000 to 22,670,000. The 95-percent confidence interval is from 22,184,000 to 22,832,000 (using twice the standard error). Therefore, a conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 95 percent of all possible samples.

Standard errors of estimated percentages. The reliability of an estimated percentage, computed using sample data for both numerator and denominator, depends upon both the size of the percentage and the size of the total upon which the percentage is based. Estimated percentages are relatively more reliable than the corresponding estimates of the numerators of the percentages, particularly if the percentages are 50 percent or more. When the numerator and denominator of the percentage are in different categories, use the factor or parameters from table B-3 or B-4 indicated by the numerator. The approximate standard error, (x,p), of an estimated percentage can be obtained by use of the formula

[blocks in formation]
[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][subsumed][merged small][merged small]

Table B-3. Factors to be Applied to Generalized Standard Errors in Tables B-1 and B-2

[blocks in formation]

'Apply this factor to table B-2 to obtain standard errors of estimated percentages; for standard

where σx and y are the standard errors of the estimates x and y; the estimates can be of numbers, percents, ratios, etc.

This will represent the actual standard errors quite accurately for the difference between two estimates of the same char

Table B-4. "a" and "b" Parameters for Estimated Numbers and Percentages of Persons, Families and
Unrelated Individuals, Households, or Householders

[merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

Use this parameter to calculate standard errors of estimated percentages only.

acteristic in two different areas, or for the difference between separate and uncorrelated characteristics in the same area. If, however, there is a high positive (negative) correlation between the two characteristics, the formula will overestimate (underestimate) the true standard error.

Illustration of the computation of the standard error of a difference. As stated earlier, table A shows that in 1982, 59.4 percent of all households (83,527,000) were maintained by married couples. Table A also shows that in 1970, 70.5 percent of all households (63,401,000) were maintained by married couples. Thus, the apparent difference between the percentage of households maintained by married couples in 1982 and 1970 is 11.1 percent. The standard error (0) of 59.4 percent is 0.2 percent as shown above. Using formula (4) and the appropriate standard error (a) of 70.5 percent is 0.2 percent. Therefore, using formula (5), the standard error of the estimated difference of 11.1 percent is about

(0.2)2 + (0.2)2= 0.3 percent

This means that the 68-percent confidence interval on the difference between the households maintained by married couples in 1982 and 1970 is from 10.8 to 11.4 percent. The 95-percent confidence interval on the difference of 11.1 percent is from 10.5 to 11.7 percent. Therefore, a conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 95 percent of all possible samples. Since this interval does not contain zero, we can conclude with 95percent confidence that the percent of households maintained by married couples in 1982 decreased from 1970.

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small]

The standard error of the estimated number of families or households, "y, and the standard error of the estimated number of persons with the characteristics in those families or households, Ox, may be calculated by the methods described above. In formula (6), p represents the correlation coefficient between the numerator and the denominator of the estimate. In the above example, and for other ratios of this kind, use 0.7 as an estimate of p.

Case 2:The number of persons having the characteristic in a given family or household may be 0, 1, 2, 3, or more: for example, the mean number of persons under 18 years of age per household. For ratios of this kind the standard error is approximated by formula (6), but p is assumed to be zero. If pis actually positive (negative), then this procedure will provide an overestimate (underestimate) of the standard error of the ratio.

Standard error of a median. The sampling variability of an estimated median depends upon the form of the distribution as well as the size of its base. An approximate method for measuring the reliability of an estimated median is to determine a confidence interval about it. (See the section on sampling variability for a general discussion of confidence intervals.) The following procedure may be used to estimate the 68-percent confidence limits of a median based on sample data.

1. Determine, using the standard error tables and factors or formula (4), the standard error of the estimate of 50 percent from the distribution.

[merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small]
« PreviousContinue »