Page images
PDF
EPUB

Appendix C. Source and Reliability of Estimates

SOURCE OF DATA

The estimates in this report are based on data obtained from the Bureau of the Census collected in the Current Population Survey (CPS). The sources of data in each text table and for each figure can be found at the bottom of that table or figure. Brief descriptions of the sources of data and the procedures by which data were obtained are presented below.

Current Population Survey (CPS). The CPS estimates in this report are based on data obtained in the June surveys of 1976, and 1980 to 1984. The monthly CPS deals mainly with labor force data for the civilian noninstitutional population. Questions relating to labor force participation are asked about each member 14 years old and over in each sample household. In addition, supplementary questions are asked each June about the fertility of American women.

The present CPS sample was initially selected from the 1970 census files and is continuously updated to reflect new construction. The current CPS sample is located in 629 areas comprising 1,148 counties, independent cities, and minor civil divisions in the Nation. In June 1984, approximately 60,500 occupied households were eligible for interview. Of this number, about 2,500 occupied units were visited but interviews were not obtained because the occupants were not found at home after repeated calls or were unavailable for some other reason.

The redesigned CPS sample was selected from the 1980 census files.Phase-in of the new sample began in April 1984. Continue to use the parameters given on table C-6 at this time. The following table provides a description of some aspects of the CPS sample designs in use during the referenced data collection periods.

[blocks in formation]

The estimation procedure used for the monthly CPS data involved the inflation of the weighted sample results to independent estimates of the total civilian noninstitutional population of the United States by age, race, and sex. These independent estimates were based on statistics from decennial censuses; statistics on births, deaths, immigration, and emigration; and statistics on the strength of the Armed Forces. The independent population estimates used in this report to obtain data for June 1980 and later are based on the 1980 decennial census. Data for 1976 were obtained using independent population estimates based on the 1970 decennial census.

RELIABILITY OF ESTIMATES

Since the CPS estimates in this report are based on samples, they may differ somewhat from the figures that would have been obtained if a complete census had been taken using the same questionnaires, instructions, and enumerators. There are two types of errors possible in an estimate based on a sample survey-sampling and nonsampling. The standard errors provided for this report primarily indicate the magnitude of the samplaing error. They also partially measure the effect of some nonsampling errors in response and enumeration, but do not measure any systematic biases in the data. The full extent of the nonsampling error is unknown. Consequently, particular care should be exercised in the interpretation of figures based on a relatively small number of cases or on small differences between estimates.

Nonsampling variability. Nonsampling errors can be attributed to many sources, e.g., inability to obtain information about all cases in the sample, definitional difficulties, differences in the interpretation of questions, inability or unwillingness on the part of respondents to provide correct information, inability to recall information, errors made in collection such as in recording or coding the data, errors made in processing the data, errors made in estimating values for missing data, and failure to represent all units with the sample (undercoverage).

Undercoverage in the CPS results from missed housing units and missed persons within sample households. Overall undercoverage, as compared to the level of the 1980 decennial census, is about 7 percent. It is known that CPS undercoverage varies with age, sex, and race. Generally, under

coverage is larger for males than for females and larger for Blacks and other races combined than for Whites. Ratio estimation to independent age-sex-race population controls partially corrects for the bias due to survey undercoverage. However, biases exist in the estimates to the extent that missed persons in missed households or missed persons in interviewed households have different characteristics than interviewed persons in the same age-sex-race group. Further, the independent population controls used have not been adjusted for undercoverage in the decennial census.

In addition to the basic CPS noninterview and the above mentioned sources of undercoverage in the CPS, several sources of response error with respect to fertility of American women have been identified.

Regarding the question on children ever born, 2.3 percent of the ever-married women and 2.7 percent of the single women were counted as ''not reporting" (Appendix A). These percentages include women who were not contacted by the interviewer or who refused to answer the questions.

In these instances, the number of children ever born was imputed, based on a match of these women with other women of similar characteristics who did report children ever born. Tables presenting rates of children ever born are based on all women in the sample, including women with an imputed value for number of children ever born.

The June 1984 CPS included single women 18 to 44 years old among those asked about previous childbearing. Consequently, there is the likelihood of some deliberate misreporting of the facts, especially among women who perceive out-of-wedlock childbearing as bearing a social, moral, or legal stigma. It is also quite possible that the level of misreporting may differ systematically according to various demographic and social characteristics.

For additional information on nonsampling error, including the possible impact on CPS data, refer to Statistical Policy Working Paper 3, An Error Profile: Employment as Measured by the Current Population Survey, Office of Federal Statistical Policy and Standards, U.S. Department of Commerce, 1978 and Technical Paper 40, The Current Population Survey: Design and Methodology, Bureau of the Census, U.S. Department of Commerce.

Sampling variability. The standard errors given in the following tables are primarily measures of sampling variability, that is, of the variation that occurred by chance because a sample rather than the entire population was surveyed. The sample estimate and its standard error enable one to construct confidence intervals, ranges that would include the average result of all possible samples with a known probability. For example, if all possible samples were selected, each of these being surveyed under essentially the same general conditions and using the same sample design, and if an estimate and its standard error were calculated from each sample, then:

1. Approximately 68 percent of the intervals from one standard error below the estimate to one standard error

above the estimate would include the average of all possible samples.

2. Approximately 90 percent of the intervals from 1.6 standard errors below the estimate to 1.6 standard errors above the estimate would include the average result of all possible samples.

3. Approximately 95 percent of the intervals from two standard errors below the estimate to two standard errors above the estimate would include the average result of all possible samples.

The average estimate derived from all possible samples is or is not contained in any particular computed interval. However, for a particular sample, one can say with a specified confidence that the average estimate derived from all possible samples is included in the confidence interval.

Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing between population parameters using sample estimates. The most common types of hypotheses appearing in this report are: 1) the population parameters are identical, versus 2) they are different. An example of this would be comparing the fertility ratio of White women versus the fertility ratio of Black women 18 to 44 years old. Tests may be performed at various levels of significance, where a level of significance is the probability of concluding that the parameters are different when, in fact, they are identical. All statements of comparison in the text have passed an hypothesis test at the 0.10 level of significance or better, and most have passed an hypothesis test at the 0.05 level of significance or better. This means that, for most differences cited in the text, the estimated difference between parameters is greater than twice the standard error of the difference. For the other differences, where the estimated difference between parameters is between 1.6 and 2.0 times the standard error of the difference, the statement of comparison is qualified in some way; e.g., by use of the phrase "some evidence."

Note when using small estimates. Percent distributions and ratios are shown in this report only when the base of the statistic is greater than 75,000 for any data collected in the June 1976 or June 1980 through 1984 CPS. Because of the large standard errors involved, there is little chance that summary measures would reveal useful information when computed on a smaller base. Estimated numbers are shown, however, even though the relative standard errors of these numbers are larger than those for the corresponding percentages. These smaller estimates are provided primarily to permit such combinations of the categories as may serve each user's needs. Similarly, estimated numbers of children ever born per 1,000 women and birth expectations data are shown in the report only when the associated number of women is greater than 75,000.

Comparability with other data. Data from sources other than the Census Bureau may be subject to both higher sampling

and nonsampling variability. In addition, data obtained from the CPS are not entirely comparable with data obtained from other sources. This is due in large part to differences in interviewer training and experience and in differing survey processes. This is an additional component of error not reflected in the standard error tables. Therefore, caution should be used in comparing results between these different sources.

Caution should also be used when comparing CPS estimates for 1980 and later, which reflect 1980 censusbased population controls, to those for 1976, which reflect 1970 census-based population controls. This change in population controls had relatively little impact on summary measures such as means, medians, and percent distributions, but did have a significant impact on levels. For example, use of 1980-based population controls resulted in about a 2-percent increase in the civilian noninstitutional population and in the number of families and households. Thus, estimates. of levels for 1980 and later will differ from those for earlier years by more than what could be attributed to actual changes in the population and these differences could be disproportionately greater for certain subpopulation groups than for the total population.

Standard errors for data based on the CPS. In order to derive standard errors that would be applicable to a large number of estimates and could be prepared at a moderate cost, a number of approximations were required. Therefore, instead of providing an individual standard error for each estimate, generalized sets of standard errors are provided for various types of characteristics. As a result, the sets of standard errors provided give an indication of the order of magnitude of the standard error of an estimate rather than the precise standard error.

The figures presented in tables C-1 and C-2 are approximations to standard errors of estimated numbers and estimated percentages. The figures presented in table C-3 are approximations to standard errors of estimated fertility ratios. Estimated standard errors for specific characteristics cannot be obtained from tables C-1, C-2, and C-3 without the use of the factors in table C-5. These factors must be applied to the standard errors in order to adjust for the combined effect

Table C-2. Standard Errors of Estimated Percentages

[blocks in formation]

of sample design and estimating procedure on the value of the characteristic. Standard errors for intermediate values not shown in the generalized tables for standard errors may be approximated by interpolation.

Two parameters are used (denoted "a" and "b") to calculate standard errors for each type of characteristic; they are presented in table C-6. These parameters were used to calculate the standard errors in tables C-1 and C-2 and to calculate the factors in table C-5. They also may be used to calculate directly the standard errors for estimated numbers and percentages. Methods for direct computation are given in the following sections.

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small]
[blocks in formation]
[blocks in formation]
[blocks in formation]
[blocks in formation]

'It should be noted that for data involving one event per woman, e.g., one child ever born, table C-2, the table of standard errors of percentages should be used (by dividing the ratio of the number of children per 1,000 women by 10 to get a ratio of the number of children per 100 women).

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

women 18 to 34 years old reporting on children ever born in 1983 lies within the interval from 2,245,000 to 2,529,000 (using twice the standard error).

Standard errors of estimated percentages. The reliability of an estimated percentage, computed using sample data for both numerator and denominator, depends on both the size of the percentage and the size of the total upon which the percentage is based. Estimated percentages are relatively more reliable than the corresponding estimates of the numerators of the percentages, particularly if the percentages are 50 percent or more. When the numerator and denominator of the percentage are in different categories, use the factor or parameters indicated by the numerator. The approximate standard error, a (x,p) of an estimated percentage, p, on a base of size x can be obtained by use of the formula

[blocks in formation]
« PreviousContinue »