Page images
PDF
EPUB

In table B-3, the factors associated with annual Puerto Rican population change: births, deaths and migration, are displayed with their resulting component estimates. The component estimates have been developed by taking the 1980 Census count of Puerto Ricans, adding the natural increase (mainland births minus deaths) and adding the number of net arrivals from the island (inmigrants minus outmigrants) for each year since the census.

The footnotes in table B-3 explain how the components for the individual annual estimates of the Puerto Rican population were developed. Because the actual number of births, deaths, and migrants cannot be measured until after the year in question, established rates based on observed trends have been used to project the number of births and deaths for the years 1988 and 1989, and migrants in 1989. By 1991, the final tallies of 1988 and 1989 births and deaths will be known. The actual migration flow for 1989 will never be known. The passenger statistics, used by the Puerto Rican Planning Board to compute migration, measure all movement, without regard for the status of the traveler. Not all persons who move between the mainland and the island are Puerto Rican or permanent migrants.

Confidence intervals and independent estimates. The independent estimates in tables B-2 and B-3 are not significantly different from the CPS estimates in

Figure B-2.

1982, 1986, and 1988. They are significantly different in all the other years. In figure B-2, the independent estimates have been plotted with the CPS confidence intervals from table B-2. This graphic comparison of the two sets of estimates indicates that although the independent series does not exactly coincide with the CPS series, it has approximated the CPS confidence interval boundary during much of the decade. Both series indicate that an overall increase has occurred in the Puerto Rican population since 1980.

Census counts The independent series suggests that the Puerto Rican population may have increased more rapidly than the comparable CPS estimates indicate. Given the limitations associated with either approach, we cannot be certain of the true size of the Puerto Rican population. It appears, however, to have increased by between 18 to 29 percent since 1980. If the increase is real, the growth spurt could make the Puerto Rican population one of the fastest growing ethnic groups on the U.S. mainland. In 1990, the decennial census will provide a more accurate measure of the population subgroup change which has occurred since 1980, than either the CPS or independent estimation procedures.

[blocks in formation]

Appendix C. Source and Accuracy of Estimates

SOURCE OF DATA

Most estimates in this report come from data obtained in March of 1989 in the Current Population Survey (CPS). The Bureau of the Census conducts the survey every month, although this report uses only March data for its estimates. Also, some estimates come from 1980 decennial census data. The March survey uses two sets of questions, the basic CPS and the supplement.

Basic CPS. The basic CPS collects primarily labor force data about the civilian noninstitutional population. Interviewers ask questions concerning labor force participation about each member 14 years old and over in every sample household.

The March 1989 CPS sample was selected from the 1980 decennial census files with coverage in all 50 States and the District of Columbia. The sample is continually updated to account for new residential construction. It is located in 729 areas comprising 1,973 counties, independent cities, and minor civil divisions. About 56,100 occupied households are eligible for interview every month. Interviewers are unable to obtain interviews at about 2,500 of these units because the occupants are not home after repeated calls or are unavailable for some other reason.

Since the introduction of the CPS, the Bureau of the Census has redesigned the CPS sample several times to improve the quality and reliability of the data and to satisfy changing data needs. The most recent charges were completely implemented in July 1985.

The following table summarizes changes in the CPS designs for the years for which data appear in this report.

Description of the March Current Population Survey

[blocks in formation]

March supplement. In addition to the basic CPS questions, interviewers asked supplementary questions in March about the economic situation of persons and families for the previous year.

To obtain more reliable data for the Hispanic population, the March CPS sample was increased by about 2,500 eligible housing units, interviewed the previous November, that contained at least one sample person of Hispanic origin. In addition, the sample included persons in the Armed Forces living off post or with their families on post.

Estimation procedure. This survey's estimation procedure inflates weighted sample results to independent estimates of the civilian noninstitutional population of the United States by age, sex, race and Hispanic/nonHispanic categories. The independent estimates were based on statistics from decennial censuses of population; statistics on births, deaths, immigration and emigration; and statistics on the size of the Armed Forces. The independent population estimates used for 1981 (1980 for income estimates) to present were based on updates to controls established by the 1980 decennial census. Data previous to 1981 were based on independent population estimates from the most recent decennial census. For more details on the change in independent estimates, see the section entitled "Introduction of 1980 Census Population Controls" in an earlier report (Series P-60, No. 133). The estimation procedure for the March supplement included a further adjustment so husband and wife of a household received the same weight.

The estimates in this report for 1982 and later also employ a revised survey weighting procedure for persons of Hispanic origin. In previous years, weighted sample results were inflated to independent estimates of the noninstitutional population by age, sex, and race. There was no specific control of the survey estimates for the Hispanic population. Since then, the Bureau of the Census developed independent population controls for the Hispanic population by sex and detailed age groups. Revised weighting procedures incorporate these new controls. The independent population estimates include some, but not all, undocumented immigrants.

ACCURACY OF ESTIMATES

Since the CPS estimates come from a sample, they may differ from figures from a complete census using

the same questionnaires, instructions, and enumerators. A sample survey estimate has two possible types of error: sampling and nonsampling. The accuracy of an estimate depends on both types of error, but the full extent of the nonsampling error is unknown. Consequently, one should be particularly careful when interpreting results based on a relatively small number of cases or on small differences between estimates. The standard errors for CPS estimates primarily indicate the magnitude of sampling error. They also partially measure the effect of some nonsampling errors in responses and enumeration, but do not measure systematic biases in the data. (Bias is the average over all possible samples of the differences between the sample estimates and the desired value.)

Nonsampling variability. Nonsampling errors can be attributed to many sources. These sources include the inability to obtain information about all cases in the sample, definitional difficulties, differences in the interpretation of questions, respondents' inability or unwillingness to provide correct information or to recall information, errors made in data collection such as in recording or coding the data, errors made in processing the data, errors made in estimating values for missing data, and failure to represent all units with the sample (undercoverage).

CPS undercoverage results from missed housing units and missed persons within sample households. Compared to the level of the 1980 decennial census, overall CPS undercoverage is about 7 percent. CPS undercoverage varies with age, sex, and race. Generally, undercoverage is larger for males than for females and larger for Blacks and other races combined than for Whites. As described previously, ratio estimation to independent age-sex-race-Hispanic population controls partially corrects for the bias due to undercoverage. However, biases exist in the estimates to the extent that missed persons in missed households or missed persons in interviewed households have different characteristics from those of interviewed persons in the same age-sex-race-Hispanic group. Furthermore, the independent population controls have not been adjusted for undercoverage in the 1980 census.

For additional information on nonsampling error including the possible impact on CPS data when known, refer to Statistical Policy Working Paper 3, An Error Profile: Employment as Measured by the Current Population Survey, Office of Federal Statistical Policy and Standards, U.S. Department of Commerce, 1978 and Technical Paper 40, The Current Population Survey: Design and Methodology, Bureau of the Census, U.S. Department of Commerce.

Comparability of data. Data obtained from the CPS and other sources are not entirely comparable. This results from differences in interviewer training and experience and in differing survey processes. This is an

example of nonsampling variability not reflected in the standard errors. Use caution when comparing results from different sources.

Caution should also be used when comparing estimates in this report, which reflect 1980 census-based population controls, with estimates for 1980 (1979 for income estimates) and earlier years, which reflect 1970 census-based population controls. This change in population controls had relatively little impact on summary measures such as means, medians, and percentage distributions, but did have a significant impact on levels. For example, use of 1980 based population controls results in about a 2-percent increase in the civilian noninstitutional population and in the number of families and households. Thus, estimates of levels for data collected in 1981 and later years will differ from those for earlier years by more than what could be attributed to actual changes in the population. These differences could be disproportionately greater for certain subpopulation groups than for the total population.

Since no independent population control totals for persons of Hispanic origin were used before 1982, compare Hispanic estimates over time cautiously.

Note when using small estimates. Summary measures (such as medians and percentage distributions) are shown only when the base is 75,000 or greater. Because of the large standard errors involved, summary measures would probably not reveal useful information when computed on a smaller base. However, estimated numbers are shown even though the relative standard errors of these numbers are larger than those for corresponding percentages. These smaller estimates permit combinations of the categories to suit data users' needs. Take care in the interpretation of small differences. For instance, even a small amount of nonsampling error can cause a borderline difference to appear significant or not, thus distorting a seemingly valid hypothesis test.

Sampling variability. Sampling variability is variation that occurred by chance because a sample was surveyed rather than the entire population. Standard errors, as calculated by methods described later in "Standard Errors and Their Use," are primarily measures of sampling variability, although they may include some nonsampling error.

Standard errors and their use. A number of approximations are required to derive, at a moderate cost, standard errors applicable to all the estimates in this report. Instead of providing an individual standard error for each estimate, generalized sets of standard errors are provided for various types of characteristics. Thus, the tables show levels of magnitude of standard errors rather than the precise standard errors.

The sample estimate and its standard error enable one to construct a confidence interval, a range that would include the average result of all possible samples with a known probability. For example, if all possible samples were surveyed under essentially the same general conditions and using the same sample design, and if an estimate and its standard error were calculated from each sample, then approximately 90 percent of the intervals from 1.6 standard errors below the estimate to 1.6 standard errors above the estimate would include the average result of all possible samples.

A particular confidence interval may or may not contain the average estimate derived from all possible samples. However, one can say with specified confidence that the interval includes the average estimate calculated from all possible samples.

Some statements in the report may contain estimates followed by a number in parentheses. This number can be added to and subtracted from the estimate to calculate upper and lower bounds of the 90-percent confidence interval. For example, if a statement contains the phrase "grew by 1.7 percent (±1.0)," the 90 percent confidence interval for the estimate, 1.7 percent, is 0.7 percent to 2.7 percent.

Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing between population parameters using sample estimates. The most common type of hypothesis appearing in this report is that the population parameters are different. An example of this would be comparing the average size of Hispanic families in 1989 to the average size of Hispanic families in 1988.

Tests may be performed at various levels of significance, where a significance level is the probability of concluding that the characteristics are different when, in fact, they are the same. All statements of comparison in the text have passed a hypothesis test at the 0.10 level of significance or better. This means that the absolute value of the estimated difference between characteristics is greater than or equal to 1.6 times the standard error of the difference.

[blocks in formation]
[blocks in formation]

73,000 So the 90-percent confidence interval for the number of Hispanic families in the United States in 1989 is from 4,706,000 to 4,940,000, i.e., 4,823,000 ± 1.6(73,000). Therefore, a conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90 percent of all possible samples.

Standard errors of estimated percentages. The reliability of an estimated percentage, computed using

Table C-2. Standard Errors of Estimated Numbers: Total or Non-Hispanic

(Numbers in thousands)

Size of estimate

[blocks in formation]
[blocks in formation]

Note: For a particular characteristic, see table C-5 for the appropriate factor to apply to the above standard errors.

[blocks in formation]
[blocks in formation]

Illustration. Table 4 of this report shows that in 1989, 23.1 percent of the 4,823,000 Hispanic families were maintained by female householders. Table 4 also shows that 16.0 percent of all non-Hispanic families (61,013,000) were maintained by female householders. The apparent difference between the percentage of Hispanic and non-Hispanic families maintained by female householders in 1989 is 7.1 percent. Using formula (4) with b = 1,906 from table C-5, the approximate standard error, sx, for Hispanic female householders is 0.8. The standard error, s, for non-Hispanic female householders is 0.2 (b = 2,110). Using formula (5), the standard error of the estimated difference of 7.1 percent is about

Table C-3. Standard Errors of Estimated Percentages: Hispanic

[blocks in formation]
[blocks in formation]

Note: For a particular characteristic, see table C-5 for the appropriate factor to apply to the above standard errors.

« PreviousContinue »