Page images
PDF
EPUB

Introduction

for all the 1950 Housing reports except Volume II, for which electronic equipment was used for the tabulations; (3) the tables (except in the reports on block statistics) were typed manually on sheets with preprinted stubs and partially preprinted captions, and the tables were reproduced by offset printing.

The extensive use of electronic equipment in the 1960 Census insured a more uniform and more flexible edit than could have been accomplished manually or by less intricate mechanical equipment. In the editing operations, improved techniques of allocation for nonresponses and inconsistencies were feasible. Moreover, the use of FOSDIC completely eliminated the cardpunching operation and thereby removed one important source of error in the published statistics; the new types of error introduced by the use of FOSDIC were probably minor by comparison.

The electronic computer made it possible to do much more complex editing and coding than in earlier censuses and to assure consistency among a larger number of interrelated items. For example, the computer assigned a code to each housing unit for one of seven categories of condition and plumbing facilities; to determine this code in some instances required the scanning of entries in four items, where a full cross-classification of the items would involve approximately 36 combinations of categories. At the same time, the greater capacity of the computer permitted the keeping of a detailed record of the extent of computer editing.

In 1960, practically all the editing and coding operations on the housing schedules were accomplished by electronic equipment. The only schedules examined manually (after the field review and inspection) were those flagged by the computer for clerical review because the number of corrections required exceeded the tolerances that were established. In 1950 also, much of the editing and coding was accomplished by mechanical equipment, including electronic equipment for some tabulations. A few specified items on the housing schedules in 1950 were examined manually, and corrected when necessary, before the schedules were processed mechanically.

Editing. In a mass statistical operation, such as a national census, human and mechanical errors occasionally arise in one form or another, such as failure to obtain or record the required information, recording information in the wrong place, misreading position markings, and skipping pages. These were kept to a tolerable level by means of operational control systems. Nonresponses and inconsistencies were eliminated by using the computer to assign entries and correct inconsistencies. In some cases, missing and inconsistent entries resulted from poor markings which were unreadable or were misread by FOSDIC. In general, few assignments or corrections were required, although the amount varied by subject and by enumerator.

Whenever information was missing, an allocation procedure was used to assign an acceptable entry, thereby eliminating the need for a "not reported" category in the tabulations. The assignment was based on related information reported for the housing unit or on information reported for a similar unit in the immediate neighborhood. For example, if tenure for an occupied unit was omitted but a rental amount was reported, the computer automatically edited tenure to "rented." On the other hand, if the unit was reported as "rented" but the amount of rent was missing, the computer automatically assigned the rent that was reported for the preceding renter-occupied unit.

A similar procedure was used when the information reported for an item was inconsistent with other information reported for the unit. For example, if a housing unit was enumerated as having "no running water" but having both a bathtub (or shower) and flush toilet for the exclusive use of the occupants of the unit, the computer edited water supply to "hot and cold water," a category considered to be consistent with the reported bathing and toilet facilities.

Specific tolerances were established for the number of computer allocations acceptable for a given area. If the number was beyond

tolerance, the data were rejected and the original schedules were re-examined to determine the source of the error. Correction and reprocessing were undertaken as necessary and feasible. In some cases, the corrective action consisted simply of making darker shadings in the code circles. If the high number of allocations resulted from faulty entries or absence of entries on the schedules, the appropriateness of the computer allocations was considered and, in some instances, a manual allocation was substituted.

The extent of the allocations for nonresponses or for inconsistencies, including those resulting from poor markings, is shown for each item in appendix table A-1 in the individual chapters for States. The percentages reflect only the allocations made by the computer; they exclude any that were made in the field review of the census schedules and those that were made manually after they had been rejected by the computer. The table presents totals for the State, by inside and outside SMSA's, and totals for places of various population size groups. The base on which the percentage is computed is shown for each item. For most items, the percentages are based on all housing units or occupied housing units. In some instances, the base is a specific group of units. For example, a figure of 2.5 for "duration of vacancy" for places of 50,000 inhabitants or more means that answers to this question were supplied or edited for 2.5 percent of the vacant units available for rent or sale; the percent is a combined figure for all places of 50,000 inhabitants or more in the State. Percentages are not shown if the item is not published for the specified area.

In earlier censuses, assignments of acceptable entries for nonresponses and inconsistencies also were based on related information given for the units. In the absence of related information for the unit, either an acceptable code was assigned or the item was "not reported." If a code was assigned, it was made on the basis of distributions of characteristics from previous censuses or surveys. The use of electronic equipment in 1960 improved upon the procedure by making feasible the use of information implicit in the 1960 data being tabulated.

ACCURACY OF DATA

As explained above, information was obtained through selfenumeration and direct-interview procedures. The forms used by household members for self-enumeration were necessarily different from those used by the enumerator in direct interview, although the intent of the two types of forms was the same. The use of self-enumeration forms allowed household members to see the questions as worded and to consult household records to ascertain the correct answers. Furthermore, the self-enumeration forms provided brief but uniform explanations for some of the items and called attention to the response categories in a uniform manner. The less detailed wording of some items on the FOSDIC schedules was supplemented by the training and instructions given to the enumerThe enumerators received standardized and formal training in canvassing their districts, in interviewing, and in filling out the schedules. During training, they used a workbook which contained practice exercises and illustrations. Filmstrips with accompanying narratives and recorded interviews were also used. The fine distinctions made in the instructions, however, were probably not ordinarily conveyed to the respondents, unless they asked the enumerator for clarification of a particular point.

ators.

Some of the areas for which separate statistics are provided in Volume I are areas with relatively small numbers of housing units, and the enumeration represents the work of only a few enumerators. Moreover, such items as the delineation of living quarters and the classification of the condition of a housing unit were always determined by the enumerator. To the extent that answers to other census questions were obtained by direct inter

the part of the enumerator. Therefore, misinterpretation of the instructions or variation in interpretation of responses may have led to a wider margin of relative error and response variability in data for small areas (places with relatively small population, or the rural-nonfarm and rural-farm parts of counties) than for large areas. The systematic field review early in the enumeration corrected some of the errors arising from misunderstandings by the enumerator.

In the processing of the data, careful efforts were made at each step to reduce the effects of errors. Errors occurred through failure to obtain complete and consistent information, incorrect recording of information on the FOSDIC schedules or incorrectly transferring it from the self-enumeration forms, faulty marking of the FOSDIC schedules, and the like.

Some of the innovations in the 1960 Census reduced errors and others produced a more consistent quality of results. It is believed that the innovations have improved the quality of the results compared with those of earlier censuses but, at the same time, have introduced an element of difference in the statistics. According to present plans, one or more reports

evaluating the statistics of the 1960 Census of Housing will be published later.

Statistics such as the number of owner-occupied and renteroccupied units usually appear in more than one table for a given area. These figures may differ between tables, or in the same table, when characteristics of these units were tabulated at different sample rates; for example, the number of units tabulated by condition and plumbing facilities may differ from the number tabulated by bathrooms (see table I and the section on "Ratio estimation"). In the case of financial characteristics, certain types of units were excluded from the tabulations; therefore, differences between the counts obtained from the value and rent distributions and corresponding counts from distributions for other characteristics may reflect the exclusion of these units.

Statistics in this report may differ from those in other reports from the 1960 Census of Housing where different sample rates were used for the same item. Moreover, in some cases, differences caused by errors in enumeration or processing were discovered after the publication of the early reports and were corrected in subsequent reports.

SAMPLE DESIGN AND SAMPLING VARIABILITY

SAMPLE DESIGN

Although some information was collected for all housing units in 1960, information for most of the items was collected for samples of housing units. The enumerator was instructed to assign a Sample Key letter (A, B, C, or D) to each housing unit sequentially in the order in which he first visited the unit, whether or not he completed the interview. Each enumerator was given a random key letter to start his assignment, and the order of canvassing was indicated in advance, although the instructions allowed some latitude in the order of visiting individual units at an address. Each housing unit which was assigned the key letter "A" was designated as a sample unit.

Information for each housing unit in the sample was recorded on a sample FOSDIC schedule. The schedules were bound in books which were so arranged that every fifth sample FOSDIC schedule carried housing questions comprising the 5-percent sample items; the other four-fifths carried questions comprising the 20-percent sample items. Items which appeared on both types of schedules comprised the 25-percent sample items. Thus, sample items were based on 5, 20, or 25 percent of the housing units; for these items the tabulations were based on the full 5-, 20-, or 25-percent sample, respectively. For items enumerated for all housing units, however, the tabulations were not always based on the complete count; data for some of these items were tabulated from a sample of units, particularly for areas with large population. Furthermore, the same item may be tabulated at different rates within this volume. The use of different rates was determined largely by the amount of detail to be tabulated.

Although the sampling procedure did not automatically insure an exact 25-, 20-, or 5-percent sample of housing units in each area, the sample design was unbiased if carried through according to instructions. Generally, for large areas, the deviation from the estimated sample size was found to be quite small. Small biases arose, however, when the enumerator failed to follow the listing and sampling instructions exactly. The 25-percent sample as finally processed comprised 24.53 percent of the total occupied housing units and 24.71 percent of the total population in the United States as a whole.

Sample rate for tabulation. The rate at which an item was

in the United States Summary chapter, condition and plumbing facilities and number of rooms were tabulated from the complete count (100 percent) for vacant units and from the 25-percent sample for the distributions of "owner occupied," "renter occupied," and "all" units. The rates given in table I apply to the 1960 data in the text tables as well as in the detailed tables.

In the State chapters, condition and plumbing facilities for renter-occupied units were tabulated from the 25-percent sample for table 12 and from the complete count (100 percent) of units for table 25; value for owner-occupied units was tabulated from the 25-percent sample for each table in which it is presented; and the distribution and median number of persons for "all" occupied units in table 26 were based on the 100-percent count, whereas the medians for owner- and renter-occupied units were computed from the 25-percent sample. Data on number of units in structure were tabulated from the 20-percent sample for owner-occupied, renter-occupied, and vacant units. Data on number of rooms for vacant units in table 3 in the State chapters were tabulated from the 100-percent count of vacant units; data on rooms for all units in table 3 were tabulated from the 25-percent sample of occupied and vacant units. Unless otherwise specified, the sample rate for the subject is applicable to the medians and averages as well as the distributions.

Medians were computed from distributions based on weighted samples tabulated at the rate indicated for the subject in table I. Medians, averages, and percentages are not shown where the base is smaller than the required minimum. For items tabulated from the complete count, the minimum base is 50 units; for the 25-percent sample, the minimum base is 200 units; and for the 20-percent sample, the minimum base is 250 units. For population per occupied unit, the population figure is considered the base.

In 1950, information was collected on a complete-count basis except for information on heating equipment, electric lighting, refrigeration, kitchen sink, year built, radio, television, heating fuel, and cooking fuel. In the text tables in the United States Summary chapter, the 1950 data for the specified items are based on 20-percent samples of units for conterminous United States and on the complete count for Alaska and Hawaii; the 1950 data for the remaining items and all data from earlier censuses are based on the complete count (see section on "Description of

Introduction

TABLE I-SAMPLE RATE FOR TABULATION

[Rate shown in percent. Rates applicable to 1960 data; see note below regarding data for 1950 and earlier]

Subject

Inventory: 1

All housing units..

By farm-nonfarm residence.

Occupied units..

By color.

By farm-nonfarm residence.

Owner occupied; renter occupied. By color.

By farm-nonfarm residence.

Vacant units..

Occupancy characteristics:

Color or ethnic group.

Persons..

Persons per room.

[blocks in formation]
[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][subsumed][subsumed][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][ocr errors][ocr errors][ocr errors][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][ocr errors][ocr errors][merged small][ocr errors][subsumed]

Bedrooms

Elevator in structure..

Rooms:

All units; owner; renter.

25

25

3 100

25

Vacant.

100

100

Units in structure.

20

20

20

20

Trailers..

25

Year structure built..

25

[blocks in formation]
[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][subsumed][merged small][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][subsumed][subsumed][ocr errors][ocr errors][ocr errors][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][subsumed][subsumed][merged small][subsumed][subsumed][merged small][subsumed][subsumed][merged small][subsumed][merged small][subsumed][subsumed][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small]

1 Refers to counts of units under the subject "Tenure, color, and vacancy status." Under the subject "Tenure" or "Tenure, color, and vacancy status," the counts of vacant units are based on the 100-percent enumeration; the counts of owner- and renteroccupied units by color are based on the 25-percent sample and, because they are major components in the ratio estimation, are essentially in agreement with the complete count except when tabulated by farm-nonfarm residence. These counts appear as control totals in various tables for the area; totals of distributions for characteristics based on samples of different size may not agree precisely with these counts (see section on "Ratio estimation").

For owner- and renter-occupied units, median number of persons and median number of rooms were computed from the 25-percent sample of units. Based on the 100-percent enumeration in text tables D and F.

For automobiles, 20-percent sample in places of 50,000 inhabitants or more in 1950 or in an interim census prior to 1960; 5-percent sample elsewhere. For rent and value of vacant units, 100-percent in places of 50,000 inhabitants or more in 1960 for which statistics are published by blocks in 1960 Census of Housing, Volume III, City Blocks; 25-percent sample elsewhere. For an area including a place of 50,000 inhabitants or more (e.g., a region, division, or SMSA), the parts were tabulated at their respective rates and then combined.

NOTE. The 1950 data in the text tables of the United States Summary chapter are based on the complete count, except in conterminous United States data on age of

[blocks in formation]

For each of the seven groups, the ratio of the complete count to the sample count of housing units in the group was determined. Each sample housing unit in the group was assigned an integral weight so the sum of the weights would equal the complete count for the group. For example, if the ratio for a group was 4.2, onefifth of the housing units (selected at random) within the group were assigned a weight of 5, and the remaining four-fifths, a weight of 4. The use of such a combination of integral weights rather than a single fractional weight was adopted to avoid the complications involved in rounding. For the 25-percent sample tabulations, where there were fewer than 50 housing units in the complete count in a group or where the resulting weight would be over 16, groups were, in general, combined in a specific order to satisfy these two conditions. Similar procedures with appropriate values were used for the 20- and 5-percent sample tabulations. The ratio estimates achieve some of the gains of stratification which would have been obtained if the sample had been stratified by the groups for which separate ratio estimates were computed. The net effect is a reduction in the sampling variability and in the bias of many statistics below that which would be obtained by weighting the results of the 25-percent sample by a uniform factor of 4 (the 20-percent sample by 5 or the 5-percent sample by 20). The reduction in sampling variability is trivial for some items and substantial for others.

The ratio estimation procedure was generally applied to the smallest complete geographic area for which any data were to be published. Thus, the area may be a city, tract within a city, county, SMSA, urbanized area, or the rural part of a county. The rural-farm and rural-nonfarm units in a county, however, do not represent complete areas; therefore, data by rural-farm and rural-nonfarm residence are not subject to the reduction in sampling variability which is achieved by the ratio estimation procedure. Distributions of characteristics which were tabulated at different sample rates may not add to the same total.

The inventory of housing units (counts of all units, owner occupied, renter occupied, and vacant) are provided under the subject "Tenure, color, and vacancy status." In the detailed tables in the United States Summary chapter and in tables 1 to 24 and 28 to 35 in the State chapters, as a byproduct of the ratio estimation procedure, estimates of owner- and renter-occupied

8 Estimates of characteristics of the housing units from the sample for a given area are produced using the formula:

[merged small][ocr errors][merged small][merged small][merged small][merged small]

units by color of head of household (except when tabulated by farm-nonfarm residence) are essentially in agreement with the total numbers of units from the 100-percent counts in the respective groups in each area (occupied units were tabulated from the 25-percent sample and vacant units were tabulated from the 100percent counts). However, where some of the groups in the ratio estimation procedure were combined, the estimates for ownerand renter-occupied units by color are subject to a relatively small sampling variability. The counts of units which are shown under the subject "Tenure, color, and vacancy status" in the first table for a given area appear as control totals in subsequent tables for the area. For subjects tabulated from the 20-percent or 5-percent sample, the distributions may not add precisely to these control totals.

In tables 25 and 36 to 38 in the State chapters, the counts of owner- and renter-occupied units by color and the counts of vacant units, when presented under the subject "Tenure" or "Tenure, color, and vacancy status," are the 100-percent counts and therefore are not subject to sampling variability. In State table 27, the counts of owner-occupied, renter-occupied, and available vacant units also are the 100-percent counts. In State tables 40 to 42, all the data are subject to sampling variability.

Farm residence was based on the 25-percent sample of units, and estimates of owner- and renter-occupied units by color were inflated to the 100-percent counts for the entire rural portion of a county. The separate counts of rural-nonfarm and rural-farm units, therefore, are subject to sampling variability.

In the text tables in the United States Summary chapter, the 1960 inventory counts are essentially in agreement with the 100percent counts, as specified in the headnotes. For these figures, the counts are based partly on a sample; figures for owneroccupied and renter-occupied units are based on the 25-percent sample subject to ratio estimation, and the counts of vacant units are based on the 100-percent enumeration. Distributions of characteristics based on samples of units may not add precisely to the inventory counts.

SAMPLING VARIABILITY

Standard error of numbers and percentages.-Figures from sample tabulations are subject to sampling variability. For the 1960 data based on samples, the sampling variability can be estimated by using factors from table IV in conjunction with table II for absolute numbers and with table III for percentages." These tables do not reflect the effect of response variance, processing variance, or bias arising in the collection, processing, and estimation steps; estimates of the magnitude of some of these factors in the total error are being prepared and will be published at a later date. The chances are about two out of three that the difference due to sampling variability between an estimate based on a sample and the figure that would have been obtained from a The chances are complete count is less than the standard error.

about 19 out of 20 that the difference is less than twice the standard error and about 99 out of 100 that it is less than 21⁄2 times the standard error. The amount by which the estimated standard error must be multiplied to obtain other odds deemed more appropriate can be found in most statistical textbooks.

Table II shows estimates proportionate to the standard errors of estimated numbers of housing units. Table III shows estimates proportionate to the standard errors of estimated percentages of housing units. Table IV provides a factor by which the

These estimates of sampling variability are based on partial informa

[blocks in formation]

To estimate a standard error for a given characteristic, locate in table I the sample rate used in the tabulation, and in table IV the factor applying to the item according to the sample rate used; multiply this factor by the estimate proportionate to the standard error given for the number shown in table II. The product of this multiplication is the approximate standard error. Similarly, to obtain an estimate of the standard error of a percentage, multiply the figure as shown in table III by the factor from table IV. For most estimates, linear interpolation in tables II and III will provide reasonably accurate results.

Illustration: Let us assume table 13 in a State chapter shows that in a given city there are an estimated 2,500 housing units with two or more bathrooms. According to table I, data on number of bathrooms in table 13 were tabulated from the 20percent sample of housing units. Table IV shows that when number of bathrooms is tabulated from the 20-percent sample, the appropriate number in table II should be multiplied by a factor of 1.2. Table II shows that the estimate proportionate to the standard error for an estimate of 2,500 is about 80. The factor of 1.2 times 80, or 96, means that the chances are approximately 2 out of 3 that the results of a complete count would not differ by more than 96 from the estimated 2,500. It also follows that there is only about 1 chance in 100 that the results of a complete count would differ by as much as 240, that is, by 21⁄2 times the standard error. Assume also that table 28 for a State shows there are an estimated 300 dilapidated housing units in a given county. According to table I, the sample rate of tabulation for condition and plumbing is 25 percent, and according to table IV the factor is 1.2. Table II shows that the estimate proportionate to the standard error for an estimate of 300 is about 32. The factor of 1.2 times 32, or 38, means that the chances are approximately 2 out of 3 that the results of a complete count would not differ by more than 38 from the estimated 300. In table 25 for the State, however, the estimated number of dilapidated units was tabulated from the 100-percent count, and, therefore, is not subject to sampling variability.

Homeowner and rental vacancy rates, which are given in tables

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][subsumed][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][merged small][ocr errors][subsumed][ocr errors][merged small][merged small][merged small][subsumed][merged small][subsumed][merged small][subsumed][merged small][subsumed][merged small][subsumed][merged small][merged small][subsumed][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

tables 1, 2, 12, 18, and 22 in the State chapters, are subject to relatively small sampling variability in most cases since they are computed by using the 100-percent count of vacant units and the estimates of owner-occupied and renter-occupied units.

For a characteristic tabulated by color or tenure, the factor for the characteristic in table IV approximates the factor that applies to the data in the cross-tabulation. For example, to obtain the approximate standard error of the estimated number of owneroccupied units built in the period 1950 to 1954, apply the factor in table IV for "year structure built" to the estimate in table II. In the text tables in the United States Summary chapter, 1950 data based on a sample also are subject to sampling variability. Estimates of the standard errors are given in 1950 Census of Housing, Volume I, General Characteristics.

Standard error of differences.-The standard errors estimated from tables II and III (using factors from table IV) are not directly applicable to differences between two estimates. The estimates of sampling errors are to be applied differently in the following three situations:

1. For a difference between a sample estimate and one based on a complete count (e.g., a difference arising from comparison between condition and plumbing facilities based on the 25-percent sample for one area, and condition and plumbing facilities from the 100-percent tabulations in another area), the standard error of the difference is identical with the standard error of the estimate based on the sample.

2. For a difference between two sample estimates (e.g., one from 1960 and the other from 1950, or both from the same census year), the standard error is approximately the square root of the sum of the squares of the standard error of each estimate considered separately. This formula will represent the standard error quite accurately for the difference between estimates of the same characteristic in two different areas, or for the difference between separate and uncorrelated characteristics in the same area. If, however, there is a high positive correlation between the two characteristics, the formula will overestimate the true standard error.

3. For a difference between two sample estimates one of which represents a subclass of the other (e.g., units in sound condition and having all plumbing facilities as a subclass of all units in sound condition), the difference should be considered as the sample estimate; the standard error of this difference may be obtained directly.

« PreviousContinue »