## Determine Sample Size and Measuring Degree of Error

In most cases, it is not realistic for a library to survey all items in its collection, so a part of the collection, called the sample, that represents the whole is selected. To be representative of the entire population, sample titles are selected randomly. Every item in the collection has an equal chance of being selected as a member in the sample, thereby making the sample portion representative of the whole - through the process of random sampling.

The sample size, or the number of items chosen to be in the sample, determines the degree of error of the results of the survey. Confidence level and tolerance, also called precision of reliability, are measures of this degree of error. Tolerance is defined as "a measure of the accuracy of our result", and confidence level as "a measure of how certain one is that the true answer lies within the limits stated in this tolerance." Drott, 119

As an example, to state that 57 percent of books sampled for a particular survey are in moderate condition, and that all results given for the survey are based on a 95 percent confidence level with a tolerance of +/- 4%, means that there is one chance in twenty (5 %) that the actual percentage of sampled books that are in moderate condition is greater than 61% and less than 53%.

The larger the desired confidence level, the larger the sample size needed for a given population. There are statistical formulas that relate sample size and degree of error to the size of surveyed populations. There are tables relating confidence level and tolerance to sample size based on such a formula, one sample table is represented below. Drott, 120

CONF | TOL | SIZE | CONF | TOL | SIZE |
---|---|---|---|---|---|

99% | +/- 0.5 | 66.358 | 90% | +/- 0.5 | 27.060 |

+/- 1.0 | 16.590 | +/- 1.0 | 6.765 | ||

+/- 2 | 4.149 | +/- 2 | 1.691 | ||

+/- 3 | 1.843 | +/- 3 | 752 | ||

+/- 5 | 664 | +/- 5 | 271 | ||

+/- 7 | 339 | +/- 7 | 138 | ||

+/- 10 | 166 | +/- 10 | 68 | ||

CONF | TOL | SIZE | CONF | TOL | SIZE |

95% | +/- 0.5 | 38.416 | 80% | +/- 0.5 | 16.435 |

+/- 1.0 | 9.604 | +/- 1.0 | 4.109 | ||

+/- 2 | 2.401 | +/- 2 | 1.027 | ||

+/- 3 | 1067 | +/- 3 | 457 | ||

+/- 5 | 384 | +/- 5 | 164 | ||

+/- 7 | 196 | +/- 7 | 84 | ||

+/- 10 | 96 | +/- 10 | 81 |

Another factor to consider when deciding upon sample size, is the level of detail of information that is wanted from the results of the survey. Larger sample sizes are needed to obtain information about subsets of a collection with differing characteristics. The number of samples to be collected and analyzed also bears upon the cost of carrying out a survey.

To break down percentages of books that are classified as brittle by library subunit, publication date, and country of publication, as was done in the Yale Survey Walker, "The Yale Survey", 124, requires a fairly large sample size. The Yale sample size was 36,500 books, requiring a large number of staff to carry out the survey. Small sample sizes, on the other hand, offer valid results for homogeneous populations, and describe conditions of a collection as a whole, but will not convey information about special parts of the collection.

When considering sample size - "a balance needs to be struck between the accuracy of the results, the degree of information needed on subsets of the collection, and availability of staff to carry out the data collection." Reed-Scott, 93