Sample size determination is the process of calculating the number of units to be included in a sample that would most appropriately represent the population characteristics.
The word appropriately is used because, no sample can exactly replicate the population characteristics. Thus, the interviewer has to decide on the level of confidence required by him on the sample, the percentage of margin of error he is ready to accept and the expected standard deviation, in order to determine the sample size.

Enter the following details:

Confidence level:  

Standard deviation: (%)

Margin of error: (%)

Size of the actual population:  


Sample size:  

The formula for finding the sample size is:
\[n={\left(z\frac{\sigma }{M.E}\right)}^2\]
\(n\) is the sample size that has to be calculated.
\(M.E\) refers to the margin of error. The margin of error is the percentage of deviation of results from the mean value that the interviewer is ready to accept. For example, if a margin of error of \(5\ \%\) is selected, it means that the results obtained will vary by \(+\ or-5\ \%\) from the mean value. Requiring a very low margin of error would mean that the sample size must be very large. The results from a small sample would have a very high margin of error.

\(z\) is the z-score that indicates the level of confidence desired by the interviewer. For example, if the desired confidence level is \(95\ \%\), then for a large number of sample surveys conducted, the results would fall within the required margin of error \(95\ \%\) of the times. Requiring a larger confidence level means that the sample size would have to be very large. Generally, \(95\ \%\) or \(99\ \%\) confidence levels are used. Other values such as \(80\ \%\), \(85\ \%\) or \(90\ \%\) can be used too. Using a very small confidence level would require a very small sample and the results might have more than expected variation.

The z-scores for various confidence levels are as follows:-
\(80\ \%-1.2816\)
\(85\ \%-1.44\)
\(90\ \%-1.6449\)
\(95\ \%-1.96\)
\(99\ \%-2.5758\)

\(\sigma\) is the standard deviation of the population. Since the standard deviation of the population will not be known, a value from a pilot survey or some other estimate can be used.

Example 1

An interviewer needs to determine the size of the sample that he needs to make estimates about upcoming polls in his town. He needs a very small margin of error of \(2\ \%\) and desires a confidence level \(99\ \%\) . From the results of previous elections, he estimates that the population standard deviation would be \(30\ \%\) .

\(M.E=2\ \%\)
\(\sigma =30\ \%\)

\[n={\left(z\frac{\sigma }{M.E}\right)}^2\ \] \[\ \ \ =\ {\left(2.5758\times \ \frac{0.3}{0.02}\right)}^2\] \[\ \ \ =\ {\left(38.7\right)}^2=1492.8\]
This means that a sample size of \(1493\) would be required to have a margin of error of only \(2\ \%\) with a confidence level of \(99\ \%\) .