Skip to main content

Section 5.1 Size of Samples (Categorical)

We saw in Remark 4.2.5 that we can increase the likelihood the true proportion falls within our intervals, so long as we sacrifice precision and widen our intervals. But what if we demand both dependability and precision?

In this section, we identify the sample sizes neccesary to achieve confidence intervals with a given margin of error.

Subsection 5.1.1 Finding the Size of a Sample

Exploration 5.1.1. Course Modality.

A professor is deciding whether to offer a course face-to-face, or online, and wishes to find a 99% confidence interval for the proportion of students who would prefer it to be online.

Recall that the pieces neccesary to compute a 99% confidence interval are: \(\hat{p}\) the sample proportion, \(n\) the sample size, \(SE_{\hat{p}}\) the standard error, and \(z^*\) the \(z\)-score associated with the 99% interval.

(a)

Which one of the follwing factors is directly under the professors control?

  1. \(\hat{p}\text{.}\)

  2. \(n\text{.}\)

  3. \(SE_{\hat{p}}\text{.}\)

  4. \(z^*\text{.}\)

(b)

Suppose they let \(n=40\) and found 16 students who preffered it be on line. Compute:

  • \(\hat{p}\text{.}\)

  • \(SE_{\hat{p}}\text{.}\)

  • \(z^*\text{.}\)

  • The 99% Confidence Interval.

Remark 5.1.1.

We note that the interval we found Exploration 5.1.1 is quite wide! If the professor wanted to narrow it down, the only thing under their control is how many students they survey.

Definition 5.1.2.

Given a confidence interval \([LB, UB]\) the point estimate (in this chapter \(\hat{p}\)) is halfway between the lower bound and upper bound \(\frac{UB-LB}{2}\text{.}\) The margin of error is the difference between the bounds and the point estimate:

\begin{equation*} ME=UB-\hat{p}=\hat{p}-LB. \end{equation*}

Margin of Error.
Figure 5.1.3. Margin of Error.

For proportion intervals, this can be computed:

\begin{equation*} ME=z^*SE_{\hat{p}}=z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}. \end{equation*}

Activity 5.1.2. Course Modality revisted.

Suppose the Professor from Exploration 5.1.1 wanted to conduct a new survey and this time, he wants the margin of error to be less than 5%.

(b)

The Professor is still reasonably sure that \(\hat{p}=\frac{16}{40}\) is a reasonable estimate for the proportion of students who want to take an online class.

Use this value for \(\hat{p}\) and \(z^*\) for a 99% interval to solve the inequality:

\begin{equation*} z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\lt 0.05. \end{equation*}
(c)

Whats the minimum number of students the professor needs to survey to get a 99% confidence interval with margin of error less than 5%? (This needs to be a whole number.)

Activity 5.1.3. Minimum \(n\) with no Info.

In Activity 5.1.2 we were able to find the minimum number of students needed to be surveyed to find a 99% confidence interval with 5% margin of error. We did this using an existing estimate for \(\hat{p}\text{.}\) But suppose we are conducting a stufy, and we want to know the size of our sample before we have any information, so that we can properly plan the data collection process. What should we use for \(\hat{p}\) if we have no data?

(a)

Note that the margin of error \(ME=z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\) is maximized when \(\hat{p}(1-\hat{p})\) is maximized.

Consider this graph of \(y=\hat{p}(1-\hat{p})\text{.}\) When is \(y\) maximized?

\(y=\hat{p}(1-\hat{p})\text{.}\)
Figure 5.1.4. \(y=\hat{p}(1-\hat{p})\text{.}\)

From this we can conclude that no matter what \(\hat{p}\) is:

\begin{equation*} z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\leq z^*\sqrt{\frac{\frac{1}{2}(1-\frac{1}{2})}{n}}. \end{equation*}

Thus, if we have no information on \(\hat{p}\) we should use \(\hat{p}=0.5\) to find our sample size.

Activity 5.1.4. Curing Zombification.

In the zombie apocalypse, a geneticist belives she has found a prototype cure for zombification. However, she knows the effectiveness rate is not 100%, otherwise, she's really not sure what it is. She wants to find \(p\text{:}\) the proportions of zombies it cures.

Suppose she wanted to find a 95% confidence interval for her cure with margin of error at most 2%. Naturally securing zombies safely is a dangerous endeavor, so she would like to find the smallest possible number of zombies she would need to do this.

(a)

Find the \(z^*\) associated with 95%.

(b)

Following Activity 5.1.3 let \(\hat{p}=0.5\) and solve the inequality for \(n\text{:}\)

\begin{equation*} z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\lt 0.02. \end{equation*}
(c)

What is the smallest number of zombies which must be captured and tested to achieve a 95% confidence interval with 2% margin of error? (A whole number)

(d)

Fix and run the following code to see how many of the n you captured are cured:

(e)

Using this sample, compute a 95% confidence interval for the proportion of zombies cured by this prototype. Is the margin of error less than 2%?

(f)

Repeat this activity, but for a 99% confidence interval instead.