General Normal Distribution (D2)

Section 3.2 General Normal Distribution (D2)

We begin our discussion with the standard normal distribution to build intuition about normal distributions, but most naturally occuring normally distributed things do not have mean 0 and standard deviation 1. Things that are generally normally distributed may have any mean and standard deviation, but what makes them normally distributed is that they have the same fundamental shape as the standard normal, a bell curved symmetric shape. In the standard normal curve, the standard deviation is 1, so the units of \(z\) are standard deviations. We can translate values from any \(X\) normally distributed random variable to the standard distribution through \(z\)-scores.

In this section, we will compute probabilities of general normal random variables given bounds, and bounds given probabilities.

Exploration 3.2.1. Pine Trees.

Suppose the height of Pine Trees is normally distributed with mean 33 m and standard deviation 3 m.

(a)

How many points above or below the mean is a height of 39 m?

(b)

How many standard deviations above or below the mean is a height of 39 m? (A standard deviation is 3 m.)

(c)

How many points above or below the mean is a height of 30 m?

(d)

How many standard deviations above or below the mean is a height of 30 m?

(e)

How many points above or below the mean is a height of 38 m?

(f)

How many standard deviations above or below the mean is a height of 38 m?

Definition 3.2.1.

Given a random variable \(X\) with mean \(\mu\) and standard deviation \(\sigma\text{,}\) the \(z\)-score of a value \(x\) of \(X\) is the number of standard deviations \(x\) is above or below the mean. This is computed

\begin{equation*} z=\frac{x-\mu}{\sigma}. \end{equation*}

Remark 3.2.2.

Consider a normal distribution with mean \(\mu\) and standard deviation \(\sigma\text{,}\) often denoted \(N(\mu, \sigma)\text{.}\) every value \(x\) in \(N(\mu, \sigma)\) has a corresponding \(z\)-score in \(N(0,1)\text{.}\) So for a random variable \(X\) which has distribution \(N(\mu, \sigma)\text{,}\) we can find \(P(a\lt X\lt b)\) by finding \(P(z_a\lt Z\lt z_b)\) in \(N(0,1)\) where \(z_a, z_b>\) are the \(z\)-scores of \(a, b\text{.}\)

This is because all normal curves are transformations of the same curve, so every point on one curve corresponds to a point on the other, as do regions and their areas.

Activity 3.2.2. Ahi Tuna: Computing Probabilities.

Suppose Ahi tuna weights are normally distributed with mean 120 lbs and standard deviation 25 lb's. Suppose a fisherman goes out fishing. Let \(X\) denote the weight of a random fish.

(a)

Consider a tuna fish weighing 150 lbs. What is \(z\)-score of 150 lbs? Call this value \(z_0\text{.}\)

(b)

Consider a tuna fish weighing 150 lbs. What is \(z\)-score of 150 lbs? Call this value \(z_0\text{.}\)

(c)

Find the probability that a random tuna fish weighs more than 150 lbs by computing \(P(Z>z_0)\text{.}\)

(d)

Set m=120, s=25, a=150, b=infinity to compute \(P(X>150)\) directly.

(e)

What is the probability that a tuna he catches weighs between 50 and 100 lbs?

Activity 3.2.3. Ahi Tuna part 2: Computing Weights.

Suppose Ahi tuna weights are normally distributed with mean 120 lbs and standard deviation 25 lb's as in Activity 3.2.2. Let \(X\) denote the weight of a random fish.

(a)

Suppose our fisherman wants to know whether or not a fish they catch weighs in the top 25% of Ahi tuna. For what value \(z_0\) is \(P(Z>z_0)=0.25\) in the standard distribution.

(b)

What is the \(X\) value (\(x_0\)) corresponding to this \(z\)-score? This is the cutoff-weight for the top 25%.

(c)

Set m=120, s=25, P=0.25, and select From the right to compute \(x_0\) directly.

(d)

Find a value \(k\) so that \(P(120-k\lt X\lt 120+k)=0.8\text{.}\) (Remember Remark 3.1.3)

Activity 3.2.4. Ahi Tuna part 3: R.

Suppose Ahi tuna weights are normally distributed with mean 120 lbs and standard deviation 25 lb's as in Activity 3.2.2. Let \(X\) denote the weight of a random fish.

(a)

Fix and run the following code to compute the probability that a tuna weighs less than 130 lb's: \(P(X\lt 130)\text{:}\)

(b)

Use (a) to compute \(P(X\geq 130)\text{.}\)

(c)

Adjust and run the code in (a) to compute \(P(X\lt 100)\text{.}\)

(d)

Use (a) and (c) to compute \(P(100\lt X\lt 130)\text{.}\)

(e)

Fix and run the following code to compute the value \(x_0\) such that \(P(X\lt x_0)=0.1\text{:}\)

(f)

Suppose I want to find a \(k\) such that \(P(120-k\lt X\lt 120+k)=0.90\text{.}\) What is the area of each tail? What is \(P(Z\lt 120+k)\text{?}\) (Remember Remark 3.1.3)

(g)

Adjust and run the above code to compute \(120+k\text{,}\) and then find \(k\text{.}\)

Activity 3.2.5. Ahi Tuna part 4: Simulation.

Suppose Ahi tuna weights are normally distributed with mean 120 lbs and standard deviation 25 lb's as in Activity 3.2.2. Let \(X\) denote the weight of a random fish.

(a)

Run the following code to simulate the weights of 1000 random Ahi Tuna:

(b)

Run the following code to see what proportion of tuna weighing over 150 lbs:

How does this value compare to what you found in Activity 3.2.2 (c)?

(c)

Run the following code setting X0 to be the \(x_0\) found in Activity 3.2.3 (b).

How does this compare to what you observed in Activity 3.2.3 (b)?

(d)

Run the following code to see what proportion of tuna weigh between 50 and 100 pounds:

How does this value compare to what you found in Activity 3.2.2 (e)?

(e)

Run the following code setting k to be the \(k\) found in Activity 3.2.3 (d).

How does this compare to what you observed in Activity 3.2.3 (d)?

Remark 3.2.3.

It's always important to bear in mind when doing a computation, whether or not one is given bounds and finding an area/probability, or given an area/probability and trying to find bounds.

Arrows from values to Probabilities and vice versa. — Figure 3.2.4. Which direction are you going?

Activity 3.2.6. Marathon.

Suppose Marathon running times are normally distributed with mean 4.5 hours and standard deviation 1 hour. Let \(X\) denote the time in hours for a random runner to finish.

Hint.

(a)

What proportion of runners would take over 6 hours?

(b)

Suppose a special award is given to the top 5% fastest runners. What would be the predicted cutoff for this time?

(c)

What is the probability a random runner finishes between 3 and 4 hours?

(d)

Find \(k\) so that 90% of runners finish betwee \(4.5-k, 4.5+k\) hours.