Standard Normal Distribution (D1)

Section 3.1 Standard Normal Distribution (D1)

In this section, we study one of the most ubiquitous and fundamental probability distributions in nature and mathematics: the normal distribution. We may know this as the standard “bell curve”.

In this section, we will compute probabilities of the standard normal random variable given bounds, and bounds given probabilities.

Definition 3.1.1.

The standard normal curve is the curve generated by

\begin{equation*} y=\frac{e^{-x^2/2}}{\sqrt{2\pi}}. \end{equation*}

This may be seen below. It is a probability distribution with mean of 0 and standard deviation of 1.

Exploration 3.1.1. Shape of the Normal Curve.

We observe some things about the shape of the normal curve.

(a)

Where is the peak of this curve?

(b)

How would you compare the behavior of the curve as \(x\) is positive, compared to negative?

(c)

What happens to the hight of the curve as \(x\) moves away from 0?

(d)

Where is the bulk of the area under the curve?

(e)

Run the following code to simulate 10,000 standard normal variables and plot a proportion histogram of the outcomes.

How does the shape of this histogram compare to the standard normal curve?

Remark 3.1.2.

Given the standard normal curve, the probability that a normal variable \(Z\) falls between \(a\) and \(b\) is \(P(a\lt Z\lt b)\) and is computed by the area under the curve between \(a\) and \(b\text{.}\)

Remark 3.1.3.

ALWAYS ALWAYS ALWAYS sketch a curve and label any values/areas that you can before doing any computations when working with normal curves.

Remark 3.1.4.

Seriously, always sketch a curve first, the geometric intuition will tell you so much more than just doing arithmetic blindly.

Activity 3.1.2. Computing Probabilities on the Normal Curve.

Areas under this curve are difficult to compute directly, so generally, technology of some sort is employed to find these values. Some texts use “\(z\)-score tables” for this purpose, which has lists of pre-computed values. Here, we will show and visualize the areas more directly.

In the following Desmos interactive, by setting a=-1.2 and b=2 we can see that \(P(-1.2\lt Z \lt 2)\approx 0.8622\text{.}\)

(a)

Adjust a and b to find \(P(0\lt Z\lt 1)\text{.}\)

(b)

Adjust a and b to find \(P(-1\lt Z\lt 0)\text{.}\) How does this compare to what you found in (a)?

(c)

Adjust a and set b=infinity to find \(P(0\lt Z)\text{.}\) How does this reflect what we saw in Exploration 3.1.1?

(d)

Adjust a and set b=infinity to find \(P(-1\lt Z)\text{.}\) How does this value compare to what we found in (b) and (c)?

(e)

Without using technology, use (a) and (c) to find \(P(1\lt Z)\text{.}\)

(f)

Adjust a and set b=infinity to find \(P(1\lt Z)\text{.}\) How does this value compare to what we found in (e)?

(g)

Adjust a and b to find \(P(Z\lt 1)\text{.}\) How does this value compare to what we found in (d)?

Activity 3.1.3. Finding bounds for Areas/Probabilities.

It stands to reason that if one can be given bounds of a region and find an area, then if one is given an area, then one should be able to find the bounds for a region with that area.

(a)

For what value of b=Z_0 is \(P(Z\lt Z_0)=0.7\text{?}\) Get \(Z_0\) as close as you can.

(b)

In the interactive below, let P=0.7 and click From the left. How close is the value you found?

(c)

Without using technology, what is \(P(Z>Z_0)\text{?}\) (Remember Remark 3.1.3 )

Activity 3.1.4. Finding bounds for Areas/Probabilities.

Sketch a normal curve with a shaded region from \(-k\) to \(k\text{.}\)

(a)

Consider the tail regions \(Z>k\) and \(Z\lt k\text{.}\) How do we know these regions have the same area without knowing \(k\text{?}\)

(b)

Suppose this region \(-k\lt Z\lt k\) has an area of 0.8. How much is the area of each tail? Label the areas on your sketch.

(c)

Using just the sketch, what is \(P(Z>-k)\text{?}\)

(d)

In the interactive below, let P to be the area of the left tail and click From the left. What does this tell you about \(-k\text{?}\)

(e)

In the interactive above, let P to be the area of the right tail and click From the right. What does this tell you about \(k\text{?}\)

(f)

In the interactive above, let P=0.8 and click From the center. What does this tell you about \(k\) and \(-k\text{?}\)

(g)

In the interactive below, let a=-k, b=k where \(k\) is the value you found above. What do we notice about this area?

Activity 3.1.5. Computing standard normal values using R.

We can also use R to both probabilities and \(z\)-values. Remember Remark 3.1.3.

The command pnorm takes a value as an input and computes the area to the left of said value as an output. The command qnorm takes an area as an input, and finds the right endpoint of that area as a output.

(a)

Run the following code to compute \(P(Z\lt 2)\text{:}\)

(b)

Adjust and run the above code to find \(P(Z\lt 0.5)\text{.}\)

(c)

Use (a) and (b) to find \(P(0.5\lt Z\lt 2)\text{.}\)

(d)

Adjust and run the above code to find \(P(Z\lt 1.5)\text{.}\)

(e)

Use (d) to find \(P(Z>1.5)\text{.}\)

(f)

Run the following code to compute a \(z_0\) such that \(P(Z\lt z_0)=0.35\text{:}\)

(g)

Suppose I want to find a \(z_1\) such that \(P(Z>z_1)=0.12\text{.}\) What is \(P(Z\lt z_1)\text{?}\) (Remember Remark 3.1.3)

(h)

Adjust and run the above code to compute \(z_1\text{.}\)

(i)

Suppose I want to find a \(z_2\) such that \(P(-z_2\lt Z\lt z_2)=0.6\text{.}\) What is the area of each tail? What is \(P(Z\lt z_2)\text{?}\) (Remember Remark 3.1.3)

(j)

Adjust and run the above code to compute \(z_2\text{.}\)

Remark 3.1.5. Some things to keep in mind.

Here are some quick facts and heuristics about the normal curve to give you an intuitive sense on how it behaves.

Each half is 50% of the curve, so have greater than the mean, half less than the mean.
For any tail of the distribution, the corresponding tail on the other side has the same area.
Roughly 67% of the curve lies within \(\pm 1\) standard deviations from the mean, 95% within \(\pm 2\) standard deviations from the mean, and 99.7% within \(\pm 3\) standard deviations from the mean.