Skip to main content

Section 2.4 Random Variables (P4)

In Section 1.4, and Section 1.5 we found how to find the mean, variance and a standard deviation of a finite population. Then previously in this chapter, we studied the probability of individual events either happening or not happening. In this section, we look at the collection of outcomes of random process, and measure centrality and spread of these processes.

Definition 2.4.1.

A random variable random process or variable with a numerical outcome.

Remark 2.4.2.

Given a random variable \(X\) with a finite number of outcomes, we denote them \(X_1,\ldots,X_n\text{,}\) and the probability that \(X\) takes on the value \(X_i\) is \(P(X=X_i)\text{..}\)

Then since these are all the possible probabilities we have:

  1. \(\displaystyle 0\leq P(X=X_i)\leq 1\)

  2. \(\displaystyle \sum P(X=X_i)=1\)

Exploration 2.4.1. Textbooks for a class.

Two books are assigned for a statistics class: a textbook and its corresponding study guide. The university bookstore determined 20% of enrolled students do not buy either book, 55% buy the textbook only, and 25% buy both books, and these percentages are relatively constant from one term to another.

(a)

If there are 100 students enrolled, how many books should the bookstore expect to sell to this class?

(b)

The textbook costs $137 and the study guide $33. How much revenue should the bookstore expect from this class of 100 students?

(c)

What is the average revenue per student for this course?

Subsection 2.4.1 Expectation

Definition 2.4.3.

Given a random variable \(X\text{,}\) the expected value of \(X\) denoted \(E(X)\) is the mean outcome of \(X\text{.}\)

Note that \(E(X)\) need not be an outcome of \(X\text{.}\)

Remark 2.4.4.

For a random variable \(X\) with a finite number of ouctomes \(X_1, \ldots, X_n\) the expectation of \(X\) is computed

\begin{equation*} E(X)=\sum P(X=X_i)\cdot X_i. \end{equation*}

Activity 2.4.2. Simulating Textbook Purchases.

Suppose from Exploration 2.4.1 that each student has a 20% chance of not purchasing either book, a 55% chance of purchasing a textbook for $137, and a 25% chance of purchasing a textbook and study guide for $170. Let \(X\) denote the amount of money a student spends.

(a)

What is the probability that a student spends $0 (\(P(X=0)=?\))

(b)

What is the probability that a student spends $137 (\(P(X=137)=?\))

(c)

What is the probability that a student spends $170 (\(P(X=170)=?\))

(e)

Run the following code to simulate 100 students and their purchases, and display a histogram of their purchases.

(f)

Run the following code to show the mean revenue spent by students.

How does this value compare to what you found in (d)?

(g)

Modify the following code to simulate 10,000 students and their purchases, and display a histogram of their purchases, and the mean purchaseprice.

How does this value compare to what you found in (d)? In (f)?

Subsection 2.4.2 Variance

Exploration 2.4.3. Outcomes of dice.

Consider the equally likely outcomes of a dice roll 1, 2, 3, 4, 5, 6.

(a)

Find the population mean and variance of this set.

(b)

Run the following code to simulate 1,000 die rolls and plot a histogram of their outputs.

(c)

Run the following code to find the mean and variance of these dierolls.

How do these values compare to the what you found in (a)?

Remark 2.4.5.

For a random variable \(X\) with a finite number of ouctomes \(X_1, \ldots, X_n\) the variance of \(X\text{,}\) denoted \(Var(X)\text{,}\) is computed

\begin{equation*} Var(X)=\sum P(X=X_i)\cdot (X_i-E(X))^2. \end{equation*}

Then the standard deviation of \(X\) is computed

\begin{equation*} \sigma_X=\sqrt{Var(X)}=\sqrt{\sum P(X=X_i)\cdot (X_i-E(X))^2}. \end{equation*}

Activity 2.4.4. Comparing to Data Sets.

Activity 2.4.5. Die Variance.

Let \(X\) denote the outcome of a Die Roll.

(a)

For each outcome of a die roll (\(X_i=1,\ldots,6\)), find the probability that a die will give that value (\(P(X=X_i)\)).

\begin{equation*} \begin{array}{|c|c|c|c|c|c|c|} \hline X_i \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \\ \hline P(X=X_i) \amp \amp \amp \amp \amp \amp \\ \hline \end{array} \end{equation*}
(b)

Compute \(E(X)\text{.}\)

(c)

Compute \(Var(X)\text{.}\)

Activity 2.4.6. Fast Food Order.

Suppose customers for a fast food chain have a 40% chance of ordering the regular combo for $5, a 25% chance of ordering the large for $7.50 and a 35% chance of ordering the jumbo for $12. Let \(X\) denote the money spent.

(a)

Write down a table recording each of the \(X_i\) and their associated probability:

\begin{equation*} \begin{array}{|c|c|c|c|} \hline X_i \amp 5 \amp 7.50 \amp 12 \\ \hline P(X=X_i) \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \\ \hline \end{array} \end{equation*}
(b)

Compute \(E(X), Var(X)\text{.}\)

(c)

Run the following code to simulate 200 customers and plot a histogram of their purchase amounts.

(d)

Run the following code to find the mean and variance of the purchase amounts.

How do these values compare to the what you found in (b)?

Activity 2.4.7. Lottery Winnings.

Suppose a lottery ticket that costs $2 has a 20% chance of winning $5, a 0.1% chance of winning $100 and a 0.001% chance of winning $10000. Let \(X\) denote the winnings of a ticket.

(a)

What are the possible net winnings (\(X_i\)) of a ticket? (Remember to account for the $2 purchasing cost, and that there is a scenario where you do not win).

(b)

Write down a table recording each of the \(X_i\) and their associated probability:

\begin{equation*} \begin{array}{|c|c|c|c|c|} \hline X_i \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \amp \hspace{1in} \\ \hline P(X=X_i) \amp \amp \amp \amp \\ \hline \end{array} \end{equation*}
(c)

Compute \(E(X), Var(X)\text{.}\)

(d)

Do you think overall, do you think the lottery ticket sellers make or lose money?

(e)

Run the following code to simulate 1,000,000 lottery ticket winnings and plot a histogram of their outputs.

(f)

Run the following code to find the mean and variance of these winnings.

How do these values compare to the what you found in (c)?