Skip to main content

Section 2.2 Conditional Probability (P2)

Let's say you know the probability you get into a car accident while driving. Then it starts snowing, would that change what we think the likelihood of accidents are? Lung cancer occurs at a certain rate, but if you knew someone smoked, would that affect how you feel about the likelihood of lung cancer?

We define conditional probability to be the likelihood of some event \(A\) given (or conditioned, assuming etc.) some other event \(B\text{.}\) Sometimes, this information can raise, lower, or leave unchanged our understanding of the likelihood of \(A\text{.}\)

In this section we compute and interpret probability of events involving conditional probabilities.

Run the following code to download the loans_full_schema data set, and display the variables it records:

This data set represents thousands of loans made through the Lending Club platform, which is a platform that allows individuals to lend to other individuals.

Subsection 2.2.1 Defining Conditional Probability

Exploration 2.2.1.

Consider the experiment rolling a dice. Let \(A\) denote the event “the outcome is even”, and let \(B\) denote “the outcome is greater than 3”.

(a)

Run the following code to show the results of 20 die rolls.

(b)

Out of the 20 die rolls, how many satisfy \(A\text{?}\) List them:

(c)

Out of the die rolls which satisfy \(A\text{,}\) how many also satisfy \(B\text{?}\) List them:

(d)

Out of the 20 die rolls, how many satisfy \(B\text{?}\) List them:

(e)

Out of the die rolls which satisfy \(B\text{,}\) how many also satisfy \(A\text{?}\) List them:

(f)

Out of the 20 die rolls, what proportion of them satisfy \(A\text{?}\)

(g)

Out of the die rolls that satisfy \(B\text{,}\) what proportion of them satisfy \(A\text{?}\)

(h)

Out of the 20 die rolls, what proportion of them satisfy \(B\text{?}\)

(i)

Out of the die rolls that satisfy \(A\text{,}\) what proportion of them satisfy \(B\text{?}\)

(j)

Out of the 20 die rolls, how many satisfy both \(A\) and \(B\text{?}\) List them:

Activity 2.2.2. Conditional Probability and Dice.

Consider rolling a die, and let \(A\) denote the event “the outcome is even”, and let \(B\) denote “the outcome is greater than 3” as in Exploration 2.2.1

(a)

Out of the possible outcomes 1 though 6, which satisfy \(B\text{?}\)

(b)

Out of the outcomes which satisfy \(B\text{,}\) how many also satisfy \(A\text{?}\) What are they?

(c)

If you picked an outcome at random which satisfied \(B\text{,}\) what is the probability that it satisfies \(A\) as well?

(d)

How does this compare to \(P(A)\text{,}\) the probability that \(A\) occurs in general?

Definition 2.2.1.

We say the probability of \(A\) given \(B\) is the probability \(A\)occurs assuming that \(B\) occurs, denoted \(P(A|B)\text{.}\)

We can imagine this by taking the proportion of sample space that is \(B\) i.e. \(P(B)\) and inside of that, finding the proportion that also satisfies A i.e. \(P(A\ \text{and}\ B)\)

Test Venn Diagram.
Figure 2.2.2. \(B\text{,}\) the new sample space.
Test Venn Diagram.
Figure 2.2.3. \(A\) when restricted to \(B\text{.}\)

Since we now treat \(B\) as the entire sample space, we take this proportion out of \(P(B)\) so

\begin{equation*} P(A|B)=\frac{P(A\ \text{and}\ B)}{P(B)}. \end{equation*}

Activity 2.2.3. Conditional Probability and Dice using the Definition.

Consider rolling a die, and let \(A\) denote the event “the outcome is even”, and let \(B\) denote “the outcome is greater than 3” as in Exploration 2.2.1, Activity 2.2.2.

(a)

Find \(P(A), P(B), P(A\ \text{and}\ B )\text{.}\)

Activity 2.2.4. Conditional Probability and Dice Simulation.

Consider rolling a die, and let \(A\) denote the event “the outcome is even”, and let \(B\) denote “the outcome is greater than 3” as in Exploration 2.2.1, Activity 2.2.2, Activity 2.2.3. We will simulate 1000 die rolls and observe the results.

(a)

Run the following code to show the results of 1000 die rolls as a histogram.

(b)

Run the following code to seperate out the die rolls which are greater than 3 as it's own vector B.

(c)

Run the following code to count the length of B.

(d)

Run the following code to count the elements of B that are even (i.e. satisfy \(A\)).

(e)

What proportion the die rolls who satisfy \(B\) also satisfy \(A\text{?}\)

Activity 2.2.5. Conditional Probability Students.

Suppose that in a Mathematics department, 20% of the students have done undergraduate research, 25% have gone to conferences. We also have that 30% have done either one or the other.

Let \(R\) denote the event “has done undergraduate research” and \(C\) denote “has gone to a conference.”.

(a)

Fill out the proportion of students that fit in each piece of this Venn Diagram:

Test Venn Diagram.
Figure 2.2.4. \(R\) research, \(C\) conferences.

(b)

If a student doesn't do undergraduate research, whats the probability that they go to a conference?

(c)

If a student goes to a conference, what is the probability that they did undergrad research?

Activity 2.2.6. Conditional Probability and Loan Data.

We show how conditioning by one variable affects (if at all) another variable.

Recall that there are 10000 loans in the loans data set.

(a)

Run the following code to show how many loans were from applicants who owned their homes.

(b)

What proportion of the loans went to home owners?

(c)

Run the following code to subset the loans with grade “A”, and show how many loans this is.

(d)

Run the following code to show how many grade “A” loans were from applicants who were home owners.

(e)

What proportion of grade “A” loans went to home owners? How does this answer compare to what you found in (b)

(f)

Run the following code to take a sample of 1000 loans and display a contingency table comparing loan grades to home-ownership from this sample. What proportion of the homeowners got grade “A” loans?

Subsection 2.2.2 Independence

Activity 2.2.7.

Suppose you roll a pair of dice, a blue dice and a green dice. Consider the possible blue-green outcomes:

\begin{equation*} \begin{array}{cccccc} (1,1)\amp(2,1)\amp(3,1)\amp(4,1)\amp(5,1)\amp(6,1)\\ (1,2)\amp(2,2)\amp(3,2)\amp(4,2)\amp(5,2)\amp(6,2)\\ (1,3)\amp(2,3)\amp(3,3)\amp(4,3)\amp(5,3)\amp(6,3)\\ (1,4)\amp(2,4)\amp(3,4)\amp(4,4)\amp(5,4)\amp(6,4)\\ (1,5)\amp(2,5)\amp(3,5)\amp(4,5)\amp(5,5)\amp(6,5)\\ (1,6)\amp(2,6)\amp(3,6)\amp(4,6)\amp(5,6)\amp(6,6)\\ \end{array} \end{equation*}

Let \(A\) denote the outcome “The blue die comes up a 5”, and \(B\) denote the outcome “The green comes up even”.

(a)

Which of these outcomes correspond to \(A\text{?}\)

(b)

What is \(P(A)\text{?}\)

(c)

Which of these outcomes correspond to \(B\text{?}\)

(d)

What is \(P(B)\text{?}\)

(e)

Out of the outcomes which correspond to \(B\text{,}\) which of them correspond to \(A\text{?}\)

(f)

What is \(P(A|B)\text{?}\)

(g)

What do you notice about \(P(A)\) compared to \(P(A|B)\text{?}\) How does restricting to even green outcomes change (if at all) the likelihood of blue coming up a 5?

Activity 2.2.8. Simulating the double die.

We simulate 100 rolls of a pair of dice.

(a)

Run the following code to simulate 1000 blue die rolls and 1000 green die rolls, and show the first 10 pairs.

(b)

Run the following code to see how many of these rolls satisfied \(A\text{:}\) “the blue die comes up a 5”.

What proportion of the die rolls was \(A\) satisfied?

(c)

Run the following code to see how many of these rolls satisfied \(B\text{:}\) “the green die comes up even”.

What proportion of the die rolls was \(B\) satisfied?

(d)

Run the following code to subset out the pairs satisfying \(B\) i.e. where the green die was even, and display the first 10 rows.

(e)

Run the following code to see how many of the rolls satisfying \(B\) also satisfied \(A\text{.}\)

What proportion of the die rolls satisfying \(B\) was \(A\) also satisfied?

Definition 2.2.5.

Events \(A, B\) are independent of each other if the outcome of one event does not change the probability of the other occuring. That is

\begin{equation*} P(A|B)=P(A). \end{equation*}

Remark 2.2.6.

Events \(A, B\) are independent if and only if:

\begin{align*} P(A|B)\amp= P(A)\\ \frac{P(A\ \text{and}\ B)}{P(B)}\amp= P(A)\\ P(A\ \text{and}\ B)\amp= P(A)P(B). \end{align*}

Remark 2.2.7.

We think of two events as “independent” if knowing the outcome of one event will not change how we think of the outcome of another event.

For example, “It rains today” and “I flip a coin and get a heads”. The chance of getting that heads remains unchanged whether or not it rains!

In general, it's hard to know if two events are independent or not! Things may seem unrelated, but share hidden connections and causes. The world is a complicated place!

Activity 2.2.9. Which are Independent?

(c)

In Activity 2.2.6 are “applicant owns a home” and “applicant recieves grade “A” loan” independent? Why or why not?

Activity 2.2.10. Cards and Independence.

Consider the standard 52 card deck, with 4 suits (Diamonds, Clubs, Hearts, Spades) and 13 values (2-10, Jack, Queen, King, Ace). Suppose we draw one card from this deck.

(a)

Suppose

(b)

What is the probability that you draw a Spade?

(c)

What is the probability that you draw an Ace?

(d)

What is the probability that you draw an Ace of Spades?

(e)

Are drawing an Ace and a Spade Independent? Why or why not?

(f)

Suppose I draw a second card without replacement. Is the event that this second card an Ace independent of the event that the original card were an Ace? (Hint: Does the outcome that the first card is or is not an ace effect the outcome that the second card would be?)

Activity 2.2.11. Cards and Independence part 2.

Consider the standard 52 card deck, with 4 suits (Diamonds, Clubs, Hearts, Spades) and 13 values (2-10, Jack, Queen, King, Ace). Suppose we draw a card from this deck, record the value, put it back, reshuffle, and draw again.

(a)

Explain using the meaning of independence (without using arithmetic) to explain why the outcomes of these draws are independent.

(b)

What is the probability the first card is a Heart?

(c)

What is the probability the second card is a Heart?

(d)

Using the fact that the draws are independent, what is the probability the cards are both Hearts?

(e)

What is the probability that the first card is a Queen and the second note a Queen?