Probability Theory

In the final lectures of this class, we will outline basic probability theory and emphasize the importance of probabilistic modelling in biology.

Up until now, we have studied only deterministic models, in which future states are entirely specified by the current state of the system.

In the real world, however, chance plays a major role in the dynamics of a population. Lightning may strike an individual. A fire may decimate a population. Individuals may fail to reproduce or produce a bonanza crop of offspring. New beneficial mutations may, by happenstance, occur in individuals that leave no children. Drought and famine may occur, or rains and excess.

Probabilistic models include chance events and outcomes and can lead to results that differ from purely deterministic models.

In this lecture, we'll begin with some basic definitions and rules from probability theory.

For further information, read Theory of Probability by Joe Romano, from which I have drawn several of the following definitions.

Probability Theory

What is a probability?

Frequency interpretation: "Probabilities are understood as mathematically convenient approximations to long run relative frequencies."
Subjective interpretation: "A probability statement expresses the opinion of some individual regarding how certain an event is to occur."

There is certain terminology that is useful in discussing probabilities:

Probability Theory

Complement Rule: The probability that A does not occur is equal to the probability that the complement of event A occurs. P(A^c) = 1 - P(A).

Difference Rule: If A is a subset of B, then the probability of B occurring but not A is P(B) - P(A) = P(B A^c).

Inclusion-Exclusion Rule: The probability of either A or B (or both) occurring is P(A U B) = P(A) + P(B) - P(AB).

Example: If the probability of having green eyes is 10%, the probability of having brown hair is 75%, and the probability of being a green-eyed brown haired person is 9%, what is the probability of

not having green eyes? [find P(A^c)]
having green eyes but not brown hair? [find P(A) - P(AB)]
having green eyes and/or brown hair? [find P(A U B)]

Conditional Probability: The probability that A occurs given that B has occurred = P(A|B). In other words, among those cases where B has occurred, P(A|B) is the proportion of cases in which event A occurs.

Probability Theory

Multiplication Rule: The probability of both A and B occurring is equal to the probability of B times the probability that A occurs given that B has: P(AB) = P(B) P(A|B).

Consequently, the conditional probability is given by P(A|B) = P(AB)/P(B).

Similarly, the probability that A occurs and that B occurs given that A has: P(A) P(B|A) = P(AB), so P(B|A) = P(AB)/P(A).

Example: What is the probability that you will have brown hair if you have green eyes? [find P(B|A)]

What is the probability that you will have green eyes if you have brown hair? [find P(A|B)]

Bayes' Rule:

P(B|A) = P(B) P(A|B) / P(A)

This formula relates the conditional probability of B given A to the conditional probability of A given B.

Example: Ability to taste phenylthiocarbamide (PTC) is thought to be determined by a single dominant gene with incomplete penetrance. Among North American whites, there is a 70% chance of being able to taste PTC [P(taster) = 0.7]. If everybody who tastes PTC is a carrier [P(carrier|taster) = 1] and if 80% of the population carries the gene [P(carrier) = 0.8], what is the penetrance of the gene? That is, what is the probability of tasting PTC if you are a carrier, P(taster|carrier)?

Probability Theory

Average Formula: Say that the set A can be completely partitioned into n mutually exclusive subsets. Then the overall probability of A is equal to the average probability of A in the subsets weighted by the probability of those subsets:

P(A) = P(A|B₁) P(B₁) + P(A|B₂) P(B₂) + ... + P(A|B_n) P(B_n)

Example: What is the overall probability of dying of malaria in a region where the chance of dying of malaria is 15% in individuals that do not carry the sickle cell allele and 1% in carriers? Assume that the frequency of carriers is 0.25.

P(dying) = ? = P(dying|non-carrier) P(non-carrier) + P(dying|carrier) P(carrier) = 0.15 * 0.75 + 0.01 * 0.25

Thus the overall probability of dying of malaris is 11.5%, which is substantially lower than if the sickle cell allele were absent from the population.

Independence: If the chance of A does not depend on whether or not B occurs, then we say that A and B are independent.

For independent events ONLY,

P(A|B) = P(A)
P(AB) = P(A) P(B)

Example: In the US, the frequency of the O blood type is about 0.45 and the frequency of Rh+ is about 0.86. What proportion of the population would have an O+ blood type if these were independent genes?

If two objects are drawn from a pool randomly with replacement then they are independent, since the first observation has no impact on the second.

Example: If there are N males in a population and a particular female chooses a mate randomly from among the males, what is the probability that, if she mates twice in the season, she will mate with a particular male both times?

Back to biology 301 home page.