Knowlet

Unit 1: Random Variables and Distributions

1.1 Univariate Random Variables

Definition of a Random Variable

A random variable (often denoted as X, Y, etc.) is a function that assigns a unique numerical value to each possible outcome in the sample space (S) of a random experiment.

It's a "variable" because it can take different values, and "random" because the specific value it takes is determined by the outcome of a random phenomenon.

Example: Consider tossing two fair coins.
  • The sample space is S = {HH, HT, TH, TT}.
  • Let the random variable X be "the number of heads."
  • X is a function that maps outcomes to numbers:
    • X(HH) = 2
    • X(HT) = 1
    • X(TH) = 1
    • X(TT) = 0
  • The possible values for X are {0, 1, 2}.

Discrete vs. Continuous Random Variables

Random variables are broadly classified into two types:

Type Definition Values are... Examples
Discrete A random variable that can take on a finite or countably infinite number of distinct values. Counted. There are "gaps" between possible values.
  • Number of heads in 3 coin flips (0, 1, 2, 3)
  • Number of cars passing a toll booth in an hour (0, 1, 2, ...)
  • Number of defective items in a batch (0, 1, ..., n)
Continuous A random variable that can take on any value within a given range or interval. Measured. There are no gaps between values.
  • Height of a student (e.g., any value between 150cm and 190cm)
  • Temperature of a room
  • Time until a light bulb burns out

1.2 PMF, PDF, and CDF

We describe the probability of different values of a random variable using a distribution function.

Probability Mass Function (p.m.f.)

  • Used for: Discrete Random Variables.
  • Definition: A function p(x) that gives the probability that the discrete random variable X is exactly equal to some value x.
  • p(x) = P(X = x)
  • Properties:
    1. p(x) ≥ 0 for all x (Probabilities can't be negative).
    2. Σ p(x) = 1 (The sum of probabilities for all possible outcomes must be 1).
  • Example (Two Coin Toss): For X = number of heads, the p.m.f. is:
    • p(0) = P(X=0) = P(TT) = 1/4
    • p(1) = P(X=1) = P(HT or TH) = 2/4 = 1/2
    • p(2) = P(X=2) = P(HH) = 1/4
    • Check: (1/4) + (1/2) + (1/4) = 1.

Probability Density Function (p.d.f.)

  • Used for: Continuous Random Variables.
  • Definition: A function f(x) where the area under the curve between two points (a, b) gives the probability that X falls within that interval.
  • P(a ≤ X ≤ b) = ∫ab f(x) dx
  • Properties:
    1. f(x) ≥ 0 for all x (The density curve cannot dip below the x-axis).
    2. -∞+∞ f(x) dx = 1 (The total area under the entire curve must be 1).
Common Mistake: For a continuous random variable, the probability of any single, exact value is zero.

P(X = a) = ∫aa f(x) dx = 0.

This is because there is no "area" under a single point. Probabilities are only defined over intervals. This also means P(a ≤ X ≤ b) is the same as P(a < X < b).

Cumulative Distribution Function (c.d.f.)

  • Used for: Both Discrete and Continuous Random Variables.
  • Definition: A function F(x) that gives the probability that the random variable X is less than or equal to a specific value x.
  • F(x) = P(X ≤ x)
  • For Discrete X: F(x) = Σt≤x p(t)
    • The C.D.F. is a "step function" that jumps up at each possible value of X.
  • For Continuous X: F(x) = ∫-∞x f(t) dt
    • The C.D.F. is a continuous, non-decreasing function.
    • Relationship to p.d.f.: You can get the p.d.f. by differentiating the c.d.f.: f(x) = d/dx F(x).
  • Universal Properties of C.D.F.:
    1. 0 ≤ F(x) ≤ 1 (It is a probability).
    2. F(x) is non-decreasing (i.e., if a < b, then F(a) ≤ F(b)).
    3. limx→-∞ F(x) = 0 (Probability of X ≤ -∞ is 0).
    4. limx→+∞ F(x) = 1 (Probability of X ≤ +∞ is 1).

1.3 Univariate Transformations

Often, we are interested in a function of a random variable. If we know the distribution of X, can we find the distribution of Y = g(X)?

Discrete Case

This is straightforward. The p.m.f. of Y is found by summing the probabilities of all x values that map to a given y value.

pY(y) = P(Y=y) = P(g(X) = y) = Σ{x | g(x)=y} pX(x)

Example: Let X have p.m.f. p(-1)=0.1, p(0)=0.3, p(1)=0.4, p(2)=0.2.
Find the p.m.f. of Y = X².
  • Y can take values X² = {(-1)², 0², 1², 2²} = {1, 0, 4}.
  • pY(0) = P(Y=0) = P(X²=0) = P(X=0) = 0.3
  • pY(1) = P(Y=1) = P(X²=1) = P(X=-1 or X=1) = pX(-1) + pX(1) = 0.1 + 0.4 = 0.5
  • pY(4) = P(Y=4) = P(X²=4) = P(X=2) = 0.2
  • The p.m.f. for Y is: pY(0)=0.3, pY(1)=0.5, pY(4)=0.2. (Check: 0.3+0.5+0.2 = 1).

Continuous Case (Change of Variable Technique)

This is more complex and requires calculus. If Y = g(X) is a monotonic (strictly increasing or decreasing) function, we can find the p.d.f. of Y.

  1. Find the inverse function: x = g⁻¹(y).
  2. Find the derivative of the inverse function: dx/dy.
  3. The p.d.f. of Y is given by the formula:
fY(y) = fX(g⁻¹(y)) * |dx/dy|

The |dx/dy| term is called the Jacobian of the transformation. It scales the density function to ensure the total area remains 1.

Example: Let X be a continuous RV with p.d.f. fX(x) = 2x, for 0 < x < 1.
Find the p.d.f. of Y = 8X³.
  1. Find range of Y: If 0 < x < 1, then 0 < 8x³ < 8. So, 0 < y < 8.
  2. Find inverse: y = 8x³ => x³ = y/8 => x = (y/8)¹/³ = y¹/³ / 2. So, g⁻¹(y) = y¹/³ / 2.
  3. Find derivative: dx/dy = d/dy ( (1/2) * y¹/³ ) = (1/2) * (1/3) * y⁻²/³ = 1 / (6y²/³).
  4. Apply formula:
    • fY(y) = fX(g⁻¹(y)) * |dx/dy|
    • fY(y) = 2(g⁻¹(y)) * |1 / (6y²/³)|
    • fY(y) = 2(y¹/³ / 2) * (1 / (6y²/³)) (Since y > 0, absolute value is not needed)
    • fY(y) = (y¹/³) * (1 / (6y²/³)) = y¹/³⁻²/³ / 6 = y⁻¹/³ / 6
  5. Final PDF: fY(y) = 1 / (6y¹/³), for 0 < y < 8.

1.4 Two-Dimensional (Bivariate) Random Variables

We often need to study two or more random variables simultaneously. A bivariate random variable is an ordered pair (X, Y) that maps each outcome in a sample space S to a point in the 2D plane.

  • (Discrete, Discrete): (Number of heads, Number of tails).
  • (Continuous, Continuous): (Height, Weight).
  • (Discrete, Continuous): (Number of children in a family, Annual income).

1.5 Joint, Marginal, and Conditional Distributions

Joint p.m.f. and p.d.f.

This is the 2D equivalent of a p.m.f./p.d.f. It describes the probability of X and Y *simultaneously* taking on certain values.

  • Joint p.m.f. (Discrete): p(x, y) = P(X=x, Y=y)
    • Properties: 1. p(x,y) ≥ 0, 2. Σx Σy p(x,y) = 1
  • Joint p.d.f. (Continuous): f(x, y)
    • Properties: 1. f(x,y) ≥ 0, 2. ∫-∞+∞-∞+∞ f(x,y) dx dy = 1
    • Probability is volume: P(acdab f(x,y) dx dy.

Joint c.d.f.

F(x, y) = P(X ≤ x, Y ≤ y)

  • Discrete: F(x, y) = Σs≤x Σt≤y p(s, t)
  • Continuous: F(x, y) = ∫-∞y-∞x f(s, t) ds dt
  • We can get the joint p.d.f. from the c.d.f.: f(x,y) = ∂²F(x,y) / (∂x ∂y).

Marginal Distributions

The marginal distribution of X is the individual probability distribution of X, "ignoring" Y. We get it by "summing out" or "integrating out" the other variable from the joint distribution.

  • Marginal p.m.f. for X (Discrete):
    pX(x) = P(X=x) = Σy p(x, y)

    Think of this as summing across the rows in a joint probability table.

  • Marginal p.d.f. for X (Continuous):
    fX(x) = ∫-∞+∞ f(x, y) dy

    This gives the individual p.d.f. for X. The same logic applies for finding the marginal distribution of Y (sum/integrate over x).

Conditional Distributions

The conditional distribution describes the probability of one variable *given that we know* the value of the other. It's like taking a "slice" of the joint distribution.

  • Conditional p.m.f. of Y given X=x:
    p(y | x) = P(Y=y | X=x) = P(X=x, Y=y) / P(X=x) = p(x, y) / pX(x)
  • Conditional p.d.f. of Y given X=x:
    f(y | x) = f(x, y) / fX(x)
Key Relationship: The joint distribution is the product of the marginal and the conditional.

f(x, y) = f(y | x) * fX(x) and f(x, y) = f(x | y) * fY(y)

This is just a rearrangement of the conditional formula and is extremely useful in proofs.

1.6 Independence of Variables

Definition: Two random variables X and Y are independent if and only if their joint distribution function factors into the product of their individual marginal distribution functions.
  • For all (x, y):
    • Joint c.d.f.: F(x, y) = FX(x) * FY(y)
    • Joint p.m.f./p.d.f.: f(x, y) = fX(x) * fY(y)

If X and Y are independent, then the conditional distribution is equal to the marginal distribution:

f(y | x) = f(x, y) / fX(x) = (fX(x) * fY(y)) / fX(x) = fY(y)

This makes intuitive sense: if the variables are independent, knowing the value of X gives you no new information about Y.

1.7 Bivariate Transformations

This extends the univariate case. We have (X, Y) and want to find the joint p.d.f. of new variables, U and V, where:

U = g₁(X, Y) and V = g₂(X, Y)

The Jacobian Method (Change of Variables)

  1. Define the transformations: U = g₁(X, Y) and V = g₂(X, Y).
  2. Solve for the inverse functions: X = h₁(U, V) and Y = h₂(U, V).
  3. Calculate the Jacobian determinant (J) of the inverse transformation. This is the determinant of a matrix of partial derivatives:
J = det [ (∂x/∂u) (∂x/∂v) (∂y/∂u) (∂y/∂v) ] = (∂x/∂u)(∂y/∂v) - (∂x/∂v)(∂y/∂u)
  1. The new joint p.d.f. for U and V is:
fU,V(u, v) = fX,Y(h₁(u, v), h₂(u, v)) * |J|

You must also transform the domain (the range of possible x, y values) into the new domain for u, v.

Classic Exam Problem: Let X and Y be independent Exponential(λ) variables. Find the joint distribution of U = X+Y and V = X/(X+Y).

You will find that U and V are independent, where U is a Gamma variable and V is a Beta variable. This is a very common and important transformation.

Did this resource help you study?

Share feedback or report issues to help improve this resource.