Knowlet

Unit 2: Mathematical Expectation

2.1 Mathematical Expectation of a Random Variable

Definition of Expectation (E[X])

The mathematical expectation (or expected value, mean) of a random variable X is the weighted average of all possible values that X can take, with the weights being their respective probabilities.

It represents the long-run average value of X if the experiment were repeated many times. It is often denoted as μ (mu).

  • For a Discrete RV (X):
    E[X] = Σx x * p(x)

    Sum of (value * probability of value) for all possible values x.

  • For a Continuous RV (X):
    E[X] = ∫-∞+∞ x * f(x) dx

    Integral of (value * density at value) over the entire range of x.

Expectation of a Function g(X)

We often need the expectation of a function of X, say Y = g(X). We can find E[Y] *without* first finding the p.d.f. of Y, using the Law of the Unconscious Statistician (LOTUS).

  • Discrete:
    E[g(X)] = Σx g(x) * p(x)
  • Continuous:
    E[g(X)] = ∫-∞+∞ g(x) * f(x) dx
Example: Let X be a discrete RV with p.m.f. p(0)=0.5, p(1)=0.3, p(2)=0.2.
Find E[X] and E[X²].
  • E[X] = (0 * p(0)) + (1 * p(1)) + (2 * p(2)) = (0 * 0.5) + (1 * 0.3) + (2 * 0.2) = 0 + 0.3 + 0.4 = 0.7
  • E[X²] = (0² * p(0)) + (1² * p(1)) + (2² * p(2)) = (0 * 0.5) + (1 * 0.3) + (4 * 0.2) = 0 + 0.3 + 0.8 = 1.1

2.2 Properties of Expectation (Addition & Multiplication Theorems)

Expectation has several key properties that make it a powerful tool.

  1. E[c] = c

    The expected value of a constant (c) is just the constant itself. (e.g., E[5] = 5).

  2. E[c * X] = c * E[X]

    Constants can be factored out of an expectation.

  3. E[aX + b] = a * E[X] + b (from properties 1 and 2)

    Expectation is a linear operator.

  4. Addition Theorem: E[X + Y] = E[X] + E[Y]

    The expectation of a sum is the sum of the expectations. This is known as the Linearity of Expectation.

    Crucial Exam Point: The Addition Theorem E[X + Y] = E[X] + E[Y] holds whether or not X and Y are independent. This is a very powerful property and a common exam question.
  5. Multiplication Theorem: E[X * Y] = E[X] * E[Y] (if X and Y are independent)

    The expectation of a product is the product of the expectations *only if* the variables are independent.

    Warning: The reverse is not true! If E[XY] = E[X]E[Y], it does not necessarily mean X and Y are independent. It only means they are uncorrelated (a weaker condition, which we'll see next).

2.3 Variance and Covariance

Variance (Var(X) or σ²)

While expectation tells us the "center" of a distribution, variance tells us about its "spread" or "dispersion." A small variance means data points are clustered tightly around the mean. A large variance means they are spread out.

  • Definition: Variance is the expected value of the squared deviation from the mean (μ = E[X]).
  • Var(X) = E[ (X - μ)² ]
  • Computational Formula (Very Important): This is almost always used for calculations.
  • Var(X) = E[X²] - (E[X])²
  • Standard Deviation (σ): The square root of the variance: σ = sqrt(Var(X)). It is in the same units as X, making it more interpretable.

Properties of Variance

  1. Var(X) ≥ 0 (Variance can never be negative, as it's an expectation of a squared value).
  2. Var(c) = 0 (A constant has no spread, so its variance is zero).
  3. Var(aX + b) = a² * Var(X)
    • Adding a constant 'b' shifts the distribution but doesn't change its spread (Var(X+b) = Var(X)).
    • Multiplying by 'a' scales the spread by a² (Var(aX) = a²Var(X)).

Covariance (Cov(X, Y))

Covariance measures the joint variability of two random variables, (X, Y). It describes the direction of the linear relationship between them.

  • Positive Covariance: X and Y tend to move in the same direction (when X is high, Y tends to be high).
  • Negative Covariance: X and Y tend to move in opposite directions (when X is high, Y tends to be low).
  • Zero Covariance: No linear relationship (they are uncorrelated).
  • Definition: Cov(X, Y) = E[ (X - E[X]) * (Y - E[Y]) ]
  • Computational Formula:
  • Cov(X, Y) = E[X * Y] - E[X] * E[Y]
Independence vs. Uncorrelated:
  • If X and Y are independent, then E[XY] = E[X]E[Y]. This means Cov(X, Y) = 0.
  • So, Independence implies Uncorrelated.
  • The reverse is NOT true. Cov(X, Y) = 0 does NOT imply independence (except for special cases like the Bivariate Normal distribution).

Variance of a Sum (General Case)

Using covariance, we can state the general formula for the variance of a sum:

Var(X + Y) = Var(X) + Var(Y) + 2 * Cov(X, Y)
Var(X - Y) = Var(X) + Var(Y) - 2 * Cov(X, Y)

If X and Y are independent, then Cov(X, Y) = 0, and the formulas simplify:

Var(X + Y) = Var(X) + Var(Y) (if independent)

Var(X - Y) = Var(X) + Var(Y) (if independent)

Warning: A very common mistake is to think Var(X - Y) = Var(X) - Var(Y). This is WRONG. Variance is a measure of spread (a squared quantity), so it always adds.

2.4 Expectation of a Bivariate Random Variable

This is simply an application of the Law of the Unconscious Statistician (LOTUS) for a function of two variables, g(X, Y).

  • Discrete:
    E[g(X, Y)] = Σx Σy g(x, y) * p(x, y)
  • Continuous:
    E[g(X, Y)] = ∫-∞+∞-∞+∞ g(x, y) * f(x, y) dx dy

The formulas for E[X+Y], E[XY], and Cov(X,Y) are all special cases of this.

  • To find E[X+Y], let g(x, y) = x + y.
  • To find E[XY], let g(x, y) = x * y.

2.5 Solved Examples

Example: Let (X, Y) have the joint p.d.f. f(x, y) = 2 for 0 < x < y < 1, and 0 otherwise.

Find E[X] and E[Y].

1. Find Marginal PDFs first:

  • fX(x): We integrate over y. The bounds for y are from x to 1.
    fX(x) = ∫x1 2 dy = [2y] from x to 1 = 2(1) - 2(x) = 2(1-x), for 0 < x < 1.
  • fY(y): We integrate over x. The bounds for x are from 0 to y.
    fY(y) = ∫0y 2 dx = [2x] from 0 to y = 2(y) - 2(0) = 2y, for 0 < y < 1.

2. Calculate Expectations using Marginals:

  • E[X] = ∫01 x * fX(x) dx = ∫01 x * 2(1-x) dx = 2 ∫01 (x - x²) dx
    = 2 [x²/2 - x³/3] from 0 to 1 = 2 * ( (1/2 - 1/3) - 0 ) = 2 * (1/6) = 1/3.
  • E[Y] = ∫01 y * fY(y) dy = ∫01 y * (2y) dy = 2 ∫01 y² dy
    = 2 [y³/3] from 0 to 1 = 2 * ( 1/3 - 0 ) = 2/3.

Alternative (using joint p.d.f.):

  • E[X] = ∫010y x * (2) dx dy = ∫01 [x²] from 0 to y dy = ∫01 y² dy = [y³/3] from 0 to 1 = 1/3.

Did this resource help you study?

Share feedback or report issues to help improve this resource.