Unit 2: Mathematical Expectation

2.1 Mathematical Expectation of a Random Variable
2.2 Properties of Expectation (Addition & Multiplication Theorems)
2.3 Variance and Covariance
2.4 Expectation of a Bivariate Random Variable
2.5 Solved Examples

2.1 Mathematical Expectation of a Random Variable

Definition of Expectation (E[X])

The mathematical expectation (or expected value, mean) of a random variable X is the weighted average of all possible values that X can take, with the weights being their respective probabilities.

It represents the long-run average value of X if the experiment were repeated many times. It is often denoted as μ (mu).

For a Discrete RV (X):
E[X] = Σ_x x * p(x)

Sum of (value * probability of value) for all possible values x.
For a Continuous RV (X):
E[X] = ∫_-∞^+∞ x * f(x) dx

Integral of (value * density at value) over the entire range of x.

Expectation of a Function g(X)

We often need the expectation of a function of X, say Y = g(X). We can find E[Y] *without* first finding the p.d.f. of Y, using the Law of the Unconscious Statistician (LOTUS).

Discrete:
E[g(X)] = Σ_x g(x) * p(x)
Continuous:
E[g(X)] = ∫_-∞^+∞ g(x) * f(x) dx

Example: Let X be a discrete RV with p.m.f. p(0)=0.5, p(1)=0.3, p(2)=0.2.
Find E[X] and E[X²].

E[X] = (0 * p(0)) + (1 * p(1)) + (2 * p(2)) = (0 * 0.5) + (1 * 0.3) + (2 * 0.2) = 0 + 0.3 + 0.4 = 0.7

E[X²] = (0² * p(0)) + (1² * p(1)) + (2² * p(2)) = (0 * 0.5) + (1 * 0.3) + (4 * 0.2) = 0 + 0.3 + 0.8 = 1.1

2.2 Properties of Expectation (Addition & Multiplication Theorems)

Expectation has several key properties that make it a powerful tool.

E[c] = c
The expected value of a constant (c) is just the constant itself. (e.g., E[5] = 5).
E[c * X] = c * E[X]
Constants can be factored out of an expectation.
E[aX + b] = a * E[X] + b (from properties 1 and 2)
Expectation is a linear operator.
Addition Theorem: E[X + Y] = E[X] + E[Y]
The expectation of a sum is the sum of the expectations. This is known as the Linearity of Expectation.

Crucial Exam Point: The Addition Theorem E[X + Y] = E[X] + E[Y] holds whether or not X and Y are independent. This is a very powerful property and a common exam question.
Multiplication Theorem: E[X * Y] = E[X] * E[Y] (if X and Y are independent)
The expectation of a product is the product of the expectations *only if* the variables are independent.

Warning: The reverse is not true! If E[XY] = E[X]E[Y], it does not necessarily mean X and Y are independent. It only means they are uncorrelated (a weaker condition, which we'll see next).

2.3 Variance and Covariance

Variance (Var(X) or σ²)

While expectation tells us the "center" of a distribution, variance tells us about its "spread" or "dispersion." A small variance means data points are clustered tightly around the mean. A large variance means they are spread out.

Definition: Variance is the expected value of the squared deviation from the mean (μ = E[X]).

Var(X) = E[ (X - μ)² ]

Computational Formula (Very Important): This is almost always used for calculations.

Var(X) = E[X²] - (E[X])²

Standard Deviation (σ): The square root of the variance: σ = sqrt(Var(X)). It is in the same units as X, making it more interpretable.

Properties of Variance

Var(X) ≥ 0 (Variance can never be negative, as it's an expectation of a squared value).
Var(c) = 0 (A constant has no spread, so its variance is zero).
Var(aX + b) = a² * Var(X)
- Adding a constant 'b' shifts the distribution but doesn't change its spread (Var(X+b) = Var(X)).
- Multiplying by 'a' scales the spread by a² (Var(aX) = a²Var(X)).

Covariance (Cov(X, Y))

Covariance measures the joint variability of two random variables, (X, Y). It describes the direction of the linear relationship between them.

Positive Covariance: X and Y tend to move in the same direction (when X is high, Y tends to be high).
Negative Covariance: X and Y tend to move in opposite directions (when X is high, Y tends to be low).
Zero Covariance: No linear relationship (they are uncorrelated).

Definition: Cov(X, Y) = E[ (X - E[X]) * (Y - E[Y]) ]
Computational Formula:

Cov(X, Y) = E[X * Y] - E[X] * E[Y]

Independence vs. Uncorrelated:

If X and Y are independent, then E[XY] = E[X]E[Y]. This means Cov(X, Y) = 0.
So, Independence implies Uncorrelated.
The reverse is NOT true. Cov(X, Y) = 0 does NOT imply independence (except for special cases like the Bivariate Normal distribution).

Variance of a Sum (General Case)

Using covariance, we can state the general formula for the variance of a sum:

Var(X + Y) = Var(X) + Var(Y) + 2 * Cov(X, Y)

Var(X - Y) = Var(X) + Var(Y) - 2 * Cov(X, Y)

If X and Y are independent, then Cov(X, Y) = 0, and the formulas simplify:

Var(X + Y) = Var(X) + Var(Y) (if independent)

Var(X - Y) = Var(X) + Var(Y) (if independent)

Warning: A very common mistake is to think Var(X - Y) = Var(X) - Var(Y). This is WRONG. Variance is a measure of spread (a squared quantity), so it always adds.

2.4 Expectation of a Bivariate Random Variable

This is simply an application of the Law of the Unconscious Statistician (LOTUS) for a function of two variables, g(X, Y).

Discrete:
E[g(X, Y)] = Σ_x Σ_y g(x, y) * p(x, y)
Continuous:
E[g(X, Y)] = ∫_-∞^+∞ ∫_-∞^+∞ g(x, y) * f(x, y) dx dy

The formulas for E[X+Y], E[XY], and Cov(X,Y) are all special cases of this.

To find E[X+Y], let g(x, y) = x + y.
To find E[XY], let g(x, y) = x * y.

2.5 Solved Examples

Example: Let (X, Y) have the joint p.d.f. f(x, y) = 2 for 0 < x < y < 1, and 0 otherwise.
Find E[X] and E[Y].

1. Find Marginal PDFs first:

f_X(x): We integrate over y. The bounds for y are from x to 1.
f_X(x) = ∫_x¹ 2 dy = [2y] from x to 1 = 2(1) - 2(x) = 2(1-x), for 0 < x < 1.

f_Y(y): We integrate over x. The bounds for x are from 0 to y.
f_Y(y) = ∫₀^y 2 dx = [2x] from 0 to y = 2(y) - 2(0) = 2y, for 0 < y < 1.

2. Calculate Expectations using Marginals:

E[X] = ∫₀¹ x * f_X(x) dx = ∫₀¹ x * 2(1-x) dx = 2 ∫₀¹ (x - x²) dx
= 2 [x²/2 - x³/3] from 0 to 1 = 2 * ( (1/2 - 1/3) - 0 ) = 2 * (1/6) = 1/3.

E[Y] = ∫₀¹ y * f_Y(y) dy = ∫₀¹ y * (2y) dy = 2 ∫₀¹ y² dy
= 2 [y³/3] from 0 to 1 = 2 * ( 1/3 - 0 ) = 2/3.

Alternative (using joint p.d.f.):

E[X] = ∫₀¹ ∫₀^y x * (2) dx dy = ∫₀¹ [x²] from 0 to y dy = ∫₀¹ y² dy = [y³/3] from 0 to 1 = 1/3.