A random variable (often denoted as X, Y, etc.) is a function that assigns a unique numerical value to each possible outcome in the sample space (S) of a random experiment.
It's a "variable" because it can take different values, and "random" because the specific value it takes is determined by the outcome of a random phenomenon.
Example: Consider tossing two fair coins.
- The sample space is S = {HH, HT, TH, TT}.
- Let the random variable X be "the number of heads."
- X is a function that maps outcomes to numbers:
- X(HH) = 2
- X(HT) = 1
- X(TH) = 1
- X(TT) = 0
- The possible values for X are {0, 1, 2}.
Random variables are broadly classified into two types:
| Type | Definition | Values are... | Examples |
|---|---|---|---|
| Discrete | A random variable that can take on a finite or countably infinite number of distinct values. | Counted. There are "gaps" between possible values. |
|
| Continuous | A random variable that can take on any value within a given range or interval. | Measured. There are no gaps between values. |
|
We describe the probability of different values of a random variable using a distribution function.
P(X = a) = ∫aa f(x) dx = 0.
This is because there is no "area" under a single point. Probabilities are only defined over intervals. This also means P(a ≤ X ≤ b) is the same as P(a < X < b).
Often, we are interested in a function of a random variable. If we know the distribution of X, can we find the distribution of Y = g(X)?
This is straightforward. The p.m.f. of Y is found by summing the probabilities of all x values that map to a given y value.
pY(y) = P(Y=y) = P(g(X) = y) = Σ{x | g(x)=y} pX(x)
Example: Let X have p.m.f. p(-1)=0.1, p(0)=0.3, p(1)=0.4, p(2)=0.2.
Find the p.m.f. of Y = X².
- Y can take values X² = {(-1)², 0², 1², 2²} = {1, 0, 4}.
- pY(0) = P(Y=0) = P(X²=0) = P(X=0) = 0.3
- pY(1) = P(Y=1) = P(X²=1) = P(X=-1 or X=1) = pX(-1) + pX(1) = 0.1 + 0.4 = 0.5
- pY(4) = P(Y=4) = P(X²=4) = P(X=2) = 0.2
- The p.m.f. for Y is: pY(0)=0.3, pY(1)=0.5, pY(4)=0.2. (Check: 0.3+0.5+0.2 = 1).
This is more complex and requires calculus. If Y = g(X) is a monotonic (strictly increasing or decreasing) function, we can find the p.d.f. of Y.
The |dx/dy| term is called the Jacobian of the transformation. It scales the density function to ensure the total area remains 1.
Example: Let X be a continuous RV with p.d.f. fX(x) = 2x, for 0 < x < 1.
Find the p.d.f. of Y = 8X³.
- Find range of Y: If 0 < x < 1, then 0 < 8x³ < 8. So, 0 < y < 8.
- Find inverse: y = 8x³ => x³ = y/8 => x = (y/8)¹/³ = y¹/³ / 2. So, g⁻¹(y) = y¹/³ / 2.
- Find derivative: dx/dy = d/dy ( (1/2) * y¹/³ ) = (1/2) * (1/3) * y⁻²/³ = 1 / (6y²/³).
- Apply formula:
- fY(y) = fX(g⁻¹(y)) * |dx/dy|
- fY(y) = 2(g⁻¹(y)) * |1 / (6y²/³)|
- fY(y) = 2(y¹/³ / 2) * (1 / (6y²/³)) (Since y > 0, absolute value is not needed)
- fY(y) = (y¹/³) * (1 / (6y²/³)) = y¹/³⁻²/³ / 6 = y⁻¹/³ / 6
- Final PDF: fY(y) = 1 / (6y¹/³), for 0 < y < 8.
We often need to study two or more random variables simultaneously. A bivariate random variable is an ordered pair (X, Y) that maps each outcome in a sample space S to a point in the 2D plane.
This is the 2D equivalent of a p.m.f./p.d.f. It describes the probability of X and Y *simultaneously* taking on certain values.
F(x, y) = P(X ≤ x, Y ≤ y)
The marginal distribution of X is the individual probability distribution of X, "ignoring" Y. We get it by "summing out" or "integrating out" the other variable from the joint distribution.
Think of this as summing across the rows in a joint probability table.
This gives the individual p.d.f. for X. The same logic applies for finding the marginal distribution of Y (sum/integrate over x).
The conditional distribution describes the probability of one variable *given that we know* the value of the other. It's like taking a "slice" of the joint distribution.
f(x, y) = f(y | x) * fX(x) and f(x, y) = f(x | y) * fY(y)
This is just a rearrangement of the conditional formula and is extremely useful in proofs.
Definition: Two random variables X and Y are independent if and only if their joint distribution function factors into the product of their individual marginal distribution functions.
If X and Y are independent, then the conditional distribution is equal to the marginal distribution:
f(y | x) = f(x, y) / fX(x) = (fX(x) * fY(y)) / fX(x) = fY(y)
This makes intuitive sense: if the variables are independent, knowing the value of X gives you no new information about Y.
This extends the univariate case. We have (X, Y) and want to find the joint p.d.f. of new variables, U and V, where:
U = g₁(X, Y) and V = g₂(X, Y)
You must also transform the domain (the range of possible x, y values) into the new domain for u, v.
You will find that U and V are independent, where U is a Gamma variable and V is a Beta variable. This is a very common and important transformation.