Unit 2: Measures of Central Tendency

1. Criteria for an Ideal Measure of Central Tendency
2. Arithmetic Mean (AM)
3. Median
4. Mode (Ungrouped, Grouped, and Graphical)
5. Geometric Mean (GM)
6. Harmonic Mean (HM)
7. Merits and Demerits

1. Criteria for an Ideal Measure of Central Tendency

An "ideal" average should have the following properties:

Rigidly Defined: It should have a clear, unambiguous formula.
Easy to Understand and Calculate: It should be simple for a common person to grasp.
Based on All Observations: It should use every data point in the set.
Suitable for Further Algebraic Treatment: We should be able to use it in more advanced formulas (e.g., combining means).
Not Unduly Affected by Extreme Values: A single very large or very small number shouldn't completely change the result.
Sampling Stability: If we take multiple samples, the average for each sample should be roughly the same.

2. Arithmetic Mean (AM)

The "average" as most people know it. It is the sum of all observations divided by the number of observations.

Ungrouped Data: x̄ = (x1 + x2 + ... + xn) / n = (Σx) / n

Grouped Data: x̄ = (Σf * x) / (Σf) = (Σf * x) / N
(where x = midpoint of class, f = frequency, N = total frequency)

Merits: Rigidly defined, easy to calculate, uses all data, good for algebra.
Demerits: Highly affected by extreme values (outliers). Cannot be calculated for open-ended classes.

3. Median

The middle value of a dataset that has been arranged in order (ascending or descending). It divides the data into two equal halves.

Ungrouped Data:
1. Order the data.
2. Find the position: (n+1)/2.
- If n=7, position is (7+1)/2 = 4th item.
- If n=8, position is (8+1)/2 = 4.5th item (average of 4th and 5th items).

Grouped Data:
1. Find the Median Class (the class where the (N/2)-th item falls).
2. Apply formula: Median = L + [ ( (N/2) - cf ) / f ] * h
(L = lower boundary of median class, N = total frequency, cf = cumulative frequency *before* median class, f = frequency of median class, h = class width)

Merits: Not affected by outliers, can be found for open-ended classes, can be found graphically (from ogives).
Demerits: Not based on all observations, not good for algebraic treatment.

4. Mode (Ungrouped, Grouped, and Graphical)

The value that appears most frequently in a dataset.

1. Mode for Ungrouped Data

Simply find the value that repeats the most.

Example: Data: {2, 5, 6, 6, 6, 7, 8, 9}. Mode = 6.
Example: Data: {1, 1, 2, 3, 3}. Bimodal: 1 and 3.
Example: Data: {1, 2, 3, 4, 5}. No Mode.

2. Mode for Grouped Data (Formula)

Find the Modal Class (the class with the highest frequency).
Apply the formula:
Mode = L + [ (f1 - f0) / (2*f1 - f0 - f2) ] * h
Where:
- L = lower boundary of modal class
- f1 = frequency of the modal class
- f0 = frequency of the class *before* the modal class
- f2 = frequency of the class *after* the modal class
- h = class width

3. Mode by Graphical Method

The mode can be estimated graphically from a Histogram.

Draw a histogram of the data.
Identify the tallest bar (the modal class).
Draw two lines:
- From the top-right corner of the modal bar to the top-right corner of the *previous* bar.
- From the top-left corner of the modal bar to the top-left corner of the *next* bar.
Draw a vertical line from the intersection of these two lines down to the X-axis.
The value on the X-axis is the estimated Mode.

Merits: Not affected by outliers, easy to understand, the only average for nominal data.
Demerits: May not exist or be unique, ill-defined, not based on all observations.

5. Geometric Mean (GM)

The n-th root of the product of n (non-negative) observations. It is best used for averaging ratios, percentages, or growth rates.

Ungrouped Data: GM = (x1 * x2 * ... * xn)^(1/n)

Calculation Tip: The GM is difficult to calculate directly. Use logarithms:
log(GM) = (1/n) * [log(x1) + log(x2) + ... + log(xn)] = (Σ log(x)) / n
GM = Antilog [ (Σ log(x)) / n ]

Merits: Based on all observations, less affected by outliers than AM, good for rates of change.
Demerits: Difficult to calculate, cannot be used if any value is zero or negative.

6. Harmonic Mean (HM)

The reciprocal of the arithmetic mean of the reciprocals of the observations. It is best used for averaging rates and speeds.

Ungrouped Data: HM = n / ( (1/x1) + (1/x2) + ... + (1/xn) ) = n / (Σ (1/x))

Merits: Based on all observations, gives more weight to smaller values.
Demerits: Difficult to calculate, cannot be used if any value is zero.

7. Merits and Demerits (Summary)

Exam Tip: "Which is the 'best' average?" The Arithmetic Mean is generally the best as it satisfies most properties of an ideal average. However, if there are outliers, the Median is better. If dealing with rates/speeds, the Harmonic Mean is better. If dealing with growth rates, the Geometric Mean is better.

Measure	Based on all data?	Affected by Outliers?	Best for...
Mean (AM)	Yes	Yes, very much	General purpose, algebra
Median	No (only position)	No	Data with outliers, open-ended classes
Mode	No	No	Categorical (Nominal) data, finding most popular item
GM	Yes	Yes, but less than AM	Growth rates, ratios
HM	Yes	Yes, but less than AM	Speeds, rates

Knowlet