FYUG Even Semester Exam, 2024
STATISTICS (2nd Semester)
Statistical Methods and Probability
Course No.: STADSM-151T
Full Marks: 70 | Time: 3 Hours
Instructions: Answer any ten from Section-A and five from Section-B. (Full paper solved here).
SECTION-A (2 x 10 = 20 Marks)
1. Define 'statistics' and mention some limitations.
2 Marks
Definition: Statistics is the science concerned with collecting, organizing, analyzing, interpreting, and presenting data.
Limitations:
- It does not study qualitative phenomena directly (like honesty or beauty) unless they are quantified
.
- It deals with aggregates (groups) rather than individual items
.
- Statistical results are true only on average
.
2. Define qualitative and quantitative data with examples.
2 Marks
Qualitative Data: Data describing attributes or characteristics that cannot be measured numerically. Example: Eye color, Gender.
Quantitative Data: Data that can be expressed numerically and measured. Example: Height (cm), Income (dollars).
3. What are discrete and continuous data? Give examples.
2 Marks
Discrete Data: Data that takes only specific, isolated values (usually integers). Example: Number of children in a family.
Continuous Data: Data that can take any value within a given range. Example: Temperature of a room.
4. Define mode and give its formula for continuous distribution.
2 Marks
Mode: The value that appears most frequently in a data set
.
Formula:
Mode = l + [(f1 - f0) / (2f1 - f0 - f2)] * h
Where: l = lower limit of modal class, f1 = frequency of modal class, f0 = frequency of preceding class, f2 = frequency of succeeding class, h = class width.
5. What are measures of dispersion? Name them.
2 Marks
Definition: Measures that describe the spread or variability of data around a central value
.
Measures: Range, Quartile Deviation, Mean Deviation, and Standard Deviation
.
6. What are raw and central moments of the rth order?
2 Marks
Raw Moment (mu'_r): The expected value of the rth power of the variable X about the origin. Formula: mu'_r = E(X^r).
Central Moment (mu_r): The expected value of the rth power of the deviation of X from its mean. Formula: mu_r = E[(X - Mean)^r].
7. Write properties of Karl Pearson's correlation coefficient (r).
2 Marks
- The value of r lies between -1 and +1 inclusive (-1 <= r <= 1)
.
- It is independent of change of origin and scale
.
- It is a pure number (dimensionless)
.
8. Prove: If bXY > 1 then bYX < 1.
2 Marks
We know that r^2 = bXY * bYX
.
Since r^2 <= 1, it follows that bXY * bYX <= 1.
If bXY > 1, then for the product to be less than or equal to 1, bYX must be less than 1.
9. Mention two properties of regression coefficients.
2 Marks
- Regression coefficients are independent of change of origin but not scale
.
- The geometric mean of two regression coefficients is the correlation coefficient
.
10. Define an attribute with an example.
2 Marks
Definition: A qualitative characteristic that cannot be measured quantitatively but can only be categorized. Example: Marital status (Married, Single, Divorced).
11. Define 'trial' and 'event'.
2 Marks
Trial: A performance of a random experiment. Example: Tossing a coin.
Event: A possible outcome or a collection of outcomes of a trial. Example: Getting a 'Head'.
12. Give the axiomatic definition of probability.
2 Marks
Probability P(A) is a real-valued function satisfying:
- Non-negativity: P(A) >= 0
.
- Certainty: P(S) = 1
.
- Additivity: For mutually exclusive events A and B, P(A union B) = P(A) + P(B)
.
13. State the addition theorem of probability.
2 Marks
For any two events A and B:
P(A union B) = P(A) + P(B) - P(A intersect B)
14. If A and B are independent, P(A intersect B) = 0.25, P(A) = 0.5, find P(B).
2 Marks
For independent events, P(A intersect B) = P(A) * P(B)
.
0.25 = 0.5 * P(B) => P(B) = 0.25 / 0.5 = 0.5.
15. If A and B are mutually exclusive, what is P(A intersect B)?
2 Marks
By definition, mutually exclusive events cannot occur simultaneously, so P(A intersect B) = 0
.
SECTION-B (10 x 5 = 50 Marks)
16. (a) Types of data used in statistics with examples.
5 Marks
Data is primarily categorized into:
- Primary Data: Collected for the first time by the investigator. Example: Census surveys.
- Secondary Data: Data already collected by others. Example: Using data from a WHO report.
- Qualitative Data: Descriptive data. Example: Blood groups (A, B, AB, O).
- Quantitative Data: Numerical data. Example: Weight of students.
16. (b) Note on graphical representation of statistical data.
5 Marks
Graphical representation simplifies complex data and makes comparisons easier. Key types include:
- Bar Charts: Used for qualitative or discrete data
.
- Histograms: Used for continuous frequency distributions
.
- Pie Charts: Showing proportions of a whole
.
- Ogive: Representing cumulative frequencies
.
17. (a) Represent the given data with a histogram.
5 Marks
The data provided is in an inclusive form (e.g., 15-19, 20-24). First, convert to exclusive class boundaries: 14.5-19.5, 19.5-24.5, etc.
| Class Boundaries | Frequency |
| 14.5 - 19.5 | 9 |
| 19.5 - 24.5 | 11 |
| 24.5 - 29.5 | 10 |
| 29.5 - 34.5 | 44 |
| 34.5 - 39.5 | 45 |
| 39.5 - 44.5 | 54 |
| 44.5 - 49.5 | 37 |
| 49.5 - 54.5 | 26 |
| 54.5 - 59.5 | 8 |
| 59.5 - 64.5 | 5 |
17. (b) 'Less than' and 'More than' cumulative frequency curves.
5 Marks
Less than Ogive: Points are plotted as (upper limit, cumulative frequency). It is a rising curve.
More than Ogive: Points are plotted as (lower limit, cumulative frequency). It is a falling curve.
Construction: 1. Prepare cumulative frequency table. 2. Plot points on a graph. 3. Join points with a smooth curve. Intersection of both curves gives the Median.
18. (a) Measures of central tendency and requisites of a good measure.
5 Marks
Measures: Arithmetic Mean (AM), Median, Mode, Geometric Mean, Harmonic Mean
.
Requisites: It should be rigidly defined, based on all observations, easy to calculate, and stable under sampling fluctuations
.
Mathematical treatment: Arithmetic Mean is the most amenable to further mathematical treatment
.
18. (b) Relationship between raw and central moments.
5 Marks
Central moments (mu_r) in terms of raw moments (mu'_r) about the origin:
- mu_2 = mu'_2 - (mu'_1)^2
- mu_3 = mu'_3 - 3*mu'_2*mu'_1 + 2*(mu'_1)^3
19. (a) Definitions of Range, QD, MD, SD, and Variance.
5 Marks
- Range: Difference between maximum and minimum value
.
- Quartile Deviation (QD): (Q3 - Q1) / 2
.
- Mean Deviation (MD): Average of absolute deviations from a central value
.
- Standard Deviation (SD): Root mean square deviation from the arithmetic mean
.
- Variance: Square of Standard Deviation
.
19. (b) Define Skewness and Kurtosis with formulas.
5 Marks
Skewness: Measure of asymmetry. Formula (Karl Pearson): (Mean - Mode) / SD.
Kurtosis: Measure of "peakedness" or flatness of a distribution. Formula: beta_2 = mu_4 / (mu_2)^2.
20. (a) Scatter Diagram and Correlation.
5 Marks
Scatter Diagram: A graph plotting pairs of (x, y) values as points
.
Correlation Idea: If points cluster along a rising line, correlation is positive. If falling, it's negative. If points are randomly scattered, correlation is zero.
20. (b) Correlation Coefficients.
5 Marks
Karl Pearson's r: Measure of linear relationship between two variables
.
Spearman's rank: Measure of relationship between ranked data.
Formula: 1 - [6 * Sum(d^2) / (n(n^2 - 1))].
Range of r: [-1, +1].
21. (a) Regression lines of X on Y and Y on X.
5 Marks
Y on X: (y - y_bar) = bYX * (x - x_bar)
.
X on Y: (x - x_bar) = bXY * (y - y_bar).
21. (b) Principle of Least Squares for fitting y = a + bx.
5 Marks
Minimize the sum of squared residuals: S = Sum(yi - (a + bxi))^2
. To find 'a' and 'b', differentiate S with respect to 'a' and 'b' and equate to zero.
This leads to Normal Equations:
- Sum(y) = na + b*Sum(x)
- Sum(xy) = a*Sum(x) + b*Sum(x^2)
22. (a) Probability Definitions.
5 Marks
- Exhaustive: All possible outcomes
.
- Mutually Exclusive: Cannot happen at once
.
- Equally Likely: No outcome is preferred over others
.
- Independent: Occurrence of one doesn't affect the other
.
22. (b) Set theoretic expressions for events.
5 Marks
- Only A: A intersect B' intersect C'
- All three: A intersect B intersect C
- At least two: (A intersect B) union (B intersect C) union (A intersect C)
- A and B but not C: A intersect B intersect C'
- None: A' intersect B' intersect C'
25. (a) Bayes' Theorem and Manager Appointment Problem.
5 Marks
Bayes' Theorem: Provides the posterior probability of an event given prior evidence
.
Let X, Y, Z be manager appointments.
P(X)=4/9, P(Y)=2/9, P(Z)=3/9. Let B = Bonus Scheme.
P(B|X)=3/10, P(B|Y)=5/10, P(B|Z)=8/10.
P(B) = (4/9 * 3/10) + (2/9 * 5/10) + (3/9 * 8/10) = (12+10+24)/90 = 46/90.
P(X|B) = [P(X)*P(B|X)] / P(B) = (12/90) / (46/90) = 12/46 = 6/23.