FYUG Even Semester Exam, 2024
STATISTICS (2nd Semester)
Statistical Methods and Probability

Course No.: STADSM-151T
Full Marks: 70 | Time: 3 Hours
Instructions: Answer any ten from Section-A and five from Section-B. (Full paper solved here).

SECTION-A (2 x 10 = 20 Marks)

1. Define 'statistics' and mention some limitations.

2 Marks

Definition: Statistics is the science concerned with collecting, organizing, analyzing, interpreting, and presenting data.

Limitations:

2. Define qualitative and quantitative data with examples.

2 Marks

Qualitative Data: Data describing attributes or characteristics that cannot be measured numerically. Example: Eye color, Gender.

Quantitative Data: Data that can be expressed numerically and measured. Example: Height (cm), Income (dollars).

3. What are discrete and continuous data? Give examples.

2 Marks

Discrete Data: Data that takes only specific, isolated values (usually integers). Example: Number of children in a family.

Continuous Data: Data that can take any value within a given range. Example: Temperature of a room.

4. Define mode and give its formula for continuous distribution.

2 Marks

Mode: The value that appears most frequently in a data set.

Formula: Mode = l + [(f1 - f0) / (2f1 - f0 - f2)] * h Where: l = lower limit of modal class, f1 = frequency of modal class, f0 = frequency of preceding class, f2 = frequency of succeeding class, h = class width.

5. What are measures of dispersion? Name them.

2 Marks

Definition: Measures that describe the spread or variability of data around a central value.

Measures: Range, Quartile Deviation, Mean Deviation, and Standard Deviation.

6. What are raw and central moments of the rth order?

2 Marks

Raw Moment (mu'_r): The expected value of the rth power of the variable X about the origin. Formula: mu'_r = E(X^r).

Central Moment (mu_r): The expected value of the rth power of the deviation of X from its mean. Formula: mu_r = E[(X - Mean)^r].

7. Write properties of Karl Pearson's correlation coefficient (r).

2 Marks
  • The value of r lies between -1 and +1 inclusive (-1 <= r <= 1).
  • It is independent of change of origin and scale.
  • It is a pure number (dimensionless).

8. Prove: If bXY > 1 then bYX < 1.

2 Marks

We know that r^2 = bXY * bYX. Since r^2 <= 1, it follows that bXY * bYX <= 1. If bXY > 1, then for the product to be less than or equal to 1, bYX must be less than 1.

9. Mention two properties of regression coefficients.

2 Marks
  • Regression coefficients are independent of change of origin but not scale.
  • The geometric mean of two regression coefficients is the correlation coefficient.

10. Define an attribute with an example.

2 Marks

Definition: A qualitative characteristic that cannot be measured quantitatively but can only be categorized. Example: Marital status (Married, Single, Divorced).

11. Define 'trial' and 'event'.

2 Marks

Trial: A performance of a random experiment. Example: Tossing a coin.

Event: A possible outcome or a collection of outcomes of a trial. Example: Getting a 'Head'.

12. Give the axiomatic definition of probability.

2 Marks

Probability P(A) is a real-valued function satisfying:

  1. Non-negativity: P(A) >= 0.
  2. Certainty: P(S) = 1.
  3. Additivity: For mutually exclusive events A and B, P(A union B) = P(A) + P(B).

13. State the addition theorem of probability.

2 Marks

For any two events A and B: P(A union B) = P(A) + P(B) - P(A intersect B)

14. If A and B are independent, P(A intersect B) = 0.25, P(A) = 0.5, find P(B).

2 Marks

For independent events, P(A intersect B) = P(A) * P(B).
0.25 = 0.5 * P(B) => P(B) = 0.25 / 0.5 = 0.5.

15. If A and B are mutually exclusive, what is P(A intersect B)?

2 Marks

By definition, mutually exclusive events cannot occur simultaneously, so P(A intersect B) = 0.

SECTION-B (10 x 5 = 50 Marks)

16. (a) Types of data used in statistics with examples.

5 Marks

Data is primarily categorized into:

  • Primary Data: Collected for the first time by the investigator. Example: Census surveys.
  • Secondary Data: Data already collected by others. Example: Using data from a WHO report.
  • Qualitative Data: Descriptive data. Example: Blood groups (A, B, AB, O).
  • Quantitative Data: Numerical data. Example: Weight of students.

16. (b) Note on graphical representation of statistical data.

5 Marks

Graphical representation simplifies complex data and makes comparisons easier. Key types include:

  • Bar Charts: Used for qualitative or discrete data.
  • Histograms: Used for continuous frequency distributions.
  • Pie Charts: Showing proportions of a whole.
  • Ogive: Representing cumulative frequencies.

17. (a) Represent the given data with a histogram.

5 Marks

The data provided is in an inclusive form (e.g., 15-19, 20-24). First, convert to exclusive class boundaries: 14.5-19.5, 19.5-24.5, etc.

Class BoundariesFrequency
14.5 - 19.59
19.5 - 24.511
24.5 - 29.510
29.5 - 34.544
34.5 - 39.545
39.5 - 44.554
44.5 - 49.537
49.5 - 54.526
54.5 - 59.58
59.5 - 64.55

17. (b) 'Less than' and 'More than' cumulative frequency curves.

5 Marks

Less than Ogive: Points are plotted as (upper limit, cumulative frequency). It is a rising curve.

More than Ogive: Points are plotted as (lower limit, cumulative frequency). It is a falling curve.

Construction: 1. Prepare cumulative frequency table. 2. Plot points on a graph. 3. Join points with a smooth curve. Intersection of both curves gives the Median.

18. (a) Measures of central tendency and requisites of a good measure.

5 Marks

Measures: Arithmetic Mean (AM), Median, Mode, Geometric Mean, Harmonic Mean.

Requisites: It should be rigidly defined, based on all observations, easy to calculate, and stable under sampling fluctuations.

Mathematical treatment: Arithmetic Mean is the most amenable to further mathematical treatment.

18. (b) Relationship between raw and central moments.

5 Marks

Central moments (mu_r) in terms of raw moments (mu'_r) about the origin:

  • mu_2 = mu'_2 - (mu'_1)^2
  • mu_3 = mu'_3 - 3*mu'_2*mu'_1 + 2*(mu'_1)^3

19. (a) Definitions of Range, QD, MD, SD, and Variance.

5 Marks
  • Range: Difference between maximum and minimum value.
  • Quartile Deviation (QD): (Q3 - Q1) / 2.
  • Mean Deviation (MD): Average of absolute deviations from a central value.
  • Standard Deviation (SD): Root mean square deviation from the arithmetic mean.
  • Variance: Square of Standard Deviation.

19. (b) Define Skewness and Kurtosis with formulas.

5 Marks

Skewness: Measure of asymmetry. Formula (Karl Pearson): (Mean - Mode) / SD.

Kurtosis: Measure of "peakedness" or flatness of a distribution. Formula: beta_2 = mu_4 / (mu_2)^2.

20. (a) Scatter Diagram and Correlation.

5 Marks

Scatter Diagram: A graph plotting pairs of (x, y) values as points.

Correlation Idea: If points cluster along a rising line, correlation is positive. If falling, it's negative. If points are randomly scattered, correlation is zero.

20. (b) Correlation Coefficients.

5 Marks

Karl Pearson's r: Measure of linear relationship between two variables.
Spearman's rank: Measure of relationship between ranked data. Formula: 1 - [6 * Sum(d^2) / (n(n^2 - 1))].
Range of r: [-1, +1].

21. (a) Regression lines of X on Y and Y on X.

5 Marks

Y on X: (y - y_bar) = bYX * (x - x_bar).
X on Y: (x - x_bar) = bXY * (y - y_bar).

21. (b) Principle of Least Squares for fitting y = a + bx.

5 Marks

Minimize the sum of squared residuals: S = Sum(yi - (a + bxi))^2. To find 'a' and 'b', differentiate S with respect to 'a' and 'b' and equate to zero. This leads to Normal Equations:

  1. Sum(y) = na + b*Sum(x)
  2. Sum(xy) = a*Sum(x) + b*Sum(x^2)

22. (a) Probability Definitions.

5 Marks
  • Exhaustive: All possible outcomes.
  • Mutually Exclusive: Cannot happen at once.
  • Equally Likely: No outcome is preferred over others.
  • Independent: Occurrence of one doesn't affect the other.

22. (b) Set theoretic expressions for events.

5 Marks
  1. Only A: A intersect B' intersect C'
  2. All three: A intersect B intersect C
  3. At least two: (A intersect B) union (B intersect C) union (A intersect C)
  4. A and B but not C: A intersect B intersect C'
  5. None: A' intersect B' intersect C'

25. (a) Bayes' Theorem and Manager Appointment Problem.

5 Marks

Bayes' Theorem: Provides the posterior probability of an event given prior evidence.
Let X, Y, Z be manager appointments. P(X)=4/9, P(Y)=2/9, P(Z)=3/9. Let B = Bonus Scheme. P(B|X)=3/10, P(B|Y)=5/10, P(B|Z)=8/10.
P(B) = (4/9 * 3/10) + (2/9 * 5/10) + (3/9 * 8/10) = (12+10+24)/90 = 46/90.
P(X|B) = [P(X)*P(B|X)] / P(B) = (12/90) / (46/90) = 12/46 = 6/23.