FYUG Even Semester Exam, 2025
STATISTICS (2nd Semester)
Course No.: STASEC-151
UNIT-I
Question 1 [1 x 3 = 3]
- (a) Who developed R-programming? R was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand
.
- (b) Feature of R: R is an open-source language with extensive packages for statistical modeling and high-quality graphics
.
- (c) Note on CRAN: The Comprehensive R Archive Network (CRAN) is a network of ftp and web servers around the world that store identical, up-to-date versions of code and documentation for R
.
- (d) Remove duplicates: The
unique() function is used to remove duplicate observations .
Question 2(a) [2]
R-code for min and max of: 2, 4, 7, 9, 5, 1, 15, 20
data <- c(2, 4, 7, 9, 5, 1, 15, 20)
min_val <- min(data)
max_val <- max(data)
Question 3(b) [5]
Define R. Mention three advantages and disadvantages.
Definition: R is a programming language and free software environment designed specifically for statistical computing and graphics
.
Advantages:
- Open-source and free to use
.
- Strong graphical capabilities for data visualization
.
- Large community support and over 15,000+ packages
.
Disadvantages:
- High memory consumption as it stores all objects in RAM
.
- Steeper learning curve compared to some other tools
.
- Slower execution speed for some complex computations compared to C++ or Python
.
UNIT-II
Question 4 [1 x 3 = 3]
- (a) xlim=c(0,20): This indicates the limits of the x-axis, setting the scale from 0 to 20
.
- (c) Logical operators: Operators used include
& (AND), | (OR), and ! (NOT) .
- (d) xlab and ylab: These are used to provide labels for the x-axis and y-axis respectively
.
Question 6(b) [5]
How to draw various diagrams in R:
- (i) Bar diagram: Use
barplot(height). Labels and colors can be added using names.arg and col.
- (ii) Histogram: Use
hist(x) to visualize the distribution of a continuous variable .
- (iii) Pie diagram: Use
pie(x) to show proportions of a whole .
- (iv) Line diagram: Use
plot(x, y, type="l") where "l" specifies a line plot .
- (v) Frequency polygon: This can be drawn using
plot(density(x)) or by connecting the midpoints of a histogram .
UNIT-III
Question 7(d) [1]
R-code to compute square root of 29:
sqrt(29)
Question 8(a) [2]
Relationship between mean, median, and mode:
- Symmetrical: Mean = Median = Mode
.
- Positively Skewed: Mean > Median > Mode
.
- Negatively Skewed: Mean < Median < Mode
.
Question 9(b) [5]
R-code for mean, median, variance, and SD of: 31, 27, 29, 45, 27, 19, 25
x <- c(31, 27, 29, 45, 27, 19, 25)
mean_val <- mean(x)
median_val <- median(x)
var_val <- var(x)
sd_val <- sd(x)
UNIT-IV
Question 10 [1 x 3 = 3]
- (a) Package for skewness/kurtosis: The
moments package is commonly used .
- (c) Transpose of a matrix: Use the
t(matrix_name) function .
- (d) Correlation: Use
cor(x, y) .
Question 11(a) [2]
Note on Kurtosis:
Kurtosis measures the "peakedness" of a distribution relative to a normal distribution
.
- Leptokurtic: Sharply peaked with heavy tails
.
- Mesokurtic: Normal distribution
.
- Platykurtic: Flat-topped with thin tails
.
Question 12(b) [5]
R-code for Spearman's rank correlation:
X <- c(10, 15, 17, 21, 37, 41, 45)
Y <- c(9, 12, 21, 11, 17, 23, 41)
cor(X, Y, method = "spearman")
UNIT-V
Question 13 [1 x 3 = 3]
- (a) R²: The coefficient of determination, representing the proportion of variance in the dependent variable predictable from the independent variable
.
- (c) Polynomial degree 2:
model <- lm(y ~ poly(x, 2)) .
- (d) summary() command: It provides a comprehensive statistical summary of an object, such as coefficients and p-values of a regression model
.
Question 15(a) [5]
Simple Linear Regression Equation and R-code:
Equation: Y = β₀ + β₁X + ε
R-code for provided data:
matches <- 1:10
runs <- c(7, 11, 23, 27, 31, 45, 55, 67, 69, 71)
fit <- lm(runs ~ matches)
summary(fit)