Unit 2: Sampling
1. Sampling: Meaning
In survey research, it is usually impossible to talk to everyone in a large group (the **"Population"**). For example, you cannot interview all 1.4 billion people in India.
Definition: Sampling is the scientific process of selecting a small, representative subset of a population (the **"Sample"**) to be studied. The findings from the sample are then used to make generalizations about the entire population.
- Population (or Universe): The entire group you want to study (e.g., "All voters in India").
- Sample: The small group you actually survey (e.g., "5,000 voters selected from across India").
- The Goal: To select a sample that is a "miniature version" of the population, so your findings are accurate and unbiased.
2. Types of Sampling
There are two main branches of sampling methods: **Probability Sampling** (which is scientific and random) and **Non-Probability Sampling** (which is based on convenience or judgment).
3. Probability (Random) Sampling
In this method, every member of the population has a known, non-zero chance of being selected. This is the "gold standard" for scientific surveys.
Types of Probability Sampling:
- Simple Random Sampling (SRS):
- The most basic method. Each person has an equal chance of being picked.
- Method: Like pulling names out of a hat or using a random number generator.
- Pro: Very simple. Con: Can be inefficient; you might miss small groups by chance.
- Systematic Sampling:
- Picking every "nth" person from a list.
- Method: Get a list of the population (e.g., 10,000 people). You need a sample of 100. You divide 10,000/100 = 100. So, you pick a random starting point (e.g., #27) and then select every 100th person (#27, #127, #227...).
- Pro: Easier than SRS. Con: Risk of bias if the list has a hidden pattern.
- Stratified Sampling:
- Used when the population has distinct sub-groups (**"strata"**) that you want to represent accurately.
- Method: First, divide the population into groups (e.g., by gender, by state, by age). Then, do Simple Random Sampling *within* each group.
- Example: To survey a college, you first divide by department (Arts, Science, Commerce) and then randomly select 50 students *from each* department. This ensures all departments are represented.
- Pro: Highly representative. Con: Complex and requires prior knowledge of the population.
- Cluster Sampling:
- Used when the population is geographically widespread.
- Method: Divide the population into "clusters" (e.g., districts, city blocks). Randomly select a few *clusters*. Then, survey *everyone* within those selected clusters.
- Example: To survey a city, randomly pick 10 city blocks (clusters) and then interview every household in those 10 blocks.
- Pro: Saves time and money. Con: Less precise than Stratified.
4. Non-Probability Sampling
In this method, selection is not random. It is based on the researcher's convenience, judgment, or a quota. It's cheaper and faster but cannot be scientifically generalized.
Types of Non-Probability Sampling:
- Convenience Sampling:
- Selecting whoever is easiest to reach.
- Example: A TV reporter interviewing people at a mall.
- Pro: Fast, cheap. Con: Highly biased, not representative.
- Quota Sampling:
- The non-probability version of stratified sampling.
- Method: The researcher has a "quota" to fill (e.g., "I need 20 men and 30 women"). They then find these people using convenience methods.
- Example: An interviewer stands on a street and stops people until they have interviewed 20 men and 30 women.
- Pro: Ensures key groups are included. Con: Still biased, as the selection is not random.
- Purposive (or Judgmental) Sampling:
- The researcher uses their "judgment" to select people who are "experts" or typical of a group.
- Example: If studying a village, you "purposively" select the village headman and the schoolteacher to get an expert view.
- Snowball Sampling:
- Used for hard-to-reach populations (e.g., homeless people, drug users).
- Method: You find one or two people and then ask them to "refer" you to their friends, who then refer you to *their* friends. The sample "snowballs."
5. Sampling Error
Definition: Sampling error is the unavoidable statistical inaccuracy that results from surveying a sample instead of the entire population. It is the calculated "margin of error" in a poll.
- What it means: A poll might report "45% support for Party A, with a 3% margin of error." This means the *real* support in the population is likely between 42% (45-3) and 48% (45+3).
- Cause: It's the natural result of "chance." By pure luck, your random sample might have slightly more men, or slightly more old people, than the population.
- Key Fact: Sampling error can be *reduced* by increasing the **sample size**. A larger sample (e.g., 20,000) will have a smaller error than a small sample (e.g., 500).
- Note: This is different from "Non-Sampling Error" (e.g., bad questions, biased interviewers), which cannot be measured and can make a poll completely wrong.
6. Exam Corner: Key Distinctions
Common Exam Questions:
- "What is sampling? Explain the key differences between probability and non-probability sampling."
- "Write short notes on: (a) Stratified Sampling (b) Quota Sampling."
- "What is 'Sampling Error'? How can it be reduced?"
How to Answer:
- For the "Difference" question: The single most important difference is RANDOMNESS.
- Probability = Random. Every unit has a *known* chance. Allows for scientific generalization.
- Non-Probability = Not Random. Chance of selection is *unknown*. Does not allow for scientific generalization.
- Stratified vs. Quota: This is a classic comparison.
- Both divide the population into groups.
- Stratified selects *randomly* from within the groups (Probability).
- Quota selects *conveniently* from within the groups (Non-Probability).