Bivariate Analysis: The analysis of two variables (bi-variate) to determine the empirical relationship between them.
It helps us answer questions like:
The main tools for this are Correlation (to measure strength) and Regression (to model the relationship).
The Pearson correlation coefficient (r) measures the strength and direction of a *linear* relationship between two quantitative variables. It ranges from -1 to +1.
Assume X-data (e.g., 'Study Hours') is in A2:A51 and Y-data (e.g., 'Exam Score') is in B2:B51.
A1:B51, including headers).This will produce a correlation matrix, which is useful if you have many variables.
This finds the best-fitting straight line (y = a + bx) to describe the relationship. This is the most powerful tool in the Data Analysis ToolPak.
B1:B51).A1:A51).Excel will produce a detailed "Summary Output" table.
The "Summary Output" table from the Regression tool is the most important part of the analysis.
| R Square | This is your R² value (see Unit 4). Interpretation: "85% of the variation in Sales is explained by Advertising." |
| Adjusted R Square | A modified R² used in *multiple* regression (with many X variables). For simple regression, just use R Square. |
This table tests the overall significance of the model.
| Significance F | This is the p-value for the whole model. If Significance F < 0.05, it means your model is statistically significant (i.e., it's not useless). |
This table gives you the equation of your line.
| Coefficients | P-value | |
|---|---|---|
| Intercept | 1000 | 0.002 |
| X Variable (e.g., 'Advertising') | 50 | 0.0001 |
Sales = 1000 + 50 * AdvertisingPivot Table: An interactive tool to quickly summarize, "pivot," and analyze large amounts of data. It lets you slice and dice data without using complex formulas.
Sum of Sales. You can click it to change to Average, Count, etc.The table automatically updates as you drag and drop fields, allowing for rapid, interactive analysis.
VLOOKUP (Vertical Lookup): An Excel function that searches for a value in the first column of a table and returns a corresponding value from a different column in the same row.
It is the "join" of Excel, used to combine data from two different tables that share a common key.
Scenario:
- Table 1 (on Sheet1): StudentID (A), SaleAmount (B)
- Table 2 (on Sheet2): StudentID (A), StudentName (B), Region (C)
Goal: Add the 'StudentName' and 'Region' to Table 1.
A2 in Sheet1, which contains the StudentID).Sheet2!A:C). Important: Use dollar signs () to "lock" the range so it doesn't move when you copy the formula down.StudentName is column 2 and Region is column 3).In cell C2 of Sheet1 (to get the name):
In cell D2 of Sheet1 (to get the region):
You can then drag these formulas down for all rows.