Practice Pearson's Correlation Coefficient
Step-by-step explanation, worked examples, and unlimited practice.
📖 Statistics – Pearson Correlation and Regression
📈 Pearson Correlation and Linear Regression
Pearson's correlation coefficient (r) measures the strength and direction of theLinear linear relationship between two interval/ratio variables.
Range of values: −1 ≤ r ≤ +1
📐 Covariance
sₓᵧ = Σ(xᵢ − x̄)(yᵢ − ȳ) / (n−1)
Shortcut formula: \(s_{xy} = \frac{\sum x_i y_i - n\bar{x}\bar{y}}{n-1}\)
sₓᵧ = [Σxᵢyᵢ − (Σxᵢ·Σyᵢ)/n] / (n−1)
Shortcut formula: \(s_{xy} = \frac{\sum x_i y_i - n\bar{x}\bar{y}}{n-1}\)
sₓᵧ = [Σxᵢyᵢ − (Σxᵢ·Σyᵢ)/n] / (n−1)
Covariance measures the direction of the relationship:
- sₓᵧ > 0 → positive relationship
- sₓᵧ < 0 → negative relationship
- sₓᵧ = 0 → no linear relationship
r Pearson's Formula
r = sₓᵧ / (sₓ · sᵧ)
where sₓ and sᵧ are the standard deviations of X and Y: \(r = \frac{s_{xy}}{s_x \cdot s_y}\)
where sₓ and sᵧ are the standard deviations of X and Y: \(r = \frac{s_{xy}}{s_x \cdot s_y}\)
📏 Regression Line
The regression line is the straight line that best describes the linear relationship between X and Y.
Line equation: ŷ = a + bx
Slope: b = r · (sᵧ / sₓ)
Intercept: a = ȳ − b·x̄
Intercept: a = ȳ − b·x̄
📊 Coefficient of Determination (R²)
R² = r²
R² represents the percentage of variance in Y explained by X.
R² represents the percentage of variance in Y explained by X.
Example: If r = 0.8, then R² = 0.64
Interpretation: 64% of the variance in Y is explained by X.
The remaining 36% is explained by other factors.
Interpretation: 64% of the variance in Y is explained by X.
The remaining 36% is explained by other factors.
🔢 Explained Variance and Residual Variance
Total variance = explained variance + residual variance: \(SS_{total} = SS_{regression} + SS_{residual}\)
sᵧ² = s²ŷ + s²ₑ
sᵧ² = s²ŷ + s²ₑ
| Component | Formula | Meaning |
|---|---|---|
| Explained variance | s²ŷ = r² · sᵧ² | Part explained by X |
| Residual variance | s²ₑ = (1 − r²) · sᵧ² | Part not explained |
⚠️ Important Points
- Correlation ≠ causation: A strong relationship does not prove X causes Y!
- r measures only linear association: A strong non-linear relationship can give r = 0
- Sensitive to outliers: Outliers can change r substantially
- Normal distribution: Required for significance testing
📋 Comparison of Association Measures
| Measure | Variable type | Relationship type | Range |
|---|---|---|---|
| Lambda (λ) | Nominal | General | 0 to 1 |
| Cramér's V | Nominal | General | 0 to 1 |
| Spearman (rₛ) | Ordinal | Monotonic | −1 to +1 |
| Pearson (r) | Interval/ratio | Linear | −1 to +1 |
Practice Now
Try a problem — unlimited questions, instant feedback.
Click Generate Problem to begin.