Practice Pearson's Correlation Coefficient

Step-by-step explanation, worked examples, and unlimited practice.

📖 Statistics – Pearson Correlation and Regression

📈 Pearson Correlation and Linear Regression

Pearson's correlation coefficient (r) measures the strength and direction of theLinear linear relationship between two interval/ratio variables.

Range of values: −1 ≤ r ≤ +1

Positive linear: r ≈ +1 r ≈ +1 Negative linear: r ≈ −1 r ≈ -1 No linear relationship: r ≈ 0 r ≈ 0

📐 Covariance

sₓᵧ = Σ(xᵢ − x̄)(yᵢ − ȳ) / (n−1)

Shortcut formula: \(s_{xy} = \frac{\sum x_i y_i - n\bar{x}\bar{y}}{n-1}\)
sₓᵧ = [Σxᵢyᵢ − (Σxᵢ·Σyᵢ)/n] / (n−1)

Covariance measures the direction of the relationship:

  • sₓᵧ > 0 → positive relationship
  • sₓᵧ < 0 → negative relationship
  • sₓᵧ = 0 → no linear relationship

r Pearson's Formula

r = sₓᵧ / (sₓ · sᵧ)

where sₓ and sᵧ are the standard deviations of X and Y: \(r = \frac{s_{xy}}{s_x \cdot s_y}\)

📏 Regression Line

The regression line is the straight line that best describes the linear relationship between X and Y.

Line equation: ŷ = a + bx

Slope: b = r · (sᵧ / sₓ)

Intercept: a = ȳ − b·x̄
X Y ŷ = a + bx (regression line) Residual (e)

📊 Coefficient of Determination (R²)

R² = r²

R² represents the percentage of variance in Y explained by X.
Example: If r = 0.8, then R² = 0.64
Interpretation: 64% of the variance in Y is explained by X.
The remaining 36% is explained by other factors.

🔢 Explained Variance and Residual Variance

Total variance = explained variance + residual variance: \(SS_{total} = SS_{regression} + SS_{residual}\)

sᵧ² = s²ŷ + s²ₑ
Component Formula Meaning
Explained variance s²ŷ = r² · sᵧ² Part explained by X
Residual variance s²ₑ = (1 − r²) · sᵧ² Part not explained

⚠️ Important Points

  • Correlation ≠ causation: A strong relationship does not prove X causes Y!
  • r measures only linear association: A strong non-linear relationship can give r = 0
  • Sensitive to outliers: Outliers can change r substantially
  • Normal distribution: Required for significance testing

📋 Comparison of Association Measures

Measure Variable type Relationship type Range
Lambda (λ) Nominal General 0 to 1
Cramér's V Nominal General 0 to 1
Spearman (rₛ) Ordinal Monotonic −1 to +1
Pearson (r) Interval/ratio Linear −1 to +1

Practice Now

Try a problem — unlimited questions, instant feedback.

Click Generate Problem to begin.