In the SSC CGL Tier 2 for SSC JSO paper, Correlation and Regression is an important topic under Statistics. These concepts help to understand the relationship between two or more variables, how one variable changes when another changes.
1. What is Correlation?
Definition
Correlation measures the degree and direction of relationship between two variables. If two variables change together either in the same or opposite direction they are said to be correlated.
Type of Correlation | Direction | Example |
Positive Correlation | Both variables move in the same direction. | Height and Weight – as height increases, weight increases. |
Negative Correlation | Variables move in opposite directions. | Price and Demand – as price increases, demand decreases. |
Zero Correlation | No relationship between variables. | Shoe size and intelligence. |
Also check out Most Repeated Quantitative Aptitude Questions for SSC CGL Tier 2
2. Methods to Study Correlation
(a) Scatter Diagram (Graphical Method)
- It is a simple visual method to show correlation between two variables.
- Values of one variable are plotted on the X-axis and the other on the Y-axis.
Pattern | Type of Correlation | Diagram Description |
Points lie close to an upward-sloping straight line | Positive Correlation | Both variables increase together |
Points lie close to a downward-sloping straight line | Negative Correlation | One increases, the other decreases |
Points scattered randomly | No Correlation | No visible relationship |
(b) Karl Pearson’s Coefficient of Correlation (r)
It gives a quantitative measure of correlation between two variables. r=Σ(x−xˉ)(y−yˉ)Σ(x−xˉ)2Σ(y−yˉ)2r = \frac{\Sigma (x – \bar{x})(y – \bar{y})}{\sqrt{\Sigma (x – \bar{x})^2 \Sigma (y – \bar{y})^2}}r=Σ(x−xˉ)2Σ(y−yˉ)2Σ(x−xˉ)(y−yˉ)
Where:
- x,yx, yx,y = Variables
- xˉ,yˉ\bar{x}, \bar{y}xˉ,yˉ = Mean of x and y
- rrr ranges between −1 and +1
Value of r | Interpretation |
+1 | Perfect positive correlation |
−1 | Perfect negative correlation |
0 | No correlation |
The closer |r| is to 1, the stronger the relationship.
Also check out Most Repeated Quantitative Aptitude Questions for SSC CGL Tier 2
(c) Spearman’s Rank Correlation (rₛ)
Used when data are in ranks or qualitative form (like preference, performance, etc.). rs=1−6Σd2n(n2−1)r_s = 1 – \frac{6 \Sigma d^2}{n(n^2 – 1)}rs=1−n(n2−1)6Σd2
Where:
- ddd = Difference between ranks of each pair
- nnn = Number of observations
rₛ Value | Meaning |
+1 | Perfect positive rank correlation |
−1 | Perfect negative rank correlation |
0 | No rank correlation |
Example:
If 5 students’ marks in Maths and English are ranked and Σd2=10\Sigma d^2 = 10Σd2=10, rs=1−6(10)5(52−1)=1−60120=0.5r_s = 1 – \frac{6(10)}{5(5^2 – 1)} = 1 – \frac{60}{120} = 0.5rs=1−5(52−1)6(10)=1−12060=0.5
→ Moderate positive correlation.
Check out Most Repeated Reasoning Questions for SSC CGL Tier 2
3. What is Regression?
Definition
Regression shows the functional relationship between two variables, it helps predict the value of one variable based on another.
Concept | Explanation |
Dependent Variable (Y) | The variable to be predicted |
Independent Variable (X) | The variable used for prediction |
Check out Most Repeated Computer Awareness Questions for SSC CGL Tier 2
4. Regression Lines
There are two regression lines:
- Regression Line of Y on X: Y−Yˉ=byx(X−Xˉ)Y – \bar{Y} = b_{yx}(X – \bar{X})Y−Yˉ=byx(X−Xˉ)
- Regression Line of X on Y: X−Xˉ=bxy(Y−Yˉ)X – \bar{X} = b_{xy}(Y – \bar{Y})X−Xˉ=bxy(Y−Yˉ)
Where,
- byxb_{yx}byx and bxyb_{xy}bxy are regression coefficients
Formulas:
byx=r×σyσx,bxy=r×σxσyb_{yx} = r \times \frac{\sigma_y}{\sigma_x}, \quad b_{xy} = r \times \frac{\sigma_x}{\sigma_y}byx=r×σxσy,bxy=r×σyσx
Property | Description |
---|---|
Both lines intersect at (𝑋̄, 𝑌̄). | |
byx×bxy=r2b_{yx} \times b_{xy} = r^2byx×bxy=r2 | |
If r = 0 → lines are perpendicular. | |
If r = ±1 → both lines coincide. |
Use: Regression helps in forecasting for example, predicting sales based on advertisement spend.
5. Multiple Correlation
Definition
When we study the relationship between one dependent variable and two or more independent variables, it is called multiple correlation.
Example:
Predicting a student’s performance (Y) based on study hours (X₁) and attendance (X₂).
Multiple Correlation Coefficient (R):
R=ryx12+ryx22−2ryx1ryx2rx1x21−rx1x22R = \sqrt{r_{yx1}^2 + r_{yx2}^2 – 2r_{yx1}r_{yx2}r_{x1x2} \over 1 – r_{x1x2}^2}R=1−rx1x22ryx12+ryx22−2ryx1ryx2rx1x2
Where:
- ryx1,ryx2r_{yx1}, r_{yx2}ryx1,ryx2 = Correlation of Y with X₁ and X₂
- rx1x2r_{x1x2}rx1x2 = Correlation between X₁ and X₂
Range: 0 ≤ R ≤ 1
- R close to 1 → strong relationship
- R close to 0 → weak relationship
6. Key Differences Between Correlation and Regression
Basis | Correlation | Regression |
Meaning | Measures degree of relationship between variables | Expresses the relationship mathematically |
Purpose | To find strength & direction | To predict one variable from another |
Number of Lines | One (no distinction) | Two (Y on X, X on Y) |
Interchangeability | No dependent or independent variable | One variable is dependent on another |
Value Range | −1 to +1 | Any real value |
Key Takeaways
Below are the key takeaways:
- Correlation → Measures relationship strength.
- Regression → Provides predictive equations.
- Karl Pearson’s coefficient (r) and Spearman’s rank (rₛ) are most common in exams.
- Regression lines always pass through means (𝑋̄, 𝑌̄).
- Multiple correlation deals with more than two variables.
- Formula shortcuts and properties are frequently asked in Paper II (Statistics).
FAQs on Correlation and Regression
Q1. What is the difference between correlation and regression?
Correlation measures the degree of relationship, while regression shows how one variable predicts another.
The value of r always lies between −1 and +1.
It is used when data is ranked or qualitative in nature.
They represent the rate of change in one variable due to a change in another variable.
It shows how a dependent variable is influenced by two or more independent variables.
- Sampling Theory for SSC CGL Tier 2 Paper 2
- Correlation and Regression SSC CGL Tier 2 Paper 2
- Moments, Skewness, and Kurtosis for SSC CGL Tier 2 Paper 2
- Measures of Dispersion for SSC CGL Tier 2 (JSO Post)
- Measures of Central Tendency for SSC CGL Tier 2, Paper 2
- Important Statistics Topics for SSC CGL Tier 2, Check Details Here

I’m Mahima Khurana, a writer with a strong passion for creating meaningful, learner-focused content especially in the field of competitive exam preparation. From authoring books and developing thousands of practice questions to crafting articles and study material, I specialize in transforming complex exam-related topics into clear, engaging, and accessible content. I have first hand experience of 5+ months in SSC Exams. Writing, for me, is not just a skill but a way to support and guide aspirants through their preparation journey one well-written explanation at a time.