Home

Use Scatter Plots to Identify a Linear Relationship in Simple Regression Analysis

|
Updated:  
2016-03-26 08:13:24
|
Reading Financial Reports For Dummies
Explore Book
Buy On Amazon

A scatter plot is a special type of graph designed to show the relationship between two variables. With regression analysis, you can use a scatter plot to visually inspect the data to see whether X and Y are linearly related. The following are some examples.

This figure shows a scatter plot for two variables that have a nonlinear relationship between them.

Scatter plot of a nonlinear relationship.
Scatter plot of a nonlinear relationship.

Each point on the graph represents a single (X, Y) pair. Because the graph isn't a straight line, the relationship between X and Y is nonlinear. Notice that starting with the most negative values of X, as X increases, Y at first decreases; then as X continues to increase, Y increases. The graph clearly shows that the slope is continually changing; it isn't a constant. With a linear relationship, the slope never changes.

In this example, one of the fundamental assumptions of simple regression analysis is violated, and you need another approach to estimate the relationship between X and Y. One possibility is to transform the variables; for example, you could run a simple regression between ln(X) and ln(Y). ("ln" stands for the natural logarithm.) This often helps eliminate nonlinearities in the relationship between X and Y. Another possibility is to use a more advanced type of regression analysis, which can incorporate nonlinear relationships.

This figure shows a scatter plot for two variables that have a strongly positive linear relationship between them. The correlation between X and Y equals 0.9.

Scatter plot of a strongly positive linear relationship.
Scatter plot of a strongly positive linear relationship.

The figure shows a very strong tendency for X and Y to both rise above their means or fall below their means at the same time. The straight line is a trend line, designed to come as close as possible to all the data points. The trend line has a positive slope, which shows a positive relationship between X and Y. The points in the graph are tightly clustered about the trend line due to the strength of the relationship between X and Y. (Note: The slope of the line is not 0.9; 0.9 is the correlation between X and Y.)

The next figure shows a scatter plot for two variables that have a weakly positive linear relationship between them; the correlation between X and Y equals 0.2.

Scatter plot of a weakly positive linear relationship.
Scatter plot of a weakly positive linear relationship.

This figure shows a weaker connection between X and Y. Note that the points on the graph are more scattered about the trend line than in the previous figure, due to the weaker relationship between X and Y.

The next figure is a scatter plot for two variables that have a strongly negative linear relationship between them; the correlation between X and Y equals –0.9.

Scatter plot of a strongly negative linear relationship.
Scatter plot of a strongly negative linear relationship.

This figure shows a very strong tendency for X and Y to move in opposite directions; for example, they rise above or fall below their means at opposite times. The trend line has a negative slope, which shows a negative relationship between X and Y. The points in the graph are tightly clustered about the trend line due to the strength of the relationship between X and Y.

The next figure is a scatter plot for two variables that have a weakly negative linear relationship between them. The correlation between X and Y equals –0.2.

Scatter plot of a weakly negative linear relationship.
Scatter plot of a weakly negative linear relationship.

This figure shows a very weak connection between X and Y. Note that the points on the graph are more scattered about the trend line than in the previous figure due to the weaker relationship between X and Y.

About This Article

This article is from the book: 

About the book author:

Alan Anderson, PhD is a teacher of finance, economics, statistics, and math at Fordham and Fairfield universities as well as at Manhattanville and Purchase colleges. Outside of the academic environment he has many years of experience working as an economist, risk manager, and fixed income analyst. Alan received his PhD in economics from Fordham University, and an M.S. in financial engineering from Polytechnic University.