### Question & Answer

## Question

I would like to produce a scatterplot for 2 variables with a linear regression fit line, i.e. a line of predicted values of Y across the domain of X. I have designated a split file variable with a large number of values, so there will be many scatterplots produced by one command. I don't want to edit each chart in a chart editor to add the fit line. Can I directly request the fit line from the scatterplot dialogs or subcommands for any of the graph procedures in SPSS Statistics?

## Answer

GGRAPH/GPL commands and Interactive Graphs (IGRAPH command) are the only graph procedures in SPSS that will let you request the regression fit line directly in the scatterplot dialogs or syntax command, so that you don't have to open the graph in a chart editor to add the fit line. With one or more split file variables designated, either the IGRAPH or GGRAPH/GPL commands will produce a scatterplot for each split by running a single IGRAPH or GGRAPH/GPL command. IGRAPH also places the R-squared value (squared multiple correlation) for the regression line to the right side of the scatterplot. GPL does not print the R-squared with the fit line, although a chart template could be devised to add this feature to the graphs. This option is described in more detail below.

GGRAPH/GPL Commands

GGRAPH/GPL commands can be generated by the Graph>Chart Builder dialogs, but the dialogs don't have a feature to request a fit line in a scatterplot. However, you can request a fit line in a GPL syntax subcommand, which still saves you from having to edit each chart in a chart editor to add the fit line.

Here is an example GGRAPH and GPL command set to produce a scatterplot with a fit line.

GGRAPH

/GRAPHDATASET NAME="graphdataset" VARIABLES=edlevel salnow MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

SOURCE: s=userSource(id("graphdataset"))

DATA: edlevel=col(source(s), name("edlevel"))

DATA: salnow=col(source(s), name("salnow"))

GUIDE: axis(dim(1), label("Educational level"))

GUIDE: axis(dim(2), label("Current salary"))

ELEMENT: point(position(edlevel*salnow))

ELEMENT: line(position(smooth.linear(edlevel*salnow)))

END GPL.

These commands are produced by requesting a scatterplot for EDLEVEL (X axis) and SALNOW (Y axis) in the Chart Builder, pasting the command to a syntax window, then adding the subcommand:

ELEMENT: line(position(smooth.linear(edlevel*salnow)))

in the syntax window. This added subcommand requests the fit line.

Suppose that you wished to plot separate fit lines for cases on the basis of a categorical third variable, such as minority status (MINORITY). The following GGRAPH and GPL commands will perform this task.

GGRAPH

/GRAPHDATASET NAME="graphdataset" VARIABLES=edlevel salnow minority MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

SOURCE: s=userSource(id("graphdataset"))

DATA: edlevel=col(source(s), name("edlevel"))

DATA: salnow=col(source(s), name("salnow"))

DATA: minority=col(source(s), name("minority"), unit.category())

GUIDE: axis(dim(1), label("Education"))

GUIDE: axis(dim(2), label("Current Salary"))

GUIDE: legend(aesthetic(aesthetic.color.exterior), label("Minority Status"))

ELEMENT: point(position(edlevel*salnow), color(minority))

ELEMENT: line(position(smooth.linear(edlevel*salnow)), color(minority))

END GPL.

Note the references to MINORITY in the GGRAPH variable list, the separate DATA subcommand for MINORITY , the GUIDE subcommand for MINORITY (which identifies it as a legend variable, and in the ELEMENT subcommands. The first ELEMENT subcommand assigns different colors to the observed data points, i.e., the scatter, for the different minority groups. The second ELEMENT subcommand requests separate fit lines in unique colors for the minority groups.

It was noted in the first paragraph that a chart template could be built to add the R-squared value to the graph. Building this template would require you to edit a scatterplot in the chart editor, but subsequent scatter plots could call the template in the GGRAPH command. After building a scatterplot, edit it in the Chart Editor and use the Elements menu in that editor to add the "Fit Line at Total" or "Fit Line at Subgroups" as appropriate for your graph. Save a template from the File menu of the Chart Editor. You can find more details on saving and applying templates in the online help in SPSS Statistics at Help->Topics. Enter "template" as the search topic. The key points for this application is to check both "Optional lines" and "Legend display" in the :"Save Chart Template" dialog. You can reference the template for later scatter plots in the Options dialog of the Chart Builder when building the graph, or in the GGRAPH command as follows:

* Chart Builder.

GGRAPH

/GRAPHDATASET NAME="graphdataset" VARIABLES=edlevel salnow MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE

TEMPLATE=[

"C:\Program Files\IBM\SPSS\Statistics\20\Looks\scatter_fit.sgt"].

BEGIN GPL

SOURCE: s=userSource(id("graphdataset"))

DATA: edlevel=col(source(s), name("edlevel"))

DATA: salnow=col(source(s), name("salnow"))

GUIDE: axis(dim(1), label("Educational level"))

GUIDE: axis(dim(2), label("Current salary"))

ELEMENT: point(position(edlevel*salnow))

END GPL.

Here, scatter_fit.sgt is the template file that was saved with the fit line and legend. The R-squared is requested as part of the legend although it was placed inside the graph. An Element line was therefore not added to the GPL command.

Interactive Graphs

Interactive Graphs are available from the Graph menu up to version 16 and from syntax commands in later versions. In SPSS 16, open the Graphs menu and choose Legacy Dialogs. From the side menu that appears, choose Interactive and then Scatterplot. The X-axis and Y-axis variables are chosen in the "Assign Variables" tab of the Create Scatterplot dialog. The fit line is requested from the Fit tab in that dialog. Choose Regression from the Method scrollbar. With A assigned as a split file variable, you will likely want to take the default "Fit lines for" Total. If you did have subgroups in each plot, you could check the Subgroups box to get separate lines for each. The "Prediction Lines" boxes are to request confidence bounds around the regression line. Here is syntax command for a scatterplot with a fit line. SALNOW is the Y variable and EDLEVEL is the X variable. The command was generated in SPSS 16 but also runs in later versions. (See the note below regarding the exception for Statistics 19.0.0 on the Mac.). Replace the variable names in the /X1 and /Y subcommands and try it with your own data.

IGRAPH

/VIEWNAME='Scatterplot'

/X1=VAR(edlevel) TYPE=SCALE

/Y=VAR(salnow) TYPE=SCALE

/COORDINATE=VERTICAL

/FITLINE METHOD=REGRESSION LINEAR LINE=TOTAL SPIKE=OFF

/YLENGTH=5.2

/X1LENGTH=6.5

/CHARTLOOK='NONE'

/SCATTER COINCIDENT=NONE.

If you run the IGRAPH command in Statistics 18 or later versions, the following warning message is printed, but the command still runs and the graphs are produced.

"The IGRAPH procedure has been deprecated and will be removed in the next release. Use the syntax converter to convert IGRAPH syntax to GGRAPH. See the online help for information about the syntax converter."

However, the IGRAPH procedure was not removed in version 19, although it must still be run from a syntax command. The exception is Statistics 19.0.0 for Mac, where the IGRAPH command fails and the following message is printed:

"Warning # 144. Command name: IGRAPH

The IGRAPH command is obsolete and no longer supported."

The IGRAPH command functionality is restored in Statistics 19.0.0.1 for the Mac. The 19.0.0.1 fixpack (19.0.1 patch) is available from the IBM Support Portal .

## Internal Use Only

Resolution Status at Transfer: Published - External ; Products: Statistics,Statistics,Statistics,Statistics,Statistics,Statistics; Versions: 16.0.1,18.0,18.0.1,16.0.2,18.0.2,17.0

## Historical Number

87369

### Document Information

**Modified date:**

16 June 2018

## UID

swg21485480