# The Bland-Altman Plot

## Problem

What is a Bland-Altman plot, and can one be produced in SPSS?

## Resolving The Problem

The Bland-Altman plot (Bland & Altman, 1986) is most likely to be seen in the medical statistics literature. Suppose there are two techniques for measuring some continuously-scaled variable, each having some error, and we want a graphical means to assess whether or not they are comparable. Say one wanted to compare two techniques of measuring some blood factor. Data for the plot would be collected by gathering a number of blood samples, splitting each in two, and measuring the factor using both methods. The Bland-Altman chart is a scatterplot with the difference of the two measurements for each sample on the vertical axis and the average of the two measurements on the horizontal axis. Three horizontal reference lines are superimposed on the scatterplot - one line at the average difference between the measurements, along with lines to mark the upper and lower control limits of plus and minus 1.96*sigma, respectively, where sigma is the standard deviation of the measurement differences. (Bland and Altman also discuss the option of using confidence interval bounds, based on the standard error of the mean, for the upper and lower reference lines.) If the two methods are comparable, then differences should be small, with the mean of the differences close to 0, and show no systematic variation with the mean of the two measurements. 'Small' would be an amount that would be clinically insignificant for the factor being measured. The reference is:

Bland, J.M., & Altman, D.G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 327 (8476), 307-310.

While SPSS does not have facilities specifically for producing Bland-Altman charts, they can be produced in SPSS, with help from the Chart Editor. If the measurements are stored in variables A and B, then the difference between A and B can be computed and stored as a new variable (DIFF, for example) in the Transform>Compute dialog, with DIFF as the target variable and "A-B" (without the quotation marks) as the Numeric Expression. Likewise, the average of the two measurements (MMEAN, for example) can be computed in the Transform>Compute dialog with MMEAN as the target variable and "(A + B)/2" as the Numeric Expression. To print descriptive statistics on DIFF, as well as a test of whether DIFF has a mean of 0, run the One-Sample T Test procedure (Analyze>Compare Means->One-Sample T Test) with DIFF in the "Test Variable(s)" box. Enter a "0" in the "Test Values" box. The output for the One-Sample T Test includes the mean and standard deviation of DIFF, along with the standard error of the mean, confidence intervals for the mean (95% by default) and the significance level for the test that the mean of DIFF equals 0. .

The basic scatterplot can be produced with either the Graph procedure (Graphs>Legacy Dialogs>Scatter/Dot) or the Chart Builder. In the Graph procedure dialogs, choose a "Simple Scatter" with DIFF in the Y-Axis box and MMEAN in the X-Axis box. In the Chart Builder dialog, choose Scatter/Dot from the Gallery, then drag the icon for the Simple Scatter plot from the Gallery to the Chart Preview area and drag DIFF and MMEAN to the Y-Axis and X-Axis boxes, respectively. Note that both DIFF and MEAN should be designated as Scale in the Measure column of the Data Editor Variable View.

To add the three reference lines to either scatterplot, first double-click on the chart in the Viewer window in order to open it in the Chart Editor. Choose "Edit Content>In separate window" in the pop-up menu that appears. Within the Chart Editor window, choose Options>Y-axis Reference Line to add each line.

Suppose that the average difference (the mean of DIFF) equaled 1.5 and Sigma (the standard deviation of DIFF) equalled 2.1. You would add reference lines on the Y axis at 1.5; at 1.5+1.96*2.1 = 5.616; at 1.5-1.96*2.1 = -2.616. If you preferred to enter confidence interval bounds for the upper and lower reference lines, you could place those lines at the Upper and Lower Confidence Limits for the difference as printed in the One-Sample T Test output.

For the y=1.5 line, follow these steps:
1. Options->Y-axis Reference Line
2. Type 1.5 In the Position field.
3. Click Apply.
4. Click Close.

Repeat steps 1 to 4 to add the control limits at Y = +1.96*sigma and at Y = -1.96*sigma. For this example, enter 5.616 and -2.616, respectively, in the Position box for the upper and lower limits. The Options>Y-axis Reference Line dialog box must be opened anew each time a line is added

Here are the SPSS syntax commands to compute the new variables, find the descriptive statistics for DIFF, and draw the simple scatterplots.

COMPUTE diff = A-B .
COMPUTE mmean = (A + B)/2 .
* run One-Sample T Test to get descriptive statistics for DIFF and test that Mean(DIFF) = 0 .
T-TEST
/TESTVAL=0
/MISSING=ANALYSIS
/VARIABLES=diff
/CRITERIA=CI(.95).
* Legacy Graph .
GRAPH
/SCATTERPLOT(BIVAR)=mmean WITH diff
/MISSING=LISTWISE .

* Chart Builder.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=mmean diff MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: mmean=col(source(s), name("mmean"))
DATA: diff=col(source(s), name("diff"))
GUIDE: axis(dim(1), label("mmean"))
GUIDE: axis(dim(2), label("diff"))
ELEMENT: point(position(mmean*diff))
END GPL.

The chart editor operations for these graphs would be the same steps as were provided for the menu-generated graphs. Alternatively, you could use the GUIDE command in the GPL command sequence to directly request the three reference lines and avoid the need for the chart editor.

Chart Builder.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=mmean diff MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: mmean=col(source(s), name("mmean"))
DATA: diff=col(source(s), name("diff"))
GUIDE: axis(dim(1), label("mmean"))
GUIDE: axis(dim(2), label("diff"))
GUIDE: form.line(position(*,5.616))
GUIDE: form.line(position(*, 1.5))
GUIDE: form.line(position(*, -2.616))
ELEMENT: point(position(mmean*diff))
END GPL.

The additional commands here were the GUIDE commands with the form.line functions, specifying all values of the X axis (* in the first argument) and 3 respective values for the Y-axis.

GUIDE: form.line(position(*,5.616))
GUIDE: form.line(position(*, 1.5))
GUIDE: form.line(position(*, -2.616))

Proportional bias is present when the difference between measures is a function of the average of the measures. In the Bland-Altman plot, this bias will be reflected in the scatter points, with a trend to higher or lower values of DIFF across the range of values of MMEAN. A simple test for a linear trend of this sort would be to run the Linear Regression procedure (Analyze->Regression->Linear). Place Diff in the "Dependent" box and MMEAN in the "Independent(s)" box and click OK. The default output will provide the statistical test information. The "Coefficients" table in the Regression output will contain, along with the model coefficient for MMEAN, a t value and significance level for the test of the hypothesis that the coefficient for MMEAN equals 0. Rejection of that null hypothesis is consistent with the presence of proportional bias.

If the Bland-Altman plot indicates that the variance of DIFF varies across the range of MMEAN, e.g., if the vertical spread of scatter points is much narrower at low values of MMEAN than at high values of MMEAN, Bland and Altman suggest that the researcher calculate the natural logarithms of the two measures (A and B in this example), recalculate the difference and mean measures with new variables and redraw the Bland-Altman plot with the new difference and mean measures. The new DIFF and MMEAN could be calculated in the Transform->Compute dialogs as before with slightly more complex numeric expressions. Here are the syntax commands for the transformation::

compute diff = ln(A) - ln(B).
compute mmean = (ln(A) + ln(B))/2.

The ln() function returns the natural logarithm of the variable or number in the parentheses.

The remaining commands in the example would remain unchanged. If you wanted to keep the previous DIFF and MMEAN variables in the file, you could compute the new log-based difference and mean as LDIFF and LMMEAN, for example, and replace DIFF and MMEAN in the subsequent commands with those new variables.

## Related Information

[{"Product":{"code":"SSLVMB","label":"IBM SPSS Statistics"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Not Applicable","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

19402

Modified date:
16 April 2020

swg21476730