Example 2: Testing Correlation Coefficients

While the program provides a large variety of statistical procedures, some specialized operations require the use of COMPUTE statements. For example, you may want to test a sample correlation coefficient against a population correlation coefficient. When the population coefficient is nonzero, you can compute a Z statistic to test the hypothesis that the sample and population values are equal.

The formula for <inlinetag type="italic">Z</inlinetag> is:

Figure 1. Formula for Z
Formula for Z

Let’s say you want to test an r of 0.66 obtained from a sample of 30 cases against a population coefficient of 0.85. The following figure shows commands for displaying Z and its two-tailed probability.

Figure 2. Commands for computing Z statistic
DATA LIST FREE / R N P.

BEGIN DATA
.66 30 .85
END DATA.

COMPUTE #ZR = .5* (LN ((1 + R) / (1 - R))).
COMPUTE #ZP = .5* (LN ((1 + P) / (1 - P))).

COMPUTE Z = (#ZR-#ZP)/(1/(SQRT(N-3))).
COMPUTE PROB = 2*(1-CDFNORM(ABS(Z))).

FORMAT PROB (F8.3).
LIST.
  • DATA LIST defines variables containing the sample correlation coefficient ( R), sample size (N), and population correlation coefficient (P).
  • BEGIN DATA and END DATA indicate that data are inline.
  • COMPUTE statements calculate Z and its probability. Variables #ZR and #ZP are scratch variables used in the intermediate steps of the calculation.
  • The LIST command output is shown below. Since the absolute value of Z is large and the probability is small, we reject the hypothesis that the sample was drawn from a population having a correlation coefficient of 0.85.
Figure 3. Z statistic and its probability
  R       N     P       Z   PROB
.66   30.00   .85   -2.41   .016

If you use the Z test frequently, you may want to construct a macro like that shown below. The !CORRTST macro computes Z and probability values for a sample correlation coefficient, sample size, and population coefficient specified as values of keyword arguments.

Figure 4. !CORRTST macro
DEFINE !CORRTST ( R = !TOKENS(1)
                  /N = !TOKENS(1)
                  /P = !TOKENS(1)).
INPUT PROGRAM.
- END CASE.
- END FILE.
END INPUT PROGRAM.

COMPUTE #ZR = .5* (LN ((1 + !R) / (1 - !R))).
COMPUTE #ZP = .5* (LN ((1 + !P) / (1 - !P))).

COMPUTE Z = (#ZR-#ZP) / (1/(SQRT(!N-3))).
COMPUTE PROB = 2*(1-CDFNORM(ABS(Z))).
FORMAT PROB(F8.3).

TITLE SAMPLE R=!R, N=!N, POPULATION COEFFICIENT=!P.

LIST.

!ENDDEFINE.

!CORRTST R=.66 N=30 P=.85.
!CORRTST R=.50 N=50 P=.85.
  • DEFINE names the macro as !CORRTST and declares arguments for the sample correlation coefficient (R), the sample size (N), and the population correlation coefficient (P).
  • !TOKENS(1) specifies that the value of an argument is a string that follows the name of the argument in the macro call. Thus the first macro call specifies values of 0.66, 30, and 0.85 for R, N, and P.
  • Commands between INPUT PROGRAM and END INPUT PROGRAM create an active dataset with one case. COMPUTE statements calculate the Z statistic and its probability using the values of macro arguments R, N, and P. (INPUT PROGRAM commands would not be needed if COMPUTE statements operated on values in an existing file or inline data, rather than macro arguments.)
  • A customized TITLE shows displays the values of macro arguments used in computing Z.
  • The LIST command displays Z and its probability.
  • The !CORRTST macro is called twice. The first invocation tests an r of 0.66 from a sample of 30 cases against a population coefficient of 0.85 (this generates the same Z value and probability as shown earlier). The second macro call tests an r of 0.50 from a sample of 50 cases against the same population correlation coefficient. The output from these macro calls is shown below.
Figure 5. Output from !CORRTST
SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .85

       Z     PROB
   -2.41     .016


SAMPLE R= .50 , N= 50 , POPULATION COEFFICIENT= .85

       Z     PROB
   -4.85      .000

The following figure shows a modified !CORRTST macro that you can use to test a sample r against each coefficient in a list of population coefficients.

Figure 6. !CORRTST macro with list-processing loop
DEFINE !CORRTST (R = !TOKENS(1)
                 /N = !TOKENS(1)
                 /P = !CMDEND).
- INPUT PROGRAM.
-   END CASE.
-   END FILE.
- END INPUT PROGRAM.

!DO !I !IN (!P).
- COMPUTE #ZR = .5* (LN ((1 + !R) / (1 - !R))).
- COMPUTE #ZP = .5* (LN ((1 + !I) / (1 - !I))).

- COMPUTE Z = (#ZR-#ZP)/(1/(SQRT(!N-3))).

- COMPUTE PROB=2*(1-CDFNORM(ABS(Z))).
- FORMAT PROB(F8.3).
- TITLE SAMPLE R=!R, N=!N, POPULATION COEFFICIENT=!I.
- LIST.
!DOEND.

!ENDDEFINE.

!CORRTST R=.66 N=30 P=.20 .40 .60 .80 .85 .90.
  • DEFINE names the macro as !CORRTST and declares arguments for the sample correlation coefficient (R), the sample size (N), and the population correlation coefficient (P).
  • !TOKENS(1) specifies that the value of an argument is a string that follows the name of the argument in the macro call. Thus, the macro call specifies the value of R as 0.66 and N as 0.30.
  • !CMDEND indicates that the value for P is the remaining text in the macro call. Thus the value of P is a list containing the elements 0.20, 0.40, 0.60, 0.80, 0.85, and 0.90.
  • Commands !DO !IN and !DOEND define a list-processing loop. Commands in the loop compute one Z statistic for each element in the list of population coefficients. For example, in the first iteration Z is computed using 0.20 as the population coefficient. In the second iteration 0.40 is used. The same sample size (30) and r value (0.66) are used for each Z statistic.
  • The output from the macro call is shown below. One Z statistic is displayed for each population coefficient.
Figure 7. Output from modified !CORRTST macro
SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .20

       Z     PROB
    3.07     .002


SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .40

       Z     PROB
    1.92     .055


SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .60

       Z     PROB
     .52     .605


SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .80

       Z     PROB
   -1.59     .112


SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .85

       Z     PROB
   -2.41     .016


SAMPLE R= .66 , N= 30 , POPULATION COEFFICIENT= .90

       Z     PROB
   -3.53     .000