Example 1: Automating a File-Matching Task
The following figure shows a listing of 1988 sales data for salespeople working in different regions. The listing shows that salesperson Jones sold 900 units in the Chicago sales territory, while Rodriguez sold 300 units in Baton Rouge.
YEAR REGION SALESPER SALES
1988 CHICAGO JONES 900
1988 CHICAGO GREGORY 400
1988 BATON ROUGE RODRIGUEZ 300
1988 BATON ROUGE SMITH 333
1988 BATON ROUGE GRAU 100
You can use command syntax shown below to obtain each salesperson’s percentage of total sales for their region.
GET FILE = ’SALES88.SAV’.
SORT CASES BY REGION.
AGGREGATE OUTFILE = ’TOTALS.SAV’
/PRESORTED
/BREAK = REGION
/TOTAL@ = SUM(SALES).
MATCH FILES FILE=*
/TABLE = ’TOTALS.SAV’
/BY REGION.
COMPUTE PCT = 100 * SALES / TOTAL@.
TITLE 1988 DATA.
LIST.
- The
GET
command opens SALES88.SAV. This file becomes the active dataset. -
SORT CASES
sorts the active dataset in ascending alphabetical order by REGION. - The
AGGREGATE
command saves total sales (variable TOTAL@) for each region in file TOTALS.SAV. -
MATCH FILES
appends the regional totals to each salesperson’s record in the active dataset. (See theMATCH FILES
command for more information on matching files.) -
COMPUTE
obtains the percentage of regional sales (PCT) for each salesperson. - The
LIST
command output displayed below shows that Rodriguez sold 41% of the products sold in Baton Rouge. Gregory accounted for 31% of sales in the Chicago area.
YEAR REGION SALESPER SALES TOTAL@ PCT
1988 BATON ROUGE RODRIGUEZ 300 733.00 41.00
1988 BATON ROUGE SMITH 333 733.00 45.00
1988 BATON ROUGE GRAU 100 733.00 14.00
1988 CHICAGO JONES 900 1300.00 69.00
1988 CHICAGO GREGORY 400 1300.00 69.00
The following figure shows a macro that issues the commands for
obtaining sales percentages. The macro consists of the commands that
produce sales percentages imbedded between macro definition commands DEFINE
and !ENDDEFINE
.
DEFINE !TOTMAC ().
GET FILE = ’SALES88.SAV’.
SORT CASES BY REGION.
AGGREGATE OUTFILE = ’TOTALS.SAV’
/PRESORTED
/BREAK = REGION
/TOTAL@ = SUM(SALES).
MATCH FILES FILE = *
/TABLE = ’TOTALS.SAV’
/BY REGION.
COMPUTE PCT = 100 * SALES / TOTAL@.
TITLE 1988 DATA.
LIST.
!ENDDEFINE.
!TOTMAC.
- Macro definition commands
DEFINE
and!ENDDEFINE
signal the beginning and end of macro processing.DEFINE
also assigns the name!TOTMAC
to the macro (the parentheses following the name of the macro are required). The macro name begins with an exclamation point so that the macro does not conflict with that of an existing variable or command. Otherwise, if the macro name matched a variable name, the variable name would invoke the macro whenever the variable name appeared in the command stream. - Commands between
DEFINE
and!ENDDEFINE
constitute the macro body. These commands produce sales percentages. - The final statement,
!TOTMAC
, is the macro call, which invokes the macro. When the program reads the macro call, it issues the commands in the macro body. Then these commands are executed, generating identical output.
While the macro shows you how to construct a simple macro, it doesn’t
reduce the number of commands needed to calculate regional percentages.
However, you can use macro features such as looping to minimize coding
in more complicated tasks. For example, let’s say that in addition
to the 1988 data, you have sales data for 1989 (SALES89.SAV), and
each file contains the variables REGION, SALESPER, and SALES.
The modified !TOTMAC
macro below calculates regional
sales percentages for each salesperson for 1988 and 1989.
DEFINE !TOTMAC ().
!DO !I = 88 !TO 89.
- GET FILE = !CONCAT(’SALES’, !I, ’.SAV’).
- SORT CASES BY REGION.
- AGGREGATE OUTFILE = ’TOTALS.SAV’
/PRESORTED
/BREAK = REGION
/TOTAL@ = SUM(SALES).
- MATCH FILES FILE = *
/TABLE = ’TOTALS.SAV’
/BY REGION.
- COMPUTE PCT= 100 * SALES / TOTAL@.
- !LET !YEAR = !CONCAT(’19’,!I).
- TITLE !YEAR DATA.
- LIST.
!DOEND.
!ENDDEFINE.
!TOTMAC.
-
DEFINE
and!ENDDEFINE
signal the beginning and end of macro processing. - Commands
!DO
and!DOEND
define an index loop. Commands between!DO
and!DOEND
are issued once in each iteration of the loop. The value of index variable!I
, which changes in each iteration, is 88 in the first iteration and 89 in the second (final) iteration. - In each iteration of the loop, the
GET
command opens a data file. The name of the file is constructed using the string manipulation function!CONCAT
, which creates a string that is the concatenation of SALES, the value of the index variable, and .sav. Thus the file SALES88.SAV is opened in the first iteration. - Commands between
AGGREGATE
andCOMPUTE
calculate percentages on the active dataset. - Next, a customized title is created. In the first iteration, the direct
assignment command
!LET
assigns a value of 1988 to the macro variable!YEAR
. This variable is used in theTITLE
command on the following line to specify a title of 1988 DATA. - The
LIST
command displays the contents of each variable. - In the second iteration of the loop, commands display percentages
for the 1989 data file. The output from the
!TOTMAC
macro is shown below. Note that the listing for 1988 data is the same as before.
1988 DATA
YEAR REGION SALESPER SALES TOTAL@ PCT
1988 BATON ROUGE RODRIGUEZ 300 733.00 41.00
1988 BATON ROUGE SMITH 333 733.00 45.00
1988 BATON ROUGE GRAU 100 733.00 14.00
1988 CHICAGO JONES 900 1300.00 69.00
1988 CHICAGO GREGORY 400 1300.00 69.00
1989 DATA
YEAR REGION SALESPER SALES TOTAL@ PCT
1989 BATON ROUGE GRAU 320 1459.00 22.00
1989 BATON ROUGE SMITH 800 1459.00 55.00
1989 BATON ROUGE RODRIGUEZ 339 1459.00 23.00
1989 CHICAGO JONES 300 1439.00 21.00
1989 CHICAGO STEEL 899 1439.00 62.00
1989 CHICAGO GREGORY 240 1439.00 17.00
Let’s look at another application of the !TOTMAC
macro,
one that uses keyword arguments to make the application more
flexible. The following figure shows the number of absences for students
in two classrooms. Let’s say you want to calculate deviation scores
indicating how many more (or fewer) times a student was absent than
the average student in his or her classroom. The first step in obtaining
deviation scores is to compute the average number of absences per
classroom. We can use the !TOTMAC
macro to compute
classroom means by modifying the macro so that it computes means and
uses the absences data file (SCHOOL.SAV) as input.
CLASS STUDENT ABSENT
101 BARRY G 3
101 JENNI W 1
101 ED F 2
101 JOHN 0 8
102 PAUL Y 2
102 AMY G 3
102 JOHN D 12
102 RICH H 4
The !TOTMAC
macro below can produce a variety
of group summary statistics such as sum, mean, and standard deviation
for any IBM® SPSS® Statistics data
file. In the macro call you specify values of keyword arguments indicating
the data file (FILE
), the break (grouping) variable
(BREAKVR
), the summary function (FUNC
),
and the variable to be used as input to the summary function (INVAR
).
For example, to obtain mean absences for each classroom, we specify SCHOOL.SAV as
the data file, CLASS as the break variable, MEAN
as
the summary function, and ABSENT as the variable whose values
are to be averaged.
DEFINE !TOTMAC ( BREAKVR = !TOKENS(1)
/FUNC = !TOKENS(1)
/INVAR = !TOKENS(1)
/TEMP = !TOKENS(1) !DEFAULT(TOTALS.SAV)
/FILE = !CMDEND).
GET FILE = !FILE.
SORT CASES BY !BREAKVR.
AGGREGATE OUTFILE = ’!TEMP’
/PRESORTED
/BREAK = !BREAKVR
/!CONCAT(!FUNC,’@’) = !FUNC(!INVAR).
MATCH FILES FILE = *
/TABLE = ’!TEMP’
/BY !BREAKVR.
!ENDDEFINE.
!TOTMAC BREAKVR=CLASS FUNC=MEAN INVAR=ABSENT FILE=SCHOOL.SAV.
COMPUTE DIFF = ABSENT-MEAN@.
LIST.
!TOTMAC BREAKVR=REGION FUNC=SUM INVAR=SALES FILE=SALES89.SAV.
COMPUTE PCT = 100 * SALES / SUM@.
LIST.
- The syntax for declaring keyword arguments follows the name of
the macro in
DEFINE
. -
!TOKENS(1)
specifies that the value of an argument is a string following the name of the argument in the macro call. Thus the first macro call specifies CLASS as the value ofBREAKVR
, MEAN as the value ofFUNC
, and ABSENT as the value ofINVAR
. -
!CMDEND
indicates that the value forFILE
is the remaining text in the macro call (SCHOOL.SAV). -
TEMP
is an optional argument that names an intermediate file to contain the summary statistics. SinceTEMP
is not assigned a value in the macro call, summary statistics are written to the default intermediate file (TOTALS.SAV). - In the body of the macro,
GET FILE
opens SCHOOL.SAV. -
SORT CASES
sorts the file by CLASS. -
AGGREGATE
computes the mean number of absences for each class. The name of the variable containing the means (MEAN@) is constructed using the!CONCAT
function, which concatenates the value ofFUNC
and the @ symbol. -
MATCH FILES
appends the means to student records. -
COMPUTE
calculates the deviation from the classroom mean for each student (variable DIFF). -
LIST
displays the deviation scores, as shown in the output below. For example, John D., who was absent 12 times, had 6.75 more absences than the average student in classroom 102. Rich H., who was absent 4 times, had 1.25 fewer absences than the average student in classroom 102. - The second macro call and remaining commands generate regional sales percentages for the 1989 sales data.
CLASS STUDENT ABSENT MEAN@ DIFF
101 BARRY G 3 3.50 -.50
101 JENNI W 1 3.50 -2.50
101 ED F 2 3.50 -1.50
101 JOHN 0 8 3.50 4.50
102 PAUL Y 2 5.25 -3.25
102 AMY G 3 5.25 -2.25
102 JOHN D 12 5.25 6.75
102 RICH H 4 5.25 -1.25
1989 DATA
YEAR REGION SALESPER SALES TOTAL@ PCT
1989 BATON ROUGE GRAU 320 1459.00 22.00
1989 BATON ROUGE SMITH 800 1459.00 55.00
1989 BATON ROUGE RODRIGUEZ 339 1459.00 23.00
1989 CHICAGO JONES 300 1439.00 21.00
1989 CHICAGO STEEL 899 1439.00 62.00
1989 CHICAGO GREGORY 240 1439.00 17.00
You can modify the macro call to specify a different data file,
input variable, break variable, or summary statistic. To get a different
summary statistic (such as standard deviation), change the value of FUNC
(see
the AGGREGATE
command for more information on summary
functions available in the AGGREGATE
procedure).