About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Troubleshooting
Problem
I'm using SPSS for Windows and I have a data file with thousands of variables. I'm looking for a way I can automatically or programatically delete all string variables which are blank throughout the entire dataset. In other words, if there are 1000 respondents, then I want to delete all the string variables which are blank for all the 1000 records. I suppose it would be nice to have a way to do this for the numeric or non-string variables as well. Is there any way to do this without scripting?
Resolving The Problem
Yes, there is. This would probably be much easier to do with the SPSS-Python Programmability extension, but it can be performed solely with SPSS command syntax through some creative use of the Output Management System and macros.
NOTE: the following routines assume that you are using at least version 14 because they make use of multiple datasets.
To use the following examples, please create a directory called c:\temp and open the Employee Data.sav sample dataset. Create 2 new variables: 1 string called 'blankstring' and one numeric called 'blanknumeric'. Leave them blank, of course. Save the file as c:\temp\blanktest.sav.
Then, for the deletion of all string variables:
*DeleteEmptyStringVariables.
*the following SET MPRINT command lets you see exactly how the macro
parameters are being passed into the expanded text.
*you may also wish to go to Edit->Options->Viewer and check
'Display commands in the log'.
SET MPRINT = ON.
*first open the file you wish to delete the blank string variables from
and name the dataset.
GET FILE 'c:\temp\blanktest.sav'.
DATASET NAME main.
*The Output Management System (OMS) lets you capture output information
and route it to an external file.
*In this case we want the table result of the DISPLAY DICTIONARY command
to be brought into another SPSS dataset so that we can access the format
information for each variable.
* OMS.
DATASET DECLARE fileinfo.
OMS
/SELECT TABLES
/IF COMMANDS = ["File Information"]
SUBTYPES = ["Variable Information"]
/DESTINATION FORMAT = SAV NUMBERED = TableNumber_
OUTFILE = fileinfo.
DISPLAY DICTIONARY.
OMSEND.
*Now activate the 'fileinfo' dataset and select out those
cases that contain writeformats = Ann, where 'nn' are numbers.
*These will be the string variables in the dataset.
DATASET ACTIVATE fileinfo.
SELECT IF SUBSTR(writeformat,1,1) = 'A' AND INDEX('1234567890',SUBSTR(writeformat,2,1)) > 0.
EXECUTE.
*Now we use the WRITE command to build an external command syntax file
containing lines that will invoke 2 macros per string variable.
WRITE OUTFILE = 'c:\temp\blnkchk.sps'
/1 '!blankcheck ' Var1
/2 '!blankdelete ' Var1
.
EXECUTE.
DATASET CLOSE fileinfo.
*First, we'll define the 'blankcheck' macro to check the variable for blank data and set
another macro value equal to 0 if all blank or * if there were non-blank records.
DEFINE !blankcheck (!POSITIONAL !TOKENS(1))
DATASET ACTIVATE main.
COMPUTE dummy = 1.
COMPUTE flag = 1.
IF !1 = ' ' flag = 0.
EXECUTE.
DATASET DECLARE aggset.
AGGREGATE
/OUTFILE='aggset'
/BREAK=dummy
/blanksum = SUM(flag).
DATASET ACTIVATE aggset.
FORMATS blanksum (F1.0).
WRITE OUTFILE = 'c:\temp\sumdef.sps'
/1 'DEFINE !sumval ()' blanksum ' !ENDDEFINE.'.
EXECUTE.
INCLUDE 'c:\temp\sumdef.sps'.
DATASET ACTIVATE main.
DATASET CLOSE aggset.
!ENDDEFINE.
*Now we define the 'blankdelete' macro to take the 0/* value
created in 'blankcheck' and run a DELETE VARIABLES command
only if it's 0.
DEFINE !blankdelete (!POSITIONAL !TOKEN(1))
!IF ( !EVAL(!sumval) = 0 ) !THEN
DELETE VARIABLES !1.
!IFEND.
!ENDDEFINE.
*And finally, with those two macros now defined, we can invoke them
using the blnkchk.sps command syntax file we wrote earlier.
INCLUDE 'c:\temp\blnkchk.sps'.
**************************************************.
You'll need a slightly different routine to delete the numeric variables, because you'll need a different way to detect a) the numeric variables, and b) whether they are all fully missing (instead of blank).
*DeleteEmptyNonStringVariables.
*the following SET MPRINT command lets you see exactly how the macro
parameters are being passed into the expanded text.
*you may also wish to go to Edit->Options->Viewer and check
'Display commands in the log'.
SET MPRINT = ON.
*first open the file you wish to delete the blank string variables from
and name the dataset.
GET FILE 'c:\temp\blanktest.sav'.
DATASET NAME main.
*The Output Management System (OMS) lets you capture output information
and route it to an external file.
*In this case we want the table result of the DISPLAY DICTIONARY command
to be brought into another SPSS dataset so that we can access the format
information for each variable.
* OMS.
DATASET DECLARE fileinfo.
OMS
/SELECT TABLES
/IF COMMANDS = ["File Information"]
SUBTYPES = ["Variable Information"]
/DESTINATION FORMAT = SAV NUMBERED = TableNumber_
OUTFILE = fileinfo.
DISPLAY DICTIONARY.
OMSEND.
*Now activate the 'fileinfo' dataset and select out those
cases that contain writeformats <> Ann, where 'nn' are numbers.
*These will be the non-string variables in the dataset.
DATASET ACTIVATE fileinfo.
SELECT IF SUBSTR(writeformat,1,1) <> 'A' OR INDEX('1234567890',SUBSTR(writeformat,2,1)) = 0.
EXECUTE.
*Now we use the WRITE command to build an external command syntax file
containing lines that will invoke 2 macros per non-string variable.
WRITE OUTFILE = 'c:\temp\sysmischk.sps'
/1 '!sysmischeck ' Var1
/2 '!sysmisdelete ' Var1
.
EXECUTE.
DATASET CLOSE fileinfo.
*First, we'll define the 'sysmischeck' macro to check the variable for sysmis data and set
another macro value equal to 0 if all sysmis or * if there were non-sysmis records.
DEFINE !sysmischeck (!POSITIONAL !TOKENS(1))
DATASET ACTIVATE main.
COMPUTE dummy = 1.
COMPUTE flag = 1.
IF SYSMIS(!1) flag = 0.
EXECUTE.
DATASET DECLARE aggset.
AGGREGATE
/OUTFILE='aggset'
/BREAK=dummy
/sysmissum = SUM(flag).
DATASET ACTIVATE aggset.
FORMATS sysmissum (F1.0).
WRITE OUTFILE = 'c:\temp\sumdef.sps'
/1 'DEFINE !sumval ()' sysmissum ' !ENDDEFINE.'.
EXECUTE.
INCLUDE 'c:\temp\sumdef.sps'.
DATASET ACTIVATE main.
DATASET CLOSE aggset.
!ENDDEFINE.
*Now we define the 'sysmisdelete' macro to take the 0/* value
created in 'sysmischeck' and run a DELETE VARIABLES command
only if it's 0.
DEFINE !sysmisdelete (!POSITIONAL !TOKEN(1))
!IF ( !EVAL(!sumval) = 0 ) !THEN
DELETE VARIABLES !1.
!IFEND.
!ENDDEFINE.
*And finally, with those two macros now defined, we can invoke them
using the sysmischk.sps command syntax file we wrote earlier.
INCLUDE 'c:\temp\sysmischk.sps'.
DELETE VARIABLES dummy flag.
Related Information
[{"Product":{"code":"SSLVMB","label":"IBM SPSS Statistics"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"18.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Historical Number
69466
Was this topic helpful?
Document Information
Modified date:
16 April 2020
UID
swg21481480
Manage My Notification Subscriptions