Creating new variables in the active dataset (Java)

The DataUtil class enables you to add new variables, along with their case values, to the active dataset.

Example

In this example we create a new string variable, a new numeric variable and a new date variable, and populate case values for them. A sample dataset is first created.

String[] command={"DATA LIST FREE /case (A5).",
"BEGIN DATA",
"case1",
"case2",
"case3",
"END DATA."};
StatsUtil.submit(command);
Variable numVar = new Variable("numvar",0);
Variable strVar = new Variable("strvar",1);
Variable dateVar = new Variable("datevar",0);
dateVar.setFormatType(VariableFormat.DATE);
double[] numValues = new double[]{1.0,2.0,3.0};
String[] strValues = new String[]{"a","b","c"};
Calendar dateValue = Calendar.getInstance();
dateValue.set(Calendar.YEAR, 2012);
dateValue.set(Calendar.MONTH, Calendar.JANUARY);
dateValue.set(Calendar.DAY_OF_MONTH, 1);
Calendar[] dateValues = new Calendar[]{dateValue};
DataUtil datautil = new DataUtil();
datautil.addVariableWithValue(numVar, numValues, 0);
datautil.addVariableWithValue(strVar, strValues, 0);
datautil.addVariableWithValue(dateVar, dateValues, 0);
datautil.release();
  • The Variable class creates the specification for a new variable to be added to the active dataset. The first argument to the constructor is the name of the variable and the second argument is an integer specifying the variable type. Numeric variables have a variable type of 0 and string variables have a variable type equal to the defined length of the string (maximum of 32767 bytes).
  • The addVariableWithValue method of the DataUtil class adds a new variable to the active dataset. The first argument to the method is the Variable object that specifies the properties of the variable. The second argument is an array that specifies the value of the variable for each case in the active dataset to be populated. The third argument specifies the index of the case at which to begin populating the variable values. Case indexes start with 0 for the first case in the active dataset.

    For numeric variables, cases that are not populated are set to the system-missing value. For string variables, cases that are not populated are set to a blank value. In this example, only the first case is populated for the variable dateVar.

  • Variables representing a date, or a date and a time, in IBM® SPSS® Statistics are numeric variables that have a date or datetime format. In the above example, the variable dateVar is a numeric variable whose format has been set to DATE with the setFormatType method of the associated Variable object. When setting the value for such a variable, use a Java Calendar object as shown in this example.

Note: To save the modified active dataset to an external file, use the submit method (following the release method) to submit a SAVE command, as in:

StatsUtil.submit("SAVE OUTFILE='/data/mydata.sav'.")

Example: Multiple data passes

Sometimes more than one pass of the data is required, as in the following example involving two data passes. The first data pass is used to read the data and compute a summary statistic. The second data pass is used to add a summary variable to the active dataset.

 String[] command={"DATA LIST FREE /var (F).",
"BEGIN DATA",
"40200",
"21450",
"21900",
"END DATA."};
StatsUtil.submit(command);
Double total = 0.0;
DataUtil datautil = new DataUtil();
Case[] data = datautil.fetchCases(false, 0);
for(Case onecase: data){
   total = total + onecase.getDoubleCellValue(0);
}
Double meanval = total/data.length;
Variable mean = new Variable("mean",0);
double[] meanVals = new double[data.length];
for (int i=0;i<data.length;i++){
   meanVals[i]=meanval;
}
datautil.addVariableWithValue(mean, meanVals, 0);
datautil.release();