Running a basic statistical analysis on query results

You can generate basic statistics from your query results by using QMF Analytics for TSO.

About this task

Basic Statistics show sample size, minimum value, maximum value, range, median, mode, mean, standard error, standard deviation, skewness, kurtosis, quartiles, and sextiles for your query result data.

For an example of how to analyze query results using Basic Statistics:

Procedure

  1. Start QMF for TSO.

    Your QMF administrator can tell you how to start a QMF session and give you a user ID. Check with your QMF administrator if you have any questions about getting started with QMF.

  2. From the QMF Home Panel, enter DISPLAY Q.CLIMATE_10YR on the command line and press Enter.

    QMF for TSO displays the query result as a multi-columned report.

    Figure 1. Query Results
    REPORT           Q.CLIMATE_10YR           LINE 1      POS 1      79     
                                                                                    
                                                                                    
                                                                                    
              YEAR        MONTH      TEMPMIN      TEMPMAX     RAINFALL     SUNSHINE 
       -----------  -----------  -----------  -----------  -----------  ----------- 
              2001            1            9           70            3          234 
              2001            2           18           72            7          205 
              2001            3           16           77           12          180 
              2001            4           32           91            3          230 
              2001            5           32           95            4          234 
              2001            6           41          115            3          230 
              2001            7           43          111            1          227 
              2001            8           39          115            3          238 
              2001            9           32           93            6          226 
              2001           10           27           88            6          221 
              2001           11           14           79           12          183 
              2001           12           19           73            9          204 
              2002            1           27           77           11          185 
              2002            2           25           75           45           25 
              2002            3           25           88           17          161 
              2002            4           32           90            3          226 
              2002            5           32          100            2          225 
              2002            6           43          108            4          241 
              2002            7           46          111            4          228 
              2002            8           45          113            2          238 
              2002            9           39          102            9          197 
              2002           10           32           90            3          227 
              2002           11           27           79            5          221 
              2002           12           23           66            8          219 
              2003            1           14           64           12          195 
              2003            2           16           75            8          190 
              2003            3           25           73           10          187 
              2003            4           27           97            1          233 
              2003            5           36          102            7          220 
              2003            6           45          115            3          242 
              2003            7           37          115            1          237 
              2003            8           41          117            1          240 
              2003            9           39          104            2          235 
     1=Help         2=            3=End       4=Print       5=Chart        6=Query  
     7=Backward     8=Forward     9=Form     10=Left       11=Right       12=       
     OK,  Q.CLIMATE_10YR is displayed.                                              
     COMMAND ===>                                                  SCROLL ===> PAGE 
  3. Enter SHOW ANALYTICS on the command line and press Enter to start QMF Analytics for TSO.
  4. Tab to Basic in the Statistics section of the QMF Analytics for TSO Home panel and press Enter.

    The Parameter Selection panel is displayed.

  5. Create the specification for your analysis.
    Set your parameters as shown in the following table.

    You can enter column names manually or you can tab to the field and press the List function key (F4).

    The List function opens a Column Selection window that lists all of the column names from the query result. To select a column from the Column Selection window, tab to the column name and press Enter. QMF Analytics for TSO adds the name to the Parameter Selection panel automatically.

    Table 1. Parameter Settings for Basic Statistics
    Parameter Setting
    Select all numeric columns All
    QMF Analytics for TSO lists all columns with numeric data in the Which column(s) to be analyzed? field.
    Note: You can also specify which columns from the query result to analyze by entering the column names manually or by using the List function key (F4).
    Chart Title

    Enter a title for your chart in the Chart Title field or accept the default value of asterisk (*).

    If you accept the default value of asterisk (*), QMF Analytics for TSO creates a title based on the columns that you have selected.

    If you delete the asterisk (*) and leave this field blank, the analysis output will not include a title.

  6. Optional: Press the Save function key (F5) to save the specification.

    The specification is saved to the QMF database as an ANALYTIC object. In Save Command Prompt, you can enter object names manually or you can tab to the field and press the List function key (F4) to see all available objects in Analytic Object List.

    Saving the specification to the database allows you to run your analysis directly from the QMF command line at a later time, without having to navigate to the QMF Analytics for TSO Home panel and reenter the parameters. Saving specifications makes sense if you plan on running the analysis frequently or multiple times.
    Note: The global variable DSQEC_SESSGLV_SAV in QMF controls how the autosave for parameter settings works. For information about Global Variables, see Global Variable List in the QMF help.
  7. Press the Run function key (F2) to analyze the data.

    QMF Analytics for TSO returns a list of the variables (the numeric columns) in your analysis and for each one shows the sample size, minimum value, maximum value, and mean. The following columns are standard measures in basic statistics: Size, Minimum, Maximum, and Mean.

    In Figure 2, the Size column is the amount of data samples collected during the observation period, which is the number of rows in the table. In this example, the values in the Size column is equivalent to the number of months if the data contains one sample per month.

    Figure 2. Summary Data for Basic Statistical Analysis of Query Results
                                        Basic                                       
                                                                                    
    Name                                   Size     Minimum     Maximum        Mean 
    YEAR                                    120    2001.000    2010.000    2005.500 
    MONTH                                   120       1.000      12.000       6.500 
    TEMPMIN                                 120       9.000      48.000      29.350 
    TEMPMAX                                 120      64.000     120.000      91.550 
    RAINFALL                                120       1.000      45.000       6.917 
    SUNSHINE                                120      25.000     242.000     211.900 
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
                                                                                    
    1=Help   2=        3=End     4=Print       5=Summary       6=                   
    7=Back   8=Forward 9=       10=Statistics 11=Distribution 12=Chi-Square         

    The largest observation for each variable is listed in the column titled Maximum and the smallest observation for each variable is listed in the column titled Minimum.

    From the Summary page you can tab to a specific variable and run statistical analyses on the variable selected.

    The following table lists and describes the functions available from the Summary page.

    Table 2. Statistical analysis functions available from the Summary page. Statistical analysis functions available from the Summary page
    Function Result Usage notes
    Statistics (F10) Displays additional statistics for the column selected. See Basic statistics for Rainfall for an example. You can use the Forward and Backward function keys to scroll through basic statistics for each column.
    Distribution (F11) Displays a distribution graph of the column data being analyzed. See Distribution statistics for Rainfall for an example. You can use the Forward and Backward function keys to scroll through distribution statistics for each column.
    Chi-square (F12)

    Displays a Chi-square graph of the column data being analyzed. See Chi-square for an example.

    Chi-square is a value used in statistical analyses as a basis for rejecting or accepting the null hypothesis.

    By running a Chi-square Test, you can see whether a sample is normally distributed.

    You can use the Forward and Backward function keys to scroll through chi-square statistics for each column.
  8. Tab to RAINFALL in the Name column and press the Statistics function key (F10).
    Figure 3. Additional Statistics - Rainfall
                                        Basic                               
                                                                            
    RAINFALL                                                                
                        Size         120                                    
                                                                            
                     Minimum       1.000                                    
                     Maximum      45.000                                    
                       Range      44.000                                    
                                                                            
                      Median       5.000       Quartile 1       3.000       
                        Mode       3.000       Quartile 2       5.000       
                        Mean       6.917       Quartile 3       9.000       
                                                                            
              Standard Error       0.611        Sextile 1       2.000       
          Standard Deviation       6.691        Sextile 2       3.000       
                    Skewness       2.583        Sextile 3       5.000       
                    Kurtosis      12.186        Sextile 4       8.000       
                                                Sextile 5      10.000       
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
    1=Help   2=        3=End     4=Print       5=Summary       6=           
    7=Back   8=Forward 9=       10=Statistics 11=Distribution 12=Chi-Square 
    Press a function key for the appropriate display.                       
    In addition to the standard measures for basic statistics, the Statistics function shows the following:
    Mode
    The value of a variable that occurs most often in a distribution.
    Median.
    A measure of central tendency; the midpoint of a distribution.
    Range
    The difference between the maximum and minimum values.
    Standard deviation
    A measure of dispersion. This measures the spread of your data relative to its arithmetic mean. The standard deviation is measured in the same units as the data.
    Standard error
    The standard deviation of the sampling distribution of a variable.
    Skewness
    A measure of the asymmetry of the values in your distribution. If the distribution has a longer tail to the high end of the scale, your data is positively skewed. If the distribution has a longer tail to the low end of the scale, your data is negatively skewed.
    Quartiles
    Three values that divide a distribution into four parts, such that a quarter of the numbers fall under the first quartile, half under the second quartile, and so on.

    When speaking about Quartiles, the “numbers” represented in the analysis results have a direct relation to the size. For example, the quartile is a value which cuts 1/4 the “size” samples.

    Sextiles
    Five values that divide a distribution into six parts, such that a sixth of the numbers fall under the first sextile, a third under the second sextile, half under the third sextile, and so on.

    When speaking about Sextiles, the “numbers” represented in the analysis results have a direct relation to the size. For example, the sextile is a value which cuts 1/6 of the “size” samples.

  9. Press the Distribution function key (F11) to display distribution statistics for the selected column.
    Figure 4. Distribution statistics - Rainfall
    Distribution statistics

    This graph shows the frequency of rainfall values from the query result during the 120 month period. It indicates that the bigger the rainfall is, the less likely it happens, and that the average rainfall is measured by 6.92 millimeter.

    The X-axis is the rainfall sample value, the Y-axis is the frequency.

  10. Press the Chi-Square function key (F12) to display Chi-Square statistics for the selected column.
    Figure 5. Chi-Square Statistics - Rainfall
    Chi-square statistics for year

    Figure 5 indicates that the suggested hypothesis, that the rainfall distribution is controlled by Normal statistics, is false.

    The confidence levels 95% and 99% are the corresponding probability of the decisions made: Different and Different. That decision means that if we accept as normal the 5% error level by making a wrong guess about the rainfall distribution, then the Chi-Square criteria gives the answer NO. Therefore the distribution is different from normal.

    The same holds true if we accept 1% error level.

  11. Optional: If you have configured print capabilities in QMF, you can print the chart.
  12. Press the End function key (F3) to return to the Parameter Selection panel.
  13. Press the End function key (F3) to return to the QMF Analytics for TSO Home panel.