Running a basic statistical analysis on query results
You can generate basic statistics from your query results by using QMF Analytics for TSO.
About this task
Basic Statistics show sample size, minimum value, maximum value, range, median, mode, mean, standard error, standard deviation, skewness, kurtosis, quartiles, and sextiles for your query result data.
For an example of how to analyze query results using Basic Statistics:
Procedure
- Start QMF for TSO.
Your QMF administrator can tell you how to start a QMF session and give you a user ID. Check with your QMF administrator if you have any questions about getting started with QMF.
- From the QMF Home Panel, enter DISPLAY
Q.CLIMATE_10YR on the command line and press Enter.
QMF for TSO displays the query result as a multi-columned report.
Figure 1. Query Results REPORT Q.CLIMATE_10YR LINE 1 POS 1 79 YEAR MONTH TEMPMIN TEMPMAX RAINFALL SUNSHINE ----------- ----------- ----------- ----------- ----------- ----------- 2001 1 9 70 3 234 2001 2 18 72 7 205 2001 3 16 77 12 180 2001 4 32 91 3 230 2001 5 32 95 4 234 2001 6 41 115 3 230 2001 7 43 111 1 227 2001 8 39 115 3 238 2001 9 32 93 6 226 2001 10 27 88 6 221 2001 11 14 79 12 183 2001 12 19 73 9 204 2002 1 27 77 11 185 2002 2 25 75 45 25 2002 3 25 88 17 161 2002 4 32 90 3 226 2002 5 32 100 2 225 2002 6 43 108 4 241 2002 7 46 111 4 228 2002 8 45 113 2 238 2002 9 39 102 9 197 2002 10 32 90 3 227 2002 11 27 79 5 221 2002 12 23 66 8 219 2003 1 14 64 12 195 2003 2 16 75 8 190 2003 3 25 73 10 187 2003 4 27 97 1 233 2003 5 36 102 7 220 2003 6 45 115 3 242 2003 7 37 115 1 237 2003 8 41 117 1 240 2003 9 39 104 2 235 1=Help 2= 3=End 4=Print 5=Chart 6=Query 7=Backward 8=Forward 9=Form 10=Left 11=Right 12= OK, Q.CLIMATE_10YR is displayed. COMMAND ===> SCROLL ===> PAGE
- Enter SHOW ANALYTICS on the command line and press Enter to start QMF Analytics for TSO.
- Tab to Basic in the Statistics section
of the QMF Analytics for TSO Home
panel and press Enter.
The Parameter Selection panel is displayed.
- Create the specification for your analysis.
Set your parameters as shown in the following table.
You can enter column names manually or you can tab to the field and press the List function key (F4).
The List function opens a Column Selection window that lists all of the column names from the query result. To select a column from the Column Selection window, tab to the column name and press Enter. QMF Analytics for TSO adds the name to the Parameter Selection panel automatically.
Table 1. Parameter Settings for Basic Statistics Parameter Setting Select all numeric columns All QMF Analytics for TSO lists all columns with numeric data in the Which column(s) to be analyzed? field.Note: You can also specify which columns from the query result to analyze by entering the column names manually or by using the List function key (F4).Chart Title Enter a title for your chart in the Chart Title field or accept the default value of asterisk (*).
If you accept the default value of asterisk (*), QMF Analytics for TSO creates a title based on the columns that you have selected.
If you delete the asterisk (*) and leave this field blank, the analysis output will not include a title.
- Optional: Press the Save function key (F5)
to save the specification.
The specification is saved to the QMF database as an ANALYTIC object. In Save Command Prompt, you can enter object names manually or you can tab to the field and press the List function key (F4) to see all available objects in Analytic Object List.
Saving the specification to the database allows you to run your analysis directly from the QMF command line at a later time, without having to navigate to the QMF Analytics for TSO Home panel and reenter the parameters. Saving specifications makes sense if you plan on running the analysis frequently or multiple times.Note: The global variable DSQEC_SESSGLV_SAV in QMF controls how the autosave for parameter settings works. For information about Global Variables, see Global Variable List in the QMF help. - Press the Run function key (F2) to analyze the data.
QMF Analytics for TSO returns a list of the variables (the numeric columns) in your analysis and for each one shows the sample size, minimum value, maximum value, and mean. The following columns are standard measures in basic statistics: Size, Minimum, Maximum, and Mean.
In Figure 2, the Size column is the amount of data samples collected during the observation period, which is the number of rows in the table. In this example, the values in the Size column is equivalent to the number of months if the data contains one sample per month.
Figure 2. Summary Data for Basic Statistical Analysis of Query Results Basic Name Size Minimum Maximum Mean YEAR 120 2001.000 2010.000 2005.500 MONTH 120 1.000 12.000 6.500 TEMPMIN 120 9.000 48.000 29.350 TEMPMAX 120 64.000 120.000 91.550 RAINFALL 120 1.000 45.000 6.917 SUNSHINE 120 25.000 242.000 211.900 1=Help 2= 3=End 4=Print 5=Summary 6= 7=Back 8=Forward 9= 10=Statistics 11=Distribution 12=Chi-Square
The largest observation for each variable is listed in the column titled Maximum and the smallest observation for each variable is listed in the column titled Minimum.
From the Summary page you can tab to a specific variable and run statistical analyses on the variable selected.
The following table lists and describes the functions available from the Summary page.
Table 2. Statistical analysis functions available from the Summary page. Statistical analysis functions available from the Summary page Function Result Usage notes Statistics (F10) Displays additional statistics for the column selected. See Basic statistics for Rainfall for an example. You can use the Forward and Backward function keys to scroll through basic statistics for each column. Distribution (F11) Displays a distribution graph of the column data being analyzed. See Distribution statistics for Rainfall for an example. You can use the Forward and Backward function keys to scroll through distribution statistics for each column. Chi-square (F12) Displays a Chi-square graph of the column data being analyzed. See Chi-square for an example.
Chi-square is a value used in statistical analyses as a basis for rejecting or accepting the null hypothesis.
By running a Chi-square Test, you can see whether a sample is normally distributed.
You can use the Forward and Backward function keys to scroll through chi-square statistics for each column. - Tab to RAINFALL in the Name column
and press the Statistics function key (F10).
Figure 3. Additional Statistics - Rainfall Basic RAINFALL Size 120 Minimum 1.000 Maximum 45.000 Range 44.000 Median 5.000 Quartile 1 3.000 Mode 3.000 Quartile 2 5.000 Mean 6.917 Quartile 3 9.000 Standard Error 0.611 Sextile 1 2.000 Standard Deviation 6.691 Sextile 2 3.000 Skewness 2.583 Sextile 3 5.000 Kurtosis 12.186 Sextile 4 8.000 Sextile 5 10.000 1=Help 2= 3=End 4=Print 5=Summary 6= 7=Back 8=Forward 9= 10=Statistics 11=Distribution 12=Chi-Square Press a function key for the appropriate display.
In addition to the standard measures for basic statistics, the Statistics function shows the following:- Mode
- The value of a variable that occurs most often in a distribution.
- Median.
- A measure of central tendency; the midpoint of a distribution.
- Range
- The difference between the maximum and minimum values.
- Standard deviation
- A measure of dispersion. This measures the spread of your data relative to its arithmetic mean. The standard deviation is measured in the same units as the data.
- Standard error
- The standard deviation of the sampling distribution of a variable.
- Skewness
- A measure of the asymmetry of the values in your distribution. If the distribution has a longer tail to the high end of the scale, your data is positively skewed. If the distribution has a longer tail to the low end of the scale, your data is negatively skewed.
- Quartiles
- Three values that divide a distribution into four parts, such
that a quarter of the numbers fall under the first quartile, half
under the second quartile, and so on.
When speaking about Quartiles, the “numbers” represented in the analysis results have a direct relation to the size. For example, the quartile is a value which cuts 1/4 the “size” samples.
- Sextiles
- Five values that divide a distribution into six parts, such that
a sixth of the numbers fall under the first sextile, a third under
the second sextile, half under the third sextile, and so on.
When speaking about Sextiles, the “numbers” represented in the analysis results have a direct relation to the size. For example, the sextile is a value which cuts 1/6 of the “size” samples.
- Press the Distribution function key (F11) to
display distribution statistics for the selected column.
Figure 4. Distribution statistics - Rainfall This graph shows the frequency of rainfall values from the query result during the 120 month period. It indicates that the bigger the rainfall is, the less likely it happens, and that the average rainfall is measured by 6.92 millimeter.
The X-axis is the rainfall sample value, the Y-axis is the frequency.
- Press the Chi-Square function key (F12) to display
Chi-Square statistics for the selected column.
Figure 5. Chi-Square Statistics - Rainfall Figure 5 indicates that the suggested hypothesis, that the rainfall distribution is controlled by Normal statistics, is false.
The confidence levels 95% and 99% are the corresponding probability of the decisions made:
Different
andDifferent
. That decision means that if we accept as normal the 5% error level by making a wrong guess about the rainfall distribution, then the Chi-Square criteria gives the answer NO. Therefore the distribution is different from normal.The same holds true if we accept 1% error level.
- Optional: If you have configured print capabilities in QMF, you can print the chart.
- Press the End function key (F3) to return to the Parameter Selection panel.
- Press the End function key (F3) to return to the QMF Analytics for TSO Home panel.