Topic
  • 9 replies
  • Latest Post - ‏2012-11-29T00:41:09Z by SystemAdmin
SystemAdmin
SystemAdmin
456 Posts

Pinned topic Help - Fitting a distribution to non-normal data in SPSS

‏2012-11-20T21:33:36Z |
Hi,

I am working with non-normal data and I need find the right distributions for my data so I can run GLMMs. Does SPSS has a function to analyze the distribution of the data? E.g. Poisson, log, .....I've found countless tutorials for programs but none for SPSS.

Thanks!
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-20T21:41:17Z  
    I've now gotten this question from you three times. Once is enough.

    Take a look at the Q-Q plots via Analyze > Descriptive Statistics. You can fit many distributions that way as well as getting a good diagnostic graph, but remember that it's the error term that matters. There is also an extension command, STATS DISTFIT, that can fit distributions in bulk.

    HTH,
    Jon Peck
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-21T17:34:21Z  
    I've now gotten this question from you three times. Once is enough.

    Take a look at the Q-Q plots via Analyze > Descriptive Statistics. You can fit many distributions that way as well as getting a good diagnostic graph, but remember that it's the error term that matters. There is also an extension command, STATS DISTFIT, that can fit distributions in bulk.

    HTH,
    Jon Peck
    Thank you!
    I didn't realized that all of the separate groups were all answered by yourself. Apologies.
    Are you aware of any decent tutorials that can cover the STATS DISTFIT extension command? I have limited experience working with R and even less with SPSS syntax but I am trying to feel my way through it .... with limited success.
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-21T17:46:31Z  
    Thank you!
    I didn't realized that all of the separate groups were all answered by yourself. Apologies.
    Are you aware of any decent tutorials that can cover the STATS DISTFIT extension command? I have limited experience working with R and even less with SPSS syntax but I am trying to feel my way through it .... with limited success.
    I got notified of these postings but don't necessarily answer all.

    As for tutorials, you don't need one. Once you install the R Essentials and this extension command, it works just like native Statistics commands, and it has a dialog box interface as well. So you can use it with pointing and clicking and never go near Statistics or R syntax.

    You will have to install R and the R Essentials. Regarding the latter, be sure to download and read the installation instructions. The link to the R Essentials can be found from the SPSS Community site by following Downloads for SPSS Statistics.

    HTH,
    Jon Peck
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-22T21:20:34Z  
    I got notified of these postings but don't necessarily answer all.

    As for tutorials, you don't need one. Once you install the R Essentials and this extension command, it works just like native Statistics commands, and it has a dialog box interface as well. So you can use it with pointing and clicking and never go near Statistics or R syntax.

    You will have to install R and the R Essentials. Regarding the latter, be sure to download and read the installation instructions. The link to the R Essentials can be found from the SPSS Community site by following Downloads for SPSS Statistics.

    HTH,
    Jon Peck
    Ok. I have everything installed but unfortunately when I try and a distribution fit I have one of two issues.

    1. I receive the the following message and must re-start SPSS: "An unknown error has terminated communication with the processor. The SPSS Statistics Processer is unavailable."

    2. I receive the message (for several different distributions that I have tried): "Error: An unsupported distribution was specified".

    Have you any ideas as to what is going wrong?
    Thank you....
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-23T01:51:58Z  
    Ok. I have everything installed but unfortunately when I try and a distribution fit I have one of two issues.

    1. I receive the the following message and must re-start SPSS: "An unknown error has terminated communication with the processor. The SPSS Statistics Processer is unavailable."

    2. I receive the message (for several different distributions that I have tried): "Error: An unsupported distribution was specified".

    Have you any ideas as to what is going wrong?
    Thank you....
    You need to specify what SPSS version and platform you are using.
    Post the syntax where you are getting the unknown error message and where you are getting the unsupported distribution message.

    And did you try the Q-Q plots?
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-23T02:26:06Z  
    You need to specify what SPSS version and platform you are using.
    Post the syntax where you are getting the unknown error message and where you are getting the unsupported distribution message.

    And did you try the Q-Q plots?
    Hi Jon. I'm working with Windows 7 and running SPSS version 20. I did try the Q-Q plots (thanks for that) and they worked perfectly. I was just hoping to run the distributions and get some test of Goodness of Fit so I am not relying solely on my qualitative assessment of the plots.

    This first syntax simply resulted in an error message while the second crashed the program.

    STATS DISTFIT VARIABLE=Sodium
    DISTRIBUTION=logistic
    /OPTIONS QQPLOT=YES.
    Error: An unsupported distribution was specified: logistic

    STATS DISTFIT VARIABLE=Sodium
    DISTRIBUTION=lognormal
    /OPTIONS QQPLOT=YES.
    Error: An unsupported distribution was specified: lognormal
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-23T14:24:31Z  
    Hi Jon. I'm working with Windows 7 and running SPSS version 20. I did try the Q-Q plots (thanks for that) and they worked perfectly. I was just hoping to run the distributions and get some test of Goodness of Fit so I am not relying solely on my qualitative assessment of the plots.

    This first syntax simply resulted in an error message while the second crashed the program.

    STATS DISTFIT VARIABLE=Sodium
    DISTRIBUTION=logistic
    /OPTIONS QQPLOT=YES.
    Error: An unsupported distribution was specified: logistic

    STATS DISTFIT VARIABLE=Sodium
    DISTRIBUTION=lognormal
    /OPTIONS QQPLOT=YES.
    Error: An unsupported distribution was specified: lognormal
    Logistic was inadvertently omitted from some of the code. We'll post the fixed version of the command once it has been approved by Legal. You can subscribe to a notification on the file in the Extension Commands Collection to be notified when it has been posted.

    As for the problem with lognormal, I can't reproduce it. If the variable has any nonpositive values, that would violate the range requirement for this distribution, and the fit would fail. When I try that, however, I get a message that the distribution could not be fit, and there was no crash. I expect that you are seeing that the startx process terminated, rather than Statistics. Startx (or startx32) is the connection to the R code. Please confirm.

    If you can post the dataset you are fitting or send it to me directly (peck AT us.ibm.com), I can see if that shows the problem.

    HTH,
    Jon Peck
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-28T18:36:39Z  
    Logistic was inadvertently omitted from some of the code. We'll post the fixed version of the command once it has been approved by Legal. You can subscribe to a notification on the file in the Extension Commands Collection to be notified when it has been posted.

    As for the problem with lognormal, I can't reproduce it. If the variable has any nonpositive values, that would violate the range requirement for this distribution, and the fit would fail. When I try that, however, I get a message that the distribution could not be fit, and there was no crash. I expect that you are seeing that the startx process terminated, rather than Statistics. Startx (or startx32) is the connection to the R code. Please confirm.

    If you can post the dataset you are fitting or send it to me directly (peck AT us.ibm.com), I can see if that shows the problem.

    HTH,
    Jon Peck
    Hi Jon,
    With the new code I seem to able to run models one at a time (more than that results in a crash). Unfortunately I am now getting 2 error messages (below). What am I doing wrong?

    1: In dlogis(x, location, scale, log) : NaNs produced
    2: In dlogis(x, location, scale, log) : NaNs produced
    3: In function (x, y, ..., alternative = c("two.sided", "less", "greater"), :
    cannot compute correct p-values with ties
    STATS DISTFIT VARIABLE=Sodium
    DISTRIBUTION=lognormal
    /OPTIONS QQPLOT=NO.

    Thank you!
  • SystemAdmin
    SystemAdmin
    456 Posts

    Re: Help - Fitting a distribution to non-normal data in SPSS

    ‏2012-11-29T00:41:09Z  
    Hi Jon,
    With the new code I seem to able to run models one at a time (more than that results in a crash). Unfortunately I am now getting 2 error messages (below). What am I doing wrong?

    1: In dlogis(x, location, scale, log) : NaNs produced
    2: In dlogis(x, location, scale, log) : NaNs produced
    3: In function (x, y, ..., alternative = c("two.sided", "less", "greater"), :
    cannot compute correct p-values with ties
    STATS DISTFIT VARIABLE=Sodium
    DISTRIBUTION=lognormal
    /OPTIONS QQPLOT=NO.

    Thank you!
    These messages are coming from the R package. The NaN messages can probably be ignored. They are likely due to the iterative numerical algorithm fitting the logistic distribution, and if the parameter estimates look reasonable, the algorithm probably converged.

    For the p values message, the algorithm expects continuous data, so ties would be a probability 0 event that it does not handle. If the number of ties is small, I wouldn't worry about the message. If there are a lot of them, then the choice of a logistic distribution is probably wrong anyway.

    HTH,
    Jon