IC5Notice: We have upgraded developerWorks Community to the latest version of IBM Connections. For more information, read our upgrade FAQ.
Topic
  • 2 replies
  • Latest Post - ‏2011-01-18T03:56:23Z by SystemAdmin
SystemAdmin
SystemAdmin
2077 Posts

Pinned topic Thread safety of spss module

‏2011-01-17T20:31:31Z |
If n concurrent python processes use the spss module to create a dataset of the same name, will they clash with each other? I notice on my own machine (win32) I see that a statement like:

spss.Submit("DATASET DECLARE AGG.") yields a message:

_________________________________
Dataset name AGG already defined.
_________________________________


When I call my script more than once (the previous python process has shut down). It seems that there's some resonance in the underlying SPSS server instance. I have plans to spawn potentially hundreds of parallel SPSS jobs (on a computing cluster) via the python interface and I'm a bit concerned about whether this was a use case considered in the design.
Updated on 2011-01-18T03:56:23Z at 2011-01-18T03:56:23Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    2077 Posts

    Thread safety of SPSS module

    ‏2011-01-17T21:04:25Z  
    Each Python process has a separate address space with a copy of SPSS in it, so they will not conflict.

    If you open the same data file in two independent sessions, you are likely to get an in-use warning, but that would not apply to the DATASET DECLARE statements.

    I tried this in two different Python sessions and did not see this warning. Are you sure that you had actually ended the Python process?

    I also tried calling spss.StopSPSS after defining a handle and then spss.StartSPSS and declaring the handle again without getting a message.

    Note that all this applies to processes, not threads. The SPSS backend is NOT thread safe (and in fact Python threads can't run concurrently anyway without special code to release the GIL). The Python multiprocessing module allows one Python session to manage a set of processes with an api that is similar to the threading module.

    Note that in internal mode, the Python state is preserved between programs run in the same session.

    HTH,
    Jon Peck
  • SystemAdmin
    SystemAdmin
    2077 Posts

    Re: Thread safety of SPSS module

    ‏2011-01-18T03:56:23Z  
    Each Python process has a separate address space with a copy of SPSS in it, so they will not conflict.

    If you open the same data file in two independent sessions, you are likely to get an in-use warning, but that would not apply to the DATASET DECLARE statements.

    I tried this in two different Python sessions and did not see this warning. Are you sure that you had actually ended the Python process?

    I also tried calling spss.StopSPSS after defining a handle and then spss.StartSPSS and declaring the handle again without getting a message.

    Note that all this applies to processes, not threads. The SPSS backend is NOT thread safe (and in fact Python threads can't run concurrently anyway without special code to release the GIL). The Python multiprocessing module allows one Python session to manage a set of processes with an api that is similar to the threading module.

    Note that in internal mode, the Python state is preserved between programs run in the same session.

    HTH,
    Jon Peck
    Thanks Jon. I realized what I did. I was running this from IDLE which invokes (when I press F5) the script from an existing and long-lived python session as a module I guess. When I run outside of idle with just "python foo.py" I don't see the warning either.