Topic
  • 2 replies
  • Latest Post - ‏2011-12-01T14:34:06Z by SystemAdmin
SystemAdmin
SystemAdmin
396 Posts

Pinned topic Multithreading R Essentials for SPSS V20???

‏2011-11-30T18:29:45Z |
I installed R Essentials for SPSS V20 and it works great. However, I have R code that takes 6-8 hrs to run on each file, and I have to process 16 files. Currently, I run each file/code combo in a separate instance of R so they run in parallel on different processors. Does anyone know of a way to start multiple instances of R from within SPSS instead of single-thread like examples in docs???

Thanks,
Jim
  • SystemAdmin
    SystemAdmin
    396 Posts

    Re: Multithreading R Essentials for SPSS V20???

    ‏2011-11-30T20:09:08Z  
    A possibility may be to drive the R Plug-in from Python, using the Python Plug-in in external mode. The Python Plug-in starts up an instance of the SPSS Statistics backend which allows you to submit syntax, so in particular you can submit an R program block (BEGIN PROGRAM R - END PROGRAM) from Python. You could have separate instances of Python, which would then start up separate instances of the SPSS Statistics backend. One limitation to this approach is that there is no SPSS Statistics front-end, thus no Output Viewer.

    For reference, the Python Plug-in is installed with Essentials for Python.
  • SystemAdmin
    SystemAdmin
    396 Posts

    Re: Multithreading R Essentials for SPSS V20???

    ‏2011-12-01T14:34:06Z  
    A possibility may be to drive the R Plug-in from Python, using the Python Plug-in in external mode. The Python Plug-in starts up an instance of the SPSS Statistics backend which allows you to submit syntax, so in particular you can submit an R program block (BEGIN PROGRAM R - END PROGRAM) from Python. You could have separate instances of Python, which would then start up separate instances of the SPSS Statistics backend. One limitation to this approach is that there is no SPSS Statistics front-end, thus no Output Viewer.

    For reference, the Python Plug-in is installed with Essentials for Python.
    It is true that you can only have one R process associated with a Statistics job at a time. You could run multiple external mode Statistics jobs controlled by .NET or Python code. Each would have its own pair of processes. If you go the Python route, though, you could use the Python multiprocessing module to launch and monitor these sets of jobs from one parent job. You could probably do something equivalent with .NET. But always there will have to be a 1-1 correspondence between R processes and Statistics processes.

    Regards,
    Jon Peck