Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
11 replies Latest Post - ‏2014-08-07T12:36:14Z by JonPeck
Kees_RT
Kees_RT
33 Posts
ACCEPTED ANSWER

Pinned topic Python /Statistics runs out of memory

‏2014-07-16T08:54:05Z |

Executing a loop my program halts sometimes during the third, sometimes during the fourth loop. Execution halts at approx. the same line (give or take 2 lines).

The Program Manager indicates that memory requirements of Statistics grow from 238 to 250, but startx increases from 200 to 300.  Does the latter indicate problems with garbage collecting? If so, how come, as during each loop all variablesare declared afresh? Or could it be something else? Or is this too specific to be dealt with?

My versions: Statistics V22, OS: Windows Vista 32 bit

I hope you can shed some light!

Regards

Kees

 

 

 

 

 

  • JonPeck
    JonPeck
    325 Posts
    ACCEPTED ANSWER

    Re: Python /Statistics runs out of memory

    ‏2014-07-16T20:57:20Z  in response to Kees_RT

    Are you really running out of memory or just seeing memory usage grow?

    Since Python has its own garbage collection system, the Task Manager report does not always show how much memory is really being used.  The memory is not necessarily recycled immediately when it is no longer in use.  The gc module lets you force garbage collection and perform other memory-related tasks, but this is very rarely necessary.

    It is also possible that your code is holding on to objects that could be deleted.  But the TM report isn't very much memory, so it seems unlikely that things are halting because of memory problems.  If you really ran out, there would be a loud noise.

    Updated on 2014-07-16T21:07:53Z at 2014-07-16T21:07:53Z by JonPeck
    • Kees_RT
      Kees_RT
      33 Posts
      ACCEPTED ANSWER

      Re: Python /Statistics runs out of memory

      ‏2014-07-17T10:43:04Z  in response to JonPeck

      Thank you, Jon

      Didn't hear noise or see smoke but that's about as far as my diagnostsic capabilities go, I'm afraid. I do have to restart SPSS though because it stops working.

      Using gc I'm able to reduce the growth of startx a little, but within a session it grows nevertheless. I use the following to collect the garbage: 

          dl = dir()

          for i in dl:

              print i

              del i

          gc.collect(2)              

      But across sessions startx retains its size. I use syntax to run a Python script (or program, I always forget which is which). The first time I run the syntax, startx is 45K. After running the syntax, startx is 82K (my garbage collection not withstanding). The third time startx starts out with 84K, , then 92 etc. (Those figures are when I use small cTables to process. In real life the cTables and amounts of memory are much larger). 

      So, although my python program does add to the size of startx, the results would be less detrimental if I could force starx to start wit its initial size. Is this reasoning correct? If so, is there an API call to produce that effect?

      Do you have any suggestions about how to get rid of objects that aren't nedeed anymore, apart from using del or setting to None?

       

      • JonPeck
        JonPeck
        325 Posts
        ACCEPTED ANSWER

        Re: Python /Statistics runs out of memory

        ‏2014-07-17T19:25:12Z  in response to Kees_RT

        Usually the garbage collector does pretty well without manual control.  Since the Task Manager doesn't know how much memory in the Python process is actually free, it's statistics are not really reliable here.

        If you want to send me a small job and dataset that shows this misbehavior, I might be able to diagnose the problem.  But it sounds like the only real problem is the freeze, and that might not be related to memory.  What version and bit size are you on?

        • Kees_RT
          Kees_RT
          33 Posts
          ACCEPTED ANSWER

          Re: Python /Statistics runs out of memory

          ‏2014-07-18T07:01:56Z  in response to JonPeck

          I didn't realize that TM doesn't know about Pythons free memory, but of course.

          What does 'the freeze' mean? Is that something I can look up somewhere?

          I'm on Statatistcs V22, Python 2.7, Windows Vista Service Pack 2. All'32 bits.

          I'm afraid I can't break the program down to a small job, but thanks for the offer.

          • JonPeck
            JonPeck
            325 Posts
            ACCEPTED ANSWER

            Re: Python /Statistics runs out of memory

            ‏2014-07-18T12:26:22Z  in response to Kees_RT

            The freeze is a puzzle.  If the TM shows cpu time for the startx or spssengine process continuing to increase, then there is probably an infinite loop somewhere in the code.  If the time just stops, then code is waiting forever for something.

            Can you run this code in external mode where you start Python and run the code directly from there?  Most Python programs can be run that was as long as they do not require the SpssClient (scripting) module.  That has less overhead, and some debuggers can interrupt the code and find out what it is executing.

            • Kees_RT
              Kees_RT
              33 Posts
              ACCEPTED ANSWER

              Re: Python /Statistics runs out of memory

              ‏2014-07-21T12:21:44Z  in response to JonPeck

              I can understand that you expect an infinite loop somewhere. Because the program runs fine when it has to process 14 small cTables but quits when processing 4 large ones, it looks like there's some interaction there: the infinite loop doesn't bother SPSS untill after a certain load.

              Running in external mode is not an option as the program uses SpssClient quite a lot.

              I don't think we'll be able to sort this out on the forum. It's probably too specifically bound to my program which I can't send you because of its size and complexity and because variable names and comments are in Dutch.

              So, let's leave it at this, and thanks againfor taking the trouble, Jon.

               

               

               

              • JonPeck
                JonPeck
                325 Posts
                ACCEPTED ANSWER

                Re: Python /Statistics runs out of memory

                ‏2014-07-21T12:53:43Z  in response to Kees_RT

                Sorry to see this unresolved.  There is one other thing I could do that might help.  I could run the program as is under my debugger using your data and see, perhaps, where it stops or at least where it is spending its time.  Or, if you have the Wing IDE (my favorite Python IDE), you could do this yourself.  Wing is able to debug Python code running within Statistics in most cases.-

                Jon

                • Kees_RT
                  Kees_RT
                  33 Posts
                  ACCEPTED ANSWER

                  Re: Python /Statistics runs out of memory

                  ‏2014-07-21T13:01:36Z  in response to JonPeck

                  On your advice I bought WIng IDE  a couple of years ago. I was a bit overwhelmed by it, but hey, I'll put it to the test!

                  Regards

                  kees

                   

                   

                  • JonPeck
                    JonPeck
                    325 Posts
                    ACCEPTED ANSWER

                    Re: Python /Statistics runs out of memory

                    ‏2014-07-21T13:23:41Z  in response to Kees_RT

                    Good luck.  Read the Wing help on how to set up external debugging.  There are a few settings that need to be modified, and you need to copy the wingdebug.py module to the directory where you code is.

                    Also, when you insert debugging code, since you are using the SpssClient module, you need to disable the heartbeat between the Statistics processes and the Python process.  The heartbeat is a clock-driven message that allows the processes to be sure that the partner process is still alive.  If no response is given within a certain time limit, the SpssClient process is shut down.

                    To do this, insert the following code in your module where it is sure to be executed before your main job gets run.

                    # debugging
                        # makes debug apply only to the current thread
                        try:
                            import wingdbstub
                            if wingdbstub.debugger != None:
                                import time
                                wingdbstub.debugger.StopDebug()
                                time.sleep(1)
                                wingdbstub.debugger.StartDebug()
                            import thread
                            wingdbstub.debugger.SetDebugThreads({thread.get_ident(): 1}, default_policy=0)
                            SpssClient._heartBeat(False)
                        except:
                            print "debugging failed"

                     

                    • Kees_RT
                      Kees_RT
                      33 Posts
                      ACCEPTED ANSWER

                      Re: Python /Statistics runs out of memory

                      ‏2014-08-07T12:14:25Z  in response to JonPeck

                      Hi Jon

                      I've found and remedied the problem. It's nothing to do with memory management and SPSS.

                      My program accesses Excel very, very freqently through win32com, and there's the problem. At (seemingly?) random times (seee http://stackoverflow.com/questions/3718037/error-while-working-with-excel-using-python) this generates an error. And I think that the longer my program runs, the higher the risk of encountering this error.

                      So I  set the Excel object properties ScreenUpdating and Application.Interactive to False and off I went ...

                      Regards

                      Kees

                       

                      • JonPeck
                        JonPeck
                        325 Posts
                        ACCEPTED ANSWER

                        Re: Python /Statistics runs out of memory

                        ‏2014-08-07T12:36:14Z  in response to Kees_RT

                        Interesting.  I would have expected the com requests to b queued, but it apears that Application.Interactive actually does something like this, although the doc isn't really clear on this point.

                        this property is usually True. If you set the this property to False, Microsoft Excel will block all input from the keyboard and mouse (except input to dialog boxes that are displayed by your code).

                        This property is useful if you're using DDE or OLE Automation to communicate with Microsoft Excel from another application.