Topic
  • 11 replies
  • Latest Post - ‏2014-08-07T12:36:14Z by JonPeck
Kees_RT
Kees_RT
35 Posts

Pinned topic Python /Statistics runs out of memory

‏2014-07-16T08:54:05Z |

Executing a loop my program halts sometimes during the third, sometimes during the fourth loop. Execution halts at approx. the same line (give or take 2 lines).

The Program Manager indicates that memory requirements of Statistics grow from 238 to 250, but startx increases from 200 to 300.  Does the latter indicate problems with garbage collecting? If so, how come, as during each loop all variablesare declared afresh? Or could it be something else? Or is this too specific to be dealt with?

My versions: Statistics V22, OS: Windows Vista 32 bit

I hope you can shed some light!

Regards

Kees

 

 

 

 

 

  • JonPeck
    JonPeck
    360 Posts
    ACCEPTED ANSWER

    Re: Python /Statistics runs out of memory

    ‏2014-07-21T13:23:41Z  
    • Kees_RT
    • ‏2014-07-21T13:01:36Z

    On your advice I bought WIng IDE  a couple of years ago. I was a bit overwhelmed by it, but hey, I'll put it to the test!

    Regards

    kees

     

     

    Good luck.  Read the Wing help on how to set up external debugging.  There are a few settings that need to be modified, and you need to copy the wingdebug.py module to the directory where you code is.

    Also, when you insert debugging code, since you are using the SpssClient module, you need to disable the heartbeat between the Statistics processes and the Python process.  The heartbeat is a clock-driven message that allows the processes to be sure that the partner process is still alive.  If no response is given within a certain time limit, the SpssClient process is shut down.

    To do this, insert the following code in your module where it is sure to be executed before your main job gets run.

    # debugging
        # makes debug apply only to the current thread
        try:
            import wingdbstub
            if wingdbstub.debugger != None:
                import time
                wingdbstub.debugger.StopDebug()
                time.sleep(1)
                wingdbstub.debugger.StartDebug()
            import thread
            wingdbstub.debugger.SetDebugThreads({thread.get_ident(): 1}, default_policy=0)
            SpssClient._heartBeat(False)
        except:
            print "debugging failed"

     

  • JonPeck
    JonPeck
    360 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-16T20:57:20Z  

    Are you really running out of memory or just seeing memory usage grow?

    Since Python has its own garbage collection system, the Task Manager report does not always show how much memory is really being used.  The memory is not necessarily recycled immediately when it is no longer in use.  The gc module lets you force garbage collection and perform other memory-related tasks, but this is very rarely necessary.

    It is also possible that your code is holding on to objects that could be deleted.  But the TM report isn't very much memory, so it seems unlikely that things are halting because of memory problems.  If you really ran out, there would be a loud noise.

    Updated on 2014-07-16T21:07:53Z at 2014-07-16T21:07:53Z by JonPeck
  • Kees_RT
    Kees_RT
    35 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-17T10:43:04Z  
    • JonPeck
    • ‏2014-07-16T20:57:20Z

    Are you really running out of memory or just seeing memory usage grow?

    Since Python has its own garbage collection system, the Task Manager report does not always show how much memory is really being used.  The memory is not necessarily recycled immediately when it is no longer in use.  The gc module lets you force garbage collection and perform other memory-related tasks, but this is very rarely necessary.

    It is also possible that your code is holding on to objects that could be deleted.  But the TM report isn't very much memory, so it seems unlikely that things are halting because of memory problems.  If you really ran out, there would be a loud noise.

    Thank you, Jon

    Didn't hear noise or see smoke but that's about as far as my diagnostsic capabilities go, I'm afraid. I do have to restart SPSS though because it stops working.

    Using gc I'm able to reduce the growth of startx a little, but within a session it grows nevertheless. I use the following to collect the garbage: 

        dl = dir()

        for i in dl:

            print i

            del i

        gc.collect(2)              

    But across sessions startx retains its size. I use syntax to run a Python script (or program, I always forget which is which). The first time I run the syntax, startx is 45K. After running the syntax, startx is 82K (my garbage collection not withstanding). The third time startx starts out with 84K, , then 92 etc. (Those figures are when I use small cTables to process. In real life the cTables and amounts of memory are much larger). 

    So, although my python program does add to the size of startx, the results would be less detrimental if I could force starx to start wit its initial size. Is this reasoning correct? If so, is there an API call to produce that effect?

    Do you have any suggestions about how to get rid of objects that aren't nedeed anymore, apart from using del or setting to None?

     

  • JonPeck
    JonPeck
    360 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-17T19:25:12Z  
    • Kees_RT
    • ‏2014-07-17T10:43:04Z

    Thank you, Jon

    Didn't hear noise or see smoke but that's about as far as my diagnostsic capabilities go, I'm afraid. I do have to restart SPSS though because it stops working.

    Using gc I'm able to reduce the growth of startx a little, but within a session it grows nevertheless. I use the following to collect the garbage: 

        dl = dir()

        for i in dl:

            print i

            del i

        gc.collect(2)              

    But across sessions startx retains its size. I use syntax to run a Python script (or program, I always forget which is which). The first time I run the syntax, startx is 45K. After running the syntax, startx is 82K (my garbage collection not withstanding). The third time startx starts out with 84K, , then 92 etc. (Those figures are when I use small cTables to process. In real life the cTables and amounts of memory are much larger). 

    So, although my python program does add to the size of startx, the results would be less detrimental if I could force starx to start wit its initial size. Is this reasoning correct? If so, is there an API call to produce that effect?

    Do you have any suggestions about how to get rid of objects that aren't nedeed anymore, apart from using del or setting to None?

     

    Usually the garbage collector does pretty well without manual control.  Since the Task Manager doesn't know how much memory in the Python process is actually free, it's statistics are not really reliable here.

    If you want to send me a small job and dataset that shows this misbehavior, I might be able to diagnose the problem.  But it sounds like the only real problem is the freeze, and that might not be related to memory.  What version and bit size are you on?

  • Kees_RT
    Kees_RT
    35 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-18T07:01:56Z  
    • JonPeck
    • ‏2014-07-17T19:25:12Z

    Usually the garbage collector does pretty well without manual control.  Since the Task Manager doesn't know how much memory in the Python process is actually free, it's statistics are not really reliable here.

    If you want to send me a small job and dataset that shows this misbehavior, I might be able to diagnose the problem.  But it sounds like the only real problem is the freeze, and that might not be related to memory.  What version and bit size are you on?

    I didn't realize that TM doesn't know about Pythons free memory, but of course.

    What does 'the freeze' mean? Is that something I can look up somewhere?

    I'm on Statatistcs V22, Python 2.7, Windows Vista Service Pack 2. All'32 bits.

    I'm afraid I can't break the program down to a small job, but thanks for the offer.

  • JonPeck
    JonPeck
    360 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-18T12:26:22Z  
    • Kees_RT
    • ‏2014-07-18T07:01:56Z

    I didn't realize that TM doesn't know about Pythons free memory, but of course.

    What does 'the freeze' mean? Is that something I can look up somewhere?

    I'm on Statatistcs V22, Python 2.7, Windows Vista Service Pack 2. All'32 bits.

    I'm afraid I can't break the program down to a small job, but thanks for the offer.

    The freeze is a puzzle.  If the TM shows cpu time for the startx or spssengine process continuing to increase, then there is probably an infinite loop somewhere in the code.  If the time just stops, then code is waiting forever for something.

    Can you run this code in external mode where you start Python and run the code directly from there?  Most Python programs can be run that was as long as they do not require the SpssClient (scripting) module.  That has less overhead, and some debuggers can interrupt the code and find out what it is executing.

  • Kees_RT
    Kees_RT
    35 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-21T12:21:44Z  
    • JonPeck
    • ‏2014-07-18T12:26:22Z

    The freeze is a puzzle.  If the TM shows cpu time for the startx or spssengine process continuing to increase, then there is probably an infinite loop somewhere in the code.  If the time just stops, then code is waiting forever for something.

    Can you run this code in external mode where you start Python and run the code directly from there?  Most Python programs can be run that was as long as they do not require the SpssClient (scripting) module.  That has less overhead, and some debuggers can interrupt the code and find out what it is executing.

    I can understand that you expect an infinite loop somewhere. Because the program runs fine when it has to process 14 small cTables but quits when processing 4 large ones, it looks like there's some interaction there: the infinite loop doesn't bother SPSS untill after a certain load.

    Running in external mode is not an option as the program uses SpssClient quite a lot.

    I don't think we'll be able to sort this out on the forum. It's probably too specifically bound to my program which I can't send you because of its size and complexity and because variable names and comments are in Dutch.

    So, let's leave it at this, and thanks againfor taking the trouble, Jon.

     

     

     

  • JonPeck
    JonPeck
    360 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-21T12:53:43Z  
    • Kees_RT
    • ‏2014-07-21T12:21:44Z

    I can understand that you expect an infinite loop somewhere. Because the program runs fine when it has to process 14 small cTables but quits when processing 4 large ones, it looks like there's some interaction there: the infinite loop doesn't bother SPSS untill after a certain load.

    Running in external mode is not an option as the program uses SpssClient quite a lot.

    I don't think we'll be able to sort this out on the forum. It's probably too specifically bound to my program which I can't send you because of its size and complexity and because variable names and comments are in Dutch.

    So, let's leave it at this, and thanks againfor taking the trouble, Jon.

     

     

     

    Sorry to see this unresolved.  There is one other thing I could do that might help.  I could run the program as is under my debugger using your data and see, perhaps, where it stops or at least where it is spending its time.  Or, if you have the Wing IDE (my favorite Python IDE), you could do this yourself.  Wing is able to debug Python code running within Statistics in most cases.-

    Jon

  • Kees_RT
    Kees_RT
    35 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-21T13:01:36Z  
    • JonPeck
    • ‏2014-07-21T12:53:43Z

    Sorry to see this unresolved.  There is one other thing I could do that might help.  I could run the program as is under my debugger using your data and see, perhaps, where it stops or at least where it is spending its time.  Or, if you have the Wing IDE (my favorite Python IDE), you could do this yourself.  Wing is able to debug Python code running within Statistics in most cases.-

    Jon

    On your advice I bought WIng IDE  a couple of years ago. I was a bit overwhelmed by it, but hey, I'll put it to the test!

    Regards

    kees

     

     

  • JonPeck
    JonPeck
    360 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-07-21T13:23:41Z  
    • Kees_RT
    • ‏2014-07-21T13:01:36Z

    On your advice I bought WIng IDE  a couple of years ago. I was a bit overwhelmed by it, but hey, I'll put it to the test!

    Regards

    kees

     

     

    Good luck.  Read the Wing help on how to set up external debugging.  There are a few settings that need to be modified, and you need to copy the wingdebug.py module to the directory where you code is.

    Also, when you insert debugging code, since you are using the SpssClient module, you need to disable the heartbeat between the Statistics processes and the Python process.  The heartbeat is a clock-driven message that allows the processes to be sure that the partner process is still alive.  If no response is given within a certain time limit, the SpssClient process is shut down.

    To do this, insert the following code in your module where it is sure to be executed before your main job gets run.

    # debugging
        # makes debug apply only to the current thread
        try:
            import wingdbstub
            if wingdbstub.debugger != None:
                import time
                wingdbstub.debugger.StopDebug()
                time.sleep(1)
                wingdbstub.debugger.StartDebug()
            import thread
            wingdbstub.debugger.SetDebugThreads({thread.get_ident(): 1}, default_policy=0)
            SpssClient._heartBeat(False)
        except:
            print "debugging failed"

     

  • Kees_RT
    Kees_RT
    35 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-08-07T12:14:25Z  
    • JonPeck
    • ‏2014-07-21T13:23:41Z

    Good luck.  Read the Wing help on how to set up external debugging.  There are a few settings that need to be modified, and you need to copy the wingdebug.py module to the directory where you code is.

    Also, when you insert debugging code, since you are using the SpssClient module, you need to disable the heartbeat between the Statistics processes and the Python process.  The heartbeat is a clock-driven message that allows the processes to be sure that the partner process is still alive.  If no response is given within a certain time limit, the SpssClient process is shut down.

    To do this, insert the following code in your module where it is sure to be executed before your main job gets run.

    # debugging
        # makes debug apply only to the current thread
        try:
            import wingdbstub
            if wingdbstub.debugger != None:
                import time
                wingdbstub.debugger.StopDebug()
                time.sleep(1)
                wingdbstub.debugger.StartDebug()
            import thread
            wingdbstub.debugger.SetDebugThreads({thread.get_ident(): 1}, default_policy=0)
            SpssClient._heartBeat(False)
        except:
            print "debugging failed"

     

    Hi Jon

    I've found and remedied the problem. It's nothing to do with memory management and SPSS.

    My program accesses Excel very, very freqently through win32com, and there's the problem. At (seemingly?) random times (seee http://stackoverflow.com/questions/3718037/error-while-working-with-excel-using-python) this generates an error. And I think that the longer my program runs, the higher the risk of encountering this error.

    So I  set the Excel object properties ScreenUpdating and Application.Interactive to False and off I went ...

    Regards

    Kees

     

  • JonPeck
    JonPeck
    360 Posts

    Re: Python /Statistics runs out of memory

    ‏2014-08-07T12:36:14Z  
    • Kees_RT
    • ‏2014-08-07T12:14:25Z

    Hi Jon

    I've found and remedied the problem. It's nothing to do with memory management and SPSS.

    My program accesses Excel very, very freqently through win32com, and there's the problem. At (seemingly?) random times (seee http://stackoverflow.com/questions/3718037/error-while-working-with-excel-using-python) this generates an error. And I think that the longer my program runs, the higher the risk of encountering this error.

    So I  set the Excel object properties ScreenUpdating and Application.Interactive to False and off I went ...

    Regards

    Kees

     

    Interesting.  I would have expected the com requests to b queued, but it apears that Application.Interactive actually does something like this, although the doc isn't really clear on this point.

    this property is usually True. If you set the this property to False, Microsoft Excel will block all input from the keyboard and mouse (except input to dialog boxes that are displayed by your code).

    This property is useful if you're using DDE or OLE Automation to communicate with Microsoft Excel from another application.