Skip to main content

Charming Python: Reloading on the fly

Dynamically reloading modules in long-running processes

David Mertz (mertz@gnosis.cx), Applied Metaphysician, Gnosis Software, Inc.
Since conceptions without intuitions are empty, and intuitions without conceptions, blind, David Mertz wants a cast sculpture of Milton for his office. Start planning for his birthday. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future, columns are welcomed.

Summary:  A great advantage of Python over most other programming languages is its extreme runtime dynamism capabilities. Thanks to the handy reload() function, we can write programs that run persistently, but that load components that have been modified during the run of the process (pretty useful for services where continuous uptime is critical). This article illustrates runtime program modification by means of some enhancements to the Txt2Html front-end discussed in David's earlier article. Specifically, our sample program will do a background check for new versions of the Txt2Html conversion library on the Internet, and download and reload any new version as needed without manual user intervention.

Date:  01 Nov 2000
Level:  Introductory
Comments:  

Let's paint a scenario for this article: Suppose you want to run a process on your local machine, but part of your program logic lives somewhere else. Specifically, let us assume that this program logic is updated from time to time, and when you run your process, you would like to use the most current program logic. There are a number of approaches to addressing the requirement just described; this article walks you through several of them.

As this Charming Python column has progressed, I have discussed ongoing enhancements to my public-domain utility Txt2Html. This utility converts "smart ASCII" text files to HTML. Previous articles discussed the Web-proxy version of the utility and a curses interface for the utility. Also I occasionally notice that some ASCII markup could be converted in a more useful way, or I fix a bug in handling a particular markup construct.

In fact, my articles for this column are written in ASCII, and converted during the editorial process to the HTML format you are probably reading. Prior to sending a draft article off for publication, I run something like the following process:

txt2html charming_python_7.txt > charming_python_7.html

If I wanted, I could specify some flags to modify the operation; but either way, I just rely on the fact that the most current version of the converter is on my local drive and path. If I am working on a different machine, or for readers who might want to use the utility, the process is more cumbersome: check my Web site, compare version numbers and file dates (sometimes I make a change too small to number), download the current version, copy the current version to the right directory, and run the command-line converter. (See Resources later in this article.)

There are several manual and moderately time-consuming steps involved in the above. It ought to be easier, and it can be.

Command-line Web access

Most people think of the Web as a way to interactively browse pages in a GUI environment. Doing that is nice, of course, but there is also a lot of power in a command line. Systems with the text-mode Web-browser lynx can largely treat the entire Web as just another set of files for command-line tools to work with. For example, some commands I find useful are:

lynx -dump http://gnosis.cx/dW/.
lynx -dump http://ibm.com/developer/. > ibm_developer.txt
lynx -dump http://gnosis.cx/dW | wc | sed "s/( *[0-9]* *\)\([0-9]*\)\(.*\)/\2/g"

The first of these says: "Display David Mertz' home page to the console (as ASCII text)." The second says: "Save an ASCII version of IBM's current developerWorks home page to a file." The third example says: "Display the number of words in David's home page." (Don't worry about the specifics, it just shows command-line tools being combined with pipes.)

One thing about lynx is that (with the -dump option) it does almost exactly the opposite thing as Txt2Html: the former converts HTML to text; the latter converts in the other direction. But there is no reason not to be able to use Txt2Html in the same fashion as lynx. Doing so can be accomplished with a short Python script:

 import sys
 from urllib import urlopen, urlencode
 if len(sys.argv) == 2:
   cgi = 'http://gnosis.cx/cgi/txt2html.cgi'
   opts = urlencode({'source':sys.argv[1], 'proxy':'NONE'})
   print urlopen(cgi, opts).read()
 else:
   print "Please specify URL for Txt2Html conversion"
 

To run this script, just do something like:

python fetch_txt2html.py http://gnosis.cx/dW/programming/charming_python_7.txt

This does not provide you with all the switches of a local Txt2Html process, but it would be simple to add those if needed. You can pipe and redirect the output just as you would with any command-line tool. However, in the above version, you can only process data files that can be reached by URL, not local files.

Actually, fetch_txt2html.py does something lynx does not (and neither does Txt2Html by itself): it not only fetches the data source from a URL, it also gets the program logic remotely. If you use fetch_txt2html.py there is no need to even have Txt2Html on your local machine; the processing is remotely invoked (with the latest version), and the results are sent back to you exactly as if you had run a local process. Neat, huh? The local version of Txt2Html can access remote URLs just as with local files, but it cannot make sure it keeps itself up to date . . . yet!


Dynamic initialization

Using fetch_txt2html.py assures that the latest program logic is always used in conversions. Another thing this approach does, however, is move the processor (and memory) requirements onto the gnosis.cx Web server. The load imposed by this particular process is not particularly high, but it is easy to imagine other types of processes where processing on the client is more efficient and desirable.

The way Txt2Html is organized -- the way most programs are organized -- is with a couple of core flow-control functions assisted by a variety of utility functions. In particular, the utility functions are the ones that I update fairly frequently; the core functions ( main() and a few others) will be touched only in the event of a major rewrite. In short, what could helpfully be updated at each program run is the utility functions. Most of the time, in fact, most of the functions will be fine in the main Txt2Html module dmTxt2Html .

'd2h_textfuncs.py' dynamic Txt2Html updates

"""Hot-pluggable replacement functions for Txt2Html"""#-- Functions to massage blocks by type#def Titleify(block): #def Authorify(block): # ... [more block massaging functions] ...#-- Utility functions for text transformation#def AdjustCaps(txt): #def capwords(txt): #def URLify(txt): def Typographify(txt):# [module] names r = re.compile(r"""([\(\s'/">]|^)\[(.*?)\]([<\s\.\),:;'"?!/-])""", re.M | re.S) txt = r.sub('\\1<em><code>\\2</code></em>\\3',txt) # *strongly emphasize* words r = re.compile(r"""([\(\s'/"]|^)\*(.*?)\*([\s\.\),:;'"?!/-])""", re.M | re.S) txt = r.sub('\\1<strong>\\2</strong>\\3', txt) # ... [more text massaging] ...return txt # ... [more text transformation functions] .....

To utilize the support module in its latest and greatest incarnation, a few steps of preparation are necessary. First, download the main Txt2Html module to your local system (this is a one-time step). Second, create a Python script on your local system that reads:

'dyn_txt2html.py' command-line converter

from dmTxt2Html import *     # Import the body of 'Txt2Html' codefrom urllib import urlopen
import sys

# Check for updated functions (fail gracefully if not fetchable)try:
    updates = urlopen('http://gnosis.cx/download/t2h_textfuncs.py').read()
    fh = open('t2h_textfuncs.py', 'w')
    fh.write(updates)
    fh.close()
except:
    sys.stderr.write('Cannot currently download Txt2Html updates')

# Import the updated functions (if available)try:
    from t2h_textfuncs import *
except:
    sys.stderr.write('Cannot import the updated Txt2Html functions')

# Set options based on runmode (shell vs. CGI)if len(sys.argv) >= 2:
    cfg_dict = ParseArgs(sys.argv[1:])
    main(cfg_dict)
else:
  print"Please specify URL (and options) for Txt2Html conversion"

In the dyn_txt2html.py script, notice that when the from t2h_textfuncs import * statement is executed, all the functions (like Typographify() ) that were defined previously in dmTxt2Html are replaced by the t2h_textfuncs version of the same-named functions. Of course, where functions in t2h_textfuncs are only commented-out placeholders, no replacement occurs.

One minor matter is that different systems handle writes to STDERR differently. Under UNIX-like systems, you can redirect STDERR when you run the script; however, under my current OS/2 shell, and under Windows/DOS, the STDERR messages will be appended to the console output. You might want to either write the above errors/warning to a log file, or simply get in the habit of directing the STDOUT to a file (where it is probably more useful anyway). For example:

G:\txt2html> python dyn_txt2html.py test.txt > test.html
Cannot currently download Txt2Html updates

The error goes to console; the converted output to a file.

A more interesting matter is why dyn_txt2html.py does not just download the whole dmTxt2Html module instead of the support module only. There are a few things going on here. The t2h_textfuncs support module is significantly smaller than the main dmTxt2Html module, especially since most of the functions are stubbed/commented out. On a modem connection this could be significantly faster. But download size is not the main thing.

For Txt2Html it probably does not matter if users auto-download the whole latest module. But what about a system where the program logic is distributed, particularly where the responsibility for maintenance is distributed? You might have Alice, Bob, and Charlie be responsible for modules Funcs_A , Funcs_B , and Funcs_C , respectively. Each of them makes periodic (and independent) changes to the functions under their control, and uploads the latest and greatest to their own Web site (such as http://alice.com/Funcs_A.py). In this scenario it is not feasible to have all three programmers make changes to the same main module. But a script similar to dyn_txt2html.py can straightforwardly be extended to try importing Funcs_A , Funcs_B , and Funcs_C all at startup (and fall back to MainProg version if these resources cannot be obtained).


A long-running dynamic process

The tools we have looked at so far get their dynamic program logic by downloading updated resources at initialization. This makes a lot of sense for command-line or batch processes, but what about long-running applications. Such long-running applications are most likely server processes that respond to client requests continuously. In our case, however, we will use the curses_txt2html.py script developed for a previous article to illustrate Python's reload() function. The program curses_txt2html is a wrapper for a local copy of dmTxt2Html . Without trying to address curses programming a second time herein, it is enough to mention that curses_txt2html provides a set of interactive menus to configure and run multiple, sequential Txt2Html conversions.

curses_txt2html could potentially be left running in the background all the time, and we would like it to be able to utilize up-to-date program logic when we switch to its session and run a conversion. For this specific simple example, it admittedly would not be difficult to close and re-launch the application, and no particular disadvantages would be incurred. But it is easy to imagine other processes that genuinely do depend on being left running all the time (perhaps ones stating action performed in a session).

For this article, a new File/Update submenu was added. When activated, it simply calls a new function called update_txt2html() . Aside from the curses calls that relate to providing some confirmation of what occurred, we have already seen these steps in other examples in this article:

'curses_txt2html.py' dynamic update function

def update_txt2html():# Check for updated functions (fail gracefully if not fetchable) s = curses.newwin(6, 60, 4, 5) s.box() s.addstr(1, 2, "* PRESS ANY KEY TO CONTINUE *", curses.A_BOLD) s.addstr(3, 2, "...downloading...") s.refresh() try: from urllib import urlopen updates = urlopen('http://gnosis.cx/download/dmTxt2Html.py').read() fh = open('dmTxt2Html.py', 'w') fh.write(updates) fh.close() s.addstr(3, 2, "Module [dmTxt2Html] downloaded to current directory") except: s.addstr(3, 2, "Download of updated [dmTxt2Html] module failed!") reload(dmTxt2Html) s.addstr(4, 2, "Module [dmTxt2Html] reloaded from current directory ") s.refresh() c = s.getch() s.erase()

There are two significant differences between dyn_txthtml.py and our update_txt2html() function. One is that we go ahead and import the main dmTxt2Html module rather than just the support functions. This is largely just to simplify the import. The issue here is that we use an import dmTxt2Html to access the module instead of a from dmTxt2Html import * . In many ways this is a safer procedure, but a consequence is that it is more difficult to (accidentally or deliberately) overwrite functions in dmTxt2Html . If we wanted to attach functions from d2h_textfuncs we would have to do a dir() on the imported support module, and attach members to the "dmTxt2Html" namespace in attribute fashion. Doing this style of overwriting is left as an exercise for the reader.

The most significant difference introduced by the update_txt2html() function is the use of Python's built-in reload() function. Just performing a brand new import dmTxt2Html will not overwrite the functions previously imported. Watch out for this! A lot of beginners assume that re-importing a module will update the version in memory. It won't. Instead, the way to update the in-memory image of the functions in a module is to reload() the module.

There is another small trick performed above. The download location of an updated dmTxt2Html module is the local working directory, which may or may not be the directory from where dmTxt2Html was originally loaded. In fact, if it is in the Python library directory, you probably will not be working there (and probably don't have permission to it as a user). But the reload() call tries loading from the current directory first, then from the rest of Python's path. So whether or not the download succeeds, the reload() should be a safe operation (it may or may not load anything new though).


Resources

About the author

Since conceptions without intuitions are empty, and intuitions without conceptions, blind, David Mertz wants a cast sculpture of Milton for his office. Start planning for his birthday. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future, columns are welcomed.

Comments



Trademarks

static.content.url=/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11054
ArticleTitle=Charming Python: Reloading on the fly
publish-date=11012000
author1-email=mertz@gnosis.cx
author1-email-cc=