Porting Perl To Python

Techniques for migrating legacy, untested Perl to Python

Porting legacy Perl to Python can be a daunting task. In this article, learn some of the theory behind dealing with legacy code, including what not to do.

Noah Gift , Senior Technical Director, GiftCS, LLC

Photo of Noah Gift

Noah Gift is the co-author of Python For UNIX and Linux System Administration by O'Reilly, and is also working on Google App Engine In Action for Manning. He is an author, speaker, consultant, and community leader, writing for publications such as Red Hat Magazine, O'Reilly, and MacTech. His consulting company's website is http://www.giftcs.com, and much of his writing can be found at http://noahgift.com. You can also follow Noah on Twitter.

He has a Master's degree in CIS from Cal State Los Angeles and a B.S. in Nutritional Science from Cal Poly San Luis Obispo. He is an Apple- and LPI-certified sysadmin, and has worked at companies such as Caltech, Disney Feature Animation, Sony Imageworks, Turner Studios, and Weta Digital. In his free time, he enjoys spending time with his wife, Leah, and their son, Liam, composing for the piano, running marathons, and exercising religiously.



01 September 2010

Also available in Russian Japanese Spanish

I'll begin by quoting Damian Conway in Perl Best Practices: "Perl's approach to 'object oriented' is almost excessively Perlish: there are far too many ways to do it... There are just so many possible combinations of implementation, structure, and semantics that it's quite rare to find two unrelated class hierarchies that use precisely the same style of Perl OO."

This inherent flexibility in the design of the Perl language has undoubtedly caused the organic accumulation of Perl code that is technically running yet fragile to change and difficult to understand. Compounding the problem is the possibility that the original developers are no longer available, having moved on to other projects or companies. In addition to the legacy code burden, your production requirements may have changed, or newer vendor APIs are available only in Python. At this point, the monumental feat of porting Perl to Python begins.

Once you've reached this decision, you need to choose the best strategy to solve the problem. If you are fortunate enough to have a codebase of well-written object-oriented Perl, with full test coverage, then it might be as simple as porting the unit tests from Perl to Python and then making the appropriate Python code to pass the newly ported Python unit tests. Although there are many talented Perl programmers who write well-documented and readable code, these folks aren't as common as they could be. Most likely, you'll find yourself in a situation where you have no idea how the Perl code works, exactly, other than observing that it does. This is where the tough part of porting Perl to Python really begins.

What not to do: Call Perl code from new Python code

Automating test generation in Python

Python programmers shouldn't get too smug. While many people agree that Python is designed in a way that makes it a highly readable language, there can still be problems with legacy, untested Python code too. After all, Python doesn't test itself...or does it?

One potential way to deal with the problem of legacy, untested Python, is to use the test generation tool Pythoscope. Pythoscope's mission statement is "To create an easily customizable and extensible open source tool that will automatically, or semi-automatically, generate unit tests for legacy systems written in Python." Additionally, you can define entry points that automatically test functions based on their execution. Refer to the Resources section for a detailed example of how this works.

Before I dive into the recommended way to do things, let's first explore what not do. It's human nature to choose the path of least resistance when confronted with a challenge. Porting a decade's worth of organically grown, and untested, Perl code to Python is a difficult problem, so the most obvious solution would appear to be to find a way around rewriting all of that Perl code. This line of thinking will then lead you to a module called perlmodule, which lets you embed a Perl interpreter in Python. It might then seem simple to just call the old Perl code from new Python and be done with it.

This is a really bad idea, because now you have an even larger problem than when you started! You have legacy code you don't understand, and you have new code that calls code you don't understand. This is like paying one credit card payment using a cash advance from another credit card—you're just delaying the inevitable and increasing your technical debt (see Resources for a link to more information on technical debt). To make matters worse, you will have "infected" your new code by incorporating subtle bugs that are difficult or impossible to test. Finally, new developers who come on to the project later will have to work with a code base that is a frightening mix of untested Perl and inadequately tested Python.


Functional testing legacy code with nose to create a new spec

In the book Working Effectively With Legacy Code, the author, Michael Feathers, states, "One of the things that nearly everyone notices when they try to write tests for existing code is just how poorly suited code is to testing." Chances are that you will notice the same thing when you first think about porting legacy, untested, Perl to Python.

One important psychological and technical step can be to create a functional test that accurately captures the end result of the Perl code you are attempting to port. For example, if you are porting a Perl script that parses a large log file and generates a comma-separated values report, you could write a simple, failing, functional test to check that this actually occurs in the new code you are writing.

To follow along with this next example, you will need to install nose. If you have the Python easy_install tool already installed, you can simply issue the command easy_install nose. If not, you can install setuptools first by following the setuptools installation instructions.

With that out of the way, here is an example nose test:

Listing 1. Intentionally failing nose functional test
#!/usr/bin/env python
"""First pass at porting Perl to Python"""

import os

def test_script_exists():
    """This test intentionally fails"""
    assert os.path.exists("myreport.csv")

If you go ahead and actually run this test, it should look like this:

Listing 2. Test result
linux% /usr/local/bin/nosetests
F
======================================================================
FAIL: test_failing_functional.test_script_exists
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Python/2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", 
    line 182, in runTest 
      self.test(*self.arg)
  File "/usr/home/ngift/tests/test_failing_functional.py", line 7, in test_script_exists
    assert os.path.exists("myreport.csv")
AssertionError

----------------------------------------------------------------------
Ran 1 test in 0.037s

FAILED (failures=1)

As you can see from this failing test, the assertion failed because we never did anything to create this file. While this may seem initially pointless, it is a step in the process of mapping out as many things as possible in our black back of legacy code.

Once functional tests have been written that fill out as much of the functional spec of the previous code as possible, it would be worthwhile to look to see if you can identify any modular, testable, and well-written pieces of Perl to creating failing unit tests for. More failing tests could be written for those pieces of code until a reasonable specification begins to take shape.

The final step, which is actually the hardest, is to then write Python code that passes those tests you created. Unfortunately, there is no silver bullet. Porting legacy, untested code in Perl or any language is just plain tough, but writing failing tests could be a great help, and are a reasonable strategy.


Conclusion

Let me close by quoting Guido Van Rossum in his article "Strong Versus Weak Typing": "You'll never get all the bugs out. Making the code easier to read and write, and more transparent to the team of human readers who will review the source code, may be much more valuable...."

Ultimately, creating readable and testable code is one of the main, if understated, goals of porting legacy code to a new language such as Python. Embracing this ideal can take away some of the fear and pain from the process. Good luck!

Resources

Learn

Get products and technologies

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source
ArticleID=515089
ArticleTitle=Porting Perl To Python
publish-date=09012010