Porting Perl To Python
Techniques for migrating legacy, untested Perl to Python
I'll begin by quoting Damian Conway in Perl Best Practices: "Perl's approach to 'object oriented' is almost excessively Perlish: there are far too many ways to do it... There are just so many possible combinations of implementation, structure, and semantics that it's quite rare to find two unrelated class hierarchies that use precisely the same style of Perl OO."
This inherent flexibility in the design of the Perl language has undoubtedly caused the organic accumulation of Perl code that is technically running yet fragile to change and difficult to understand. Compounding the problem is the possibility that the original developers are no longer available, having moved on to other projects or companies. In addition to the legacy code burden, your production requirements may have changed, or newer vendor APIs are available only in Python. At this point, the monumental feat of porting Perl to Python begins.
Once you've reached this decision, you need to choose the best strategy to solve the problem. If you are fortunate enough to have a codebase of well-written object-oriented Perl, with full test coverage, then it might be as simple as porting the unit tests from Perl to Python and then making the appropriate Python code to pass the newly ported Python unit tests. Although there are many talented Perl programmers who write well-documented and readable code, these folks aren't as common as they could be. Most likely, you'll find yourself in a situation where you have no idea how the Perl code works, exactly, other than observing that it does. This is where the tough part of porting Perl to Python really begins.
What not to do: Call Perl code from new Python code
Before I dive into the recommended way to do things, let's first explore
what not do. It's human nature to choose the path of least
resistance when confronted with a challenge. Porting a decade's worth of
organically grown, and untested, Perl code to Python is a difficult
problem, so the most obvious solution would appear to be to find a way
around rewriting all of that Perl code. This line of thinking will then
lead you to a module called
lets you embed a Perl interpreter in Python. It might then seem simple to
just call the old Perl code from new Python and be done with it.
This is a really bad idea, because now you have an even larger problem than when you started! You have legacy code you don't understand, and you have new code that calls code you don't understand. This is like paying one credit card payment using a cash advance from another credit card—you're just delaying the inevitable and increasing your technical debt (see Related topics for a link to more information on technical debt). To make matters worse, you will have "infected" your new code by incorporating subtle bugs that are difficult or impossible to test. Finally, new developers who come on to the project later will have to work with a code base that is a frightening mix of untested Perl and inadequately tested Python.
Functional testing legacy code with nose to create a new spec
In the book Working Effectively With Legacy Code, the author, Michael Feathers, states, "One of the things that nearly everyone notices when they try to write tests for existing code is just how poorly suited code is to testing." Chances are that you will notice the same thing when you first think about porting legacy, untested, Perl to Python.
One important psychological and technical step can be to create a functional test that accurately captures the end result of the Perl code you are attempting to port. For example, if you are porting a Perl script that parses a large log file and generates a comma-separated values report, you could write a simple, failing, functional test to check that this actually occurs in the new code you are writing.
To follow along with this next example, you will need to install
nose. If you have the Python easy_install tool already installed, you can
simply issue the command
easy_install nose. If
not, you can install setuptools first by following the setuptools installation
With that out of the way, here is an example nose test:
Listing 1. Intentionally failing nose functional test
#!/usr/bin/env python """First pass at porting Perl to Python""" import os def test_script_exists(): """This test intentionally fails""" assert os.path.exists("myreport.csv")
If you go ahead and actually run this test, it should look like this:
Listing 2. Test result
linux% /usr/local/bin/nosetests F ====================================================================== FAIL: test_failing_functional.test_script_exists ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/Python/2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/home/ngift/tests/test_failing_functional.py", line 7, in test_script_exists assert os.path.exists("myreport.csv") AssertionError ---------------------------------------------------------------------- Ran 1 test in 0.037s FAILED (failures=1)
As you can see from this failing test, the assertion failed because we never did anything to create this file. While this may seem initially pointless, it is a step in the process of mapping out as many things as possible in our black back of legacy code.
Once functional tests have been written that fill out as much of the functional spec of the previous code as possible, it would be worthwhile to look to see if you can identify any modular, testable, and well-written pieces of Perl to creating failing unit tests for. More failing tests could be written for those pieces of code until a reasonable specification begins to take shape.
The final step, which is actually the hardest, is to then write Python code that passes those tests you created. Unfortunately, there is no silver bullet. Porting legacy, untested code in Perl or any language is just plain tough, but writing failing tests could be a great help, and are a reasonable strategy.
Let me close by quoting Guido Van Rossum in his article "Strong Versus Weak Typing": "You'll never get all the bugs out. Making the code easier to read and write, and more transparent to the team of human readers who will review the source code, may be much more valuable...."
Ultimately, creating readable and testable code is one of the main, if understated, goals of porting legacy code to a new language such as Python. Embracing this ideal can take away some of the fear and pain from the process. Good luck!
- For a complete tutorial on nose, read An Extended Introduction to the nose Unit Testing Framework.
- Check the Nose testing site for the latest updates on nose.
- Read "Strong Versus Weak Typing" for more of Guido van Rossum's insights on typing.
- Check CPAN for more information on the perlmodule module.
- The Technical debt entry on Wikipedia gives a little background on this term describing the cost of slapdash architecture and development decisions.
- Books have been written on Working Effectively with Legacy Code.
- The Pythoscope site gives more information on this useful unit test generator for Python.
- Be sure to read Damian Conway's book, Perl Best Practices.
- In the developerWorks Linux zone, find hundreds of how-to articles and tutorials, as well as downloads, discussion forums, and a wealth of other resources for Linux developers and administrators.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.
- Follow developerWorks on Twitter, or subscribe to a feed of Linux tweets on developerWorks.