Skip to main content

Taking OpenPower for a spin, Part 3: How to avoid having to port your code

Peter Seebach, Freelance author, Plethora.net
Peter Seebach
Peter Seebach first tested his code for 64-bit readiness on an Alpha in 1999. Still, it never hurts to be extra careful. And yes, it all worked, so booyah.

Summary:  Why is porting even hard? In this last article of the Taking OpenPower for a spin series, Peter Seebach looks at what kinds of issues are involved with portability from one architecture to another and contrasts APIs with hardware interfaces.

View more content in this series

Date:  26 Sep 2006
Level:  Intermediate

Activity:  2983 views
Comments:  

The examples in Part 2 were contrived. Why? Because it's hard to come up with an example of code that requires much porting that isn't either obviously bad, contrived, or too low-level to apply well to non-kernel development. Device drivers and the like have a reason to look at these questions, within reason; user applications, not so much.

Over the years, software engineers have learned a lot about portability, and much of that knowledge has filtered into language design and standard coding practices. If you follow language specifications and use reasonable practices, you should rarely, if ever, have any reason to care whether your host hardware is big- or little-endian, or what exactly the largest type is. Of course, that assumes everyone else follows these practices, too; in reality, they don't always, so there's still plenty of fun to be had.

The problem, though, isn't that portability is hard in general; it's that portability is hard to retrofit. A great deal of commercial code is written without any thought to future portability. So, when you have real code, you might run into portability problems -- but most of them are accidental difficulties, not real challenges inherent in the problem. They're a result of bad code, and in most cases, code without the problem would have been easy to write.

Portability mostly has to do with interacting with APIs, not with hardware. Processor architecture makes a lot less difference when compiling code than APIs do. If you want to port a native UNIX® application to Microsoft® Windows®, you will need a support library; if you want to port a native x86 application to PowerPC®, but stay within the same operating system, you are probably already done.

In fact, though, some code out there doesn't survive transitions gracefully, and there are reasons why. So, this final article looks a bit at the kinds of things you learn about software portability from experiments like "copying my source directory from a Pentium-based laptop to an OpenPower™ server."

Trust the API

If you want to see code that won't survive porting, look for people who think they know more than the library writers. Let me give an example from about a decade back; it's still a very good example. It involves the PINE mail reader. For reasons beyond my ken, PINE was designed with a very elaborate set of libraries and wrappers. When you compiled it, you told it which platform you were on, and then it did Deep Magics. One of these magics was, on nearly every platform, to provide its own declarations for every library function and system call it planned to use, rather than including the standard headers. According to a PINE developer, this was done because there existed a platform somewhere where the standard headers were wrong; I think it may have been Ultrix or something. (I've never seen one, so I can't speculate.)

What this meant was that, on every new platform, PINE needed a new set of its own definitions of exactly what was in the standard headers. If you brought it to a new platform, it needed its definitions updated and corrected.

The example that caught us was compiling PINE on a big-endian system. 4.4BSD systems, such as NetBSD and FreeBSD, all have 64-bit file offsets. PINE, rather than including the system headers, provided its own declaration of the related functions, all defined in terms of 32-bit offsets. So, on a big-endian system, it thought every file was empty. Data corruption was the worst outcome, but a more pragmatic one was that it simply never thought there was any e-mail.

The lesson is simple: If someone thinks he or she knows more than the library or system vendor, run away. If you absolutely have to touch the code, expect it to be a nightmare. Then make plans to migrate to something else. Code like this is a disaster waiting to happen.

The number of old UNIX programs which had trouble with 64-bit file offsets is not as small as we might like; however, most of them only have trouble with files which actually exceed 2GB, or 4GB, in size; only those that actively subvert the provided API and the standard C way of getting function declarations are confused by smaller files. Similarly, many older Windows installers will do horrific things (at best, simply refusing to install) if a disk has more than 2GB of free space.

It's easy to forget that the C and UNIX API aren't just sets of functions; they're also sets of ways to get the functions and their associated types properly in scope. Trying to outsmart these is foolhardy. I know of only one exception: Despite the fact that Perl's configure-time diagnostic ("Your stdio isn't very std") for being unable to figure out the internals of the stdio library is obviously wrong, Perl handles the situation gracefully. (In fact, I think the message was once taken out of the configure script, but later restored, presumably by someone who thought it was cute and doesn't care that it's flatly incorrect.)

In most cases, if what you want can be done with a higher-level library (stdio instead of raw reads and writes, for instance), you should do that; it gives you more insulation from possible quirks.


We typed 'make'

The (possibly apocryphal) tale of an early Linux® port, by a major database vendor, is that they were asked once to describe the process of porting to Linux. "We typed 'make'." This is a fairly good summary of the experience of porting between platforms with essentially similar APIs. Not all platforms are quite this compatible, though. A programmer humor site devoted to horrifying stories reports that someone's thread-based code wasn't compiling well because he wrote it on a Windows system using the MFC libraries and then tried to compile it on some UNIX system. This kind of thing is a bit crazy.

It's important to distinguish between APIs even on a single system. A lot can be said for preferring a more portable API -- OpenGL over DirectX, for instance. Unfortunately, you can't always get a portable API for what you want, and portable APIs might sometimes lack features. Still, the porting costs can be significant, and the ability to target multiple systems is worth something.

Cross-platform development APIs and toolkits abound. In general, these work by trying to abstract away API features to a new level and then providing a generic API on top of that. Sometimes this works pretty well; sometimes it's pretty bad. Interpreted languages, as well as bytecode languages such as Java™, go a step forwards by abstracting away the entire machine. In fact, this is often an excellent choice; many programs are much more developer-time bound than CPU-bound, and even for CPU-bound activities, being able to copy your program over to an eight-processor server with 32GB of memory might make up for running a bit slower than something that ekes every last drop of performance out of a single-processor workstation. (It was rather humbling to me to realize that I could have a single process on the OpenPower server with more physical memory associated with it than any machine I own has installed.)

For C, your best bet has been the UNIX/POSIX API for about the last twenty years; nothing else is close to the same level of portability or flexibility. The hard part is picking a GUI API; these appear to be insanely short lived, at least by comparison. It is probably not a coincidence that a lot of GUI work is now being done in other forms entirely -- Web pages and Java, for instance.


What's the big deal, then?

So, if portability to new architectures is trivial, why is such a big deal made of it? There are a few reasons. One is that the high-profile ports tend to be from the very tiny subset of code that is actually affected. We hear about ports of Linux to new architectures because it takes some work to bring up a kernel (not to mention the compiler toolchain, and everything else) on a new architecture. What we don't hear about is hundreds of people reporting that a particular package has been "successfully" ported to Linux. For instance, the "Zork test" I like to use on embedded systems is utterly trivial for the OpenPower project. I needed to change one line in the makefile to build a 32-bit version, and about four lines to build a 64-bit version. Both worked without so much as one line of code touched.

A lot of discussion of porting software has to do with two major industries: desktop computers and video game consoles. In both cases, the primary issues are APIs and development environments. In fact, video game consoles are often an exception to general rules about modern code not needing much porting. Because they have comparatively large markets with absolutely identical hardware, they tend to reward and encourage development of code that's very precisely tuned to the hardware. Without fear that some users will have slightly lower-end systems, developers can push performance right to the limits of a given piece of hardware. Nonetheless, even that is starting to fade; developers are more and more likely to use an API rather than accessing hardware directly.

In most cases, if you want to run an application on new hardware, NetBSD and Linux will run on the hardware, and your port is done. It's only when the kernel development needs to happen that there's much challenge to it... But that's the part that makes Slashdot, and thus, gets people thinking that porting to new systems is really hard. Unless you work on the compiler or kernel, it's not. The applications that need to know much, if anything, about their host hardware are few and far between. You can easily come up with some examples, such as X11 or Java, but 90% of the packages you see have no reason to care significantly about their host platform.

That's not to say they won't try; if you browse around running autoconf scripts, you will still find people checking for sizeof(char), on the off chance that their C code is being run under something which isn't a C compiler. Most of the time, these checks really don't do anything, and indeed, sometimes they just create exciting new opportunities for bugs, for instance when cross-compiling.

Really, it comes down to our understanding of the word "platform." As time has passed, UNIX has eroded the relationship between an "architecture" (such as PowerPC, or x86) and a "platform" (such as UNIX, or OS X, or Windows). It used to be that platform and architecture were closely related; SPARC meant SunOS, for instance. Now, UNIX is a platform, and the architecture is mostly an afterthought. You can throw together a box from spare parts and run UNIX, or you can buy a carefully pre-configured workstation for US$10,000 and still run UNIX.


OpenPower: Fun, and worth the time

The motivation to get access to something like the OpenPower systems isn't the need to find out whether a program will run at all on a POWER5. It might be useful for some performance testing; it's not always obvious whether a given system will be able to manage a given task, especially if you have hard timing requirements. For instance, if you need a rendering farm, a bit of benchmarking goes a long way.

I have vague recollections of, in the early 90s, being told that free operating systems would never be up to the standards we expected of commercial UNIX, such as multiprocessor systems with RAID arrays, huge amounts of memory, and months of uptime. And yet, here I am with a shell prompt on an eight-way multiprocessor system with RAID drives, 32GB of memory (hey, it looks huge to me), and two months of uptime.

If you're trying to convince management that your code will run on this hardware, having the opportunity to run make and say "yes, it works" might make the pitch a bit easier. You wouldn't have to without all the hype about "porting" to "new architectures," but there it is; at least it's easy to work around. Some people tend to worry that a Linux distribution for a "new" architecture might not be fully mature or stable, but in the case of Power Architecture systems, that's just not an issue anymore; PowerPC Linux has been stable and usable for a long time. I think it gets a bad rap because of the difficulty in keeping up with Apple's vast array of innovatively incompatible components, such as sound cards and thermal management systems. If you stay away from those particular parts, it's rock solid. This is not that dissimilar from what happens if you try to run x86 Linux on bleeding-edge hardware, especially laptop or desktop systems. Since finding that out was the point of the exercise, I guess I'd call it a success.


Resources

Learn

Get products and technologies

Discuss

About the author

Peter Seebach

Peter Seebach first tested his code for 64-bit readiness on an Alpha in 1999. Still, it never hurts to be extra careful. And yes, it all worked, so booyah.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration, Linux
ArticleID=163072
ArticleTitle=Taking OpenPower for a spin, Part 3: How to avoid having to port your code
publish-date=09262006
author1-email=developerworks@seebs.plethora.net
author1-email-cc=dwpower@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers