Porting open source projects to z/OS UNIX, Part 1: Open source network retriever

Best of both worlds: Open source on IBM platforms

Discover tools, techniques, and tips to improve your UNIX® and z/OS® software ports. This article describes a real-word software port, with examples of how various porting challenges are resolved. If you are a software developer porting software to UNIX, you will find these techniques invaluable in avoiding common pitfalls, resolving bugs, and improving your productivity.

Share:

Kumar Mani (kmani@us.ibm.com), Certified IT Architect, IBM

Kumar Mani is a Certified IT Architect at IBM Global Business Solutions. He has over ten years of experience in the design and architecture of enterprise application and integration solutions. He has expertise in designing solutions using IBM WebSphere Application Server, WebSphere Process Modeler, WebSphere Business Integration (InterChange), Web services, and the Rational toolset. He has mentored architects, analysts, and IT specialists in end-to-end requirements and solution management, and he has been an invited speaker at industry conferences and various IBM events. Kumar holds an M.S. in Computer Science and engineering from Columbia University.



Tom Hubbell (hubbellt@us.ibm.com), Software Engineer, IBM

Tom Hubbell is a software engineer in the IBM Software Group (Lotus). He has over ten years of experience in the design and development of enterprise software solutions for high-volume messaging, collaboration, and knowledge management. He has expertise in C, C++, Java code, and the entire suite of Lotus enterprise products. Tom has been an invited speaker at IBM and industry conferences on Messaging and Collaboration. He holds an M.S. in Computer Science from Rensselaer Polytechnic Institute.



13 April 2010

Introduction

This article describes how to port a popular open source project to the IBM z/OS UNIX environment. The following three reasons motivated us to write this article:

  • Our project touches nearly all the fundamental aspects of modern software, that is, data, networks, command terminal I/O, and standard libraries. They are not only interesting areas in themselves, but they also provide the software designer with an important education on the platform and the porting process.
  • Free software systems have their own set of conventions, distinctions, and limitations that may, at first, seem peculiar or confusing to developers from structured programming backgrounds. Deeper inspection reveals that free software is quite rich in technical innovation and that the free software movement is an effective collaborative environment for evolution and growth. This article illustrates these points and strengthens ties between IBM systems and the free software movement.
  • To augment the existing body of knowledge. There are a great many books that describe porting (including porting to zSeries®) in generic terms. But real life accounts, where specific challenges and solutions are described, are more effective for developers. The hope is that software developers are able to apply these techniques in their projects and improve their understanding of porting to the IBM platform.

This article is intended for intermediate to advanced level UNIX developers. Familiarity with IBM platforms, particularly zSeries UNIX, is recommended. The article focuses on the key concepts and technical matter, but you are encouraged to keep IBM UNIX references close at hand. The following reference materials were indispensable (see Resources for links):

  • C/C++ Applications on z/OS and OS/390 UNIX (SG24-5992-01)
  • Open Source Software for z/OS and OS/390 UNIX (SG2459-44)
  • z/OS V1R11.0 UNIX System Services Command Reference (SA22-7802-11)
  • z/OS V1R11.0 XL C/C++ Run-Time Library Reference (SA22-7821-11)
  • z/OS UNIX System Service Porting Guide

Setting up the port

This article begins with an overview of the project and the porting environment. The project is called Wget, which is a free software package for retrieving files using various Internet protocols. Take a look at the Wget home page (see Resources) to understand its features, options, and usage.

It is assumed that you have access to a z/OS UNIX environment. You can use a typical login shell through a compatible telnet application. However, batch users can do exactly as interactive users and achieve the same outcome. Listing 1 describes how to set up the porting environment.

Listing 1. Set up the port environment
 /tmp/x $ uname -I -a
z/OS MYZOS 11.00 01 2097
 /tmp/x $ ls wget-1.9.tar.gz 
wget-1.9.tar.gz
 /tmp/x $ gunzip -c wget-1.9.tar.gz | pax -ofrom=ISO8859-1,to=IBM-1047 -rv 
wget-1.9
wget-1.9/doc
wget-1.9/doc/ChangeLog-branches
wget-1.9/doc/ChangeLog-branches/1.6_branch.ChangeLog
wget-1.9/doc/ChangeLog
...
wget-1.9/ltmain.sh
wget-1.9/mkinstalldirs
wget-1.9/stamp-h.in
 /tmp/x $ cd wget-1.9
 /tmp/x/wget-1.9 $

The uname command indicates that version 1 release 11 of the operating system is being used. This is useful for ascertaining the options and features of the libraries, tools, and documentation, and to report bugs. The rest of the listing shows how to unpack the Wget source package. Note the use of the pax command to translate ASCII source into EBCDIC. Now you're ready to begin the port.


Configuring Wget

Free software packages are typically built using three or four steps. Most free software packages include a file called README or INSTALL that describes the build. The first step is configuration; this is done by running the configure script that comes with the package.

Listing 2. Configuration
 /tmp/x/wget-1.9 $ ./configure >config.out 2>config.err  
 /tmp/x/wget-1.9 $ cat config.out
     1  configuring for GNU Wget 1.9
     2  checking build system type... i370-ibm-openedition
     3  checking host system type... i370-ibm-openedition
     4  checking whether make sets $(MAKE)... yes
     5  checking for a BSD-compatible install... ./install-sh -c
     6  checking for gcc... no
     7  checking for cc... cc
     8  checking for C compiler default output... a.out
     9  checking whether the C compiler works... yes
    10  checking whether we are cross compiling... no
    11  checking for suffix of executables...
    12  checking for suffix of object files... o
    13  checking whether we are using the GNU C compiler... no
    14  checking whether cc accepts -g... yes
    15  checking for cc option to accept ANSI C... none needed
    16  checking how to run the C preprocessor... cc -E
 ...
    55  checking for cc option to accept ANSI C... no
    56  checking for function prototypes... no
 ...
    84  checking sys/utime.h usability... yes
    85  checking sys/utime.h presence... no
    86  configure: WARNING: sys/utime.h: accepted by compiler, rejected by preprocessor!
    87  configure: WARNING: sys/utime.h: proceeding with the preprocessor's result
    88  configure: WARNING:     ## ------------------------------------ ##
    89  configure: WARNING:     ## Report this to bug-autoconf@gnu.org. ##
    90  configure: WARNING:     ## ------------------------------------ ##
    91  checking for sys/utime.h... no
 ...
   117  checking for working alloca.h... no
   118  checking for alloca... no
 ...
   195  checking libintl.h usability... yes
   196  checking libintl.h presence... no
   197  configure: WARNING: libintl.h: accepted by compiler, rejected by preprocessor!
   198  configure: WARNING: libintl.h: proceeding with the preprocessor's result
   199  configure: WARNING:     ## ------------------------------------ ##
   200  configure: WARNING:     ## Report this to bug-autoconf@gnu.org. ##
   201  configure: WARNING:     ## ------------------------------------ ##
   202  checking for libintl.h... no
 ...
 /tmp/x/wget-1.9 $

Configuration yields mixed results. Wget recognizes the platform (line 3) and presumably makes some assumptions and accommodations for it. But there are some wrinkles. There is some conflict about ANSI C (line 15, line 55), and, some headers and functions (for example, alloca) are not recognized.

What happened? Take a look at the config.log file for some hints.

Listing 3. Configuration errors
 /tmp/x/wget-1.9 $ more config.log
 ...
   341  configure:7526: checking for cc option to accept ANSI C
   342  configure:7569: cc  -c  -O  conftest.c >&5
   343  ERROR CCN3166 configure:7568  Definition of function choke requires parentheses.
   344  ERROR CCN3276 configure:7568  Syntax error: possible missing '{'?
   345  CCN0793(I) Compilation failed for file ./conftest.c.  Object file not created.
   346  FSUM3065 The COMPILE step ended with return code 12.
   347  FSUM3017 Could not compile conftest.c. Correct the errors and try again.
   348  configure:7572: $? = 3
   349  configure: failed program was:
   350  | #line 7544 "configure"
   351  | /* confdefs.h.  */
   352  |
 ...
   373  | /* end confdefs.h.  */
   374  | #if !defined(__STDC__)
   375  | choke me
   376  | #endif
 ...
   886  WARNING CCN3296 configure:9994  #include file <sys/utime.h> not found.
   887  FSUM3065 The COMPILE step ended with return code 4.
   888  configure:9970: $? = 0
 ...
  1037  configure:10584: checking for working alloca.h
  1038  configure:10606: cc -o conftest  -O   conftest.c  >&5
  1039  WARNING CCN3296 configure:10638 #include file <alloca.h> not found.
  1040  FSUM3065 The COMPILE step ended with return code 4.
  1041   IEW2456E 9207 SYMBOL alloca UNRESOLVED.  MEMBER COULD NOT BE INCLUDED FROM THE
  1042            DESIGNATED CALL LIBRARY.
  1043  FSUM3065 The LINKEDIT step ended with return code 8.
  1044  configure:10609: $? = 3
  1045  configure: failed program was:
  1046  | #line 10592 "configure"
  1047  | /* confdefs.h.  */
  1048  |
  1049  | #define PACKAGE_NAME ""
 ...
  1090  | #define RETSIGTYPE void
  1091  | #define HAVE_STRUCT_UTIMBUF 1
  1092  | /* end confdefs.h.  */
  1093  | #include <alloca.h>
 ...
  2474  configure:13513: cc -c  -O  conftest.c >&5
  2475  WARNING CCN3296 configure:13615 #include file <libintl.h> not found.
  2476  FSUM3065 The COMPILE step ended with return code 4.
  2477  configure:13516: $? = 0
 ...
  /tmp/x/wget-1.9 $

The hints are conclusive. Lines 343-344 point to line 375; this line chokes the compiler. The alloca problem occurs because the symbol is not defined. Though the include headers are not found, configure issues a purely informational warning and proceeds. (This confuses matters to the extent the script urges the user to report the bug!) These problems are resolved as follows. The compiler flags langlvl(extc99) and haltonmsg are invoked. The langlvl flag enables extended mode to recognize the unresolved symbols, and the haltonmsg flag directs the compiler to a hard stop if headers are not found. The stdlib.h standard header, which defines the alloca function on z/OS, is included. There is a code change to the configure file, as shown in Listing 4. At this point, configuration runs without errors.

Listing 4. Configuration corrections
 /tmp/x/wget-1.9 $ vi configure
 ...
 10660  #   else
 10661  #    ifdef __MVS__
 10662  #     include <stdlib.h>
 10663  #    else  /* non MVS */
 10664  #     ifndef alloca /* predefined by HP cc +Olibcalls */
 10665  char *alloca ();
 10666  #     endif
 10667  #    endif /* MVS */
 10668  #   endif
 10669  #  endif
 10670  # endif
 10671  #endif
 ...
 /tmp/x/wget-1.9 $ ./configure CFLAGS="-Wc,langlvl(extc99),haltonmsg(CCN3296)" \
                   >config2.out.txt 2>&1

Note the use of the __MVS__ symbol. Software should be designed and coded so as to be portable. This implies that porting to a new platform should not affect source code or the build process. The __MVS__ symbol implicitly identifies the z/OS platform and does not require explicit mention by the compiler or preprocessor. This symbol marks all the changes in this port.


Building wget

Free software is typically built by the make command, and we start with this command.

Listing 5. Start the build
 /tmp/x/wget-1.9 $ gmake
echo timestamp > stamp-h.in
cd src && \
gmake CC='cc' CPPFLAGS='' \
DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" \
-DLOCALEDIR=\"/usr/local/share/locale\"' \
CFLAGS='-Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)' LDFLAGS='' LIBS='' \
prefix='/usr/local' exec_prefix='/usr/local' bindir='/usr/local/bin' \
infodir='/usr/local/info' mandir='/usr/local/man' manext='1'
gmake[1]: Entering directory `/tmp/x/wget-1.9/src'
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" \
-DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\) \
-c cmpt.c
...
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" \
-DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\) \
-c main.c
ERROR CCN3052 ./main.c:461   Duplicate case label for value 132. Labels must be unique.
ERROR CCN3052 ./main.c:478   Duplicate case label for value 136. Labels must be unique.
ERROR CCN3052 ./main.c:488   Duplicate case label for value 146. Labels must be unique.
ERROR CCN3052 ./main.c:494   Duplicate case label for value 148. Labels must be unique.
ERROR CCN3052 ./main.c:527   Duplicate case label for value 165. Labels must be unique.
...
 /tmp/x/wget-1.9 $ vi src/main.c
 ...
 407      case 132:
 408        setoptval ("spider", "on");
 409        break;
 410      case 133:
 411        setoptval ("noparent", "on");
 412        break;
 413      case 136:
 414        setoptval ("deleteafter", "on");
 415        break;
 ...
 461      case 'd':
 462  #ifdef ENABLE_DEBUG
 463        setoptval ("debug", "on");
 464  #else
 465        fprintf (stderr, _("%s: debug support not compiled in.\n"),
 466             exec_name);
 467  #endif
 468        break;
 469      case 'E':
 470        setoptval ("htmlextension", "on");
 471        break;
 472      case 'F':
 473        setoptval ("forcehtml", "on");
 474        break;
 475      case 'H':
 ...
 /tmp/x/wget-1.9 $

Most of the files compile without incident. But the compiler reports "duplicate case label" errors in main.c. These errors occur because the character literals conflict with the numbered labels.. This is an ASCII/EBCDIC code page problem. It turns out that wget was originally designed for ASCII platforms where the numeric labels do not conflict with alphabetical characters. On EBCDIC platforms such as z/OS, there is a conflict.

How do you resolve this? The solution is to ensure that the numeric labels (lines 407, 410, 413, and so on) do not conflict with any character in any code page. This would, however, necessitate several (albeit trivial) changes to the source code. Recall that one of the aims in porting software is to minimize the number and extent of source code changes. Therefore, a different technique is called for. Compile the source code entirely in ASCII. This is accomplished by defining the __LIBASCII__ symbol and enabling the CONV(ISO8859-1) option. (The C/C++ Runtime Library Reference has details on how to enable this. See Resources for a link.) This option, where all the literal strings are converted to ASCII, may well solve the problem.

Listing 6. Building wget with LIBASCII
 /tmp/x/wget-1.9 $ gmake
cd src && \
gmake CC='cc' CPPFLAGS='' \
DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" \
-DLOCALEDIR=\"/usr/local/share/locale\"' \
CFLAGS='-Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)' LDFLAGS='' LIBS='' \
prefix='/usr/local' exec_prefix='/usr/local' bindir='/usr/local/bin' \
infodir='/usr/local/info' mandir='/usr/local/man' manext='1'
gmake[1]: Entering directory `/tmp/x/wget-1.9/src'
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"\
   -DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
   -Wc,conv\(ISO8859-1\) -D__LIBASCII__ -c cmpt.c
...
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"\
   -DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
   -Wc,conv\(ISO8859-1\) -D__LIBASCII__ -c log.c
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"\
   -DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
   -Wc,conv\(ISO8859-1\) -D__LIBASCII__ -c main.c
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"\
   -DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
   -Wc,conv\(ISO8859-1\) -D__LIBASCII__ -c gen-md5.c
...
 /tmp/x/wget-1.9 $ ./wget                                      ## first run
 %$#..$%$...$%$#$%$#$%$#$% /tmp/x/wget-1.9/src $
 /tmp/x/wget-1.9 $ ./wget >out.txt                             ## second run
 /tmp/x/wget-1.9 $ iconv -f ISO8859-1 -t IBM-1047 out.txt
%s: missing URL
Usage: %s [OPTION]... [URL]...

Try `%s --hep' for more options.
 /tmp/x/wget-1.9 $ src/wget http://www.ibm.com >out2.txt 2>&1  ## third run
 /tmp/x/wget-1.9 $ iconv -f ISO8859-1 -t IBM-1047 out2.txt
--%s--  %s
  %s => `%s'
Resolving %s... failed: %s.
 /tmp/x/wget-1.9 $

The build runs without apparent error. However, when you first try to run it, you get illegible output on the terminal. Close examination reveals that wget did run correctly, and even produced the help message. On the third try, when a URL is supplied, it becomes clear that wget fails because it is unable to parse the argument.

What happened? The immediate temptation is to start the debugger and step through the code. But the first three runs have already produced enough output to provide some important clues. The first clue is the presence of formatting specifications (for example, %s) in the output. This indicates that while all the strings in the code were converted to ASCII, they were incorrectly interpreted (or not interpreted at all) by the functions. As an example, %s, which is commonly used in the Xprintf family of functions, is printed out as it is, which indicates that the function did not interpret it at all. The second clue is the failure in the third try. The URL argument to wget (www.ibm.com) is correct, yet wget is not able to resolve the hostname. The likely cause is that the hostname was not translated to ASCII. As a result, the name lookup failed.

Native __LIBASCII__ does not solve this problem. Before you apply other techniques, it is helpful to reflect upon the depth and precise nature of the ASCII/EBCDIC issue. It is not limited to converting literal strings. Another aspect of the issue is to ensure that the Xprintf functions interpret their arguments correctly. These functions should also behave sensibly; if their output goes to the terminal, then it must be translated properly. Finally, wget exchanges information (in ASCII) with external hosts; this must be handled correctly. (This last issue is explained in the Porting Guide, pages 103-105. See Resources for the link.)


Advanced build and debug

A working solution to the ASCII/EBCDIC issue is available in the libascii library. This library includes functions that handle the various I/O scenarios sensibly. The point, however, is not that libascii is a panacea to every ASCII/EBCDIC problem. Rather, you should ask the following questions when faced with an ASCII/EBCDIC issue:

  • Is it limited to string literals?
  • Are there assumptions (for example, ctype) that the alphabet 'a'...'z' is contiguous in the code page?
  • Does the system interact (exchange ASCII protocol data) with external hosts?
  • What is the mix of terminal I/O versus file I/O?

These questions will indicate which solution is best suited for the problem.

Let's proceed with the libascii library. (See Resources for the link.) The instructions are fairly clear. Remember to modify the Makefiles to include the new code set directives: -D__STRING_CODE_SET__="ISO8859-1" and -D__STRING_CODE_SET__="ISO8859-1" (marked in bold in Listing 7).

Listing 7. libascii
 /tmp/x/wget-1.9 $ vi src/main.c
 ...
 64  #include "url.h"
 65  #include "progress.h"       /* for progress_handle_sigwinch */
 66  #include "convert.h"
 67
 68  #ifdef HAVE_SSL
 69  # include "gen_sslfunc.h"
 70  #endif
 71
 72  /* On GNU system this will include system-wide getopt.h. */
 73  #include "getopt.h"
 74
 75  #ifdef __MVS__  /* zOS */
 76  #include "_Ascii_a.h"
 77  #endif          /* zOS */
 78
 79  #ifndef PATH_SEPARATOR
 80  # define PATH_SEPARATOR '/'
 81  #endif
 ...
 383    i18n_initialize ();
 384       
 385    append_to_log = 0;
 386
 387    #ifdef __MVS__ /* zOS */
 388    __argvtoascii_a(argc, (char **)argv);
 389    #endif         /* zOS */
 390
 ...
 /tmp/x/wget-1.9 $ grep "^CFLAGS" Makefile src/Makefile
Makefile:CFLAGS = -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\) \
         -D__STRING_CODE_SET__="ISO8859-1"
Makefile:CFLAGS='$(CFLAGS)' LDFLAGS='$(LDFLAGS)' LIBS='$(LIBS)' \
src/Makefile:CFLAGS   = -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\) \
         -D__STRING_CODE_SET__="ISO8859-1"

Build and run are equally straightforward. You should use care in including the header file correctly and in specifying the libascii.a library at link time.

Listing 8. Building wget with the libascii library
 /tmp/x/wget-1.9 $ gmake
cd src && \
gmake CC='cc' CPPFLAGS='' \
DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\" \
-DLOCALEDIR=\"/usr/local/share/locale\"' \
CFLAGS='-Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
        -D__STRING_CODE_SET__="ISO8859-1"'\
LDFLAGS='' LIBS='' \
prefix='/usr/local' exec_prefix='/usr/local' bindir='/usr/local/bin' \
infodir='/usr/local/info' mandir='/usr/local/man' manext='1'
gmake[1]: Entering directory `/tmp/x/wget-1.9/src'
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"\
   -DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
   -D__STRING_CODE_SET__="ISO8859-1" -c cmpt.c
...
cc -I. -I. -I/opt/include   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\"/usr/local/etc/wgetrc\"\
   -DLOCALEDIR=\"/usr/local/share/locale\" -Wc,langlvl\(extc99\),haltonmsg\(CCN3296\)\
   -D__STRING_CODE_SET__="ISO8859-1" -c version.c
/bin/sh ../libtool --mode=link cc  /tmp/x/libascii/libascii.a -o wget  cmpt.o connect.o\
 convert.o cookies.o ftp.o ftp-basic.o ftp-ls.o ftp-opie.o getopt.o hash.o headers.o\
 host.o html-parse.o html-url.o http.o init.o log.o main.o gen-md5.o gnu-md5.o netrc.o\
 progress.o rbuf.o recur.o res.o retr.o safe-ctype.o snprintf.o  url.o utils.o version.o 
cc -o wget cmpt.o connect.o convert.o cookies.o ftp.o ftp-basic.o ftp-ls.o ftp-opie.o\
 getopt.o hash.o headers.o host.o html-parse.o html-url.o http.o init.o log.o main.o\
 gen-md5.o gnu-md5.o netrc.o progress.o rbuf.o recur.o res.o retr.o safe-ctype.o\
 snprintf.o url.o utils.o version.o /tmp/x/libascii/libascii.a
 /tmp/x/wget-1.9 $
 /tmp/x/wget-1.9 $ ./wget
 wget: missing URL
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
 /tmp/x/wget-1.9/src $ ./wget http://www.ibm.com
--17:19:04--  http://www.ibm.com/
           => `index.html'
Resolving www.ibm.com... 129.42.56.216Killed
 /tmp/x/wget-1.9 $

There is a new problem now. Wget builds successfully, but it mysteriously dies as indicated at the end of Listing 8. Obtaining a CEEDUMP is a good way to start debugging this problem. CEEDUMPs (Common Execution Environment DUMP) are roughly similar to UNIX core dumps. They are often the starting point for debugging programs that suffer from abnormal terminations.

Listing 9. Debugging wget
 /tmp/x/wget-1.9 $
 /tmp/x/wget-1.9 $ export _CEE_DMPTARG=/tmp/x/wget-1.9
 /tmp/x/wget-1.9 $ export _CEE_RUNOPTS="termthdact(dump),trap(on,nospie)"
 /tmp/x/wget-1.9 $ src/wget http://www.ibm.com >out3.txt 2>&1
Segmentation fault
 /tmp/x/wget-1.9 $ ls CEE*
CEEDUMP.20100203.183737.50462961
 /tmp/x/wget-1.9 $ vi CEEDUMP.20100203.183737.50462961
     1  CEE3DMP V1 R11.0: Condition processing resulted in the unhandled condition.  \
             02/03/10 6:37:37 PM                  Page:    1      
     2  ASID: 028E   PID: 50462961   Parent PID: 16908960   User name: ME
     3      
     4  CEE3845I CEEDUMP Processing started.
     5      
     6  Information for enclave main
     7      
     8  Information for thread 242E590000000001
     9          
    10  Traceback:
    11      DSA   Entry       E  Offset  Statement Load Mod ProgramUnit Service Status
    12      1     CEEHDSP     +000041E2            CEEPLPKA CEEHDSP     UK51026 Call
    13      2     _UCS_conv   +0000026A            CEUUZZZZ             UK50307 Exception
    14      3     _iconv_exec +000001CC            CEUUZZZZ             HLE7760 Call
    15      4     @@GETFN     +000000C2            CEEEV003                     Call
    16      5     iconv       +000000B4            CEEEV003             UK50307 Call
    17      6     __toebcdic_a+000000BC            wget                 Call
    18      7     __fputs_a   +000000AC            wget                 Call
    19      8     logputs     +00000138            wget                 Call
    20      9     lookup_host +00000370            wget                 Call
    21      10    gethttp     +00000232            wget                 Call
    22      11    http_loop   +00000764            wget                 Call
    23      12    retrieve_url+00000380            wget                 Call
    24      13    main        +000018E8            wget                 Call
    25      14    EDCZMINV    +000000C2            CEEEV003             Call
    26      15    CEEBBEXT    +000001B8            CEEPLPKA CEEBBEXT    HLE7760  Call
    27
    ...
    64  Original Condition:
    65  CEE3204S The system detected a protection exception (System Completion Code=0C4).
    66  Location:
    67   Program Unit:  Entry: _UCS_conv
    68   Statement:     Offset: +0000026A
    ...

The CEEDUMP reveals everything. (The Language Environment Debugging Guide [see Resources] has extensive documentation on CEEDUMPs.) Note the environment variables required to force a CEEDUMP. The dump indicates that this was a memory protection fault (line 65). It also includes a backtrace (lines 11-26). The backtrace reveals that __toebcdic_a is approximately where the program transitioned out of wget space. This is a strong hint that a data buffer is crossing a memory protection boundary between logputs and iconv.

Listing 10. Solving the memory problem
 /tmp/x/wget-1.9 $
 /tmp/x/wget-1.9 $ vi src/log.c
   ...
   328  void
   329  logputs (enum log_options o, const char *s)
   330  {
   331    FILE *fp;
   332
   333    check_redirect_output ();
   334    if (!(fp = get_log_fp ()))
   335      return;
   336    CHECK_VERBOSE (o);
   337
   338    fputs (s, fp);
   339    if (save_context_p)
   340      saved_append (s);
   341    if (flush_log_p)
   342      logflush ();
   343    else
   344      needs_flushing = 1;
   345  }
   ...
 /tmp/x/wget-1.9 $
 /tmp/x/wget-1.9 $ vi src/log.c      ## solution
   ...
   338  #ifdef __MVS__
   339    char sbuf[256]; int len;
   340    
   341    len = strlen(s);
   342    if (len >= 256) len = 255;
   343    strncpy(sbuf, s, len); sbuf[len] = '\0';
   344    fputs (sbuf, fp);
   345    if (save_context_p)
   346      saved_append (sbuf);
   347  #else /* non MVS */
   348    fputs (s, fp);
   349    if (save_context_p)
   350      saved_append (s);
   351  #endif /* MVS */
   ...
 /tmp/x/wget-1.9 $ gmake
   ...
 /tmp/x/wget-1.9 $ ./wget http://www.ibm.com
 --19:37:58--  http://www.ibm.com/
           => `index.html'
Resolving www.ibm.com... 129.42.56.216
Connecting to www.ibm.com[129.42.56.216]:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.ibm.com/us/en/ [following]
--19:37:58--  http://www.ibm.com/us/en/
           => `index.html'
Connecting to www.ibm.com[129.42.56.216]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 38,384 [text/html]

    0K .......... .......... .......... .......              100%  552.13 KB/s

19:37:59 (552.13 KB/s) - `index.html' saved [38384/38384]
 /tmp/x/wget-1.9/src $     #### SUCCESS! 

Now the problem is obvious. The second argument to logputs is a constant, but is being modified by __fputs_a. The solution, shown in Listing 10, is to copy it into a buffer so that it may be freely modified. It now works as designed.


Summary and lessons learned

Porting software, particularly free software, to a new platform is an interesting journey. This project has covered configuration, platform libraries, standard headers, code page issues, built-in code set features, network interactions, and debugging. These issues are the most likely sources of trouble in real-world software ports. However, Ben Franklin's wisdom remains fresh and relevant. An ounce of prevention is worth a pound of cure.

What lessons can you learn to make your ports clean, quick, and trouble-free? The experience with porting wget suggests the following:

  • Pay attention to ASCII/EBCDIC. It is important to define all the aspects of the issue prior to applying a solution.
  • List the system calls and system functions in use. Some functions may not always work exactly the same on all platforms. It is best to analyze esoteric functions for common or standardized alternatives.
  • List the header files in use. As with system functions, header files may define system calls differently on different platforms. Most of these troubles are evident during configuration. We encourage you to study the usage of header files and look for anomalies such as non-standard headers.
  • Study the operational environment. This includes various factors such as network access, terminal I/O, file/database interaction, platform security, and error handling.

This brings us to the end of Part 1. Stay tuned for Part 2, which explores more advanced topics such as multi-threading, file locking, and PKI security.

Acknowledgements. We are thankful to Messrs. C. A. Goodrich, J. P. Dewan, J. M. Hertzig, D. J. Berkley, and G. Chou for their comments and suggestions.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=481568
ArticleTitle=Porting open source projects to z/OS UNIX, Part 1: Open source network retriever
publish-date=04132010