Skip to main content

Dynamically speaking

Gary Pollice, Professor of Practice, Worcester Polytechnic Institute
Author photo
Gary Pollice is a professor of practice at Worcester Polytechnic Institute, in Worcester, MA. He teaches software engineering, design, testing, and other computer science courses, and also directs student projects. Before entering the academic world, he spent more than thirty-five years developing various kinds of software, from business applications to compilers and tools. His last industry job was with IBM Rational Software, where he was known as "the RUP Curmudgeon" and was also a member of the original Rational Suite team. He is the primary author of Software Development for Small Teams: A RUP-Centric Approach, published by Addison-Wesley in 2004. He holds a B.A. in mathematics and an M.S. in computer science.

Summary:  from The Rational Edge: Read about three of the most popular programming languages in use today -- the dynamic languages Perl, Python, and Ruby. Why are they used? What do they have in common, and what makes each one unique? This content is part of the The Rational Edge.

Date:  15 Jun 2007
Level:  Introductory
Activity:  530 views

illustration The way we write programs has changed significantly since the days when many of us chose to enter the software development field. Not only have the development tools changed, but also the languages we use to express our solutions to complex problems.

Expert software developers today are multilingual. They usually have experience with at least a handful of programming languages for developing applications. It's interesting to see how one class of languages, called dynamic languages, have become the language of choice for implementing many applications.

This month we'll look at this class of languages, why they're popular, and focus on three of them -- Perl, Python, and Ruby -- in enough depth to understand some of their main features and their differences from each other.

What are dynamic languages?

Technically, a dynamic language is one that is capable of modifying its capabilities at run time. What does this mean? Consider LISP, one of the earliest languages that exhibited this ability. One of the main features of LISP is its ability to treat instructions as data. A LISP program can read LISP statements from a text file and execute them as if they were part of the original program. The program can also create new expressions in its memory and execute them on-the-fly. For example, a Perl program can read in a text file that contains Perl code. It can then execute the code in that file, just as if the programmer had written it as part of the original program. LISP is a functional programming language, but that is not a necessary condition for a dynamic language. Many languages using other programming language paradigms, such as object-oriented, imperative, and so on, are also dynamic languages.

Dynamic languages often have high-level constructs that make programs written in them very compact. That is, the programmer does not need to write a lot of source code to produce quite sophisticated, complex programs. 1

Many people use the term scripting language when they are really talking about dynamic languages. They do this because many of the dynamic languages are used for writing small- or medium-sized scripts that can be written quickly for a specific purpose. Today, however, we use these languages increasingly more often for more complex, production systems.

Language popularity

Every month, Tiobe Software publishes an index of programming languages. 2 The index gives some indication of the popularity of a programming language based on several factors. The index does not indicate that the most popular language is, in fact, the one that is used most often to write software, but it does give some indication of the software development community's interest in different languages. 3

Five of the top ten languages in the June 2007 index are dynamic languages. This may surprise many of you. These five are PHP, Perl, Python, JavaScript, and Ruby. Each of these languages started out with different goals. And, like many good ideas that stand the test of time, these languages have expanded to support many other goals. In fact, for many applications where a dynamic language is a good choice, the similarities usually outweigh the differences. If we look at the top four languages in the Tiobe index -- Java, C, C++, and (Visual)Basic -- there are usually very good reasons for choosing one over the other for a particular problem we face. But the same cannot be said for many of the dynamic languages. The choice often comes down to personal preferences or the requirement that a language be used because it has already been used in the organization for existing applications.

In the rest of this article I will look at features that are common to three of the languages, and I will also present some features that are unique to each of them. Why, you might ask, did I choose Perl, Python, and Ruby? The simplest answer is that I've actually written some code in each of these languages. I wrote quite a bit of Perl in the 1990s and have written a good bit of Ruby in recent years. I've chosen Python to complement these two because it, like Perl, has been around for a while and has some of the features that I think made Perl so popular and which we now see in Ruby.

My treatment of these three languages is not meant to be very exhaustive; I simply hope to give you the flavor of each of them. Hopefully, this will whet your appetite to learn more about these languages and how they can help you solve problems with less effort.

Let's begin the tour.

Every language has a purpose

One thing we learn in language design courses is that every language should be designed for a purpose. While I've seen some people create a new language "just for the fun of it," most language designers create them for a reason.

There are three common reasons language designers give for creating new languages.

The first reason is that an existing languages does not solve a certain type of problem. We continually face new challenges due to technology changes and advances in fundamental computing methods and paradigms. Languages need to evolve in order to accommodate the changes effectively. When something new comes along that is not supported by existing languages, programmers will seek out new ways to solve problems with the new technologies and methods, often resorting to developing a new language. A good example of this is the creation of Smalltalk, which was designed to support the object-oriented paradigm as well as provide a rich set of capabilities for building graphical applications on bit-mapped displays.

The second reason commonly given for developing a new language is that existing languages have become stale and bloated. Programmers often talk about "code smells." That is, code, like rotting fish, begins to smell over time. We use refactoring techniques, design patterns, and other computational methods to remove the smelly code segments from our applications. Languages suffer similar maladies. Languages evolve over time, as new features get added to address some new technology or paradigm. When new features don't quite "fit" with the rest of the language, it begins to smell. At some point a bright programmer decides it's time to replace the language with a new one. (Perl, while it did not replace any single language that was becoming stale, combined features of several little languages like sed, awk, and shell scripts, along with a lot of C-style expressions.)

The third reason for developing a new language is the least convincing: People do it because they can, and their language mirrors their unique view of the world and how we should write programs. While this is an intellectually worthwhile goal, it does not usually add to the body of useful languages. Most languages that fall into this category never see the light of day. However, the genesis of Ruby, which I'll discuss further in this article, is partly explained by this reason.

What are the purposes of the three languages we're considering? 4 Well, Perl is the oldest of the three and has been around since 1987. Larry Wall, the creator of Perl, was a programmer at Unisys. He had been using languages like the Unix programs sed, awk, and sh, along with programs written in C, to perform a variety of tasks. He became frustrated at switching from one language to another and decided to create a language that would do what these languages did, and more. His philosophy was that this new language should make it easy to do simple things while allowing the programmer to use the language that is most effective in solving hard problems. Many of the tasks Perl was first used for included text processing and system administration tasks.

Perl developed into the language of choice for writing CGI scripts for Web applications and had a large following among programmers who work with network applications and databases. The latest version of Perl supports objects, closures, and many other advanced language concepts. This evolution has resulted in a very large language that does not, in my opinion, hold together cleanly as a single programming system. (Of course, I expect Perl devotees will disagree with this.)

Python came out a couple of years after Perl. Its creator, Guido van Rossum, wanted a language that, like Perl, would make it easy to do easy tasks yet also support hard tasks. However, van Rossum wanted his language to emphasize readability over speed and expressiveness and be object-oriented. Python is not any slower than many other dynamic languages, but van Rossum wanted readability and understandability to be a primary attribute of a Python program. One of Python's features that enforces readability is the requirement that code be indented consistently. For example, the following two examples in Figure 1 are different:

if x == y :	             if x == y :
  print 'hello'	               print 'hello'
  print 'world'	             print 'world'

Figure 1: Two examples of Python code

In Figure 1, if x is equal to y, then both of the examples print out two lines with "hello" on the first line and "world" on the second. If x is not equal to y, then the first example prints nothing while the second prints a single line with "world" on it. This "feature" seems to invite errors in my opinion, which is one of the main reasons why I have not done much programming in Python.

Python has evolved and become the language of choice for many developers working on network, multi-media, database, and system programming applications. Python is also a good language to provide glue between components. Like Perl, Python interfaces with other languages, such as C. Van Rossum now works at Google where he is continuing to work on Python to enhance its use in the world of applications that interest Google.

Ruby is the latecomer to our trio. It was developed by Yukihiro "Matz" Matsumoto, who began working on the language in 1993 and released it in 1995. Ruby began life, in some sense, for the third purpose I stated above -- that is, Matsumoto had always wanted to design a language. He was competent in Perl and Python, but he wanted a language that, in his opinion, was "more powerful than Perl and more object-oriented than Python." He also stated that he wanted Ruby to be a language that makes programming easy and fun. Others say that Ruby does what you expect it to do. You can usually guess the proper syntax for the command you want. This is called the Principle of Least Surprise.

Ruby is a pure object-oriented language. By that I mean that everything in Ruby is an object. After more than twenty years, I now understand why the Smalltalk programming community was so enthusiastic about Smalltalk. I was never able to really understand Smalltalk, mostly because of its very cryptic (to my mind) syntax. Ruby, however, is like Smalltalk for mere mortals. Now, I "get" it, and it is fun.

For many people, Ruby has replaced Perl and Python as a language for writing quick scripts. But Ruby has been used more recently to implement complete applications and, along with the Rails Web application framework, it is being adopted by many organizations as the language of choice for their Web-based applications. 6

Some common features

All languages must have some capability to control the flow of computation and represent data. All three of our languages have similar capabilities. They support numbers of different types (integers and floating point), characters, strings, and so on. They also have some built-in data structures like arrays, hashes, and some sort of support for record-like structures.

Many programmers who are familiar with languages like C may not know about hashes, which are also called dictionaries or associative arrays. Hashes are structures that are like arrays, but the values in the hash are accessed by key rather than numeric position. So, if you have a hash called grades that contains the final grades for all students in your class, you can set the grade for J. Doe to an A with a statement like:

grades['J. Doe'] = 'A'

Hashes are incredibly useful for accessing data that does not have an obvious ordering. Java programmers are familiar with hashes if they have used any of the Map classes.

Our three languages also share support for regular expressions and pattern matching. Many systems administration tasks involve manipulating text in system files, log files, configuration files, and so on. Much of this information is formatted in such a way that a program can use regular expressions to find information in it. For example, if you wanted to determine if a string, str, contains the substring "IBM" followed by the string "Rational," but not necessarily immediately after it, you could create a small Perl program as follows:

$str = 'This is a sentence containing IBM and Rational in it.';
if ($str =~ /IBM.*Rational/) {
    print "True"
} else {
    print "False"
} 

Here the characters between the slashes form a simple pattern. The only special characters in the pattern are the '.*' which indicates a sequence of zero or more characters of any type. The program above prints "True." Regular expressions should be familiar to most programmers who have ever used a Unix-like system. They can be quite cryptic, but are worth learning because of the inherent power they give you for text processing.

Each language under consideration provides support for various types of input and output. Each also has some ability to format data so neat reports can be produced. In fact, when Perl was first released, people thought "Perl" stood for Practical Extraction and Report Language. In fact, the name "Perl" really has no special meaning. It's not an acronym. Wall was originally going to name his language Pearl, based upon the parable of the Pearl of Great Price from the Gospel of Matthew, but he discovered another language was called PEARL and decided to name his language Perl.

Valuing differences

While there are many more common features I could describe in these three languages, I'd like to spend the rest of my time here talking about some of the differences that make each language unique. I believe that some of these differences yield strengths in a given language, other differences spell weakness. And yes, some of my reactions are purely subjective, but others are based on a bit more thoughtful reasoning.

Code readability has always been a sore point for me. Like Guido van Rossum, I value the ability to read and understand code over trying to get the last little bit of performance out of my programs at the expense of clarity. Sure, there are some cases where you simply have to make the program as optimal as possible, but I do that only after confirming the necessity.

All of the languages we're considering can be written in a readable way. However, just because they can be written that way doesn't mean that they are in practice. I mentioned earlier the need to indent Python code. Since I always indent my code, I don't have a problem with this. I do, however, have a problem with the potential for errors like the one I pointed out in Figure 1. Each of the languages has more than one integrated development environment (IDE) that supports writing programs in it. The Eclipse platform has plug-ins that support each of them. When you have an IDE or a syntax-sensitive editor, issues like indentation are usually taken care of for you.

As far as readability, Python wins first place, and Perl is the loser, in my opinion. I often classify Perl as a "write-only" language. Perl programmers have adopted a culture where developing the most compact, terse code is valued highly. Usually, this is done at the expense of readability. Adding to the problem is the use of special variables in Perl. Once you become used to the Perl idioms, many of the special variables are known to you, but for someone learning Perl, trying to read a Perl program is like trying to decipher hieroglyphics. Perl is a language that gives technical folks great glee in showing how they're able to do really neat things with very cool, non-obvious code. The script in Figure 2 shows how cryptic a fairly small piece of Perl code can be. 7

1 #!/usr/local/bin/perl
2 $op = shift;
3 for (@ARGV) {
4     $was = $_;
5     eval $op;
6     die $@ if $@;
7     rename($was,$_) unless $was eq $_;
8 }

Figure 2: Eight lines of Perl code

Ok, let's look at Figure 2 and consider some of the things that might not be obvious to non-Perl programmers. Line 2 shifts the first word off of the arguments to the script and places it in the scalar variable op. We know that the value is being shifted off of the argument list because no argument is given to the shift function. When this happens at the outermost level of a script, it automatically assumes that the argument list is the parameter. We also know that op is a scalar variable because it is prefixed with the $. If it were prefixed with a @, it would be an array, and if it had a % as a prefix, it would be a hash.

Lines 3 through 8 are a loop. This loop iterates over the array, ARGV, which is the remaining set of arguments passed to the script on the command line. Line 4 sets the variable was to the value of the next item in the argument list. The $_ is another special variable that has different meanings in different contexts.

Lines 5 and 6 are perhaps the most confusing. Line 5 uses the eval statement, which is one of the most powerful features of dynamic languages. It allows you to execute text that you read in or create as if it were part of your program. What is being executed? It's the first argument that we moved to op. If you didn't understand the purpose of the script, you would have trouble understanding what's happening. This line indicates that the first argument should be some Perl expression that can be applied to a value -- the argument we shifted off to was (which is also still in $_). The expectation is that the expression is some sort of regular expression substitution, like s/txt/pl/, which substitutes the occurrence of "txt" with "pl" in the operand. When you evaluate this against the operand, it will make the change in place in the operand.

If there is an error in the evaluation, line 6 will print out the error message (stored in the special variable, $@) and cause the script to exit.

This is a very powerful script in only a few lines of code. Similarly, on a Linux system you could enter the following command line (assuming the name of your script is rename.pl and is executable) to change the year part of the names of a set of files:

rename.pl s/2006/2007/ HW1-2006.txt HW2-2006.txt HW3-2006.txt

Using some of the common regular expressions with other Linux commands you can redirect their output to be the list of file names.

I've spend a lot of time with Perl here. I wanted to illustrate both the power it has as well as the difficulty you might have reading it. And I don't want you to think that I completely shun Perl. I wrote all of the code for my M.S. thesis in Perl. It was a great language for manipulating text and generating source code in other languages, like C and C++. I might still choose Perl today for that purpose, although I think I'm more comfortable with Ruby now.

Python has most of the features of Perl, but adds many more advanced features that are sure to make programmers who want object-orientation happy; introspection (like reflection in Java), where you can write programs to manipulate programs; lambda (anonymous) functions; and so on. If you are a language maven who likes to use all of the different programming paradigms, Python is a good choice. It is also supported by many modules and libraries that provide the ability to work with advanced technologies like different audio file formats, video files, and many others. Python has several extensions, like Pyro, which is a development platform for robotics exploration. 8

Figure 3 shows a program written in Python that is similar to the Perl program above. It renames files in a directory, according to some regular expression match. 9

1 #!/usr/local/bin/python
 2
 3 # Python Rename File 1.0 
 4 # Author: Douglas Palovick
 5 # License: GPL http://www.gnu.org/licenses/gpl.txt
 6
 7 import re, os
 8 rxin = raw_input('enter a regex to search for:\n')
 9 foo = re.compile(rxin)
10 newname = raw_input('enter a new base name:\n')
11 a = 0
12 for fname in os.listdir(os.getcwd()):
13     allowed_name = re.compile(rxin).match
14     if allowed_name(fname):
15         # newfname = string.lower(re.sub(foo,
16                                    # '', fname))
17         # b = (newname + str(a))
18         a += 1
19         c = os.path.splitext(fname)
20         b = (newname + str(a) + c[1])
21         os.rename(fname, b)

Figure 3: A program written in Python

This program is not as terse as the Perl version and does not have the special variables and assumptions. There are some parts I want to point out. Notice line 9. This uses the regular expression module (re) to create a compiled regular expression from the regular expression the user enters in line 8. Java programmers will see the similarity to the Java Matcher class. I'll leave it to the reader to examine the rest of the code.


Conclusion

Neither of the above examples truly shows off the true power of either Perl or Python, nor was that my intent in a short article. But I hope these examples and comparisons whet your appetite to find out more. I'm going to leave off exploring Ruby until next month when we look at it and the Rails framework.

Before leaving this month's column, I want to point out that there is an archived zip file accompanying this article. It contains four files. One is a PDF file that describes a problem that I gave to three of my current and former students. Each of them has experience in at least one of the languages. I asked them to write a program to solve the problem using one of the dynamic languages that I specified. The three files that accompany the specification are their solutions. Take a look at them and think about which you prefer.

Acknowledgments

I want to thank the three student programmers for taking the time to write the programs provided as a supplement to this article. Tyler Boone wrote the Perl program, Tom Rybka wrote the Python one, and Jim Schementi wrote the Ruby implementation.

Notes

1 In 1994 I attended the USENIX Symposium on Very High Level Languages where many of the papers presented discussed dynamic languages.

2 http://www.tiobe.com/. The Web pages provide information about how the index is calculated and details about their rating system.

3 I'm using software community very broadly here to include anyone who wants to try their hand at writing a program.

4 You can find links to a lot of information about these and other dynamic languages on the Wikipedia's dynamic language page: http://en.wikipedia.org/wiki/Dynamic_language.

5 Foreword to the first edition of Programming Ruby, reprinted in the second edition, Pragmatic Programmers, 2004, ISBN0974514055.

6 Next month we'll discuss Ruby on Rails in more detail.

7 This script, called rename, can be found at http://user.it.uu.se/~matkin/programming/PERL/. The Web page says that it was written by Larry Wall, Perl's creator.

8 See http://pyrorobotics.org/ for information on Pyro.

9 This program can be found at http://www.palovick.com/code/python/python-rename-files.php.


Resources

  • A new forum has been created specifically for Rational Edge articles, so now you can share your thoughts about this or other articles in the current issue or our archives. Read what your colleagues the world over have to say, generate your own discussion, or join discussions in progress. Begin by clicking HERE.

  • Global Rational User Group Community

About the author

Author photo

Gary Pollice is a professor of practice at Worcester Polytechnic Institute, in Worcester, MA. He teaches software engineering, design, testing, and other computer science courses, and also directs student projects. Before entering the academic world, he spent more than thirty-five years developing various kinds of software, from business applications to compilers and tools. His last industry job was with IBM Rational Software, where he was known as "the RUP Curmudgeon" and was also a member of the original Rational Suite team. He is the primary author of Software Development for Small Teams: A RUP-Centric Approach, published by Addison-Wesley in 2004. He holds a B.A. in mathematics and an M.S. in computer science.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational
ArticleID=229695
ArticleTitle=Dynamically speaking
publish-date=06152007
author1-email=gpollice@cs.wpi.edu
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers