I attended the Agile 2006 conference in Minneapolis, Minnesota this summer and had a great time. As I was having a conversation with Ron Jefferies about the approach I take in teaching software development, Bob Payne, who is with the Agile Toolkit, asked if I'd be willing to record a podcast about the subject.1 A year ago in this column, I described my teaching approach2 so I won't cover that ground again. But I would like to focus on an issue that arose while I was recording the podcast: programmers' all too common inability to write clean, maintainable code.
How did that topic come up? One of the people listening to us record the podcast simply asked me why students coming out of our colleges today don't write good code (that is, readable, well-formatted, elegant code). "Don't students realize there is a penalty in the real world for writing bad code?" he wondered. "What are you professors in academia doing wrong that the students don't appreciate the need for elegance in their code and solutions?"
In this article, I'll consider some of the problems and consequences of bad code, and I will propose some possible solutions.
In some ways, clean code -- like beauty -- is in the eye of the beholder. Experienced programmers can look at a program's source code and tell whether it's readable or not. They also quickly develop an opinion of whether the code is efficient, whether it's well structured and even elegant in its simplicity. All of these characteristics are hard to define, but you usually get agreement among software developers when you present them with code as to whether it is clean or not.
There are many things that contribute to clean code. Some are universal and apply to any type of programming language or problem domain for which you are developing software. Some of the properties of clean code depend upon the specific programming language.
Let's take a quick look at two short pieces of code that do the same thing. This is code that scores sets in tennis.3 Each one has a method to record a game won by a player and another method to return a message containing the score. Both assume that all inputs are valid and no checks for invalid state are made. They are written in a design-by-contract method, where the burden on the correct use of the class is on the client. They both pass the same set of unit tests. Now, take a look at the two listings below. Which do you think is "cleaner," Listing 1 or Listing 2?
Listing 1: First tennis set scoring example
Listing 2: Second tennis set scoring example
Listing 1 is written in a naive style that a beginning programmer might use with code that seems to repeat itself. It's not necessarily complex, but seems cluttered to me and quite a bit inefficient. Listing 2 has more complex conditions, but if you understand Java, it reads quite well. The only part that you might question is the first part of the condition in the last else if. It turns out that when you get to this clause, one player has won.
Neither implementation is wrong. In fact, they are both small enough that they're not much more than a toy example, so talking about how clean the code isn't a very useful discussion to have about these listings, except to whet your appetite about what makes one implementation cleaner than another.
There are many articles, Websites, and books that discuss clean code. Everyone has an opinion about what makes code clean. As I've looked over some of these -- I've been reading more and more code for my classes and my research -- I believe there are some general principles you can follow. I will offer them to you later in this article.
I think there are three reasons why we get dirty or sloppy code:
- Time pressure
- Lack of training
Let's look at each of these.
Software projects slip. If you use a waterfall process, they slip more than if you use an iterative approach, but most projects slip somewhat, regardless of the lifecycle model you adopt. With an iterative, incremental lifecycle, the slip may be in functionality and not in the date delivered.
When a project begins to slip, the developers -- and the whole team -- tend to take shortcuts. This is human nature. There's a saying that goes: "When you're up to your neck in alligators, it's hard to remember that you were there to drain the swamp." As pressure is applied, something has to give, and we tend to neglect those things that do not provide immediate relief -- we just attack the alligators. I've noticed that process is a typical victim of this sort of neglect. The process may require taking specific actions or producing certain artifacts, but if we can avoid them we will. We have good intentions and plan to come back after the time crunch at hand subsides, but we seldom do that.
Clean code goes into the same category as process when it comes to things we can drop if we have to. Instead of using readable names like "netPay" for variables, we type "np" since it's shorter and we save valuable time. We stop providing meaningful comments or describing the algorithms. The deadline looms, and we just have to make our code work.
This approach to coding to a deadline is short-sighted and will come back to haunt us later.
Whether you learn to program in high school, the university, or in some other way, chances are you really don't learn how to write clean code. Part of this is time pressure on the educational institution. There is only so much that can be taught in a period of time, and coding style, readability, and other things that typically contribute to clean code get dropped from the instruction process, just as process gets dropped when the deadline approaches on a project.
If you look at the syllabi from programming courses at various educational institutions, you will find several that list programming style as one of the course topics or goals. For most that I've looked at more deeply, however, it seems that instructors seldom follow through on this goal. Assignments and grading criteria don't mention any penalty or reward for clear, readable, well-written code. Commercial offerings of programming courses are even worse than the academic ones. They usually have less time for the course and are focused on giving you the bare essentials to enable you to go back to your office and start being productive in the language du jour.
In university computer science curricula, most courses are not introductions to programming in one language or another. Rather, they assume students have sufficient grounding in one language or another, so a course in operating systems or networks seldom includes any time devoted to coding style or readability. Students get rewarded for solving the problem, period. There are no style points.
Don't get me wrong; computer scientists appreciate elegant code. I'd like to think that software engineering professors are especially tuned into code cleanliness. But this appreciation does not necessarily require the code to be readable or maintainable. We often value elegant algorithms or technical elegance over code readability and structure.
The perceived value of getting the code right instead of making it pretty is not unique to the university. It occurs in industry, and to my dismay it can also be seen in programming competitions. Consider this excerpt on the scoring for the ACM International Collegiate Programming Contest world finals:
"Teams are ranked according to the most problems solved. Teams placing in the first twelve places who solve the same number of problems are ranked first by least total time and, if need be, by the earliest time of submittal of the last accepted run."4
There's nothing in the statement that rewards clean code. I realize that "judging" code for its elegance and readability is not an easy task and is, in fact, rather subjective. And even if you argue that writing elegant code gives you a better chance at getting the solution sooner, I know of no studies done on this. My point is, for important programming venues, clean code is too often not even a consideration.
So, there's enough blame to go around for why we get too much dirty code. What can we do about it? I suggest there are three things we can do:
- Make writing clean code part of your personal process.
- Teach how to write clean code.
- Value clean code.
Now I'd like to explain these a bit.
The Agile community picked up on the value of keeping your code clean as a part of following their principles, one of which is:
Continuous attention to technical excellence and good design enhances agility. 5
This principle is manifested in practices such as the continual refactoring practice of eXtreme Programming (XP). Programmers who work on an XP team are expected to take the time to keep the code as clean and elegant as possible every time they work on it. Shortcuts are not allowed and are certainly not valued or rewarded. This, of course, assumes that the team is really following the XP practices, even when deadlines loom. But there is no guarantee that the culture of the organization doesn't revert to fighting the alligators when the time crunch hits.
If you make refactoring and writing good comments a part of your personal process, you will get to a point where you are unable to write code in any other way. You will take into account the time it takes you to clean up your code and insert comments so that your estimates will be accurate and reasonable.6
Writing code is like performing any other type of work. It takes practice, criticism, and more practice. Richard Gabriel discusses many of the similarities in writing prose, poetry, and software code in his book, Writers' Workshops & the Work of Making Things: Patterns, Poetry....7 Treating these various forms of writing as an art and insisting on practice with collaborative techniques can be difficult in the classroom, but any means by which teachers can elevate the software coding discipline to the level of art is probably worthwhile.
In order to write well, we must be able to read well. We must understand how to read both the standards and the idioms used in a particular language before we can grow proficient in writing. Reading and writing code is no different. We must learn the language and how to use it well. Then we must become good code readers if we are to become great code writers. We don't spend enough time asking our students to read code and reflect on and debate about its quality and elegance. One of the best software developers I've ever had the pleasure of working with is a copius code reader. He learns how to program better by reading code from various projects and then thinking about how he might improve upon it.
Some schools -- too few -- offer courses and seminars that have titles like Software Studio. In these courses, students come together like artists and writers and share their work with others. They open them up and learn how to give and receive constructive criticism. This is a hard course for many bright students with the typically large egos we see in our industry, but it offers a wonderful learning experience.
We have begun a coding dojo (dojo: a formal training hall in Japan for students of the martial arts) at WPI this year. The coding dojo is something that I learned about at the Agile 2006 conference this summer.8 We meet weekly and tackle a simple coding challenge. We often write the code as a group. The code I presented earlier in Listing 2 is from one of those recent meetings. The only problem I have with our dojo meetings is that they are too short. But the dojo is a great way to get people to think about and incorporate clean coding techniques into their own work. It can be run anywhere and would be a fantastic lunchtime activity in a company.
Most assignments in the university courses are small, and as I've said, code elegance and readability isn't really considered that important. But as educators, we need to find ways to increase the code base that our students work with. For instance, we could have them maintain existing code bases and extend them. They can feel the pain of poorly written code and what it does to productivity after it's written. This happened with my students last year and I realized this year that I needed to enforce code quality in the projects.
The results have been dramatic. I've given my software engineering students base code written by my object-oriented analysis and design students and they have, within a week, begun to use it effectively. That never happened last year. Part of my OOAD project grades -- given weekly -- were based upon their javadoc comments and their code style.
If you are a manager, you need to make sure that the people working for you understand your values. Make it a point to ensure that your developers know that code quality and elegance are important to you. When deadlines approach, stick up for the team and fight for scope management over shipping badly written code just to make a (usually arbitrary, anyway) date.
Find some way of publicizing a reward for great code. Rewards aren't hard to implement. It doesn't take a long time until your team realizes what you value, and they will rise to the occasion and deliver according to those values. Even if the larger company's actions do not support those values, your team will be able to do more in the long run because the maintainability of their code will support future efforts.
I hope this article has gotten you to think about the value of clean code in your systems. I know it's gotten me thinking about ways we might measure the value of clean code. I will address measurement criteria in a future column.
1 See http://www.podcastdirectory.com/podcasts/2918 for this and other podcasts from the conference.
2 See the December 2005 (http://www.ibm.com/developerworks/rational/library/dec05/pollice/index.html) and February 2006 (http://www.ibm.com/developerworks/rational/library/feb06/pollice/index.html) articles.
3 If you play tennis, scoring is something that you understand well, but as we developed this code, we found that there were some very complicated conditions in game, set, and match scoring. Check out http://en.wikipedia.org/wiki/Tennis#Scoring for a short overview.
6 See the August 2004 column on the personal process: http://www-128.ibm.com/developerworks/rational/library/content/RationalEdge/aug04/5585.html
7 Pearson Education, 2002, ISBN 020172183X.
8 See http://wiki.agilefinland.com/?CodingDojo for some information about coding dojos.
Gary Pollice is a professor of practice at Worcester Polytechnic Institute, in Worcester, MA. He teaches software engineering, design, testing, and other computer science courses, and also directs student projects. Before entering the academic world, he spent more than thirty-five years developing various kinds of software, from business applications to compilers and tools. His last industry job was with IBM Rational Software, where he was known as "the RUP Curmudgeon" and was also a member of the original Rational Suite team. He is the primary author of Software Development for Small Teams: A RUP-Centric Approach, published by Addison-Wesley in 2004. He holds a B.A. in mathematics and an M.S. in computer science.