Level: Introductory Gary Pollice, Professor of Practice, Worcester Polytechnic Institute
15 Dec 2007
from The Rational Edge: If we're going to make the most of advances in parallel hardware, we need to do a better job of teaching students how to design and develop software that takes advantages of the hardware.
From The Rational Edge.
One of the ongoing discussions we've been having in our computer science department at Worcester Polytechnic Institute (WPI) regards the role of concurrency in the curriculum. Similar discussions have taken place at other colleges and universities. Here's the fundamental question: How much exposure to concurrency should our students receive in their undergraduate courses, and in which courses should they be given such exposure?
This month I'll look at the importance of concurrency in computer science and software engineering and suggest a few things in store for the next generation of computer scientists.
Background
When I first learned to program, I learned how to write out my algorithms in an ordered sequence of steps that led to the final result. After all, computers were machines that executed one instruction after another -- but they were so fast that most of us couldn't imagine a different paradigm for the problems we were solving. Surely we could imagine faster processors, but Moore's law told us that computers would continue to get faster, and they have.
1
Modern operating systems gave us multitasking capabilities. This improved program performance, not because the computers were too slow but rather they were too fast. The computer spent most of its time idle while auxiliary operations such as disk access were completing. Therefore the designers of operating systems found ways to rapidly switch contexts, partition the memory resources, and use the idle cycles so multiple programs could run, seemingly simultaneously. For most needs, this form of multitasking works just fine. Right now, on my desktop machine, which is a single processor Pentium 4, I have 57 processes running with 634 threads, using just 3% of my processing power and approximately 50% of my physical memory capacity. What I find interesting is that I can only identify about ten of the processes as being ones that I think I asked for. The rest are owned by my operating system and software configuration.
Years ago, we quickly outgrew the capabilities provided by multiple processes. The computer scientists who developed operating systems gave us a boost when they introduced threads -- lightweight processes that run within a process. Threads are segments of execution for the same program and, in theory, allow us to simultaneously execute different operations in the program simultaneously. However, we were still running on a single processor and even if two threads could really execute simultaneously, physical laws prevented it.
Now, we're at a critical point in computing where multi-processors are not only possible, they are practical for all of our computing needs. I would venture to guess that over half of you reading this are reading from a computer with more than one processor. If you have such a processor you have certainly seen your programs speed up. But with so many idle cycles on my desktop processor, why would I need another processor? As we develop programs that use more threads, we're going to encounter more times when threads can run simultaneously. If we're able to provide the ability to execute them at the same time, the performance increase will be noticeable. I see this with my new laptop. It has a dual-core processor with a base processor speed that is quite a bit slower than my single-core desktop. Yet the performance improvement is quite noticeable.
Multi-core processors, where a single CPU chip has several processors sharing resources, are here to stay. David Patterson, a professor at the University of California and former president of the Association for Computing Machinery, offers a very readable view of multi-core technology in 2006 in his paper "Computer Science Education in 21st Century"
2
. There is also a nice summary of the basic trends toward concurrency by Herb Sutter that appeared in Dr. Dobb's Journal in February, 2005.
3
As software engineers, we should be somewhat chagrined by the fact that hardware is so far ahead of software. Surely we needed hardware to execute our programs in parallel before any software benefits could be realized, but now that the hardware is here, we're trying hard to catch up with our software. This is not a new phenomenon. As new processors appeared in the last couple of decades, with new instruction sets, the compiler writers, OS designers, and other system software developers worked hard to make their programs take advantage of the new hardware. Some of the hardware engineers added features to the hardware that took the place of traditional software. The DEC Alpha chip, for example, did quite a bit of code optimization -- entering into an area that traditionally was the realm of compiler developers.
Who really needs concurrency?
It's fair to ask whether we really do parallel
4
computation and when? It's nice to have threads running on different physical processors, but do I, as a computer scientist and software developer, need to worry much about it?
As long as the hardware is able to schedule threads appropriately, it's possible that many programmers will never really have to worry about parallel programming. Most of the problems we deal with on a daily basis are solved quite nicely by our standard sequential programs. This doesn't mean that all programs are amenable to sequential solutions. It also doesn't mean that we won't discover solutions to many problems that heretofore have been practically unsolvable with our computers.
Many problems that require true parallel computing are solved via scientific calculations that perform operations on vectors and matrices. In the '80s and '90s, several special vector processors were developed to support this type of computation. Special compilers for programming languages like FORTRAN were developed to optimize the code to take advantage of the vector processors.
Some companies specialized in massively parallel computing. One of the better known was Thinking Machines Corp. that was founded in 1982 by Danny Hillis and Sheryl Handler. Thinking Machines developed special processors and languages based on LISP, FORTRAN, and C for developing software on their processor. Unfortunately, the number of customers who had problems requiring massive parallelism and could afford machines such as those provided by Thinking Machines were few.
Today, we have more problems that require parallel computing and luckily we have much less expensive hardware that can accommodate our needs regarding parallel computation. A sampling of these include weather prediction, data mining, genetic and biomedical engineering, and artificial intelligence applications.
5
Concurrency in today's computer science curriculum
A typical computer science graduate today will have been introduced to parallelism in a few of their courses. In general, the topic is considered a special or advanced topic that deserves mention, but seldom is there a deep investigation on how to incorporate parallel thinking into the main CS topics.
There are two fundamental types of concurrency that students encounter. The first is concurrent resource allocation. When there are processes competing for limited resources, we need to make those resources available on a fair basis according to some definition of fairness and need. The second type of concurrency problem is how to design programs to effectively use resources that are available concurrently, such as multiple processors. Students today have much more exposure to the first type of problem than the second.
That's because students usually encounter parallel processing and concurrency issues in an operating systems course. It's impossible to understand operating systems design today without considering processes, threads, deadlocking, and other concurrent topics. Courses will typically explore these topics in the context of a specific operating system. Often these are open-source operating systems like Linux or Open Solaris.
6
Concurrency appears in processes and threads most often in these projects. There are several classic deadlock and synchronization problems the student must understand in order to begin to reason about creating secure, efficient operating systems. I think every CS graduate has had at least one homework assignment involving the dining philosophers problem or a producer-consumer problem.
7
These problems help us reason about how to ensure that we avoid deadlocks between processes competing for resources. They assume some sort of concurrency exists in the computation system. As far as delving into the nature of parallel computation, the student receives a smattering of the many other issues needed to successfully deal with concurrency.
Yet many students who have studied operating systems are unable to put their knowledge of processes and threads to practical use. I discovered this when I assumed that students in my object-oriented analysis and design course were comfortable working with threads in Java. They knew what a thread was, but were never challenged to write an actual threaded application. Their practice hadn't caught up to their knowledge of theory. Since then, I usually have some part of my courses include writing code that utilizes threads. Threads, however, are just the tip of the concurrent iceberg.
Database courses tackle concurrent resource problems. The theory behind atomic transactions, rollbacks, and other critical database topics are very similar to the issues students study in operating systems. Concurrency is studied from a resource viewpoint.
Many CS students never get into the fascinating topic of designing and implementing programs that are naturally concurrent. Consider writing programs to control traffic, air, land, or even network traffic. You not only deal with events occurring asynchronously, but also the many threads controlling the objects in your software system, which must be able to easily communicate and share information quickly. There are some opportunities, but not many that offer a typical undergraduate the chance to consider such issues.
Most computer science departments offer a course in computer architecture and organization. It was in such a course that I was introduced to parallel computation at a much finer level of granularity than the task or thread. Even though I'm not very knowledgeable about hardware, the course was a great introduction to the issues that hardware designers must take into account when they want to improve processor performance. They "get" concurrency and many of the issues required to implement fast, accurate computations in parallel. Plus, they've spent more time thinking about this as a group than software developers have.
Compilers certainly present the sort of programming issue that should deal with parallelism at different levels. Compilers are programs that are the interface between the programmer and the underlying machine architecture. If you don't have a compiler that can produce code to take advantage of the underlying processor architecture and instruction set, then the fastest machines will be useless. Unfortunately, most undergraduate compiler courses only teach the basics of scanning, parsing, intermediate representations, and some code generation. The hard issues of parallelism and the optimizations needed to produce effective code just don't fit into the time available for a single course in compiler construction. These issues are deferred to either a second semester or to graduate courses -- and even then they may not be covered in much detail.
Some students will take a course in programming language design or programming paradigms. I've taught a graduate course in the design of programming languages and was happy to see that the textbook I adopted had a chapter on parallel computing and programming. The chapter was the last one in the book and admits that it just presents the rudiments of parallel programming, but it's at least a start.
8
You might notice that I've been discussing the more fundamental, theoretical computer science courses. I think this is important. If we're going to become effective with parallel computation, we need to develop a deep understanding of it at the theoretical foundations. Currently, the theory courses that a typical CS undergraduate will take have little or no content in the area of concurrency. The courses I'm specifically thinking of are those that we have labeled Foundations of Computing, Algorithms, and discrete mathematics.
Changing our approach
The lack of exposure to parallelism is unfortunate but not hopeless. First of all, we need to realize that those students with a solid foundation in computing do enter the arena of advanced computing techniques and contribute significantly to advanced computing techniques and tools. But we can do better preparing future students.
I hope you will agree that if we're going to make the most of advances in parallel hardware we need to do a better job of teaching the students how to design and develop software that takes advantages of the hardware. We need to do this from the bottom up. We need to change some of the foundational material that we present to the students and reinforce it throughout the curriculum.
Most CS students choose the major because they either like to program, or think they will. Depending on the school, they will jump into a popular programming language such as C++ or Java and learn how to construct sequential programs using mainly iterative and event-driven designs. Some schools, like WPI, take a slightly different approach. The first course our CS majors take is called Introduction to Program Design, which teaches how to approach problem-solving through student programs. We use the Scheme programming language in this course, because Scheme is very simple to learn, has minimal syntax, and is very flexible. Scheme also supports solving problems recursively, which is a natural way to solve many problems. Students who study both recursive programming methods and iterative methods are much better positioned to solve a wider selection of problems than students who are not so exposed.
So, what if we expanded the range of problems presented to our students, requiring them to reason about problems in such a way that they might recognize when a concurrent solution is appropriate? We need to add something to the curriculum, and unfortunately, curriculum development is a zero-sum game. If you add something, you have to take something away, and this is the problem. We can add a lot of material about concurrency, but what should we give up?
Academic institutions consistently review and revise their course offerings. The situation is not much different from a corporation trying to decide how to allocate the yearly budget and determine the best product mix to produce for the market. We have to decide what constitutes basic skills for tomorrow's computer scientists and what additional topics we can offer with the available staff. And, there are hard decisions that must be made.
Real advances in developing a deep understanding of concurrent computing will necessarily come through graduate study and advanced undergraduate courses that will only be offered infrequently. Since I've been teaching, there have been so many courses I'd like to offer to our undergraduates but I have come to realize that the core courses must be covered and that doesn't leave a lot of time for the special courses I'd like to give. Every now and then I am able to offer a special topics seminar or independent study.
What's needed
Ideally, introductory computer science courses would include examples of problems that are best solved by parallel solutions and some experience with writing simple programs that use parallel constructs. If you use a language like Scheme there are some examples of a concurrent extension. There are concurrent extensions for other functional languages like LISP. Parallel programs seem more natural to me as extensions of functional languages; although I must admit that I haven't looked closely for others.
The next change I would suggest is that computer science curriculums ensure that their foundational course work in computing, algorithms, and discrete mathematics -- i.e., the theoretical foundation courses -- includes parallelism. A selection of algorithms that rely on concurrent processors can show the students the benefits of using parallelism. Introducing temporal logic in discrete mathematics and foundations courses offers some tools to reason about concurrency.
Once we plant the seeds of concurrency in these base courses, we then could expand on them throughout the curriculum. This should be reasonably easy in courses like artificial intelligence, database, and operating systems. By the time a student graduates he or she should be able to approach a job interview with proof that they are able to work effectively with concurrent technology.
Conclusion
It won't be too long before parallel computing courses become more widespread. They are beginning to show up at several universities. These courses will develop the material that can be fed into the core courses when appropriate. They will also be the foundation for new algorithms, languages, and architectures that will lead to a new computing paradigm based upon the effective use of concurrency.
I'm most excited about these possibilities for the new languages and corresponding compilers that will be developed to support them. It took a couple of decades for the object-oriented paradigm to take hold in the mainstream. I believe we're at the beginning of another important paradigm shift and will see concurrent programming be a tool in almost every programmer's toolbox in much less time than it took objects to rise to prominence. Advances like this are what keeps computer science so fascinating for me.
Notes
1 Whether Moore's Law accurately predicts nanotechnology improvements is debatable (see http://www.firstmonday.org/issues/issue7_11/tuomi/ for a discussion), but it has been the most widely referenced indicator of increased hardware performance.
2 See the section on parallelism, http://academic.evergreen.edu/projects/icer/documents/Patterson.pdf.
3 See http://www.gotw.ca/publications/concurrency-ddj.htm.
4 I use the words "parallel" and "concurrent" interchangeably.
5 More information can be found in the notes for the first class in Manavendra Misra's course on Parallel Computing for Scientists and Engineers. http://www.krellinst.org/UCES/archive/classes/ASC/main.html.
6 See http://www.linux.org/ and http://www.opensolaris.org/os/.
7 See http://www.isi.edu/~faber/cs402/notes/lecture8.pdf, or http://www.cs.cornell.edu/Courses/cs414/2004su/slides/08_classicsynchproblems.pdf.
8 The textbook is Programming Languages and Methodologies, Robert J. Schalkoff, Jones and Bartlett, 2007, ISBN 9780763740597.
Resources
About the author  | 
|  | Gary Pollice is a professor of practice at Worcester Polytechnic Institute, in Worcester, MA. He teaches software engineering, design, testing, and other computer science courses, and also directs student projects. Before entering the academic world, he spent more than thirty-five years developing various kinds of software, from business applications to compilers and tools. His last industry job was with IBM Rational Software, where he was known as "the RUP Curmudgeon" and was also a member of the original Rational Suite team. He is the primary author of Software Development for Small Teams: A RUP-Centric Approach, published by Addison-Wesley in 2004. He holds a B.A. in mathematics and an M.S. in computer science. |
Rate this page
|