The Clojure programming language
Take advantage of the Clojure plug-in for Eclipse
This article covers the Clojure programming language. Clojure is a Lisp dialect. It is assumed that you do not already know Lisp. Instead, it is assumed that you have knowledge of Java technology. To write Clojure programs, you need a Java Development Kit V5 or higher and the Clojure library. For this article, JDK V1.6.0_13 and Clojure V1 were used. You should also take advantage of the Clojure plug-in for Eclipse (clojure-dev), and you will need Eclipse for that. For this article, Eclipse V3.5 was used, along with clojure-dev 0.0.34. See Related topics for links.
What is Clojure?
It was not that long ago when running your programs on the Java Virtual Machine (JVM) meant writing your program using the Java programming language. Those days are long gone because now you have many choices. Many popular choices, such as Groovy, Ruby (via JRuby), and Python (via Jython), allow for a more procedural, scripting style of programming, or they have their own flavor of object-oriented programming. These are both paradigms familiar to Java programmers. One could argue that with these languages, you write programs similar to what you would write in the Java language; you just get to use a different syntax.
Clojure is yet another programming language for the JVM. However, it is
quite different from Java technology or any of the other JVM languages
mentioned. It is a dialect of Lisp. The Lisp family of programming languages
have been around a long time — since the 1950s, in fact. Lisp uses the
distinct S-expressions or prefix notation. This notation can be summarized
as (function arguments...)
. You always start with the name of a function,
and list zero or more arguments to pass in to that function. The
function and its arguments are organized together by surrounding them with
parentheses. This leads to one of the trademarks of Lisp: a lot of
parentheses.
As you might guess, Clojure is a functional programming language. Academics can debate its "purity," but it definitely embraces the pillars of functional programming: avoid mutable state, recursion, higher-order functions, etc. Clojure is also a dynamically typed language, though you can optionally add type information to improve performance for critical paths in your code. Clojure not only runs on the JVM but is designed with Java interoperability in mind. Finally, Clojure is a language designed with concurrency in mind and has some unique features related to concurrent programming.
Clojure by example
For many, the best way to learn a new language is to start writing code. In this spirit, we will take some simple programming problems and solve them using Clojure. We will go through the solutions in detail to gain a better understanding of how Clojure works, how you can use it, and what kind of things it does well. However, like any other language, we need to set up a development environment for working with it. Luckily, this is pretty easy with Clojure.
Minimal setup
All you need for working with Clojure is a JDK and the Clojure library, which is a single JAR file. There are two common ways to develop and run Clojure programs. The most common is using its read-eval-print-loop (REPL).
Listing 1. The Clojure REPL
$ java -cp clojure-1.0.0.jar clojure.lang.Repl Clojure 1.0.0- user=>
The command was run from the directory where the Clojure JAR was located. Adjust the path to the JAR as needed. You can also create a script and execute the script. To do this, you need to execute a Java class called clojure.main.
Listing 2. Clojure main
$ java -cp clojure-1.0.0.jar clojure.main /some/path/to/Euler1.clj 233168
Again, you need to adjust the path to your Clojure JAR and your scripts. Finally, there is IDE support for Clojure. Eclipse users can install the clojure-dev plug-in using its Eclipse update site. Once it is installed, make sure you are in the Java perspective, then you can create a new Clojure project and new Clojure files, as shown below.
Figure 1. Using clojure-dev, the Clojure plug-in for Eclipse
With clojure-dev, you get some basic syntax highlighting, including parentheses matching (a must-have for any Lisp). You can also launch any script in an REPL that is embedded directly in Eclipse. The plug-in is still very new, as of the writing of this article, and its features are moving forward rapidly. Now that we have the basic setup out of the way, let's explore the language by writing some Clojure programs.
Example 1: Working with sequences
The name Lisp comes from "list processing," and it is often said that everything in Lisp is a list. In Clojure, this is generalized as sequences. For the first example, we will take the following programming problem.
If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6, and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1,000.
This problem is taken from Project Euler, a collection of mathematical problems that can be solved using clever (or sometimes not-so-clever) computer programming. In fact it is Problem No. 1. Listing 3 shows a solution to it using Clojure.
Listing 3. Example 1 from Project Euler
(defn divisible-by-3-or-5? [num] (or (== (mod num 3) 0)(== (mod num 5) 0))) (println (reduce + (filter divisible-by-3-or-5? (range 1000))))
The first line defines a function. Remember: Functions are
the primary building blocks of programs in Clojure. Most Java programmers
are used to objects being the building blocks of their programs, so using
functions can take some getting used to. You might think that
defn
is a keyword of the language, but it is
actually a macro. A macro allows you to extend the Clojure compiler to
essentially add new keywords to the language. Thus,
defn
is not part of the language specification
but it is added by the language's core library.
In this case, it is creating a function called
divisible-by-3-or-5?
. This follows Clojure
naming conventions. Words are separated by hyphens, and the function's name
ends with a question mark, indicating it is a predicate in that it returns
true or false. The function takes a single parameter named
num
. If there were more input parameters, they
would appear inside the square brackets, separated by spaces.
Next comes the body of the function. First, we call the
or
function. This is the normal logical
or
are used to; it's just a function,
not an operator. We pass it to parameters. Each of these are also
expressions. The first expression starts with the
==
function. This does a value-based comparison
of the parameters passed to it. There are two parameters passed to it. The
first is another expression; this expression calls the
mod
function. This is the modulo operator from
mathematics, or the %
operator in the Java
language. It returns the remainder, so in this case, the remainder when
num
is divided by 3. That remainder is compared
to 0 (it is the remainder 0 and, thus, num
is divisible by 3). Similarly, we
check to see what is the remainder when num
is
divided by 5 is 0. If either of these remainders is 0, the function
returns true.
On the next line, we are creating an expression and printing it out. Let's
start from the innermost set of parentheses. Here, we call the range
function and pass in the number 1,000. This creates a sequence,
starting with 0, of all numbers less than 1,000. This is exactly the set of
numbers we want to check to see if they are divisible by 3 or 5.
Moving out, we call the filter
function. This
takes two parameters: The first is another function that must be a
predicate in that it must return true or false; the second parameter is a
sequence — in this case, the sequence (0, 1, 2, ... 999)
. The
filter
function applies the predicate, and if
it returns true, the element in the sequence is added to the result. The
predicate is just the divisible-by-3-or-5?
function defined on the line above.
So the filter expression will
result in a sequence of integers where each is less than 1,000 and
divisible by 3 or 5. This is exactly the set of integers we are interested
in, so now we just need to add them. To do this, we use the reduce
function. This function takes two parameters: a function and a sequence.
It applies the function to the first two elements in the sequence. Then it
applies the function to the previous result and the next element in the
sequence. In this case, the function is the +
function, or addition. Thus,
it will add all of the elements in the sequence.
Taking a look at Listing 3, a lot happens in a small amount of code. That is one of the appeals of Clojure. A lot happens, but yet once you get used to the notation, the code is self-explanatory. Certainly, it would take a lot more Java code to do the same thing. Let's move on to another example.
Example 2: Laziness is a virtue
For this example, we will take a look at recursion and at laziness in Clojure. This is another concept new to many Java programmers. Clojure lets you define sequences that are "lazy" because their elements are not calculated until they are needed. This allows you to define infinite sequences, and you definitely do not see those in the Java language. To see an example of when this is especially useful, let's take a look at an example that involves another important aspect of functional programming: recursion. Once again, we use a programming problem from Project Euler, but this time it's Problem No. 2.
Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be: 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
Find the sum of all the even-valued terms in the sequence that do not exceed 4 million. To solve this problem, a Java programmer might be tempted to define a function that gives you the nth Fibonacci number. A naive implementation of this is shown below.
Listing 4. A naive Fibonacci function
(defn fib [n] (if (= n 0) 0 (if (= n 1) 1 (+ (fib (- n 1)) (fib (- n 2))))))
This checks if n is 0; if so, it returns 0. Then it checks if n is 1. If so, it returns 1. Otherwise, it calculates the (n-1)th Fibonacci number and the (n-2)th Fibonacci number and adds them together. This is certainly correct, but if you have done much Java programming, you see the problem. A recursive definition like this is going to fill up the stack rapidly and lead to a stack overflow. The Fibonacci numbers form an infinite sequence, so it should be described as such using Clojure's infinite lazy sequences. This is shown in Listing 5. Note that although Clojure has a more efficient Fibonacci implementation that is part of the standard library (clojure-contrib), it is more complex, so the Fibonacci sequence shown here comes from Stuart Halloway's book (see Related topics for more information).
Listing 5. A lazy sequence for the Fibonacci numbers
(defn lazy-seq-fibo ([] (concat [0 1] (lazy-seq-fibo 0 1))) ([a b] (let [n (+ a b)] (lazy-seq (cons n (lazy-seq-fibo b n))))))
In Listing 5, the lazy-seq-fibo
function has two
definitions. The first definition has no arguments, hence the empty square
brackets. The second definition takes two arguments
[a b]
. For the no-arguments case, we take the
sequence [0 1]
and concatenate it to an
expression. That expression is a recursive call to
lazy-seq-fibo
, but this time, it is calling the
two argument case, passing in 0 and 1 to it.
The two-argument case starts off with a let
expression. This is variable assignment in Clojure. The expression
[n (+ a b)]
is defining a variable
n
and setting it equal to
a+b
. It is then using the
lazy-seq
macro. As the name suggests, the
lazy-seq
macro is used to create a lazy
sequence. Its body is an expression. In this case, it's using the
cons
function. This is a classic function in
Lisp. It takes in an element and a sequence and returns a new sequence by
prepending the element to the sequence. In this case, the sequence is the
result of again calling the lazy-seq-fibo
function. If this sequence was not lazy, the
lazy-seq-fibo
function would get called again
and again. However, the lazy-seq
macro ensures
that the function will only be invoked as the elements are accessed. To
see this sequence in action, you can use the REPL, as shown in Listing
6.
Listing 6. Generating Fibonacci numbers
1:1 user=> (defn lazy-seq-fibo ([] (concat [0 1] (lazy-seq-fibo 0 1))) ([a b] (let [n (+ a b)] (lazy-seq (cons n (lazy-seq-fibo b n)))))) #'user/lazy-seq-fibo 1:8 user=> (take 10 (lazy-seq-fibo)) (0 1 1 2 3 5 8 13 21 34)
The take
function is used to take a certain
number (in this case, 10) of elements from a sequence. Now that we have a
good way to generate Fibonacci numbers, let's solve the problem.
Listing 7. Example 2
(defn less-than-four-million? [n] (< n 4000000)) (println (reduce + (filter even? (take-while less-than-four-million? (lazy-seq-fibo)))))
In Listing 7, we define a function called
less-than-four-million?
. This simply tests if
its input is less than 4 million. In the next expression, it is
useful to start in the innermost expression. We first get the infinite
Fibonacci sequence. We then use the take-while
function. This is like the take
function, but
it takes a predicate. Once the predicate returns false, it stops taking
from the sequence. So in this case, as soon as we get a Fibonacci number
greater than 4 million, we stop taking. We take this result
and apply a filter. The filter uses the built-in
even?
function. This function does just what
you would think: It tests if a number is even. The result is all of
the Fibonacci numbers less than 4 million and even. Now we total them up
using reduce
, just as we did in the first
example.
Listing 7 solves the problem at hand, but it is not completely satisfying.
To use the take-while
function, we had to
define a very simple function called
less-than-four-million?
. It turns out that this
is not necessary. It should come as no surprise that Clojure has support
for closures. This can simplify code like that in Listing 8.
Closures in Clojure
Closures are common in many programming languages, especially in functional languages, such as Clojure. Not only are functions first-class citizens and can be passed as arguments to other functions but they can be defined inline or anonymously. Listing 8 shows a simplification of Listing 7, using a closure.
Listing 8. Simpler solution
(println (reduce + (filter even? (take-while (fn [n] (< n 4000000)) (lazy-seq-fibo)))))
In Listing 8, we have used the fn
macro. This
creates an anonymous function and returns it. Predicate functions are
often very simple and better off being defined using a closure. As it
turns out, Clojure has an even more-abbreviated way to define closures.
Listing 9. Shorthand closure
(println (reduce + (filter even? (take-while #(< % 4000000) (lazy-seq-fibo)))))
We have used #
to create the closure
instead of the fn
macro. We have also used the
%
symbol for the first parameter passed to the
function. You could also use %1
for the first parameter and similarly
%2
, %3
, etc. if the
function accepted multiple parameters.
With just these two simple examples, we have seen many features of Clojure. One other important aspect of Clojure is its tight integration with the Java language. Let's look at another example where leveraging Java from Clojure is helpful.
Example 3: Using Java technology
The Java platform has a lot to offer. The performance of JVM and the richness of both the core APIs and the numerous third-party libraries written in the Java language are all powerful tools that can save you from reinventing too many wheels. Clojure is built around these ideas. It is easy to call Java methods, create Java objects, implement Java interfaces, and extend Java classes. To see some examples of this, let's take a look at another Project Euler problem.
Listing 10. Problem No. 8 from Project Euler
Find the greatest product of five consecutive digits in the 1000-digit number. 73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450
In this problem, we have a 1,000-digit number. This could be represented
numerically in Java technology using a BigInteger
.
However, we do not need to do computations on the entire number
— only five
digits at a time. Thus it is easier to treat it as a string. However, to
make calculations, we need to treat the digits as integers. Luckily, there
are APIs in the Java language for going back and forth between strings and integers. To
start with, we need to deal with the large piece of unruly text from
above.
Listing 11. Parsing the text
(def big-num-str (str "73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450"))
Here, we take advantage of Clojure's support for multi-line strings. We use
the str
function to parse the multi-line string
literal. We then use the def
macro to define a constant called
big-num-str
. However, what will be most useful
to turn this into a sequence of integers. This is done in Listing 12.
Listing 12. Creating a numerical sequence
(def the-digits (map #(Integer. (str %)) (filter #(Character/isDigit %) (seq big-num-str))))
Again, let's start in the innermost expression. We use the
seq
function to turn
big-num-str
into a sequence. However, it turns
out that this sequence is not exactly what we want. You can see this with
help of the REPL, shown below.
Listing 13. Examining the big-num-str
sequence
user=> (seq big-num-str) (\7 \3 \1 \6 \7 \1 \7 \6 \5 \3 \1 \3 \3 \0 \6 \2 \4 \9 \1 \9 \2 \2 \5 \1 \1 \9 \6 \7 \4 \4 \2 \6 \5 \7 \4 \7 \4 \2 \3 \5 \5 \3 \4 \9 \1 \9 \4 \9 \3 \4 \newline...
The REPL shows characters (a Java char) as
\c
. So \7
is the
char 7, and \newline
is the char \n (a
new line). This is what we get for parsing the text directly. Clearly, we
need to get rid of the newlines and covert to integers before we can do
any useful calculations. This is what we do in Listing 11. There we use a
filter to remove the newlines. Notice that once again, we used a shorthand
closure for the predicate function passed to the
filter
function. The closure is using
Character/isDigit
. This is the static method
isDigit
from
java.lang.Character
. Thus, the filter only
allows in chars that are numeric digits, discarding the newline
characters.
Now we have gotten rid of the newlines, so we need to convert to integers.
Moving inside-out in Listing 12, notice that we use the
map
function, which takes two parameters: a function
and a sequence. It returns a new sequence where the nth element of the
sequence is the result of applying the function to the nth element of the
original sequence. For the function, we are once again using the shorthand
closure notation. First we use the str
function from Clojure to convert
the char to a string. Why do we do this? Because next, we create an integer
using the constructor for java.lang.Integer
.
This is denoted by Integer
. You could think of
this expression as new
java.lang.Integer(str(%))
. Using this with the
map
function, we get a sequence of integers,
just as we wanted. Now we can solve the problem.
Listing 14. Example 3
(println (apply max (map #(reduce * %) (for [idx (range (count the-digits))] (take 5 (drop idx the-digits))))))
To understand this piece of code, let's start with the for
macro. This is
not like a for
loop in the Java language.
Instead, it is a
sequence comprehension. First, we create a binding using the square
brackets. In this case, we are binding the variable
idx
to a sequence from 0 ... N-1 where N is the
number of elements in the sequence the-digits
,
(N = 1,000, as the original number had 1,000 digits). Next, the for
macro takes an expression it uses to generate a new sequence. It will
iterate over each element of the idx
sequence,
evaluate the expression, and add the result to the return sequence. You
can see how in some ways this does act kind of like a
for
loop. The expression used in the
comprehension will first use the drop
function
to drop the first M elements of the sequence, then use the
take
function to take the first five elements of
the shortened sequence. Remember that M will be 0, then 1, then 2, etc., so
the result will be a sequence of sequences, where the first element will
be (e1, e2, e3, e4, e5), the next element will be (e2, e3, e4, e5, e6),
etc., where e1, e2, etc. are the elements from
the-digits
.
Now that we have this sequence of sequences, we use the
map
function. We transform each sequence of
five numbers to the product of those five numbers by using the
reduce
function. Now we have a sequence of
integers, where the first element is the product of elements 1-5, the
second element is the product of elements 2-6, etc. We want the
maximum such product. To do this, we use the max
function. However, max
expects multiple
elements passed to it, not a single sequence. To turn the sequence into
multiple elements to pass to max
, we use the
apply
function. This produces the maximum that
we wanted to solve the problem, and of course prints out the answer. Now you have solved several problems while learning how to
use Clojure at the same time.
Summary
In this article, we have introduced the Clojure programming language and have benefited from the use of the Clojure plug-in for Eclipse. We took a brief look at some of its philosophies and features, but concentrated on code examples. In those simple examples, we have seen many of the core features of the language: functions, macros, bindings, recursion, lazy sequences, closures, comprehensions, and integration with Java technology. There are many more aspects to Clojure. Hopefully, the language has caught your attention, and you will take a look at some of the resources and learn more about it.
Downloadable resources
- PDF of this content
- Article source code (os-eclipse-clojure-euler.zip | 2KB)
Related topics
- Visit Clojure.org to download Clojure, read tutorials, and access reference documentation.
- See all the math problems at the Project Euler site.
- Check out clojure-contrib for essential libraries created by the Clojure community and used by many Clojure projects. This library is included with the Eclipse plug-in by default.
- The best way to go from beginner to expert in Clojure is to read Stuart Halloway's Programming Clojure. The lazy sequence implementation of the Fibonacci sequence was pulled from this book.
- Learn about how Clojure works and its future by watching the QCon interview with Clojure's creator Rich Hickey.
- Read "Beginning Haskell" for an introduction another functional language.
- "Higher-order functions" explains higher-order functions and Scheme, another dialect of Lisp.
- Check out "Develop Lisp applications using the Cusp Eclipse plug-in" to see how you can use Eclipse to write Common Lisp programs.
- "The busy Java developer's guide to Scala: Functional programming for the object oriented" explores more functional programming on the JVM using the Scala programming.
- Read "Getting started with the Eclipse Platform" for an introduction to the Eclipse platform.
- You need the Java Development Kit V5 or higher. This article uses Java Development Kit V1.6.0_13.
- Download Clojure V1.
- Get the latest Eclipse IDE. Eclipse V3.5 was used in this article.
- clojure-dev is an IDE for the Clojure programming language, built on the Eclipse platform. V0.0.34 was used in this article.
- Follow developerWorks on Twitter.
- Download IBM product evaluation versions or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.