Java.next: Overcome synonym suffering

Recognize similar functional constructs across the Java.net languages

The previous Java.next installment ("Functional coding styles") compared and contrasted functional coding styles in Scala, Groovy, and Clojure. In this article, series author Neal Ford delves more deeply into the filter, map, and reduce functions in the Java.next languages. A series of short coding examples help you to sort out the somewhat confusing differences in how the three languages name these key functional constructs.

Share:

Neal Ford, Director / Software Architect / Meme Wrangler, ThoughtWorks

Photo of Neal fordNeal Ford is Director, Software Architect, and Meme Wrangler at ThoughtWorks, a global IT consultancy. He is also the designer and developer of applications, instructional materials, magazine articles, courseware, and video/DVD presentations, and he is the author or editor of books spanning a variety of technologies, including the most recent Presentation Patterns. He focuses on designing and building large-scale enterprise applications. He is also an internationally acclaimed speaker at developer conferences worldwide. Check out his website.



28 January 2014

Also available in Chinese Russian Japanese

About this series

The Java™ legacy will be the platform, not the language. More than 200 languages run on the JVM, and it's inevitable that one of them will eventually supplant the Java language as the best way to program the JVM. This series explores three next-generation JVM languages: Groovy, Scala, and Clojure, comparing and contrasting new capabilities and paradigms, to provide Java developers a glimpse into their own near future.

Functional programming languages feature several families of common functions. Yet developers sometimes have difficulty moving between languages because familiar functions have unfamiliar names. Functional languages tend to name these common functions based on functional paradigms. Languages that are derived from scripting backgrounds tend to use more descriptive names (sometimes multiple names, with several aliases that point to the same function).

In this installment, I continue to discuss the utility of three key functions (filter, map, and reduce) and show implementation details from each Java.next language. The discussion and examples are designed to allay the confusion that can arise from the inconsistent names that the three languages use for similar functional constructs.

Filter

With the filter function, you can specify a Boolean criterion (typically in the form of a higher-order function) to apply to a collection. The function returns the subset of the collection whose elements match the criterion. Filtering is closely related to find functions, which return the first matching element in a collection.

Scala

Scala has many filter variants. The simplest case filters a list based on passed criteria. In this first example, I create a list of numbers. Then I apply the filter() function, passing a code block that specifies the criterion that all elements must be divisible by 3:

val numbers = List.range(1, 11)
numbers filter (x => x % 3 == 0)
// List(3, 6, 9)

I can create a terser version of the code block by relying on implicit parameters:

numbers filter (_ % 3 == 0)
// List(3, 6, 9)

This second version is less verbose because in Scala, you can replace parameters with underscores. Both versions yield the same result.

Many examples of filtering operations use numbers, but filter() applies to any collection. This example applies filter() to a list of words to determine the three-letter words:

val words = List("the", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog")
words filter (_.length == 3)
// List(the, fox, the, dog)

Another filtering variant in Scala is the partition() function, which modifies a collection by splitting it into multiple parts. The split is based on the higher-order function that you pass to determine the separation criteria. Here, the partition() function returns two lists that are split according to which list members are divisible by 3:

numbers partition (_ % 3 == 0)
// (List(3, 6, 9),List(1, 2, 4, 5, 7, 8, 10))

The filter() function returns a collection of matching elements, whereas find() returns only the first match:

numbers find (_ % 3 == 0)
// Some(3)

However, the return value for find() isn't the matched value itself, but rather one that's wrapped in an Option class. Option has two possible values: Some or None. Scala, like some other functional languages, uses Option as a convention to avoid returning null in the absence of a value. The Some() instance wraps the actual return value, which is 3 in the case of numbers find (_ % 3 == 0). If I try to find something that doesn't exist, the return is None:

numbers find (_ < 0)
// None

Scala also includes several functions that process a collection based on a predicate function and either retain values or discard them. The takeWhile() function returns the largest set of values from the front of the collection that satisfy the predicate function:

List(1, 2, 3, -4, 5, 6, 7, 8, 9, 10) takeWhile (_ > 0)
// List(1, 2, 3)

The dropWhile() function skips the largest number of elements that satisfy the predicate:

words dropWhile (_ startsWith "t")
// List(quick, brown, fox, jumped, over, the, lazy, dog)

Groovy

Groovy isn't considered a functional language, but it contains many functional paradigms — some with names that are derived from scripting languages. For example, the function that's traditionally called filter() in functional languages is Groovy's findAll() method:

(1..10).findAll {it % 3 == 0}
// [3, 6, 9]

Like Scala's filter functions, Groovy's work on all types, including strings:

def words = ["the", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]
words.findAll {it.length() == 3}
// [The, fox, the, dog]

Groovy also has a partition()-like function called split():

(1..10).split {it % 3}
// [[1, 2, 4, 5, 7, 8, 10], [3, 6, 9]]

The return value of the split() method is a nested array, like the nested list in Scala that's returned from partition().

Groovy's find() method returns the first match from the collection:

(1..10).find {it % 3 == 0}
// 3

Unlike Scala, Groovy follows Java conventions and returns null when find() fails to find an element:

(1..10).find {it < 0}
// null

Groovy also has takeWhile() and dropWhile() methods with similar semantics to Scala's versions:

[1, 2, 3, -4, 5, 6, 7, 8, 9, 10].takeWhile {it > 0}
// [1, 2, 3]
words.dropWhile {it.startsWith("t")}
// [quick, brown, fox, jumped, over, the, lazy, dog]

As in the Scala example, dropWhile acts as a specialized filter: It drops the largest prefix that matches the predicate, filtering only the first part of the list:

def moreWords = ["the", "two", "ton"] + words
moreWords.dropWhile {it.startsWith("t")}
// [quick, brown, fox, jumped, over, the, lazy, dog]

Clojure

Clojure has an astounding number of collection-manipulation routines. Many of them are very generic because of Clojure's dynamic typing. Many developers gravitate toward Clojure because of the richness and flexibility of its collection libraries. Clojure uses traditional functional programming names, as shown by the (filter ) function:

(def numbers (range 1 11))
(filter (fn [x] (= 0 (rem x 3))) numbers)
; (3 6 9)

Like the other languages, Clojure has a terser syntax for simple anonymous functions:

(filter #(zero? (rem % 3)) numbers)
; (3 6 9)

And as in the other languages, Clojure's functions work on any applicable type, such as strings:

(def words ["the" "quick" "brown" "fox" "jumped" "over" "the" "lazy" "dog"])
(filter #(= 3 (count %)) words)
; (the fox the dog)

Clojure's return type for (filter ) is a Seq, which is delineated by parentheses. Seq is the core abstraction for sequential collections in Clojure.


Map

The second major functional transformation that's common to all Java.next languages is map. A map function accepts a higher-order function and a collection, then applies the passed function to each element and returns a collection. The returned collection (unlike with filtering) is the same size as the original collection, but with updated values.

Scala

Scala's map() function accepts a code block and returns the transformed collection:

List(1, 2, 3, 4, 5) map (_ + 1)
// List(2, 3, 4, 5, 6)

The map() function works on all applicable types, but it doesn't necessarily return a transformed collection of the elements of the collection. In this example, I return a list of the sizes of all elements in a string:

words map (_.length)
// List(3, 5, 5, 3, 6, 4, 3, 4, 3)

Nested lists occur so often in functional programming languages that library support for de-nesting — typically called flattening — is common. Here is an example of flattening a nested list:

List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)) flatMap (_.toList)
// List(1, 2, 3, 4, 5, 6, 7, 8, 9)

The resulting List contains just the elements, with the extra infrastructure removed. The flatMap function also works on data structures that might not seem nested in the traditional way. For example, you can consider a string as a series of nested characters:

words flatMap (_.toList)
// List(t, h, e, q, u, i, c, k, b, r, o, w, n, f, o, x, ...

Groovy

Groovy also includes several map variants called collect(). The default variant accepts a code block to apply to each element of the collection:

(1..5).collect {it += 1}
// [2, 3, 4, 5, 6]

Like the other languages, Groovy allows shorthand for simple anonymous higher-order functions; the it reserved word stands in for the lone parameter.

The collect() method works on any collection to which you can supply a reasonable predicate, such as a list of strings:

def words = ["the", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]
words.collect {it.length()}
// [3, 5, 5, 3, 6, 4, 3, 4, 3]

Groovy also has a method similar to flatMap() that collapses inner structure, called flatten():

[[1, 2, 3], [4, 5, 6], [7, 8, 9]].flatten()
// [1, 2, 3, 4, 5, 6, 7, 8, 9]

The flatten() method also works on nonobvious collections such as strings:

(words.collect {it.toList()}).flatten()
// [t, h, e, q, u, i, c, k, b, r, o, w, n, f, o, x, j, ...

Clojure

Clojure includes a (map ) function that accepts a higher-order function (which includes operators) and a collection:

(map inc numbers)
; (2 3 4 5 6 7 8 9 10 11)

The first parameter for (map ) can be any function that accepts a single parameter: a named function, an anonymous function, or a preexisting function such as inc that increments its argument. The more typical anonymous syntax is illustrated in this example, which generates a collection of the lengths of the words in a string:

(map #(count %) words)
; (3 5 5 3 6 4 3 4 3)

Clojure's (flatten ) function is similar to Groovy's:

(flatten [[1 2 3] [4 5 6] [7 8 9]])
; (1 2 3 4 5 6 7 8 9)

Fold/reduce

The third common function has the most variations in name, and many subtle differences, among the three Java.next languages. foldLeft and reduce are specific variations on a list-manipulation concept called a catamorphism, which is a generalization of list folding. In this case, a "fold left" means:

  1. Use a binary function or operator to combine the first element of the list with the second element to create a new first element.
  2. Repeat Step 1 until the list is exhausted and you are left with a single element.

Notice that this is exactly what you do when you sum a list of numbers: start with zero, add the first element, take that result and add it to the second, and continue until the list is consumed.

Scala

Scala has the richest set of fold operations, in part because it facilitates several typing scenarios that don't appear in the dynamically typed Groovy and Clojure. Reduce is commonly used to perform sums:

List.range(1, 10) reduceLeft((a, b) => a + b)
// 45

The function that's supplied to reduce() is typically a function or operator that accepts two arguments and returns a single result, thereby consuming the list. You can use Scala's syntactic sugar to shorten the function definition:

List.range(1, 10).reduceLeft(0)(_ + _)
// 45

The reduceLeft() function assumes that the first element is the left side of the operation. For operators such as plus, the placement of operands is irrelevant, but order matters for operations such as divide. If you want to reverse the order in which the operator is applied, use reduceRight():

List.range(1, 10) reduceRight(_ - _)
// 5

Understanding when you can use higher-level abstractions such as reduce is one of the keys to mastery of functional programming. This example uses reduceLeft() to determine the longest word in a collection:

words.reduceLeft((a, b) => if (a.length > b.length) a else b)
// jumped

The reduce and fold operations have overlapping functionality, with subtle distinctions that go beyond the scope of this article. However, one obvious difference suggests common uses. In Scala, the signature of reduceLeft[B >: A](op: (B, A) => B): B shows that the only parameter that's expected is the function to combine elements. The initial value is expected to be the first value in the collection. In contrast, the signature of foldLeft[B](z: B)(op: (B, A) => B): B indicates an initial seed value for the result, so you can return types that are different from the type of the list elements.

Here's an example summing a collection by using foldLeft:

List.range(1, 10).foldLeft(0)(_ + _)
// 45

Scala supports operator overloading, so the two common fold operations, foldLeft and foldRight, have corresponding operators: /: and :\ respectively. Thus, you can create a terser version of sum by using foldLeft:

(0 /: List.range(1, 10)) (_ + _)
// 45

Similarly, to find the cascading difference between each member of a list (the reverse of a sum operation, admittedly a rare requirement) you can use either the foldRight() function or the :\ operator:

(List.range(1, 10) :\ 0) (_ - _)
// 5

Groovy

Groovy's entry in the reduce category uses overloading to support the same functionality as Scala's reduce() and foldLeft() options. One version of the function accepts an initial value. This example uses the inject() method to generate a sum over a collection:

(1..10).inject {a, b -> a + b}
// 55

The alternate form accepts an initial value:

(1..10).inject(0, {a, b -> a + b})
// 55

Groovy has a much smaller functional library than Scala or Clojure — not surprising in view of the fact that Groovy is a multiparadigm language that doesn't emphasize functional programming.

Clojure

Clojure is primarily a functional programming language, so it supports (reduce ). The (reduce ) function accepts an optional initial value to cover both the reduce() and foldLeft() cases that Scala handles. The (reduce ) function brings no surprises. It accepts a function that expects two arguments and a collection:

(reduce + (range 1 11))
; 55

Clojure includes advanced support for reduce-like functionality in a library called reducers, which I cover in an upcoming installment.


Conclusion

Part of the challenge of learning a different paradigm (such as functional programming) is learning new terminology. That effort is complicated when different communities use different terms. But once you grasp the similarities, you see that all three Java.next languages offer overlapping functionality in syntactically surprising ways.

In the next installment, I explore memoization in the Java.next languages and discuss how functional features used in combination can lead to concise power.

Resources

Learn

  • Scala: Scala is a modern, functional language on the JVM.
  • Groovy: Groovy is a dynamic variant of the Java language, with updated syntax and capabilities.
  • Clojure: Clojure is a modern, functional Lisp that runs on the JVM.
  • reducers: This powerful library for Clojure adds automatic parallelism for reduce operations.
  • Functional thinking: Explore functional programming in Neal Ford's column series on developerWorks.
  • Language designer's notebook: In this developerWorks series, Java Language Architect Brian Goetz explores some of the language design issues that have presented challenges for the evolution of the Java language in Java SE 7, Java SE 8, and beyond.
  • developerWorks Java technology zone: Find hundreds of articles about every aspect of Java programming.

Get products and technologies

Discuss

  • Get involved in the developerWorks community. Connect with other developerWorks users as you explore the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=961121
ArticleTitle=Java.next: Overcome synonym suffering
publish-date=01282014