Contents


Functional thinking

Transformations and optimizations

More functional comparisons across languages

Comments

Content series:

This content is part # of # in the series: Functional thinking

Stay tuned for additional content in this series.

This content is part of the series:Functional thinking

Stay tuned for additional content in this series.

Functional programming has its roots in both mathematics and computer science, both of which have strong opinions about terminology. Language and frameworks designers develop their favorite nomenclature, only to discover that the underlying paradigms already have names. Learning functional programming paradigms is difficult because of the inconsistency of terminology.

In "Tons of transformations," I took the problem of classifying prime numbers and implemented a solution across several functional languages on the JVM and two functional Java frameworks. Continuing that theme, this installment optimizes the previous algorithm in a couple of ways, showing the subsequent changes across languages. This installment, like the last, illustrates the differences in terminology and availability of features across tools and languages. In particular, I exercise map, filter, and memoize for these examples.

Prime-number classification optimized, in pure Java

The stated problem is determining whether a number is a prime number, one whose only factors are 1 and itself. Among the several algorithms that solve this problem, I've chosen to illustrate filtering and mapping in the functional programming world.

In the last installment, I took a naive approach to the algorithm to determine a number's factors, opting for simple code rather than optimally performing code. In this installment, I optimize that algorithm using several different functional concepts. In addition, I optimize each version for the use case in which the class is invoked multiple times to classify the same number.

My original Java code for determining prime numbers appears in Listing 1:

Listing 1. Original Java version of the prime-number classifier
public class PrimeNumberClassifier {
    private Integer number;

    public PrimeNumberClassifier(int number) {
        this.number = number;
    }

    public boolean isFactor(int potential) {
        return number % potential == 0;
    }

    public Set<Integer> getFactors() {
        Set<Integer> factors = new HashSet<Integer>();
        factors.add(1);
        factors.add(number);
        for (Integer i = 2; i < number; i++)
            if (isFactor(i))
                factors.add(i);
        return factors;
    }

    public int sumFactors() {
        int sum = 0;
        for (int i : getFactors())
            sum += i;
        return sum;
    }

    public boolean isPrime() {
        return sumFactors() == number + 1;
    }
}

In Listing 1, the getFactors() method iterates the potential factors from 2 to the number being classified, which is quite inefficient. Consider the fact that factors always occur in pairs; that suggests that when I find one factor, I can ascertain its mate by simple division. Thus, I don't need to iterate all the way to the number; I can instead iterate up to the number's square root, harvesting the factors in pairs. The improved getFactors() method appears within the overall-improved version in Listing 2:

Listing 2. Optimized pure Java version
public class PrimeNumber {
    private Integer number;
    private Map<Integer, Integer> cache;

    public PrimeNumber() {
        cache = new HashMap<Integer, Integer>();
    }

    public PrimeNumber setCandidate(Integer number) {
        this.number = number;
        return this;
    }

    public static PrimeNumber getPrime(int number) {
        return new PrimeNumber().setCandidate(number);
    }

    public boolean isFactor(int potential) {
        return number % potential == 0;
    }

    public Set<Integer> getFactors() {
        Set<Integer> factors = new HashSet<Integer>();
        factors.add(1);
        factors.add(number);
        for (int i = 2; i < sqrt(number) + 1; i++)
            if (isFactor(i)) {
                factors.add(i);
                factors.add(number / i);
            }
        return factors;
    }

    public int sumFactors() {
        int sum = 0;
        if (cache.containsValue(number))
            sum = cache.get(number);
        else
            for (int i : getFactors())
                sum += i;
        return sum;
    }

    public boolean isPrime() {
        return number == 2 || sumFactors() == number + 1;
    }

}

In the getFactors() method in Listing 2, I iterate from 2 to the square root of the number (plus 1, to handle rounding errors) and harvest the factors in pairs. Returning a Set is important in this code because of an edge case concerning perfect-square numbers. Consider the number 16, whose square root is 4. In the getFactors() method, using a List instead of a Set will generate duplicate 4s in the list. Unit tests exist to find edge cases like this!

The other optimization in Listing 2 concerns multiple invocations. If the typical usage of this code is to evaluate the same number many times for primeness, the calculation that the sumFactors() method in Listing 1 performs is inefficient. Instead, in the sumFactors() method in Listing 2, I create a class-wide cache to hold previously calculated values.

Achieving the second optimization requires a little dubious class design, forcing it to be stateful so that the instance can act as the owner of the cache. This could be improved, but the improvement is trivial in subsequent examples, so I won't bother here.

Optimized Functional Java

Functional Java (see Related topics) is a framework to add functional features to Java. The two methods that optimization affects are the getFactors() and sumFactors() methods, whose original (unoptimized) versions appear in Listing 3:

Listing 3. Original Functional Java getFactors() and sumFactors() methods
public List<Integer> getFactors() {
    return range(1, number + 1)
            .filter(new F<Integer, Boolean>() {
                public Boolean f(final Integer i) {
                    return isFactor(i);
                }
            });
}

public int sumFactors() {
    return getFactors().foldLeft(fj.function.Integers.add, 0);
}

The getFactors() method in Listing 3 filters the range of numbers from 1 to the target number plus 1 (because ranges in Functional Java are noninclusive) using the isFilter() method to determine inclusion. The optimized version of the Functional Java prime-number classifier appears in Listing 4:

Listing 4. Optimized Functional Java version
import fj.F;
import fj.data.List;
import java.util.HashMap;
import java.util.Map;
import static fj.data.List.range;
import static fj.function.Integers.add;
import static java.lang.Math.round;

import static java.lang.Math.sqrt;

public class FjPrimeNumber {
    private int candidate;
    private Map<Integer, Integer> cache;

    public FjPrimeNumber setCandidate(int value) {
        this.candidate = value;
        return this;
    }

    public FjPrimeNumber(int candidate) {
        this.candidate = candidate;
        cache = new HashMap<Integer, Integer>();
    }

    public boolean isFactor(int potential) {
        return candidate % potential == 0;
    }

    public List<Integer> getFactors() {
        final List<Integer> lowerFactors = range(1, (int) round(sqrt(candidate) + 1))
                .filter(new F<Integer, Boolean>() {
                    public Boolean f(final Integer i) {
                        return isFactor(i);
                    }
                });
        return lowerFactors.append(lowerFactors.map(new F<Integer, Integer>() {
            public Integer f(final Integer i) {
                return candidate / i;
            }
        }))
        .nub();
    }

    public int sumFactors() {

        if (cache.containsKey(candidate))
            return cache.get(candidate);
        else {
            int sum = getFactors().foldLeft(add, 0);
            cache.put(candidate, sum);
            return sum;
        }
    }

    public boolean isPrime() {
        return candidate == 2 || sumFactors() == candidate + 1;
    }
}

In the getFactors() method in Listing 4, I use the same range() and filter() methods more selectively. The first range harvests the factors up to the square root, using the filter() method as in Listing 3. The second line of the method uses the map() method from Functional Java to generate the factors above the square root. The map() method applies the function to each element in the collection, returning a transformed collection. This list of factors above the square root is appended to the factors below the square root (lowerFactors). The last method call to Functional Java's nub() method converts the list to a set, alleviating the perfect-square duplication problem.

The sumFactors() optimization in Listing 4 uses a cache identically to the pure Java version in Listing 2, implying the same statefulness requirement on the class as in that version.

Optimized Groovy

The original Groovy versions of the getFactors() and sumFactors() methods appear in Listing 5:

Listing 5. Original Groovy getFactors() and sumFactors() methods
public def getFactors() {
  (1..number).findAll { isFactor(it) }.toSet()
}

public def sumFactors() {
  getFactors().inject(0, {i, j -> i + j})
}

In Groovy, the findAll() method filters the range of numbers, and the sumFactors() method uses Groovy's inject() method, applying the code block to each element to reduce the list to one element (which will be the sum, because the code block sums each pair as the reduction operation). The optimized Groovy version of the prime number classifier appears in Listing 6:

Listing 6. Optimized Groovy version
import static java.lang.Math.sqrt

class PrimeNumber {
  static def isFactor(potential, number) {
    number % potential == 0;
  }

  static def factors = { number ->
    def factors = (1..sqrt(number)).findAll { isFactor(it, number) }
    factors.addAll factors.collect { (int) number / it}
    factors.toSet()
  }

  static def getFactors = factors.memoize();

  static def sumFactors(number) {
    getFactors(number).inject(0, {i, j -> i + j})
  }

  static def isPrime(number) {
    number == 2 || sumFactors(number) == number + 1
  }
}

Just as in the Functional Java version, the factors() method in Listing 6 partitions the factors using the square root and converts the resulting list into a set via the toSet() method. The primary difference is the terminology difference between Functional Java and Groovy; in Functional Java, the filter() and foldLeft() methods are synonyms for Groovy's findAll() and inject(), respectively.

The optimization solution in Listing 6 is radically different from the previous Java versions. Rather than add statefulness to the class, I use Groovy's memoize() method. The factors method in Listing 6 is a pure function, meaning that it relies on no state except its parameter. Once that requirement is met, the Groovy runtime can cache the values automatically via the memoize() method, which returns a cached version of the factors() method named getFactors(). This is a great example of the ability of functional programming to reduce the number of mechanisms the developer must maintain, such as caching. I cover memoization more fully in the "Functional design patterns, Part 1" installment of this series.

Optimized Scala

The original Scala versions of the getFactors() and sumFactors() methods appear in Listing 7:

Listing 7. Original Scala factors() and sum() methods
def factors(number: Int) =
  (1 to number) filter (isFactor(number, _))

def sum(factors: Seq[Int]) =
 factors.foldLeft(0)(_ + _)

The code in Listing 7 makes use of the convenient _ placeholder for parameters whose names aren't important. The optimized version of the prime-number classifier appears in Listing 8:

Listing 8. Optimized Scala version
import scala.math.sqrt;

object PrimeNumber {
  def isFactor(number: Int, potentialFactor: Int) =
    number % potentialFactor == 0

  def factors(number: Int) = {
    val lowerFactors = (1 to sqrt(number).toInt) filter (isFactor(number, _))
    val upperFactors = lowerFactors.map(number / _)
    lowerFactors.union(upperFactors)
  }

  def memoize[A, B](f: A => B) = new (A => B) {
    val cache = scala.collection.mutable.Map[A, B]()
    def apply(x: A): B = cache.getOrElseUpdate(x, f(x))
  }

  def getFactors = memoize(factors)

  def sum(factors: Seq[Int]) =
    factors.foldLeft(0)(_ + _)

  def isPrime(number: Int) =
    number == 2 || sum(getFactors(number)) == number + 1
}

The optimized factors() method uses the same technique as previous examples (such as in Listing 3), adapted to Scala's syntax, yielding a straightforward implementation.

Scala does not have a standard memoization facility, although it is suggested as a future addition. It can be implemented numerous ways; an easy implementation relies on Scala's built-in mutable map and its handy getOrElseUpdate() method.

Optimized Clojure

The Clojure versions of the factors and sum-factors methods appear in Listing 9:

Listing 9. Original factors and sum-factors methods
(defn factors [n]
  (filter #(factor? n %) (range 1 (+ n 1))))

(defn sum-factors [n]
  (reduce + (factors n)))

As in the other preoptimized versions, the original Clojure code filters the range of numbers from 1 to the number plus 1, and it uses Clojure's reduce function to apply the + function to each element, yielding the sum. The optimized Clojure prime-number classifier appears in Listing 10:

Listing 10. Optimized Clojure version
(ns primes)

(defn factor? [n, potential]
  (zero? (rem n potential)))

(defn factors [n]
  (let [factors-below-sqrt (filter #(factor? n %) (range 1 (inc (Math/sqrt n))))
        factors-above-sqrt (map #(/ n %) factors-below-sqrt)]
    (concat factors-below-sqrt factors-above-sqrt)))

(def get-factors (memoize factors))

(defn sum-factors [n]
  (reduce + (get-factors n)))

(defn prime? [n]
  (or (= n 2) (= (inc n) (sum-factors n))))

The factors method uses the same optimized algorithm as used in previous examples (such as Listing 3), harvesting the factors below the square root by filtering the range from 1 to the square root plus 1: (filter #(factor? n %) (range 1 (inc (Math/sqrt n)))). The Clojure version uses its own symbol (%) for unnamed parameters, like the Scala version in Listing 8. The #(/ n %) syntax creates an anonymous function, as syntactic-sugar shorthand for (fn [x] (/ n x)).

Clojure includes the ability to memoize pure functions via the memoize function, just as in the Groovy version, making the second optimization trivial to implement.

Conclusion

In this installment as in the last, I illustrated how similar concepts have grown different names across languages and frameworks, as well as built-in abilities. Groovy is the odd language out when it comes to function naming (findAll() rather than the more common filter(), and collect() rather than map(), for example). The presence of memoization makes a huge difference for the ease and safety of implementing caching.

In the next installment, I'll explore laziness more fully across languages and functional frameworks.


Downloadable resources


Related topics

  • The Productive Programmer (Neal Ford, O'Reilly Media, 2008): Neal Ford's book discusses tools and practices that help you improve your coding efficiency.
  • Scala: Scala is a modern, functional language on the JVM.
  • Clojure: Clojure is a modern, functional Lisp that runs on the JVM.
  • Functional Java: Functional Java is a framework that adds many functional language constructs to Java.
  • Practically Groovy: Explore Groovy's functional (and other) features in this developerWorks article series.
  • "Execution in the Kingdom of Nouns" (Steve Yegge, March 2006): An entertaining rant about some aspects of Java language design.
  • Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java development
ArticleID=840128
ArticleTitle=Functional thinking: Transformations and optimizations
publish-date=10162012