Running the example

Complete the following steps to run the WordCount example.

Procedure

  1. Compile WordCount.java and create a .jar file:
    $ export JAVA_HOME=/nz/export/ae/languages/java/java_sdk/host
    $ export MR_HOME=/nz/export/ae/products/netezza/mapreduce/current
    $ mkdir wordcount_classes
    ${JAVA_HOME}/bin/javac \
    -classpath ${MR_HOME}/mapreduce.jar \
    -d wordcount_classes \
    WordCount.java
    ${JAVA_HOME}/bin/jar -cvf wordcount.jar -C wordcount_classes/ .
  2. Create a mapreduce_db database:
    CREATE DATABASE mapreduce_db;
  3. Connect to mapreduce_db and seed table wordcount_input with sample data:
    CREATE TABLE wordcount_input(
    id int,
    text varchar(100)
    );
    INSERT INTO wordcount_input VALUES(1, 'Hello World Bye World');
    INSERT INTO wordcount_input VALUES(2, 'Hello mapreduce Goodbye
    mapreduce');
    INSERT INTO wordcount_input VALUES(3, 'Hello INZA');
    1. Run the application with the following command:
    ${MR_HOME}/bin/mapreduce jar wordcount.jar WordCount \
    mapreduce_db
    wordcount_input id text \
    wordcount_output word count
  4. Show output table:
    $ nzsql ${DB} -c "select * from wordcount_output"
    WORD | COUNT
    -----------+-------
    Hello | 3
    INZA | 1
    World | 2
    mapreduce | 2
    Bye | 1
    Goodbye | 1
    (6 rows)