Contents


Qlik Sense integrated with Hortonworks Data Platform (HDP) running on IBM Power Systems

Steps for discovering and visualizing data in HDP on IBM Power Systems using Qlik Sense

Comments

Qlik Sense is a business intelligence (BI) solution that helps in discovering the insights that query-based BI tools can simply miss. You can search and explore across all your data, instantly pivoting your analysis when new ideas surface. Qlik Sense gives total flexibility with a cloud-ready data analytics platform that supports the full spectrum of BI use cases. Qlik Sense supports accessing data in Apache Hadoop environments. Validation testing was performed to verify Qlik Sense’s ability to integrate with and visualize data specifically to Hortonworks Data Platform (HDP) on IBM® POWER8®. This article provides an overview of the validation tests that were completed.

Objectives

The key objectives for the validation testing of Qlik Sense were to:

  1. Configure Qlik Sense to connect to HDP 2.6 running on an IBM POWER8 processor-based server.
  2. Extract and visualize sample data from the Hadoop Distributed File System (HDFS) of HDP running on a POWER8 processor-based server.

Test environment

This section lists the high-level components included in the test environment.

Qlik Sense

  • Qlik Sense Desktop 11.11.1 for a Microsoft Windows 7 PC
  • Hortonworks ODBC Driver for Apache Hive (v2.1.5.1006 64 bit)
  • A notebook running Windows 7

Hortonworks Data Platform

  • Hortonworks Data Platform (HDP) version 2.6
  • Red Hat Enterprise Linux (RHEL) 7.2
  • Minimum resources: Eight virtual processors, 24 GB memory, 50 GB disk space
  • IBM PowerKVM™
  • IBM POWER8 processor-based server

Deployment architecture

The deployment architecture is quite simple. Qlik Sense and the Hortonworks ODBC driver were installed and run on a Windows 7 notebook. HDP was installed and run on a POWER8 server. Qlik Sense and the ODBC driver were configured to connect to HDP. Data in HDP was accessed and visualized by Qlik Sense. Tests were run in a single-node HDP environment and a multi-node HDP cluster.

Installation and configuration

The section covers installation and configuration of a HDP cluster and Qlick Sense software.

Installing and configuring a HDP cluster

Here are the high-level steps to install and configure the HDP cluster:

  1. Follow the installation guide for HDP on IBM Power Systems™ to install and configure the HDP cluster. Refer to the Related topics section for more information.
  2. Log in to the Ambari server and ensure that all the services are running.
  3. Monitor and manage the HDP cluster, Hadoop, and related services through Ambari.

Setting up test data and Hive tables

Download the MovieLens and driver test data, create Hive tables and copy the data to the Hive tables in HDFS.

  1. Download the MovieLens data set from MovieLens Dataset (see the citation in Related topics).
  2. Create tables in Hive.
     
    hive> create database movielens;
    OK
    Time taken: 3.063 seconds
    hive> use movielens;
    OK
    Time taken: 0.261 seconds
    hive> 
        > CREATE EXTERNAL TABLE ratings (
        >   userid INT, 
        >   movieid INT,
        >   rating INT, 
        >   tstamp STRING
        > ) ROW FORMAT DELIMITED
        > FIELDS TERMINATED BY '#'
        > STORED AS TEXTFILE
        > LOCATION '/dataset/movielens/ratings';
    OK
    Time taken: 0.578 seconds
    hive> 
        > CREATE EXTERNAL TABLE movies (
        >   movieid INT, 
        >   title STRING,
        >   genres ARRAY<STRING>
        > ) ROW FORMAT DELIMITED
        > FIELDS TERMINATED BY '#'
        > COLLECTION ITEMS TERMINATED BY "|"
        > STORED AS TEXTFILE
        > LOCATION '/dataset/movielens/movies';
    OK
    Time taken: 0.429 seconds
    hive> 
        > CREATE EXTERNAL TABLE users (
        >   userid INT, 
        >   gender STRING, 
        >   age INT,
        >   occupation INT,
        >   zipcode STRING
        > ) ROW FORMAT DELIMITED
        > FIELDS TERMINATED BY '#'
        > STORED AS TEXTFILE
        > LOCATION '/dataset/movielens/users';
    OK
    Time taken: 0.291 seconds
    hive> 
        > CREATE EXTERNAL TABLE occupations (
        >   id INT,
        >   occupation STRING
        > ) ROW FORMAT DELIMITED
        > FIELDS TERMINATED BY '#'
        > STORED AS TEXTFILE
        > LOCATION '/dataset/movielens/occupations';
    OK
    Time taken: 0.336 seconds
    hive>
  3. Load the test data into tables.
     
    hadoop fs -put ratings.t /dataset/movielens/ratings
    hadoop fs -put movies.t /dataset/movielens/movies
    hadoop fs -put users.t /dataset/movielens/users
    hadoop fs -put occupations.t /dataset/movielens/occupations
  4. Verify the data in HDFS.
     
    [root@hdp264build ml-1m]# hadoop fs -ls /dataset/movielens
    Found 4 items
    drwxr-xr-x   - hive hdfs          0 2018-05-30 05:19 /dataset/movielens/movies
    drwxr-xr-x   - hive hdfs          0 2018-05-30 05:19 /dataset/movielens/occupations
    drwxr-xr-x   - hive hdfs          0 2018-05-30 05:19 /dataset/movielens/ratings
    drwxr-xr-x   - hive hdfs          0 2018-05-30 05:19 /dataset/movielens/users
    [root@hdp264build ml-1m] #
  5. Verify the data in Hive tables.
     
    hive> show databases;
    OK
    default
    movielens
    Time taken: 2.559 seconds, Fetched: 2 row(s)
    hive> show tables in movielens;
    OK
    movies
    occupations
    ratings
    users
    Time taken: 0.371 seconds, Fetched: 4 row(s)
    hive>

Installing and configuring Hortonworks ODBC driver

Here are the steps to install and configure the ODBC driver:

  1. Download the Hortonworks ODBC driver on Microsoft Windows 7 (see Related topics for the download website).
  2. Install and configure the ODBC driver. Follow the instructions in the tutorial and guide listed in the Related topics section.

Installing and configuring Qlik Sense

Here are the steps to install and configure Qlik Sense:

  1. Go to the Qlik Sense download page (see Related topics) to download Qlik Sense Desktop Edition on Windows 7.
  2. Follow the instructions to install it on Windows 7.
  3. Follow the instructions in the Qlik Sense Quick Installation Guide (see Related topics). Some of the steps are described in the following configuration section.

Connecting HDP to Qlik Sense

Qlik Sense uses the following methods for fetching data from HIVE2 running on HDP. In the test, the first method (A) was used for ingesting data from Hive.

  • Method A: Data loaded to Qlik Sense In-Memory Associative Data Store
  • Method B: Qlik Sense Hybrid Solution – Qlik Sense Direct Discovery on top of Hadoop

Here are the steps to configure the connection between HDP and Qlik Sense.

  1. Launch ODBC Administrator from Windows and add a data source for Hortonworks Hive as shown in Figure 1.
    Figure 1. Hortonworks Hive ODBC Driver setup screens
  2. On Windows 7, launch the Qlik Sense application. Create a new app by clicking Create new app. Enter a name for the app and click Create. Open the newly created app and click Create new connection -> ODBC. Refer to Figure 2 and Figure 3.
    Figure 2. Create a new app
    Figure 3. Qlik Sense - ODBC new connection screen
  3. Connect to the HIVE2 server running on HDP 2.6 instance running on the IBM POWER8 processor-based server as shown in Figure 3. Select the ODBC data source added from the ODBC Administrator in the previous step. Provide the Hive user name and password (use the Hive DB password and not the Hive UNIX user password). The connection to HIVE2 must succeed to continue.

    Note: If you have created the Hive DB and tables using the Hive user account, then use hive as the user username and the Hive DB password.

  4. Click Load data at the upper-right side to load the data from the Hive table to the Qlik Sense application memory, as shown in Figures 4.
    Figure 4. Loading data to the Qlik Sense application memory

    From the drop-down menu at the upper-left side (as shown in figure 5) select Data load editor and then click Add data.

    Select the required data connection as, in this case ODBC, from the left side menu (as shown in Figure 6).

    Figure 5. Add data to the new Qlik Sense app
    Figure 6. Add data to the new Qlik Sense App (continued)

    Select the tables that needs to be added to the new app and click Add data. Refer to Figure 7. After the data is loaded it will be available for analysis. Figure 8 shows data loaded in the Qlik Sense app.

    Figure 7. Add data to the new Qlik Sense app (continued)
    Figure 8. Tables in the new Qlik Sense App

Visualization and analysis in Qlik Sense

Here are the steps to visualize and analyze data using Qlik Sense:

  1. Connect the tables (circles) shown in Figure 8 by dragging toward each other to create an association. After the tables are connected, a new window opens (Figure 9) that enables you to select the fields for association. Associated data will look as in Figure 10.
    Figure 9. Data association in Qlik Sense
    Figure 10. Data associated in Qlik Sense
  2. You can perform analysis and visualization on the data fetched from the Hive DB. Note that the data is now in memory and analysis is done on the data in memory. Figure 11 and Figure 12 show example visualizations within the Qlik Sense dashboard.
    Figure 11. Qlik Sense visualization example 1
    Figure 12. Qlik Sense visualization example 2

Related topics


Downloadable resources


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=1061783
ArticleTitle=Qlik Sense integrated with Hortonworks Data Platform (HDP) running on IBM Power Systems
publish-date=06082018