Datameer

Datameer Analytics Solution (DAS) is a Hadoop-based solution for big data analytics that includes data source integration, storage, an analytics engine and visualization.

About this task

Follow these steps to use Datameer within the MapReduce framework in IBM® Spectrum Symphony:

Procedure

Download Datameer. For details on the Cloudera’s Distribution including Hadoop (CDH) versions supported for the MapReduce framework in IBM Spectrum Symphony, see Supported distributed files systems for MapReduce or YARN integration.

For information on installing Datameer, refer to Datameer documentation.
Unzip the Datameer distribution package and install Datameer as described in the Datameer installation guide. Modify and run the following commands as they apply to your installation:
export INSTALL_LOCATION=/opt/datameer

cd $INSTALL_LOCATION

unzip das-1.3.7-0.20.2.zip

cd datameer/das-1.3.7-0.20.2
Edit the etc/das-env.sh file to provide values for the following properties:
- export DAS_PORT=8081
- export DAS_DB_MODE=hsql-file
Remove the existing Hadoop jar files and create symbolic links for your distribution’s Hadoop files (including Jackson) for HDFS integration:
cd /opt/datameer/das-1.3.7-0.20.2/webapps/ROOT/WEB-INF/lib

mv hadoop-0.20.2-core.jar hadoop-0.20.2-tools.jar /tmp

ln -s /usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar .

ln -s /usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar .

ln -s /usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar .

ln -s /usr/lib/hadoop/hadoop-tools-0.20.2-cdh3u2.jar .

Edit the $INSTALL_LOCATION/das-1.3.7-0.20.2/bin/conductor.sh file:

Add PMR_LD_PATH and include it before the existing $JAVA_LIBRARY_PATH. For example:

@@ -171,6 +171,8 @@
   JAVA_PLATFORM=`CLASSPATH=${HADOOP_JAR} $JAVA -Xmx32m 
org.apache.hadoop.util.PlatformName | sed -e "s/ /_/g"`

   JAVA_LIBRARY_PATH=${homeDir}/lib/native/${JAVA_PLATFORM}
+ PMR_LD_PATH="/opt/pmr/soam/mapreduce/version/os_type/lib64:
/opt/pmr/soam/mapreduce/version/os_type/lib:
/opt/pmr/soam/version/os_type/lib64:
/opt/pmr/soam/version/os_type/lib::
/opt/pmr/perf/soam/version/linux-x86_64/lib:
/opt/pmr/perf/ego/ego_version/linux-x86_64/lib:
/opt/pmr/soam/version/os_type/lib:/opt/pmr/ego_version/os_type/lib"
+ JAVA_LIBRARY_PATH=$PMR_LD_PATH:$JAVA_LIBRARY_PATH
fi

echo "DeployMode: $DAS_DEPLOY_MODE"

Replace the custom jars with PMR_JARS and change the prefix for the custom jars.

@@ -186,11 +188,11 @@
    
 # Link custom libraries
 if [[ $command == start || $command == restart ]]; then
-    CUSTOM_JARS=${homeDir}/etc/custom-jars
+    CUSTOM_JARS="/opt/pmr/soam/version/os_type/lib 
/opt/pmr/soam/mapreduce/version/os_type/lib 
/opt/pmr/soam/mapreduce/version/os_type/lib/cloudera-cdh3u2 
/opt/pmr/soam/mapreduce/version/os_type/lib/hadoop-0.20.203"

          JAR_DIRS="${homeDir}/job-jar ${homeDir}/webapps/ROOT/WEB-INF/lib"
-    LINK_PREFIX=das-custom-jar-
+    LINK_PREFIX=a-
    
-    customJars=`find $CUSTOM_JARS -name "*.jar"`
+    customJars=`find $CUSTOM_JARS -maxdepth 1 -name "*.jar"`
          for dir in $JAR_DIRS; do
                       jars=`find $dir -name "$LINK_PREFIX*.jar"`
                       for jar in $jars; do

Add the following properties to $INSTALL_LOCATION/$DAS/conf/das-common.properties file:
mapreduce.application.name=application_name

mapreduce.job.login.user=Admin

mapreduce.job.login.password=Admin
Start Datameer with the user account that has permissions to submit jobs via the mrsh utility and write to HDFS:
cd /opt/datameer/das-1.3.7-0.20.2 && bin/conductor.sh start
Configure the Hadoop cluster settings in DAS to use the Distributed mode from Administrator > Hadoop Cluster > Edit.
Add the datastore and run the import job.