The Big SQL component of BigInsights 3.0 offers powerful SQL-on-Hadoop capabilities for analyzing Big Data. The Hadoop data residing in HDFS is kept safe by using replication of data blocks, so that resiliency is achieved even when a data node fails.
But what about the metadata used by Big SQL, including the metadata managed by the Hive Metastore? This metadata includes definitions for schemas, tables, statistics, storage formats, functions, and security (just to name a few), all of which is used by Big SQL to provide those powerful capabilities.
This Big SQL v3.0 Metadata Resiliency Guide provides best practices for establishing resiliency for your metadata through regular backups, as well as redundancy of the Catalog containing the Hive Metastore. The guide also shows you how to recover from a failure to the metadata databases.
Contents I. Introduction ......................................... 3 II. First Steps .......................................... 6 III. Performing Offline Backups .......................... 11 IV. Enabling ONLINE Backups ............................. 15 V. Setup Automatic Online Backups ...................... 20 VI. Restore the Databases from a Backup ................. 23 VII. Purge Old Backups and Archive Logs .................. 26 VIII. Redundancy of the Catalog with HADR ................. 29 IX. Summary ............................................. 41 X. [Addendum] Detailed Table of Contents ............... 43
Download the complete guide here (PDF):
https://www.ibm.com/developerworks/community/files/app/file/9b9d5ac0-7aea-44a9-8ac8-6046c1468cf8