Ranger policy with Kerberos security on Hive warehouse cases

Test case name Step Description
Hive data warehouse Ranger policy setup 1 Assign RWX on /user/hivedirectory for hdp-user1 on HDP (local native HDFS) and IBM Storage Scale HDFS Transparency cluster using Ranger UI.
DDL operations
  1. LOAD data local inpath
  2. INSERT into table
  3. INSERT Overwrite TABLE
1 Drop remote database if EXISTS cascade.
2 Create remote_db with hive warehouse on remote IBM Storage Scale HDFS Transparency cluster.
3 Create internal nonpartitioned table on remote_db.
4 LOAD data local inpath into table created in the above step.
5 Create internal nonpartitioned table on remote IBM Storage Scale HDFS Transparency cluster.
6 LOAD data local inpath into table created in the above step.
7 Create internal transactional table on remote_db.
8 INSERT into table from internal nonpartitioned table.
9 Create internal partitioned table on remote_db.
10 INSERT OVERWRITE TABLE from internal nonpartitioned table.
11 Create external nonpartitioned table on remote_db.
12 Drop local database if EXISTS cascade.
13 Create local_db with hive warehouse on local native HDFS cluster.
14 Create internal nonpartitioned table on local_db.
15 LOAD data local inpath into table created in the above step.
16 Create internal nonpartitioned table on local native HDFS cluster.
17 LOAD data local inpath into table created in the above step.
18 Create internal transactional table on local_db.
19 INSERT into table from internal nonpartitioned table.
20 Create internal partitioned table on local_db.
21 INSERT OVERWRITE TABLE from internal nonpartitioned table.
22 Create external nonpartitioned table on local_db.

DML operations

1. Query local

database

tables

2. Query remote

database

tables

1 Query data from local external nonpartitioned table.
2 Query data from local internal nonpartitioned table.
3 Query data from local nonpartitioned remote data table.
4 Query data from local internal partitioned table.
5 Query data from local internal transactional table.
6 Query data from remote external nonpartitioned table.
7 Query data from remote internal nonpartitioned table.
8 Query data from remote nonpartitioned remote data table.
9 Query data from remote internal partitioned table.
10 Query data from remote internal transactional table.
JOIN tables in local database 1 JOIN external nonpartitioned table with internal nonpartitioned table.
2 JOIN internal nonpartitioned table with internal nonpartitioned remote table.
3 JOIN internal nonpartitioned remote table with internal partitioned table.
4 JOIN internal partitioned table with internal transactional table.
5 JOIN internal transactional table with external nonpartitioned table.
JOIN tables in remote database 1 JOIN external nonpartitioned table with internal nonpartitioned table.
2 JOIN internal nonpartitioned table with internal nonpartitioned remote table.
3 JOIN internal nonpartitioned remote table with internal partitioned table.
4 JOIN internal partitioned table with internal transactional table.
5 JOIN internal transactional table with external nonpartitioned table.
JOIN tables between local_db and remote_db 1 JOIN local_db external nonpartitioned table with remote_db internal nonpartitioned table.
2 JOIN local_db internal nonpartitioned table with remote_db internal nonpartitioned remote table.
3 JOIN local_db internal nonpartitioned remote table with remote_db internal partitioned table.
4 JOIN local_db internal partitioned table with remote_db internal transactional table.
5 JOIN local_db internal transactional table with remote_db external nonpartitioned table.
Local temporary table from remote_db table 1 Create temporary table on local_db AS select query from remote_db table.
2 Query data from temporary table.

IMPORT and EXPORT

operations

1 EXPORT local_db internal partitioned table to remote IBM Storage Scale HDFS Transparency cluster location.
2 List the directory/file created on remote Hadoop cluster by EXPORT operation.
3 IMPORT table to create table in local_db from the EXPORT data on remote HDFS Transparency cluster.
4 List the directory/file created on the local Hadoop cluster by IMPORT operation.
5 Query data from local_db table created by IMPORT operation.
6 EXPORT remote_db external table to the local HDP (local native HDFS) Hadoop cluster location.
7 List the directory/file created on the local Hadoop cluster by EXPORT operation.
8 IMPORT table to create table on remote_db from the EXPORT data on the local Hadoop cluster.
9 List directory/file created on remote HDFS Transparency cluster by IMPORT operation.
10 Query data from remote_db table created by the IMPORT operation.
Table-level and column-level statistics 1 Run table-level statistics command on external nonpartitioned table.
2 Run DESCRIBE EXTENDED to check the statics of the nonpartitioned table.
3 Run column-level statistics command on internal partitioned table.
4 Run DESCRIBE EXTENDED to check the statics of the partitioned table.