IBM Support

Knox Service Level Authorization - Hadoop Dev

Technical Blog Post


Abstract

Knox Service Level Authorization - Hadoop Dev

Body

Apache Knox Gateway has a “Service Level Authorization” provider which allows users to specify ACL like authorization access based on users, groups, and service host IP addresses. It is a very simple way to provide Authorization at the service level. Here we will explain more about the details of this feature.

We will need a “Service Level Authorization provider” definition in the topologies for every service you want to provide service level authorization.

A default definition for “Service Level Authorization” is usually added to the topology by default. This means there is no restriction since no ACL has been defined.

    <provider>      <role>authorization</role>      <name>AclsAuthz</name>      <enabled>true</enabled>  </provider>    

Next we need to discuss how we define “users”, “groups” and “IP addresses” for “Service Level Authorization”. “Users” are Knox user’s short names such as “guest”.

To define groups, it is necessary to understand the “Identity Assertion provider”. This provider allows us to map a Knox principal name (short name like “guest”) to an identity used by a Hadoop cluster.

The principal.mapping setting doesn’t do mapping by default but it could be done if desired. For example, you could map Knox user “guest” to Hadoop user “hdfs”. (We will see in more details on the xml snippet further down this article).

If using, then Knox to connect to hdfs in Simple mode the “user.name” would be set to “hdfs” instead of “guest”.

The second mapping supported by the “Identity Assertion” provider which is more relevant to our “Service Level Authorization” provider is called the “group.principal.mapping”. This is how Knox users are assigned group membership. Notice that this is just a Knox construct and may bear not relation to cluster groups.

We use client hosts IP addresses (not hostnames) to indicate from where a request may come for a given service.

After defining the “authorization” provider and whether it is “enabled” we will add some “param” elements.

There are two type of “param” elements:

  • ${serviceName}.acl
  • ${serviceName}.acl.mode

The first “${serviceName}.acl” param element defines the “user” list, group “list”, and “IP address” list to consider for service level authorization.

    <param>      <name>${serviceName}.acl</name>      <value>username[,*|username...];group[,*|group...];ipaddr[,*|ipaddr...]</value>  </param>    

The second "${serviceName}.acl.mode element is optional and it defines an ACL "mode". This "mode" can be "AND" or "OR". If ${serviceName} is not defined it is assumed to be "AND" ACL.

Here is an example of a full service level definition:

     <provider>              <role>identity-assertion</role>              <name>Default</name>              <enabled>true</enabled>              <param>                  <name>principal.mapping</name>                  <!-- here we are mapping Knox user "guest" to cluster user "hdfs" so that Knox dispatch would use user.name=hdfs in the case of Simple authentication. -->                  <value>guest=hdfs;</value>              </param>              <param>                  <!-- See here the group.principal.mappings from identity provide which is                          relevant to the Service Level Authentication.                          Here we are defining that all Knox users belong to the "users" group,  and that  user Knox "hdfs" ( aka guest from above definition)                          belongs to the admin group."                   -->                    <name>group.principal.mapping</name>                  <value>*=users;hdfs=admin</value>              </param>          </provider>          <provider>              <role>authorization</role>              <name>AclsAuthz</name>              <enabled>true</enabled>              <param>                  <name>acl.mode</name>                  <value>OR</value>              </param>              <param>                  <name>WEBHDFS.acl.mode</name>                  <value>AND</value>              </param>              <param>                  <name>WEBHDFS.acl</name>                  <value>hdfs;admin;127.0.0.2,127.0.0.3</value>              </param>  


Service Level Param description.

  1. ${ServiceName}, is the role name of your service like “WEBHDFS”.
  2. ${Users_csv}, is a comma-separated list of users.
  3. ${Groups_csv}, is a comma-separated list of groups.
  4. ${IP_addresses_csv}, is a comma-separated list of IP’s.
  5. An optional parameter can be defined which sets the “Acl Mode”. If it is not
    present it defaults to “AND”.
  6. The semantics of logical AND is applied to the "value" parameter of: "User_CSV;Group_CSV;IP_CSV". E.g. Restrict access to "User_CSV AND Group_CSV AND IP_CSV". ( U ^ G ^ IP ). All three condition must be matched.
  7. The semantics of logical OR is applied to the "value" parameter of: "User_CSV;Group_CSV;IP_CSV". E.g. Restrict access to "User_CSV OR Group_CSV OR IP_CSV. ( U v G v IP )". Any of the three condition must be matched.
  8. "*" is "allow any" in AND ACL mode and it is "deny any" in OR ACL mode.

The following picture illustrates the default "AND" ACL behavior.

knox_acl_and

The following picture illustrates the default "OR" ACL behavior.

knox_acl_or

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16260039