Setup Hadoop Impersonation

For Enterprise Steam to act on behalf of logged-in users when launching clusters on Hadoop/YARN a Hadoop administrator has to allow Enterprise Steam to do so. This requires changes to Hadoop core-site.xml. Do not change core-site.xml manually, instead use Cloudera Manager, Ambari or similar tool that manages Hadoop configuration.

Hadoop administrator needs to add the following properties to core-site.xml:

<property>
     <name>hadoop.proxyuser.SERVICEID.hosts</name>
     <value>HOST</value>
</property>
<property>
     <name>hadoop.proxyuser.SERVICEID.groups</name>
     <value>*</value>
</property>

where:

  • SERVICEID is the user ID of Kerberos principal that is associated with the Enterprise Steam Kerberos keytab or Enterprise Steam service ID (usually steam)
  • HOST is the hostname of the Enterprise Steam server. Wildcard (*) is accepted.

Following is an example of valid core-site.xml changes to enable Enterprise Steam on steam.mycompany.loc to impersonate any user:

<property>
     <name>hadoop.proxyuser.SERVICEID.hosts</name>
     <value>steam.mycompany.loc</value>
</property>
<property>
     <name>hadoop.proxyuser.SERVICEID.groups</name>
     <value>*</value>
</property>

Additional information about these changes is available here: https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/Superusers.html.

In Cloudera Manager

  1. Login to Cloudera Maanager as Hadoop administrator capable of changing Hadoop configuration.
  2. Go to HDFS service.
  3. Go to Configuration.
  4. Search for Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml configuration.
  5. Add entry with name hadoop.proxyuser.SERVICEID.hosts and value HOST as described in the previous section.
  6. Add entry with name hadoop.proxyuser.SERVICEID.groups and value * as described in the previous section.
  7. Save Changes
  8. Deploy client configuration and restart the cluster.