Monitoring Amazon ElasticMapReduce (EMR)

The Amazon ElasticMapReduce (EMR) sensor is automatically deployed and installed after you install the Instana agent.

Learn about the other supported AWS services with AWS documentation.

This sensor monitors AWS ElasticMapReduce (EMR) environments and their instances.

Sensor (Data Collection)

Cluster Details

  • Cluster ID
  • Cluster Name
  • Cluster Creation Time
  • Cluster Version
  • Cluster State
  • Grouping zone (region)

Metrics

Cluster Metrics

Name Description
Apps running The number of applications currently in the cluster.
Apps Pending The number of applications pending for the cluster.
Apps Failed The number of applications that failed in the cluster.
Memory Allocated The amount of memory that is allocated to the cluster in bytes.
Memory Reserved The amount of memory reserved in bytes.
Memory Available The amount of memory available to be allocated in bytes.
Containers running The number of containers in the cluster.

Node Metrics

Name Description
Active Nodes The number of nodes currently MapReduce tasks within the cluster.
Lost Nodes The number of nodes that are allocated to MapReduce tasks with a LOST state.
Unhealthy Nodes The number of nodes that are allocated to MapReduce tasks with an UNHEALTHY state.
Decommissioned Nodes The number of nodes that are allocated to MapReduce tasks with a DECOMMISSIONED state.

Input/output Metrics

Name Description
Bytes Written to S3 The number of bytes written to the S3 bucket by the cluster.
Bytes Read from S3 The number of bytes read from the S3 bucket by the cluster.
HDFS usage The percentage of HDFS storage currently being used.
Total Load The total number of concurrent data transfers.

Required Permissions

  • CloudWatch:GetMetricStatistics
  • CloudWatch:GetMetricData
  • elasticmapreduce:ListClusters
  • elasticmapreduce:DescribeCluster

Configuration

Metrics for EMR are pulled every 300 seconds, which can be changed through agent configuration in <agent_install_dir>/etc/instana/configuration.yml:

com.instana.plugin.aws.emr:
  cloudwatch_period: 300

To disable monitoring of EMR instances, use the following configuration:

com.instana.plugin.aws.emr:
  enabled: false

Proxy configuration

To configure the specific AWS Sensor to use proxy configuration, add the following agent configuration settings:

com.instana.plugin.aws.emr:
  proxy_host: 'example.com' # proxy host name or ip address
  proxy_port: 3128 # proxy port
  proxy_protocol: 'HTTP' # proxy protocol: HTTP or HTTPS
  proxy_username: 'username' # OPTIONAL: proxy username
  proxy_password: 'password' # OPTIONAL: proxy password

Monitoring multiple AWS accounts

Refer to the Monitoring multiple AWS accounts documentation to set up monitoring of multiple AWS accounts with one AWS agent in the same region.

AWS named profiles approach

To override which profiles are used to monitor ElasticMapReduce, use the following configuration:

com.instana.plugin.aws.emr:
  profile_names:
    - 'profile2'
    - 'profile3'

Defining profiles on service level overrides the global AWS profiles configuration.

AWS STS approach

To override which IAM Roles are used to monitor ElasticMapReduce, use the following configuration:

com.instana.plugin.aws.emr:
  role_arns:
    - 'arn:aws:iam::<account_1_id>:role/<role_1_name>'
    - 'arn:aws:iam::<account_2_id>:role/<role_2_name>'

Defining IAM roles on service level overrides the global AWS IAM roles configuration.

Filtering

Multiple tags can be defined, separated by a comma. Tags are provided as a key-value pair separated by: To make configuration easier, it is possible to define which tags you want to include in discovery or exclude from discovery. If defining tag in both lists (include and exclude), exclude list has higher priority. If there is no need for filtering services, the configuration should not be defined. It’s not mandatory to define all values to enable filtering.

Users are able to specify how often sensors poll the AWS tagged resources that use the tagged-services-poll-rate configuration property (default 300 seconds).

Tags are only available with the AWS Agent.

To define how often sensors poll the tagged resources use following configuration:

com.instana.plugin.aws:
  tagged-services-poll-rate: 60 #default 300

To include services by tags into discovery use following configuration:

com.instana.plugin.aws.emr:
  include_tags: # Comma separated list of tags in key:value format (e.g. env:prod,env:staging)

To exclude services by tags from discovery use following configuration:

com.instana.plugin.aws.emr:
  exclude_tags: # Comma separated list of tags in key:value format (e.g. env:dev,env:test)

AWS services without tags are monitored by default but can be excluded by setting the include_untagged field to false:

com.instana.plugin.aws.emr:
  include_untagged: false # True value by default

Instana Agent Tags

Note that tags are only available with the AWS Agent. More details on using tags are described here.