IBM Support

Starting Flume agents by using the Ambari web interface - Hadoop Dev

Technical Blog Post


Abstract

Starting Flume agents by using the Ambari web interface - Hadoop Dev

Body

The following directions detail the manual installation of software into IBM Open Platform for Apache Hadoop. These directions, and any binaries that may be provided as part of this article (either hosted by IBM or otherwise), are provided for convenience and make no guarantees as to stability, performance, or functionality of the software being installed. Product support for this software will not be provided (including upgrade support for either IOP or the software described). Questions or issues encountered should be discussed on the BigInsights StackOverflow forum or the appropriate Apache Software Foundation mailing list for the component(s) covered by this article.

Apache Flume can be used to efficiently collect, aggregate, and move large amounts of log data from many different sources to a centralized data store. Ambari provides an intuitive, easy-to-use Hadoop management web interface backed by its RESTful APIs. You can use the Ambari web interface to configure Flume and to start, stop, or monitor Flume agents.

The Flume agents that are started by Ambari are run by user flume, and the file directory that is used in agent configuration must be accessible to user flume. The log files are located in /var/log/flume on the Flume node.

This article shows you how to run Flume agents from Ambari on an IBM Open Platform (IOP) 4.2 cluster.

Starting Flume agents from Ambari

  1. Log in to Ambari and verify that the services are running. Click the Flume service; the Summary page displays the overall status. Click the Configs tab to edit the Flume configuration.
    Pic1
  2. The Configs page has two sections: “flume.conf” for Flume agent configuration, and “Advanced flume-env” for the flume-env.sh file. Expand “flume.conf” which, by default, contains only one comment line: # Flume agent config. Add your agent configuration to this field. Click Save and then restart the Flume service (click Restart) to enable the changes to take effect.
    Pic2
  3. On the Summary page, you should now be able to see that the newly defined agents are running. You can start or stop the flume agent for a specific host from the drop-down list next to the corresponding Running button. You can also restart, start, or stop the Flume service from the Service Actions menu, or monitor various metrics for your channel in the Metrics section.
    Pic3

Using configuration groups to start Flume agents with different configurations

The example agents in the previous section run on all hosts. In most cases, Flume flow is complex, with different agents running across the Flume hosts. From the Ambari web interface, you can divide hosts into different configuration groups. Each configuration group can be associated with a specific configuration file for Flume agents. This section shows you how to create configuration groups for Flume agents.

  1. On the Flume Configs page, the default group contains all of the Flume hosts, and all hosts in this default group have the same Flume agent running.
    Pic4
  2. To create a configuration group, click Manage Config Groups -> +. Type a name and description for the new group. Click OK.
    Pic5
  3. Select the new group (in this example, “group1”), then click + under the empty host name box.
    Pic6
  4. Select hosts to include in the configuration group. Click Save.
    Pic7
  5. Create other configuration groups, as appropriate.
    Pic8
  6. You can change the configuration values for a group. For example, select group1 from the group list, click + in the flume.conf section, and then add the Flume agent configuration details.
    Pic9
    Pic10
  7. Expand the Advanced flume-env section, click +, and change flume-env values, as appropriate. You can update the memory parameter or add customized JAR files to the class path.
    Pic11
  8. Similarly, you can change the configuration values for another group, such as group2, for example.
    Pic12
  9. On the Summary page, you can now see that the hosts in different configuration groups have different Flume agents running.
    Pic13

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16260033