Installing Filebeat on the Hadoop nodes

You must install Filebeat on all of the Hadoop node computers.

Procedure

  1. Install Filebeat.

    For an offline installation, you must download the RPM on an online computer. Use the following command to download it.

    curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.2.4-x86_64.rpm
  2. Copy the RPM file to each Hadoop node computer.
  3. On the Hadoop node computer, go to the directory where you copied the RPM file, and run the following command:
    sudo rpm -vi filebeat-6.2.4-x86_64.rpm
  4. As the root user, open the /etc/filebeat/filebeat.yml file in a text editor.
    1. Update the following paths:
      # Paths that should be crawled and fetched. Glob based paths.
       paths:
       - /home/sifsuser/logs/sifsspark*.log
      
    2. Update with following by uncommenting the section, and modifying the kubernetes_master_ip value with the appropriate value.
      #----------------------------- Logstash output --------------------------
      output.logstash:
        # The Logstash hosts
        hosts: ["<kubernetes-master-ip>:5045"]
      
    3. Save and close the file.
  5. Start Filebeat by running the following command:
    sudo service filebeat start 

    Filebeat will start sending the updates by sending the logs from the path that you entered above to the Logstash service that is running on the Kubernetes cluster.

  6. Repeat steps 2 - 5 on each Hadoop node computer.