Customizing IBM Cloud Private Filebeat nodes for the logging service

You can customize Filebeat to collect system or application logs for a subset of nodes.

The IBM Cloud Private logging service uses Filebeat as the default log collection agent.

Filebeat monitors logs that are produced by workloads, such as containers, on the same node. It extracts and transfers logs to the server for further processing and storage. If a Filebeat instance does not run on a particular node then the logs from workloads on that node will not be streamed to Elasticsearch.

By default, a Filebeat instance on each IBM Cloud Private node collects all application logs for the node. You can use node labels and selectors to customize which nodes run Filebeat.

The Filebeat registry contains locations of the last log scan. You can configure the Filebeat registry path as a Helm chart parameter.

filebeat:
  registryHostPath: "/var/lib/icp/logging/filebeat-registry/{{ .Release.Name }}"

For managed logging instances, {{ .Release.Name }} is logging.

Install the kubectl command line interface. See Accessing your cluster from the Kubernetes CLI (kubectl).

Get a list of IBM Cloud Private nodes by running the following command:

kubectl get nodes --show-labels

The command output resembles the following text:

NAME          STATUS    AGE       VERSION                    LABELS
9.42.24.5     Ready     5h        v1.7.3-11+f747daa02c9ffb   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.42.24.5,role=master
9.42.30.64    Ready     4h        v1.7.3-11+f747daa02c9ffb   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.42.30.64
9.42.41.109   Ready     4h        v1.7.3-11+f747daa02c9ffb   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.42.41.109,management=true

Label the nodes on which to run Filebeat. Labels are applied by running the following command. <node_name> is the name of a node that should run Filebeat and myfilebeat=true is a label that can later be used to match that node for the Filebeat deployment. Any label that conforms to Kubernetes standards will work.
```
kubectl label node <node_name> myfilebeat=true
```
Get a list of the current Filebeat instances for each architecture. Search for the DaemonSets in the namespace that generate the Filebeat instances by running the following command. <namespace> is the the namespace to search.
```
kubectl get ds --namespace=<namespace>
```
Add the label to the nodeSelector for the Filebeat Daemonset. The nodeSelector block tells Kubernetes how to match the cluster nodes that should run a particular Daemonset pod.
1. Open a Filebeat DaemonSet definition in an editor. <filebeat_daemonset> is the name of an active Filebeat DaemonSet and <namespace> is the namespace that hosts the DaemonSet.
```
kubectl edit ds <filebeat_daemonset> --namespace=<namespace>
```
2. Add the myfilebeat=true label to the nodeSelector parameter. Kubernetes will now only deploy pods for that Daemonset to nodes that match all nodeSelector criteria. You should end up with something like the following text:
```
nodeSelector:
  beta.kubernetes.io/arch: amd64
  myfilebeat: "true"
```
3. Save the file.
4. Verify that the Filebeat DaemonSet is running. <filebeat_daemonset> is the name of the Filebeat DaemonSet that you modified and <namespace> is the namespace that hosts the DaemonSet.
```
kubectl get ds <filebeat_daemonset> --namespace=<namespace>
```
  If the updated Filebeat DaemonSet is running properly, the desired and available instance counts will match, as shown.
Repeat the previous step for each remaining Filebeat DaemonSet.