How to use Databases for Elasticsearch to index and search your VPC Flow Logs.

You may have set up VPC Flow Logs for your entire VPC to write to a Cloud Object Storage (COS) bucket. I did it a few weeks ago for a VPC that I use mainly for short-term projects and with no real-world production workload. Recently, I was looking for a method to search through the data stored in the Flow Logs COS bucket that had been accumulating for over one month. I knew of this excellent post and associated Python code repository from my colleague Powell Quiring that walk you through ingesting the Flow Logs objects into IBM Log Analysis with LogDNA.  

In addition to searching through this data, I envisioned integrating the search results with an ongoing pet development project of mine. In that other project, the environment is built and destroyed frequently, and I needed a fast method to bring back the data from the COS bucket. The Python sample referenced above is geared mostly to handle objects as they are added to the bucket, not for weeks worth of Flow Logs objects. For my use case, I determined that having a tool written in the same language I was using in my project — Golang — and indexing in IBM Cloud Databases for Elasticsearch was going to be the solution.  

In this post, I will share my configuration and a sample of the code I am using to index VPC Flow Logs to Elasticsearch.  

Deployment scenario

As mentioned earlier, I started with an existing VPC and already had a Flow Logs collector configured to a COS bucket. If you are new to VPC or VPC Flow Logs, you can read through Powell’s post and code sample — he provides the steps and scripts to generate VPC resources and configure Flow Logs. You can also check out our more extensive tutorials on VPC.

  1. A Flow Logs collector is configured for the VPC.
  2. The collector interfaces with IBM Cloud Object Storage and writes to the “flowlogs” bucket.
  3. A Databases for Elasticsearch is provisioned to be used for indexing and searching of the Flow Logs.
  4. A second COS bucket “indexed-flowlogs” is created to store objects that have already been indexed in Elasticsearch.
  5. The vpc-flowlogs-elasticsearch tool is configured to read the objects from the flowlogs bucket…
  6. …, index to Elasticsearch …
  7. … and write the indexed objects to the indexed-flowlogs bucket and deleted them from the flowlogs bucket.
  8. Once indexed, the tool, Postman, or any other application can be used to query Elasticsearch. 

Downloading, configuring, and running the tool

The tool demonstrates how to do the following:

The source code is available on GitHub, and you can copy and tailor it to your needs. The README.md in the repository will guide you on how to create a Databases for Elasticsearch instance, clone the repository, configure, and run the tool.  

Once configured to interact with your COS and Elasticsearch instances, indexing is a simple command: 

Searching can be performed straight from the tool using some pre-configured queries (are you one of the IPs trying to hack my server?):

You can also use Postman or another client; some example Elasticsearch queries are also made available in the repository. 

Questions and feedback

The GitHub repository has an Issues tab where you can comment on the content and code. If you have suggestions or issues, please submit your feedback.

Categories

More from Cloud

Kubernetes version 1.28 now available in IBM Cloud Kubernetes Service

2 min read - We are excited to announce the availability of Kubernetes version 1.28 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 23rd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.27 (soon to be 1.28); you can also choose to immediately deploy version 1.28. Learn more about deploying clusters here. Kubernetes version 1.28 In…

Temenos brings innovative payments capabilities to IBM Cloud to help banks transform

3 min read - The payments ecosystem is at an inflection point for transformation, and we believe now is the time for change. As banks look to modernize their payments journeys, Temenos Payments Hub has become the first dedicated payments solution to deliver innovative payments capabilities on the IBM Cloud for Financial Services®—an industry-specific platform designed to accelerate financial institutions' digital transformations with security at the forefront. This is the latest initiative in our long history together helping clients transform. With the Temenos Payments…

Foundational models at the edge

7 min read - Foundational models (FMs) are marking the beginning of a new era in machine learning (ML) and artificial intelligence (AI), which is leading to faster development of AI that can be adapted to a wide range of downstream tasks and fine-tuned for an array of applications.  With the increasing importance of processing data where work is being performed, serving AI models at the enterprise edge enables near-real-time predictions, while abiding by data sovereignty and privacy requirements. By combining the IBM watsonx data…

The next wave of payments modernization: Minimizing complexity to elevate customer experience

3 min read - The payments ecosystem is at an inflection point for transformation, especially as we see the rise of disruptive digital entrants who are introducing new payment methods, such as cryptocurrency and central bank digital currencies (CDBC). With more choices for customers, capturing share of wallet is becoming more competitive for traditional banks. This is just one of many examples that show how the payments space has evolved. At the same time, we are increasingly seeing regulators more closely monitor the industry’s…