Use Flow Logs for VPC to monitor network traffic and troubleshoot configuration and connectivity.
Virtual Private Cloud (VPC) Gen2 is a critical service to deploy compute in a logically isolated virtual network that you define, including selection of your own IP address range, creation of subnets and security groups.
IBM Cloud Flow Logs for VPC capture the IP traffic into and out of the network interfaces in a customer-generated VSI of a VPC and persist them into an IBM Cloud Object Storage (COS) bucket. You can use flow logs to diagnose connectivity issues or monitor traffic that enters and leaves the network interfaces of the VPC instances. This allows you to answer questions like the following:
- Are unexpected TCP ports being accessed on my VSIs?
- Is SSH traffic reaching the VPC but getting rejected?
- Are bad actors trying to access my network?
COS provides an excellent landing place for high-volume, continuously growing storage. It is also possible to ingest this data from the COS bucket into other analysis tools.
In this blog post, you'll learn how to use IBM Cloud SQL Query to analyze flow logs in a serverless way based on the scenario that my colleague Powell Quiring outlined in his excellent post, "Use IBM Log Analysis with LogDNA to Analyze VPC Network Traffic from IBM Cloud Flow Logs for VPC."
With IBM Cloud SQL Query, you only pay for what you use; there are no standing cost or setup steps required. You'll only pay a fee for how much data is accessed to execute your query. This means no pre-processing or transformation is necessary and you can get started in minutes with analyzing your flow logs:
There are various ways to process flow logs for specific use cases, and all have specific advantages:
- Use IBM Log Analysis with LogDNA to Analyze VPC Network Traffic from IBM Cloud Flow Logs for VPC
- Indexing and Searching VPC Flow Logs in IBM Cloud Databases for Elasticsearch
- Getting More Out of Your VPC Flow Logs with Kentik Network Observability Cloud
Deploying the sample code
The source code implementing these flows is available in GitHub. It comes with scripts to set up Cloud Object Storage. Detailed instructions can be found in the README, but simply start in the Cloud Shell and type the following:
Once you have configured your shell environment with the local.env file, you can start running the scripts. You will need the IBM Cloud CLI, the Cloud Object Storage plugin, the Schematics plugin and the jq command line utility, which are already installed in Cloud Shell:
000-prereqs.sh, which performs basic checks of the target resource group, the target region, the required IBM Cloud plugins and external tools.
010-create-services.sh to create a Cloud Object Storage service, a bucket and SQL Query service.
030-create-vpc.sh to create the VPC, subnets, instances and flow log collectors. Terraform in the IBM Cloud Schematics service is used to create all of the resources except the flow log collector, which is created using the ibmcloud cli. After the script completes, check out the flow log collector configuration in the IBM Cloud Console.
VPC Infrastructure observed by the flow log collector:
Flow log collector:
In a few minutes, the COS bucket will start to have flow log objects. A few minutes after that, you can start querying your flow logs in SQL Query.
The final few lines displayed by
030-create-vpc.sh will look something like this. Copy these and keep them handy:
SQL Query analysis
The curl 22.214.171.124:3000/info will show the output of the private IP address of vsi1, or 10.240.0.4. The curl 126.96.36.199:3000/remote will curl the private instance from within the public instance, displaying the private IP address of vsi1 (it is private). Attempt to ssh to the public IP address — this will likely hang. We will investigate this soon:
Setting up SQL Query
Navigate to your SQL Query instance and then click the Launch SQL Query UI.
Important: Replace these variables in the following steps:
- bucket: The bucket where your flow logs are stored.
- region: The region alias of the bucket that holds your flow logs.
First, define a table for flow logs:
Then, define a view that will give us a nice flattened view on flow logs and make it easier to work with the data:
Looking for bad actors
Navigate to your SQL Query instance and then click the Launch SQL Query UI.
Let's look for flow logs that show all connections to vsi1 that have been rejected:
As you can see, there are quite a few records. Let's group all rejected connections by initiator_ip so that we get an idea about from where the connections originated:
As you can see, most of the rejected connections originated from 188.8.131.52. In order to understand if someone did a port-scan on vsi1 we're including the target_port of each rejected connection:
Over 700 hundred different ports have been scanned from that single IP — this surely was an attempted port scan.
Let's pivot the data by the port that was targeted instead of the initiator_ip to see if that reveals new insights.
Grouping data by target port instead of initiator_ip:
As you can see, over 1000 individual ips tried to establish a connection on port 445, which is used for direct TCP/IP MS Networking access. It's 100% likely that attackers attempted to scan the VSI for potential vulnerabilities.
Looking for expected traffic
A few minutes after the curl commands above complete, there should be some accepted traffic:
Notice the target_port is 3000 associated with our curl.
The private instance has had no packets received. Let's validate this:
Try looking for connections initiated by the private instance — there are quite a few packets:
What does this mean? Looking at one of the records, I noticed target_port=67, which is for the bootp protocol. This seems okay, so I'll filter this out and look more, filtering out DNS (53) and NTP (123). I continued with this process to notice the following:
- Port 67: Bootp
- Port 123: NTP Network Time Protocol
- Port 53: DNS
- Port 443: https for installing software
- Port 80: http for installing software
It might be interesting to look at the individual target_ip addresses for the 443 and 80 ports and verify they are the software providers that my company has approved:
Maybe I should change my security groups or network ACLs to be more constrained.
On my laptop, I obtained my own IP address using curl ifconfig.me:
This is good news. The network path from my laptop to the VSI is viable. But why is it being rejected?
This is likely due to Security Group Rules or Network ACLs. In the IBM Cloud Console, navigate to the VPC instances, click on the vsi1 instance and examine the Security Groups attached to the Network Interface. If you click on the Security Groups, you will notice that the install-software group is for installing software and the sg1 is for external access, but only to port 3000. That is the port used in the curl commands. There is no port 22 access, so this is likely the cause of the rejection.
On the x-sg1 security group page, in the inbound rules section, click the New rule button and then add a new inbound rule for Protocol: TCP, Port range: 22, 22. Try the ssh again to verify the solution. Look for action=accepted in the flow log by re-running the previous query.
The more I look into the flow logs using IBM SQL Query, the more questions I have. I need to have a solid understanding of the communication paths required for my application and carefully constrain packet flows that could cause harm.
In SQL Query, I can analyze flows in various ways using SQL superpowers. Create tables and views as we've shown to reuse and share them across various different analytics that you do.
As SQL Query offers a JDBC Driver, I can connect all of my data easily back into existing reporting infrastructure to augment existing reports.
Since SQL Query can work with any rectangular data stored on object storage and supports the full-breath of SQL, I can perform elaborate analytics, including joining of data from various systems — like custom inventory data, Cloud Internet Services, Activity Tracker and LogDNA.
Having said that, this blog post is just the beginning of exploring how SQL Query can help to get insight from different data sources and build a single-pane-of-glass view onto my logging data lake.
One example of this is to combine flow logs with Cloud Internet Services logs to get a single pane of glass of network connections as they travel through the cloud and perform additional in depth analytics.
I'll explore this option in depth in a future blog post and introduce some of the analytics that you can do.
To remove all resources created as part of this blog post, execute
IBM Cloud Flow Logs for VPC provides detailed traffic logs, and IBM Cloud SQL Query is a great way to analyse, troubleshoot and understand network traffic. The full breath of SQL allows for augmenting and enriching network flows to allow network access to be audited in a comprehensive way.
Find out more about VPC in the Solution tutorials — just open the Virtual Private Cloud section on the left. Check out IBM Cloud Flow Logs for VPC documentation for a more detailed description of how to use IBM Cloud SQL Query and some best practices.