July 29, 2020 By Powell Quiring 6 min read

Use Flow Logs for VPC to monitor network traffic and troubleshoot connectivity.

IBM Cloud Flow Logs for VPC capture the IP traffic into and out of the network interfaces in a customer generated VSI of a VPC and persist them into an IBM Cloud Object Storage (COS) bucket. You can use flow logs to diagnose connectivity issues or monitor traffic that enters and leaves the network interfaces of the VPC instances. This allows you to answer questions like the following:

  • Are unexpected TCP ports being accessed on my VSIs?
  • Is SSH traffic reaching the VPC but getting rejected?
  • Are bad actors trying to access my network?

COS provides an excellent landing place for high-volume, continuously growing storage. It is also possible to ingest this data from the COS bucket into other analysis tools. In this blog post, IBM Log Analysis with LogDNA will be the target.

Pushing VPC Flow Logs to LogDNA

Before running the code below, it can be helpful to initialize platform logs in the the target region. The invocation of the functions deployed in the next section are visible in the platform logs

Deploying the sample code

A simple way to run these shell scripts is in the IBM Cloud Shell. Open cloud.ibm.com in a browser, log in, and in the upper right, notice the shell icon:

The source code implementing these flows is available in GitHub. It comes with scripts to set up Cloud Object Storage and deploy the IBM Cloud Functions triggers and actions. Detailed instructions can be found in the README, but simply start in the Cloud Shell and type the following:

git clone https://github.com/IBM-Cloud/vpc-flowlogs-logdna
cd vpc-flowlogs-logdna

Once you have configured your shell environment with the local.env file, you can start running the scripts. You will need the IBM Cloud CLI, the Cloud Object Storage plugin, the Cloud Functions plugin, the Schematics plugin, and the jq command line utility, which are already installed in Cloud Shell:

cp template.local.env local.env
edit local.env
source local.env
# verify PREFIX and TF_VAR_ssh_key_name are in the environment by using the env command

000-prereqs.sh performs basic checks about the target resource group, the target region, the required IBM Cloud plugins, and external tools.

010-create-services.sh creates a Cloud Object Storage service, a bucket, and LogDNA service. It also sets up service keys. The service keys are used by Cloud Functions actions to access COS and LogDNA services.

Bucket:

LogDNA

020-create-functions.sh configures Cloud Functions. It creates an authorization between Cloud Functions and Cloud Object Storage, allowing Cloud Functions to be notified by COS. Then, it adds a Cloud Functions trigger that is executed when a flow log is added to a COS bucket. The trigger starts the log action. The action is written in Python and can be found in the file actions/__main__.py. When the script finishes, open the Cloud Function Triggers in the IBM Cloud Console and select the appropriate namespace from the drop-down menu on the top. Also, check out the Actions panel from the left navigation:

030-create-vpc.sh creates the VPC, subnets, instances and flow log collectors. Terraform in the IBM Cloud Schematics service is used to create all of the resources except the flow log collector, which is created using the ibmcloud cli. After the script completes, check out the flow log collector configuration in the IBM Cloud Console.

VPC Infrastructure observed by the Flow Log Collector:

Flow log collector:

In a few minutes, the COS bucket will start to have flow log objects. A few minutes after that, the function log invocation will be visible in the LogDNA platform logs. Then, a few more minutes after that, the flow logs will be available in other LogDNA instances.

The final few lines displayed by 030-create-vpc.sh will look something like this. Copy these and keep them handy:

>>> to exercise the vpc
curl 52.116.136.250:3000; # get hello world string
curl 52.116.136.250:3000/info; # get the private IP address
curl 52.116.136.250:3000/remote; # get the remote private IP address

LogDNA Analysis

Getting started

The curl 52.116.136.250:3000/info will show the output of the private IP address of vsi1, or 10.240.0.4. The curl 52.116.136.250:3000/remote will curl the private instance from within the public instance, displaying the private IP address of vsi1 (it is private).

Attempt to ssh to the public IP address—this will likely hang. We will investigate this soon:

$ ssh root@52.116.136.250

Looking for bad actors

Navigate to IBM Log Analysis page and then click the View LogDNA link next to the instance matching your prefix.

Lets look for flow logs in the Everything view for target_ip:10.240.0.4 action:rejected:

As you can see, there are quite a few records. I narrowed the search further to target_port:443 and found a record with an initiator_ip address of 92.63.197.61.

A quick Google search indicated that this IP address is unexpected and is 100% likely to be a bad actor.

Looking for expected traffic

A few minutes after the curl commands above complete, there should be some accepted traffic. Search for target_ip:10.240.0.4 action:accepted. Notice the target_port is 3000 associated with our curl.

The private instance has a lot less traffic. Try target_ip:10.240.64.4 and you might only see a few packets. Try the initiator_ip:10.240.64.4 and there are quite a few packets. What does this mean?  

Looking at one of the records, I noticed target_port:67, which is for the bootp protocol. This seems okay, so I’ll filter this out and look more (notice the minus sign): initiator_ip:10.240.64.4 -target_port:67. I continued with this process to notice the following: initiator_ip:10.240.64.4 -target_port:67 -target_port:123 -target_port:53.

  • Port 67: Bootp
  • Port 123: NTP Network Time Protocol
  • Port 53: DNS
  • Port 443: https for installing software
  • Port 80: http for installing software

It might be interesting to look at the target_ip addresses for the 443 and 80 ports and verify they are the software providers that my company has approved and set an alarm if they are not. Maybe I should change my security groups or network ACLs to be more constrained.

Cannot SSH

On my laptop, I obtained my own IP address using curl ifconfig.me:

$ curl ifconfig.me
24.22.68.94

In LogDNA, search for: target_ip:10.240.0.4 target_port:22 initiator_ip:24.22.68.94.

A packet is found and the field: action:rejected.

This is good news. The network path from my laptop to the VSI is viable. But why is it being rejected?

This is likely due to Security Group Rules or Network ACLs. In the IBM Cloud Console, navigate to the VPC instances, click on the vsi1 instance, and examine the Security Groups attached to the Network Interface. If you click on the Security Groups, you will notice that the install-software group is for installing software and the sg1 is for external access, but only to port 3000. That is the port used in the curl commands. There is no port 22 access, so this is likely the cause of the rejection.

On the x-sg1 security group page, in the inbound rules section, click the New rule button and then add a new inbound rule for Protocol: TCP, Port range: 22, 22. Try the ssh again to verify the solution. Look for action:accepted in the flow log.

More investigation

The more I look into the flow logs using IBM Log Analysis with LogDNA, the more questions I have. I need to have a solid understanding of the communication paths required for my application and carefully constrain packet flows that could cause harm.  

In LogDNA, you can create dashboards and alarms. For example, you may want to be notified via Slack whenever an ssh access is successful to an instance in your VPC. Narrow the search terms in the search box to find the records of interest. In the upper-right corner, click the Unsaved View drop-down and select save as new view/alert. In the pop-up, choose Alert > View specific alert and click on Slack or one of the other notification mechanisms. Follow the provided instructions.

Conclusion

IBM Cloud Flow Logs for VPC provides detailed traffic logs, and IBM Log Analysis with LogDNA is a great way to interactively search and understand network traffic. Real-time alert notifications allow network access to be audited.

Find out more about VPC in the Solution tutorials—just open the Virtual Private Cloud section on the left.  Check out Auditing corporate policies for Identity and Access Management and Key Protect for a more detailed description of LogDNA Slack alerts.

More from Cloud

Hybrid cloud examples, applications and use cases

7 min read - To keep pace with the dynamic environment of digitally-driven business, organizations continue to embrace hybrid cloud, which combines and unifies public cloud, private cloud and on-premises infrastructure, while providing orchestration, management and application portability across all three. According to the IBM Transformation Index: State of Cloud, a 2022 survey commissioned by IBM and conducted by an independent research firm, more than 77% of business and IT professionals say they have adopted a hybrid cloud approach. By creating an agile, flexible and…

Tokens and login sessions in IBM Cloud

9 min read - IBM Cloud authentication and authorization relies on the industry-standard protocol OAuth 2.0. You can read more about OAuth 2.0 in RFC 6749—The OAuth 2.0 Authorization Framework. Like most adopters of OAuth 2.0, IBM has also extended some of OAuth 2.0 functionality to meet the requirements of IBM Cloud and its customers. Access and refresh tokens As specified in RFC 6749, applications are getting an access token to represent the identity that has been authenticated and its permissions. Additionally, in IBM…

How to move from IBM Cloud Functions to IBM Code Engine

5 min read - When migrating off IBM Cloud Functions, IBM Cloud Code Engine is one of the possible deployment targets. Code Engine offers apps, jobs and (recently function) that you can (or need) to pick from. In this post, we provide some discussion points and share tips and tricks on how to work with Code Engine functions. IBM Cloud Code Engine is a fully managed, serverless platform to (not only) run your containerized workloads. It has evolved a lot since March 2021, when…

Sensors, signals and synergy: Enhancing Downer’s data exploration with IBM

3 min read - In the realm of urban transportation, precision is pivotal. Downer, a leading provider of integrated services in Australia and New Zealand, considers itself a guardian of the elaborate transportation matrix, and it continually seeks to enhance its operational efficiency. With over 200 trains and a multitude of sensors, Downer has accumulated a vast amount of data. While Downer regularly uncovers actionable insights from their data, their partnership with IBM® Client Engineering aimed to explore the additional potential of this vast dataset,…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters