This post will demonstrate a Squid NFV using VPC routing.
A virtual private cloud (VPC) gives an enterprise the ability to define and control a virtual network that is logically isolated from all other public cloud tenants, creating a private, secure place on the public cloud.
VPC routing allows more control over network flow and can be used to support Network Functions Virtualization (NFV) for advanced networking services, such as third-party routing, firewalls, local/global load balancing, web application firewalls and more.
This post will demonstrate a Squid NFV. Other off-the-shelf firewall instances like those from Palo Alto and F5 can be similarly configured. To quote the Squid site: "Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages."
The host instance is going to read from internet websites. Internet-bound traffic from the host subnet will be sent to the proxy instance by the routing table and routes. The Squid NFV on the proxy instance will connect to the website and act as a middle man between the host and the website.
In the diagram above, the website is neverssl.com; the Squid proxy will impersonate (AKA spoof) neverssl.com. The proxy will be an undetectable intermediary in the conversation so existing applications on the host do not require code changes to benefit from Squid functionality.
You could click through the console to create the VPC, subnets, route table, route table route, instances, etc. This post will use Terraform, so it will be up and running in just a few minutes.
The provision steps are going to be done from the CLI. This will allow you to move these steps to a CI/CD pipeline or into IBM Cloud Schematics, over time.
Ignore these prerequisites and use the IBM Cloud Shell where these tools are preinstalled — or use your workstation and verify the installation of the following tools. See the "Getting started with solution tutorials" guide for help on installing them:
- IBM Cloud CLI
You will need permissions to create VPC resources. Even if you are the account owner, an additional IAM policy is required to create instances with network interfaces that allow spoofing. See about IP spoofing checks.
I am the account administrator, so I executed this command line in the Cloud Shell using my email address:
Alternatively, you can add this policy in the IBM Cloud Console IAM section starting at Users:
- Click the User
- Click Access policies
- Click Assign access
- Click IAM services
- Choose VPC Infrastructure Services from the drop down
- Click on the IP Spoofing Operator
Create and test
Clone the source code repository and execute the tooling prerequisite check:
Create the resources. Take a look at the script, it is pretty simple:
If Terraform produces the following error message instead of provisioning the proxy instance, make sure you have correctly configured your account with the IP Spoofing Operator permission as mentioned above:
The Terraform heavy lifting is defined in main.tf. Even if you are not familiar with Terraform, take a look. You will find a self-documenting blueprint of the architecture. Once Terraform completes successfully, open the VPC layout in the IBM Cloud console and select all of the subnets. I configured a basename in local.env of Squid:
Run the test script to verify it is working as expected. You will need to accept the ssh IP addresses when prompted:
In a test-driving fashion, lets dive deeper into the system that has been created.
The only instance that can be reached directly via ssh is the jump (bastion). Check out the security group ssg_ssl in main.tf — Securely access remote instances with a bastion host details the concepts. The Terraform output has a copy/paste string you can use to ssh to the host through the jump. The rest of the testing is done using the jump host. You can verify the test results. For me, it looked like this:
Host to proxy access
In the last step you ssh'd to host. Let's reproduce some of the tests. Is the proxy reachable?
Next, verify that the Squid service is running on the proxy and that Squid is able to reach the internet. Squid is listening to port 8080, so the following curl should work:
Host access via Virtual Network Function
Finally, the beauty of VPC routing and NFV can be seen by opening the routing tables, selecting the VPC and clicking on the route table:
The 10.0.0.0 is the CIDR for the VPC. The 18.104.22.168 and 22.214.171.124 CIDRs are service endpoints in the IBM Cloud for software repository mirrors, time servers, dns servers, etc. Check out the available endpoints. These are all delegated to the default routing table. See create route.
The interesting CIDR, 0.0.0.0/0 matches everything else. The next hop — 10.0.1.4 — is the proxy. When the host connects to neverssl.com at IP address 126.96.36.199, it will match this route and the connection will be made to the proxy instance.
The Squid service and Linux iptables are configured in the proxy_user_data.sh file. Notice the command:
The above iptables command executed on the proxy configures the kernel routing table to direct some of the incoming packets to the intercept port of the Squid application. Let's break it down:
-t nat: Add entry #1 in the network address translation table.
-s $host_ipv4_cidr_block: Only consider those packets from the host cider blocks.
-p tcp: Only consider tcp protocol.
-dport 80: Only consider packets to port 80 (http).
-j REDIRECT: Redirect the matching packets.
--to-port 3129: Change the destination port from 80 to 3129.
The Squid configuration does the rest:
Squid will intercept the packets at port 3129 and serve as the middle man. Only a few sites are allowed — neversl.com, test.com and ubuntu.com. All other sites will be rejected by Squid. Now we can make sense of this portion of the diagram:
The configuration is centralized in the route table and applies to instances on configured subnets. The host instance requires no configuration to access the internet via Squid. If you could eavesdrop on the end-to-end conversation, you would see the following:
The source and destination IP numbers of the tcp packets are as you would expect, except for the ones explicitly noted:
- The request is addressed to 188.8.131.52. The route table next hop route delivers it to the proxy at 10.0.1.4.
- At the proxy, the Linux iptables redirect to the Squid process. The Squid process establishes a connection to neverssl.com. The IP address is provided by the public gateway.
- The response is returned to the proxy/Squid over the public gateway.
- Squid impersonates neverssl.com, spoofing the IP address 184.108.40.206. The curl command running on host is none the wiser.
Using tcpdump, it is possible to see some of the traffic. Bring up another ssh session to the proxy. The ssh command will be found using Terraform output. On the proxy, run
tcpdump port 80 and do the same on the host but put it into the background
tcpdump port 80 &. Then, on the host in the foreground, run the curl command again. The text below has been edited for readability.
Verify that Squid denies access to virus.com. Notice the Squid error message:
When you are done investigating, clean up all the resources using
./040-cleanup.sh. Take a look at the script:
Configuring the VPC routing tables for Network Functions Virtualization (NFV) can be a great way to transparently insert functionality into a VPC network. More generally, route tables and routes can be used to both isolate and extend network connectivity.
Get more experience with NFV by configuring Squid to work with HTTPS traffic: SslBump Peek and Splice.
You can also find these NFVs in the IBM Cloud Catalog:
- Palo Alto Networks VM-Series Next-Generation Firewall (BYOL) on IBM Cloud VPC
- F5 BIG-IP Virtual Edition for VPC