March 22, 2021 By Powell Quiring 8 min read

This post will demonstrate a Squid NFV using VPC routing.

A virtual private cloud (VPC) gives an enterprise the ability to define and control a virtual network that is logically isolated from all other public cloud tenants, creating a private, secure place on the public cloud.  

VPC routing allows more control over network flow and can be used to support Network Functions Virtualization (NFV) for advanced networking services, such as third-party routing, firewalls, local/global load balancing, web application firewalls and more.

This post will demonstrate a Squid NFV. Other off-the-shelf firewall instances like those from Palo Alto and F5 can be similarly configured. To quote the Squid site: “Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages.”

The host instance is going to read from internet websites. Internet-bound traffic from the host subnet will be sent to the proxy instance by the routing table and routes. The Squid NFV on the proxy instance will connect to the website and act as a middle man between the host and the website.

In the diagram above, the website is neverssl.com; the Squid proxy will impersonate (AKA spoof) neverssl.com. The proxy will be an undetectable intermediary in the conversation so existing applications on the host do not require code changes to benefit from Squid functionality.

You could click through the console to create the VPC, subnets, route table, route table route, instances, etc. This post will use Terraform, so it will be up and running in just a few minutes.

Tooling prerequisites

The provision steps are going to be done from the CLI. This will allow you to move these steps to a CI/CD pipeline or into IBM Cloud Schematics, over time. 

Ignore these prerequisites and use the IBM Cloud Shell where these tools are preinstalled — or use your workstation and verify the installation of the following tools. See the “Getting started with solution tutorials” guide for help on installing them:

  • Git
  • IBM Cloud CLI
  • Terraform
  • Jq

IAM prerequisites

You will need permissions to create VPC resources. Even if you are the account owner, an additional IAM policy is required to create instances with network interfaces that allow spoofing. See about IP spoofing checks.  

I am the account administrator, so I executed this command line in the Cloud Shell using my email address:

ibmcloud iam user-policy-create YOUR_USER_EMAIL_ADDRESS --roles "IP Spoofing Operator" --service-name is

Alternatively, you can add this policy in the IBM Cloud Console IAM section starting at Users:

  • Click the User
  • Click Access policies
  • Click Assign access
  • Click IAM services
  • Choose VPC Infrastructure Services from the drop down
  • Click on the IP Spoofing Operator

Create and test

Clone the source code repository and execute the tooling prerequisite check:

git clone https://github.com/IBM-Cloud/vpc-nfv-squid
cd  vpc-nfv-squid
cp local.env.template local.env
edit local.env
source local.env
./000-prereq.sh

Create the resources. Take a look at the script, it is pretty simple:

cat ./010-create.sh
#!/bin/bash

terraform init
terraform apply -auto-approve

If Terraform produces the following error message instead of provisioning the proxy instance, make sure you have correctly configured your account with the IP Spoofing Operator permission as mentioned above:

Error: the provided token is not authorized to  the specified instance (ID:NEWRESOURCE) in this account

The Terraform heavy lifting is defined in main.tf. Even if you are not familiar with Terraform, take a look. You will find a self-documenting blueprint of the architecture. Once Terraform completes successfully, open the VPC layout in the IBM Cloud console and select all of the subnets. I configured a basename in local.env of Squid:

Run the test script to verify it is working as expected. You will need to accept the ssh IP addresses when prompted:

$ ./030-test.sh

>>> verify it is possible to ssh to the host and execute the true command
ssh -J root@52.116.133.164 root@10.0.0.4 true

>>> verify proxy connectivity using ping
ssh -J root@52.116.133.164 root@10.0.0.4 ping 10.0.1.4 -c 2
PING 10.0.1.4 (10.0.1.4) 56(84) bytes of data.
64 bytes from 10.0.1.4: icmp_seq=1 ttl=64 time=0.540 ms
64 bytes from 10.0.1.4: icmp_seq=2 ttl=64 time=0.422 ms

--- 10.0.1.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1013ms
rtt min/avg/max/mdev = 0.422/0.481/0.540/0.059 ms

>>> verify explicy specifying the squid proxy server ip works. Testing the network path - not testing the router
ssh -J root@52.116.133.164 root@10.0.0.4 set -o pipefail; curl neverssl.com -s --proxy 10.0.1.4:8080 | grep poorly-behaved > /dev/null

>>> veriy direct access to neverssl.com, end to end, through the route table
ssh -J root@52.116.133.164 root@10.0.0.4 set -o pipefail; curl neverssl.com -s | grep poorly-behaved > /dev/null

>>> verify implicit access to a denied host fails
ssh -J root@52.116.133.164 root@10.0.0.4 curl virus.com -s | grep squid > /dev/null
>>> success

In a test-driving fashion, lets dive deeper into the system that has been created.

Jump instance

The only instance that can be reached directly via ssh is the jump (bastion). Check out the security group ssg_ssl in main.tf — Securely access remote instances with a bastion host details the concepts. The Terraform output has a copy/paste string you can use to ssh to the host through the jump. The rest of the testing is done using the jump host. You can verify the test results. For me, it looked like this:

$ terraform output host 
[
  {
    "ip_host" = "10.0.0.4"
    "sshhost" = "ssh -J root@52.116.137.7 root@10.0.0.4"
  },
]
$ ssh -J root@52.116.137.7 root@10.0.0.4
...
root@squid-us-south-1-host:~# 

Host to proxy access

In the last step you ssh’d to host. Let’s reproduce some of the tests. Is the proxy reachable?

root@squid-us-south-1-host:~# ping 10.0.1.4 -c 2
PING 10.0.1.4 (10.0.1.4) 56(84) bytes of data.
64 bytes from 10.0.1.4: icmp_seq=1 ttl=64 time=0.532 ms
64 bytes from 10.0.1.4: icmp_seq=2 ttl=64 time=0.428 ms

--- 10.0.1.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1020ms
rtt min/avg/max/mdev = 0.428/0.480/0.532/0.052 ms

Next, verify that the Squid service is running on the proxy and that Squid is able to reach the internet. Squid is listening to port 8080, so the following curl should work:

root@squid-us-south-1-host:~# curl neverssl.com -s --proxy 10.0.1.4:8080
<html>
    <head>
        <title>NeverSSL - helping you get online</title>
...

Host access via Virtual Network Function

Finally, the beauty of VPC routing and NFV can be seen by opening the routing tables, selecting the VPC and clicking on the route table:

The 10.0.0.0 is the CIDR for the VPC. The 166.8.0.0 and 161.26.0.0 CIDRs are service endpoints in the IBM Cloud for software repository mirrors, time servers, dns servers, etc. Check out the available endpoints. These are all delegated to the default routing table. See create route.

The interesting CIDR, 0.0.0.0/0 matches everything else. The  next hop — 10.0.1.4 — is the proxy. When the host connects to neverssl.com at IP address 54.230.154.14, it will match this route and the connection will be made to the proxy instance.

The Squid service and Linux iptables are configured in the proxy_user_data.sh file. Notice the command:

iptables -t nat -I PREROUTING 1 -s $host_ipv4_cidr_block -p tcp --dport 80 -j REDIRECT --to-port 3129

The above iptables command executed on the proxy configures the kernel routing table to direct some of the incoming packets to the intercept port of the Squid application. Let’s break it down:

  • -t nat:  Add entry #1 in the network address translation table.
  • -s $host_ipv4_cidr_block: Only consider those packets from the host cider blocks.
  • -p tcp: Only consider tcp protocol.
  • -dport 80: Only consider packets to port 80 (http).
  • -j REDIRECT: Redirect the matching packets.
  • --to-port 3129: Change the destination port from 80 to 3129.

The Squid configuration does the rest:

cat > /etc/squid/squid.conf <<EOF
visible_hostname squid

#Handling HTTP requests
http_port 3129 intercept
http_port 8080
acl allowed_http_sites dstdomain .neverssl.com
acl allowed_http_sites dstdomain .test.com
acl allowed_http_sites dstdomain .ubuntu.com
http_access allow allowed_http_sites
EOF

Squid will intercept the packets at port 3129 and serve as the middle man. Only a few sites are allowed — neversl.com, test.com and ubuntu.com. All other sites will be rejected by Squid. Now we can make sense of this portion of the diagram:

Continue testing:

root@squid-us-south-1-host:~# curl -s neverssl.com
<html>
    <head>
        <title>NeverSSL - helping you get online</title>
...

The configuration is centralized in the route table and applies to instances on configured subnets. The host instance requires no configuration to access the internet via Squid. If you could eavesdrop on the end-to-end conversation, you would see the following:

The source and destination IP numbers of the tcp packets are as you would expect, except for the ones explicitly noted:

  1. The request is addressed to 54.230.125.14. The route table next hop route delivers it to the proxy at 10.0.1.4.
  2. At the proxy, the Linux iptables redirect to the Squid process. The Squid process establishes a connection to neverssl.com. The IP address is provided by the public gateway.
  3. The response is returned to the proxy/Squid over the public gateway.
  4. Squid impersonates neverssl.com, spoofing the IP address 54.230.125.14. The curl command running on host is none the wiser.

Using tcpdump, it is possible to see some of the traffic. Bring up another ssh session to the proxy. The ssh command will be found using Terraform output. On the proxy, run tcpdump port 80 and do the same on the host but put it into the background tcpdump port 80 &. Then, on the host in the foreground, run the curl command again. The text below has been edited for readability.

Proxy:

root@squid-proxy:~# tcpdump port 80
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
22:27:38.344635 IP 10.0.0.4.59434 > server-54-230-125-14.dfw50.r.cloudfront.net.http: Flags [S],
22:27:38.344774 IP server-54-230-125-14.dfw50.r.cloudfront.net.http > 10.0.0.4.59434: Flags [S

Host:

root@squid-us-south-1-host:~# tcpdump port 80 &
root@squid-us-south-1-host:~# curl -s neverssl.com >/dev/null
22:27:38.228500 IP squid-us-south-1-host.59434 > server-54-230-125-14.dfw50.r.cloudfront.net.http: Flags [S]
22:27:38.229147 IP server-54-230-125-14.dfw50.r.cloudfront.net.http > squid-us-south-1-host.59434: Flags [S

Verify that Squid denies access to virus.com. Notice the Squid error message:

root@squid-us-south-1-host:~# curl -s virus.com
...
<p>Generated Mon, 15 Mar 2021 22:41:20 GMT by squid (squid/3.5.27)</p>
<!-- ERR_ACCESS_DENIED -->
</div>
</body></html>
root@squid-us-south-1-host:~#

Clean up

When you are done investigating, clean up all the resources using ./040-cleanup.sh. Take a look at the script: terraform destroy.

Conclusion

Configuring the VPC routing tables for Network Functions Virtualization (NFV) can be a great way to transparently insert functionality into a VPC network. More generally, route tables and routes can be used to both isolate and extend network connectivity.

Get more experience with NFV by configuring Squid to work with HTTPS traffic: SslBump Peek and Splice.

You can also find these NFVs in the IBM Cloud Catalog:

 

Was this article helpful?
YesNo

More from Cloud

24 IBM offerings winning TrustRadius 2024 Top Rated Awards

2 min read - TrustRadius is a buyer intelligence platform for business technology. Comprehensive product information, in-depth customer insights and peer conversations enable buyers to make confident decisions. “Earning a Top Rated Award means the vendor has excellent customer satisfaction and proven credibility. It’s based entirely on reviews and customer sentiment,” said Becky Susko, TrustRadius, Marketing Program Manager of Awards. Top Rated Awards have to be earned: Gain 10+ new reviews in the past 12 months Earn a trScore of 7.5 or higher from…

Helping enterprises across regulated industries leverage hybrid cloud and AI

3 min read - At IBM Cloud, we are committed to helping enterprises across industries leverage hybrid cloud and AI technologies to help them drive innovation. For true transformation to begin, we believe it is key to understand the unique challenges organizations are facing—whether it is keeping data secured, addressing data sovereignty requirements or speeding time to market to satisfy consumers. For those in even the most highly regulated industries, we have seen these challenges continue to grow as they navigate changing regulations. We…

Migration Acceleration Program for IBM Cloud

2 min read - The cloud has emerged as a transformative technology platform, offering flexibility, scalability and cost-effectiveness. Enterprise cloud migration strategies seek to be business-driven with an integrated technology, operational and financial adoption plan. Knowing where you are, where you are going, and how you get there is critical to sustainable success. Building an end-to-end plan with confidence can be a daunting undertaking, and enterprise leaders find it challenging to design and execute a cloud migration plan. To address these challenges, we continue…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters