Network Function Virtualization (NFV) Using VPC Routing

5 min read

This post will demonstrate a Squid NFV using VPC routing.

A virtual private cloud (VPC) gives an enterprise the ability to define and control a virtual network that is logically isolated from all other public cloud tenants, creating a private, secure place on the public cloud.  

VPC routing allows more control over network flow and can be used to support Network Functions Virtualization (NFV) for advanced networking services, such as third-party routing, firewalls, local/global load balancing, web application firewalls and more.

This post will demonstrate a Squid NFV. Other off-the-shelf firewall instances like those from Palo Alto and F5 can be similarly configured. To quote the Squid site: "Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages."

Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages.

The host instance is going to read from internet websites. Internet-bound traffic from the host subnet will be sent to the proxy instance by the routing table and routes. The Squid NFV on the proxy instance will connect to the website and act as a middle man between the host and the website.

In the diagram above, the website is neverssl.com; the Squid proxy will impersonate (AKA spoof) neverssl.com. The proxy will be an undetectable intermediary in the conversation so existing applications on the host do not require code changes to benefit from Squid functionality.

You could click through the console to create the VPC, subnets, route table, route table route, instances, etc. This post will use Terraform, so it will be up and running in just a few minutes.

Tooling prerequisites

The provision steps are going to be done from the CLI. This will allow you to move these steps to a CI/CD pipeline or into IBM Cloud Schematics, over time. 

Ignore these prerequisites and use the IBM Cloud Shell where these tools are preinstalled — or use your workstation and verify the installation of the following tools. See the "Getting started with solution tutorials" guide for help on installing them:

  • Git
  • IBM Cloud CLI
  • Terraform
  • Jq

IAM prerequisites

You will need permissions to create VPC resources. Even if you are the account owner, an additional IAM policy is required to create instances with network interfaces that allow spoofing. See about IP spoofing checks.  

I am the account administrator, so I executed this command line in the Cloud Shell using my email address:

ibmcloud iam user-policy-create YOUR_USER_EMAIL_ADDRESS --roles "IP Spoofing Operator" --service-name is

Alternatively, you can add this policy in the IBM Cloud Console IAM section starting at Users:

  • Click the User
  • Click Access policies
  • Click Assign access
  • Click IAM services
  • Choose VPC Infrastructure Services from the drop down
  • Click on the IP Spoofing Operator
you can add this policy in the IBM Cloud Console IAM section starting at Users:

Create and test

Clone the source code repository and execute the tooling prerequisite check:

git clone https://github.com/IBM-Cloud/vpc-nfv-squid
cd  vpc-nfv-squid
cp local.env.template local.env
edit local.env
source local.env
./000-prereq.sh

Create the resources. Take a look at the script, it is pretty simple:

cat ./010-create.sh
#!/bin/bash

terraform init
terraform apply -auto-approve

If Terraform produces the following error message instead of provisioning the proxy instance, make sure you have correctly configured your account with the IP Spoofing Operator permission as mentioned above:

Error: the provided token is not authorized to  the specified instance (ID:NEWRESOURCE) in this account

The Terraform heavy lifting is defined in main.tf. Even if you are not familiar with Terraform, take a look. You will find a self-documenting blueprint of the architecture. Once Terraform completes successfully, open the VPC layout in the IBM Cloud console and select all of the subnets. I configured a basename in local.env of Squid:

The Terraform heavy lifting is defined in main.tf.

Run the test script to verify it is working as expected. You will need to accept the ssh IP addresses when prompted:

$ ./030-test.sh

>>> verify it is possible to ssh to the host and execute the true command
ssh -J root@52.116.133.164 root@10.0.0.4 true

>>> verify proxy connectivity using ping
ssh -J root@52.116.133.164 root@10.0.0.4 ping 10.0.1.4 -c 2
PING 10.0.1.4 (10.0.1.4) 56(84) bytes of data.
64 bytes from 10.0.1.4: icmp_seq=1 ttl=64 time=0.540 ms
64 bytes from 10.0.1.4: icmp_seq=2 ttl=64 time=0.422 ms

--- 10.0.1.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1013ms
rtt min/avg/max/mdev = 0.422/0.481/0.540/0.059 ms

>>> verify explicy specifying the squid proxy server ip works. Testing the network path - not testing the router
ssh -J root@52.116.133.164 root@10.0.0.4 set -o pipefail; curl neverssl.com -s --proxy 10.0.1.4:8080 | grep poorly-behaved > /dev/null

>>> veriy direct access to neverssl.com, end to end, through the route table
ssh -J root@52.116.133.164 root@10.0.0.4 set -o pipefail; curl neverssl.com -s | grep poorly-behaved > /dev/null

>>> verify implicit access to a denied host fails
ssh -J root@52.116.133.164 root@10.0.0.4 curl virus.com -s | grep squid > /dev/null
>>> success

In a test-driving fashion, lets dive deeper into the system that has been created.

Jump instance

In the last step you ssh'd to host. Let's reproduce some of the tests. Is the proxy reachable?

The only instance that can be reached directly via ssh is the jump (bastion). Check out the security group ssg_ssl in main.tf — Securely access remote instances with a bastion host details the concepts. The Terraform output has a copy/paste string you can use to ssh to the host through the jump. The rest of the testing is done using the jump host. You can verify the test results. For me, it looked like this:

$ terraform output host 
[
  {
    "ip_host" = "10.0.0.4"
    "sshhost" = "ssh -J root@52.116.137.7 root@10.0.0.4"
  },
]
$ ssh -J root@52.116.137.7 root@10.0.0.4
...
root@squid-us-south-1-host:~# 

Host to proxy access

In the last step you ssh'd to host. Let's reproduce some of the tests. Is the proxy reachable?

nfv5
root@squid-us-south-1-host:~# ping 10.0.1.4 -c 2
PING 10.0.1.4 (10.0.1.4) 56(84) bytes of data.
64 bytes from 10.0.1.4: icmp_seq=1 ttl=64 time=0.532 ms
64 bytes from 10.0.1.4: icmp_seq=2 ttl=64 time=0.428 ms

--- 10.0.1.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1020ms
rtt min/avg/max/mdev = 0.428/0.480/0.532/0.052 ms

Next, verify that the Squid service is running on the proxy and that Squid is able to reach the internet. Squid is listening to port 8080, so the following curl should work:

root@squid-us-south-1-host:~# curl neverssl.com -s --proxy 10.0.1.4:8080
<html>
    <head>
        <title>NeverSSL - helping you get online</title>
...

Host access via Virtual Network Function

Finally, the beauty of VPC routing and NFV can be seen by opening the routing tables, selecting the VPC and clicking on the route table:

Finally, the beauty of VPC routing and NFV can be seen by opening the routing tables, selecting the VPC and clicking on the route table:
Finally, the beauty of VPC routing and NFV can be seen by opening the routing tables, selecting the VPC and clicking on the route table:

The 10.0.0.0 is the CIDR for the VPC. The 166.8.0.0 and 161.26.0.0 CIDRs are service endpoints in the IBM Cloud for software repository mirrors, time servers, dns servers, etc. Check out the available endpoints. These are all delegated to the default routing table. See create route.

The interesting CIDR, 0.0.0.0/0 matches everything else. The  next hop — 10.0.1.4 — is the proxy. When the host connects to neverssl.com at IP address 54.230.154.14, it will match this route and the connection will be made to the proxy instance.

The Squid service and Linux iptables are configured in the proxy_user_data.sh file. Notice the command:

iptables -t nat -I PREROUTING 1 -s $host_ipv4_cidr_block -p tcp --dport 80 -j REDIRECT --to-port 3129

The above iptables command executed on the proxy configures the kernel routing table to direct some of the incoming packets to the intercept port of the Squid application. Let's break it down:

  • -t nat:  Add entry #1 in the network address translation table.
  • -s $host_ipv4_cidr_block: Only consider those packets from the host cider blocks.
  • -p tcp: Only consider tcp protocol.
  • -dport 80: Only consider packets to port 80 (http).
  • -j REDIRECT: Redirect the matching packets.
  • --to-port 3129: Change the destination port from 80 to 3129.

The Squid configuration does the rest:

cat > /etc/squid/squid.conf <<EOF
visible_hostname squid

#Handling HTTP requests
http_port 3129 intercept
http_port 8080
acl allowed_http_sites dstdomain .neverssl.com
acl allowed_http_sites dstdomain .test.com
acl allowed_http_sites dstdomain .ubuntu.com
http_access allow allowed_http_sites
EOF

Squid will intercept the packets at port 3129 and serve as the middle man. Only a few sites are allowed — neversl.com, test.com and ubuntu.com. All other sites will be rejected by Squid. Now we can make sense of this portion of the diagram:

Squid will intercept the packets at port 3129 and serve as the middle man. Only a few sites are allowed — neversl.com, test.com and ubuntu.com. All other sites will be rejected by Squid.

Continue testing:

root@squid-us-south-1-host:~# curl -s neverssl.com
<html>
    <head>
        <title>NeverSSL - helping you get online</title>
...

The configuration is centralized in the route table and applies to instances on configured subnets. The host instance requires no configuration to access the internet via Squid. If you could eavesdrop on the end-to-end conversation, you would see the following:

The configuration is centralized in the route table and applies to instances on configured subnets. The host instance requires no configuration to access the internet via Squid. If you could eavesdrop on the end-to-end conversation, you would see the following:

The source and destination IP numbers of the tcp packets are as you would expect, except for the ones explicitly noted:

  1. The request is addressed to 54.230.125.14. The route table next hop route delivers it to the proxy at 10.0.1.4.
  2. At the proxy, the Linux iptables redirect to the Squid process. The Squid process establishes a connection to neverssl.com. The IP address is provided by the public gateway.
  3. The response is returned to the proxy/Squid over the public gateway.
  4. Squid impersonates neverssl.com, spoofing the IP address 54.230.125.14. The curl command running on host is none the wiser.

Using tcpdump, it is possible to see some of the traffic. Bring up another ssh session to the proxy. The ssh command will be found using Terraform output. On the proxy, run tcpdump port 80 and do the same on the host but put it into the background tcpdump port 80 &. Then, on the host in the foreground, run the curl command again. The text below has been edited for readability.

Proxy:

root@squid-proxy:~# tcpdump port 80
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
22:27:38.344635 IP 10.0.0.4.59434 > server-54-230-125-14.dfw50.r.cloudfront.net.http: Flags [S],
22:27:38.344774 IP server-54-230-125-14.dfw50.r.cloudfront.net.http > 10.0.0.4.59434: Flags [S

Host:

root@squid-us-south-1-host:~# tcpdump port 80 &
root@squid-us-south-1-host:~# curl -s neverssl.com >/dev/null
22:27:38.228500 IP squid-us-south-1-host.59434 > server-54-230-125-14.dfw50.r.cloudfront.net.http: Flags [S]
22:27:38.229147 IP server-54-230-125-14.dfw50.r.cloudfront.net.http > squid-us-south-1-host.59434: Flags [S

Verify that Squid denies access to virus.com. Notice the Squid error message:

root@squid-us-south-1-host:~# curl -s virus.com
...
<p>Generated Mon, 15 Mar 2021 22:41:20 GMT by squid (squid/3.5.27)</p>
<!-- ERR_ACCESS_DENIED -->
</div>
</body></html>
root@squid-us-south-1-host:~#

Clean up

When you are done investigating, clean up all the resources using ./040-cleanup.sh. Take a look at the script: terraform destroy.

Conclusion

Configuring the VPC routing tables for Network Functions Virtualization (NFV) can be a great way to transparently insert functionality into a VPC network. More generally, route tables and routes can be used to both isolate and extend network connectivity.

Get more experience with NFV by configuring Squid to work with HTTPS traffic: SslBump Peek and Splice.

You can also find these NFVs in the IBM Cloud Catalog:

 

Be the first to hear about news, product updates, and innovation from IBM Cloud