Computing today is undergoing something of a revival. Although virtualization was created decades ago, its true potential is now being realized through the use of commodity hardware. Virtualization consolidates server workloads for efficiency, but other elements of the server ecosystem are emerging as candidates for further consolidation. Many view virtualization as a consolidation of CPU, memory, and storage, but this oversimplifies the solution. The network is a key aspect of virtualization and represents a first-class element of the virtualization set.
Let's begin with a high-level exploration of the problem, and then dig down into the various ways that Linux® builds and supports network virtualization.
In a traditional environment (see Figure 1), a set of physical servers hosts the necessary application set. To enable communication among the servers, each server includes one or more network interface cards (NICs) that attach to an external networking infrastructure. The NIC, along with a networking software stack, enables communication among endpoints through the network infrastructure. As Figure 1 shows, this functionality is represented by a switch, which enables efficient packet communication among the participating endpoints.
Figure 1. Traditional networking infrastructure
The key innovation behind server consolidation is an abstraction of the physical hardware to allow multiple operating systems and applications to share the hardware (see Figure 2). This innovation is called a hypervisor (or virtual machine [VM] monitor). Each VM (an operating system and application set) views the underlying hardware as unshared and a complete machine, even though portions of it may not exist or may be shared by multiple VMs. An example of this is the virtual NIC (vNIC). The hypervisor may create one or more vNICs for each VM. These NICs can appear as physical NICs to the VM, but they actually represent only the interface of the NIC. The hypervisor also permits the dynamic construction of a virtual network, complete with virtualized switches to enable configurable communication among the VM endpoints. Finally, the hypervisor also permits communication to the physical networking infrastructure by attaching the server's physical NICs to the hypervisor's logical infrastructure, permitting efficient communication among VMs within the hypervisor as well as efficient communication to the external network. In the Resources section, you'll find plenty of links to more information on hypervisors with Linux (a rich area for the open source operating system).
Figure 2. Virtualized networking infrastructure
The virtualized network infrastructure has also enabled other interesting innovations, such as the virtual appliance. We'll look at these in addition to the elements of the virtual network as part of this exploration.
One of the key developments of virtualized networking infrastructure is the development of the virtual switch. The virtual switch attaches vNICs to the physical NICs of the server and—more importantly—ties vNICs to other vNICs within the server for local communication. What makes this interesting is that within a virtual switch, the limit has nothing to do with network speeds but instead, memory bandwidth, which allows efficient communication among local VMs and minimizes the overhead of the network infrastructure. This savings results from the physical network being used only for communication among servers, with inter-VM traffic isolated within the servers.
But as Linux already incorporates a layer-2 switch within the kernel, some have asked why a virtual switch is even necessary. The answer covers multiple attributes, but one of the most important is defined by the new classification for these switch types. The new class is called the distributed virtual switch, which enables cross-server bridging in a way that makes the underlying server architecture transparent. A virtual switch within one server can transparently join with a virtual switch in another server (see Figure 3), making migration of VMs between servers (and their virtual interfaces) much simpler, because they can attach to the distributed virtual switch in another server and transparently join its virtual switched network.
Figure 3. The distributed virtual switch
One of the most important projects in this space is called the Open vSwitch, which this article explores next.
One issue with isolating local traffic within a server is that the traffic is not externally visible (for example, to network analyzers). Implementations have addressed this problem through a variety of schemes, such as OpenFlow, NetFlow, and sFlow, which are used to export remote access to control and monitor traffic.
Early implementations of distributed virtual switches were closed and restricted to operating with proprietary sets of hypervisors. But in today's cloud environments, it's ideal to support a heterogeneous environment in which multiple hypervisors can coexist.
The Open vSwitch is a multilayer virtual switch that's available as open source under the Apache 2.0 license. As of May 2010, Open vSwitch was available as version 1.0.1 and supports an impressive set of features. Open vSwitch supports the leading open source hypervisor solutions, including Kernel-based VM (KVM), VirtualBox, Xen, and XenServer. It's also a drop-in replacement for the current Linux bridge module.
Open vSwitch consists of a switch daemon and companion kernel module that manages the flow-based switching. A variety of other daemons and utilities also exist for managing the switch (particularly from the perspective of OpenFlow). You can run Open vSwitch entirely from user space, but doing so results in degradation of performance.
In addition to providing a production-quality switch for VM environments, the Open vSwitch includes an impressive feature road map to compete with other close and proprietary solutions.
The virtualization of NIC hardware has existed for some time in a variety of forms—well before the introduction of virtual switching. This section looks at some of the implementations as well as some of the hardware acceleration that's available to improve the speed of network virtualization.
Although QEMU is a platform emulator, it provides software emulation for a variety of hardware devices, including NICs. In addition, QEMU provides an internal Dynamic Host Configuration Protocol server for IP address assignment. QEMU works in concert with KVM to offer platform emulation and individual device emulation to provide the platform for KVM-based virtualization. You can learn more about QEMU in the Resources section.
virtio is an input/output (I/O)
para-virtualization framework for Linux that simplifies and expedites I/O
traffic from a VM to a hypervisor.
creates a standardized transport mechanism for I/O between a VM and the
hypervisor for the purpose of virtualizing block devices, generic
peripheral component interconnect (PCI) devices, network devices, and
others. You can learn more about the internals of
virtio in the Resources section.
Virtualization has been implemented in networking stacks for quite some time to permit VM guest networking stacks access to the host networking stack. Two of the schemes are TAP and TUN. TAP is a virtual network kernel driver that implements an Ethernet device and as such, operates at the Ethernet frame level. The TAP driver provides the Ethernet "tap" by which guest Ethernet frames can be communicated. TUN (or network "tunnel") simulates a network layer device and communicates at the higher level of IP packets, which provides a bit of an optimization, as the underlying Ethernet device can manage the layer-2 framing of the TUN's IP packets.
I/O virtualization is a standardized scheme from the PCI-Special Interest Group (SIG) that enables the acceleration of virtualization at the hardware level. In particular, Single-root IOV (SR-IOV) exposes an interface through which a single PCI Express (PCIe) card can appear to many users as multiple PCIe cards, allowing multiple independent drivers to attach to the PCIe card without knowledge of one another. SR-IOV accomplishes this by extending virtual functions to the various users, which appear as physical functions of the PCIe space but are represented within the card as shared functions.
The benefit that SR-IOV brings to network virtualization is performance. Rather than the hypervisor implementing sharing of the physical NIC, the card itself implements the multiplexing, allowing a direct passthrough of I/O from guest VMs directly to the card.
Linux includes support for SR-IOV today, which benefits the KVM hypervisor. Xen also includes support for SR-IOV, allowing it to efficiently present a vNIC to the guest VMs. Support for SR-IOV is on the road map for the Open vSwitch.
Although related, virtual LANs (VLANs) are a physical method for network virtualization. VLANs provide the ability to create virtual networks across a distributed network so that disparate hosts (on independent networks) appear as if they were part of the same broadcast domain. VLANs accomplish this by tagging frames with VLAN information to identify their membership with a particular LAN (per the Institute of Electrical and Electronics Engineers [IEEE] 802.1Q standard). Hosts work in concert with VLAN switches to virtualize the physical network. But although VLANs provide the illusion of separate networks, they share the same network and thus the available bandwidth and impacts resulting from congestion.
A number of I/O-focused virtualization accelerations have begun to appear that address NICs and other devices. Intel® Virtualization Technology for Directed I/O (VT-d) provides the capability to isolate I/O resources for improved reliability and security, which includes remapping direct memory access transfers (using multi-level page tables) and device-associated interrupts remapping, supporting both unmodified guests and those that are virtualization-aware. Intel Virtual Machine Device Queues (VMDq) also accelerate network traffic flow in virtualization settings by embedding queues and sorting intelligence within hardware, resulting in less CPU utilization by the hypervisor and greater improvements in overall system performance. Linux includes support for both.
So far, this article has explored the virtualization of NIC devices and switches, some of the existing implementations, and some of the ways these virtualizations are accelerated through hardware. Now, let's expand this discussion to general network services.
One of the interesting innovations in the virtualization space is the ecosystem that's evolving from the consolidation of servers. Rather than devote applications to specialized versions of hardware, portions of a server are isolated to power VMs that extend services within the server. These VMs are called virtual appliances, as they focus on a specific application and are developed for a virtualization setting.
The virtual appliance typically connects to the hypervisor—or the generic networking infrastructure that the hypervisor presents—to extend a specific service. What makes this unique is that in a consolidated server, portions of the processing capacity (such as cores) and I/O bandwidth can be dynamically configured for the virtual appliance. This ability makes it more cost-effective (because a single server is not isolated for it), and you can dynamically alter its capacity based on the needs of the other applications running within the server. Virtual appliances can also be simpler to manage, because the application is tied to the operating system (within the VM). No special configuration is required, as the VM is preconfigured as a whole. That's a considerable benefit for virtual appliances and why they're growing today.
Virtual appliances have been developed for many enterprise applications and include WAN optimization, routers, virtual private networks, firewalls, intrusion-protection/detection systems, e-mail classification and management, and more. Outside of network services, virtual appliances exist for storage, security, application frameworks, and content management.
There was once a time when everything that was manageable could also be physically touched. But today, in our increasingly virtualized world, physical devices and services disappear into the ether. Physical networks are virtually segmented to permit traffic isolation and the construction of virtual networks across geographically disparate entities. Applications disappear into virtual appliances that are segmented amongst cores within powerful servers, creating greater complexity for the administrator but also greater flexibility and improved manageability. And of course, Linux is at the forefront.
- Linux represents a fantastic operating
system as well as a platform for virtualization solutions. You can learn
more about Linux and virtualization in Virtual Linux (developerWorks, December 2006) and Anatomy of a Linux Hypervisor (developerWorks, May 2009).
- Linux implements an I/O virtualization
framework (used by KVM) called
virtioprovides a common framework for the development of efficient para-virtualized drivers. You can learn more about
virtioand its internals in Virtio: An I/O virtualization framework for Linux (developerWorks, January 2010).
- SR-IOV provides the means to virtualize a
physical adapter to be used by multiple guest VMs. You can read more about
device emulation and I/O virtualization in Linux virtualization and PCI passthrough (developerWorks, October
- SR-IOV allows multiple guest operating
systems to share PCIe devices. You can learn more about SR-IOV from this
Intel hardware design
site. The PCI-SIG provides the specifications for the various IOV
- Virtual appliances are a relatively new
form factor for the delivery of software applications. An important goal
in virtual appliances is the ability to share them amongst more than one
hypervisor for the greatest portability. A step in this direction is the
Open Virtualization Format (OVF), which defines the format of virtual
appliance metadata. You can learn more about virtual appliances and the
OVF in Virtual appliances and the Open Virtualization Format
(developerWorks, October 2009).
- QEMU is an open source emulator for
complete computer systems as well as providing a complete virtualization
solution (via emulation). You can learn more about QEMU in System
emulation with QEMU (developerWorks, September 2007).
- The IEEE 802.1Q
standard provides the networking standard for VLAN tagging, defining the
concept of a VLAN at the MAC layer for virtual isolation of devices.
In the developerWorks Linux zone,
find hundreds of how-to
and tutorials, as well as downloads, discussion forums,
and a wealth of other resources for Linux developers and administrators.
Stay current with
developerWorks technical events and webcasts focused on a variety of IBM products and IT industry topics.
Attend a free developerWorks Live!
briefing to get up-to-speed quickly on IBM products and tools, as well as IT industry trends.
Watch developerWorks on-demand demos
ranging from product installation and setup demos for beginners, to advanced functionality for experienced developers.
Follow developerWorks on Twitter, or subscribe
feed of Linux tweets on developerWorks.
Get products and technologies
implemented NIC and switch virtualization as part of a project called
Project Crossbow brought virtualization and bandwidth resource
control into the network stack to minimize complexity and overhead.
Open vSwitch is the first open
source multi-layer switch that serves the virtualized ecosystem. Recently
released at version 1.0, Open vSwitch provides a great feature list and
supports a number of open source hypervisors (including KVM, Xen,
XenServer, and VirtualBox).
- The Xen Cloud Platform is
a virtualization infrastructure that incorporates the Open vSwitch virtual
switch package as part of its stack.
Evaluate IBM products
in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the
learning how to implement Service Oriented Architecture efficiently.
Get involved in the My developerWorks community.
Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.
M. Tim Jones is an embedded firmware architect and the author of Artificial Intelligence: A Systems Approach, GNU/Linux Application Programming (now in its second edition), AI Application Programming (in its second edition), and BSD Sockets Programming from a Multilanguage Perspective. His engineering background ranges from the development of kernels for geosynchronous spacecraft to embedded systems architecture and networking protocols development. Tim is a Consultant Engineer for Emulex Corp. in Longmont, Colorado.