Virtualization is standard equipment with most server operating systems on the market today. In the Linux® world, there are two primary choices for server virtualization: the Kernel-based Virtual Machine (KVM) and Xen. KVM is the primary technology that Red Hat and others use. Although Citrix owns Xen, much of the core functionality remains in the public domain.
The Virtual Machine Manager (VMM, or
virt-manager) project provides a tool for
managing the creation and running of both KVM and Xen virtual machine (VM)
instances. VMM is written in Python using the GTK+ library for graphical
user interface construction. The real work is done through the
libvirt library, which is what you'll be using
for this article series. Although libvirt is a
Red Hat-sponsored effort, it remains an open source project available
under the GNU Lesser General Public License.
libvirt is made up of several different pieces,
including the application programming interface (API) library, a daemon
(libvirtd), and a default command-line utility
(virsh). For the purposes of this article, all
testing is done using Ubuntu Server version 11.04. The Installation and setup section
covers everything I did to configure my server for developing the scripts
presented here. Part 1 covers the basics of
libvirt and Kernel-based Virtual Machine (KVM)
virtualization along with a few command-line scripts to whet your
appetite. Part 2 will dive deeper and show you how you can build your own
virtualization management tools using libvirt,
Python, and wxPython.
Before we dive into the actual code examples, let's go over a few terms and concepts related to virtualization with KVM. When you install KVM on a server like Ubuntu Server 11.04, you're establishing a virtualization host, or hypervisor. That means that your server will be able to host multiple guest operating systems running on top of the KVM host. Each unique guest is called a domain and functions in much the same way you would expect from a single server instance on an individual machine. You can connect to the server over Secure Shell (SSH) or Virtual Network Computing just as if you were communicating with a physical machine.
Although KVM functions as the hypervisor or guest manager, QEMU provides the actual machine emulation, meaning that QEMU executes the native instruction set of the target machine. For x86 guests, this execution translates into native instructions capable of direct execution on the underlying hardware. For other architectures, such as ARM, a translation process must take place. The combination of KVM and QEMU provides all the support functions needed to virtualize essentially every currently available operating system plus a few that are no longer available.
A guest domain consists of a number of files, including one or
more disk image files and an XML-based configuration file. This setup
makes it extremely simple to manage multiple VMs by creating a baseline
system image, and then modifying the configuration file to suit your
needs. One method of configuring and communicating with KVM/QEMU is the
libvirt toolkit. Multiple vendors have
standardized their management products based on
libvirt.
Look at the contents of a typical domain configuration file. Listing 1 shows the testdev.xml file from the
libvirt examples.
Listing 1. Device XML definition
<device>
<name>File_test_device</name>
<capability type='system'>
<hardware>
<vendor>Libvirt</vendor>
<version>Test driver</version>
<serial>123456</serial>
<uuid>11111111-2222-3333-4444-555555555555</uuid>
</hardware>
<firmware>
<vendor>Libvirt</vendor>
<version>Test Driver</version>
<release_date>01/22/2007</release_date>
</firmware>
</capability>
</device>
|
From the test domfv0.xml file shown in Listing 2, you can see a bit more detail about configuring virtual devices.
Listing 2. domfv0.xml device definition file
<devices>
<emulator>/usr/lib/xen/bin/qemu-dm</emulator>
<interface type='bridge'>
<source bridge='xenbr0'/>
<mac address='00:16:3e:5d:c7:9e'/>
<script path='vif-bridge'/>
</interface>
<disk type='file'>
<source file='/root/fv0'/>
<target dev='hda'/>
</disk>
<disk type='file' device='cdrom'>
<source file='/root/fc5-x86_64-boot.iso'/>
<target dev='hdc'/>
<readonly/>
</disk>
<disk type='file' device='floppy'>
<source file='/root/fd.img'/>
<target dev='fda'/>
</disk>
<graphics type='vnc' port='5904'/>
</devices>
|
The key point here is the relative ease with which you can read these files and subsequently create your own. Although you could build any number of configuration files by hand, it's also possible to automate the building using a scripting language like Python.
Because this article is about scripting KVM, there is a basic assumption that you have a server with KVM installed. In the case of Ubuntu Server 11.04, you have the option of installing virtualization during the setup process by choosing the Virtual Machine Host option on the Software selection screen. You might also want to choose the OpenSSH server should you want to connect remotely to the machine.
The first order of business is to install the latest version of
libvirt. To do this, you have to do some
command-line work. When you install Ubuntu Server 11.04, you get
libvirt version 0.8.8. The latest and greatest
version available from the libvirt website is
0.9.5. To install a later version, you need to add a Personal Package
Archive (PPA) repository to your system containing a more recent version
of libvirt. A quick search on the launchpad.net
site for libvirt shows a number of potential
candidates. It's important to view the repository details page before you
try to perform an update, as some may have broken packages. The Ubuntu
Virtualization Team maintains a PPA repository with several packages,
including libvirt. The latest version available
at the time of this writing was 0.9.2-4.
Perform the following steps to install that version:
- Install the python-software-properties package as follows:
sudo apt-get install python-software-properties
This command makes available the
add-apt-repositorycommand that you need to reference the third-party source. - Type the following commands:
sudo add-apt-repository ppa:ubuntu-virt/ppa sudo apt-get update sudo apt-get install libvirt-bin
- Because you'll be using Python to do all the scripting for this
article, install the IDLE shell to make it easier to write and test
scripts.
This step assumes that you have installed the desktop environment on your Ubuntu server. The quickest way to get the desktop installed is to use the following command:
sudo apt-get install ubuntu-desktop
After that's done, you'll have access to any number of graphical applications along with the Ubuntu software installer. You can use the Ubuntu Software Center to install the Python IDLE tool.
At this point, let's look at some of the fundamentals of working with
libvirt before getting too deeply into code.
Communication between an application and the
libvirt library uses a simple remote procedure
call mechanism, which makes it possible to build applications to
communicate with remote hypervisors over a TCP/IP connection. Uniform
Resource Identifiers (URIs, defined by Internet Engineering Task Force
[IETF] Request for Comments [RFC] 2396) are used to identify a specific
hypervisor with which you want to establish a connection.
Local connections typically do not require authentication, although some remote connections do. The libvirt.conf file controls the security configuration. The most extensive control over communicating with a unique domain is through network filtering. Here's an example of how you would control network traffic using a filter:
<devices>
<interface type='bridge'>
<mac address='00:16:3e:5d:c7:9e'/>
<filterref filter='clean-traffic'/>
</interface>
</devices>
|
This snippet defines a filter named
clean-traffic that will be applied to all
network traffic over the specified media access control (MAC) address. If
you examine the clean-traffic XML, it contains
the following:
<filter name='clean-traffic' chain='root'>
<uuid>6f145c54-e3de-4c33-544a-70b69c16d9da</uuid>
<filterref filter='no-mac-spoofing'/>
<filterref filter='no-ip-spoofing'/>
<filterref filter='allow-incoming-ipv4'/>
<filterref filter='no-arp-spoofing'/>
<filterref filter='no-other-l2-traffic'/>
<filterref filter='qemu-announce-self'/>
</filter>
|
The filter capabilities are quite extensive and fully documented. It takes
only one command if you want to have a local copy of the
libvirt documentation and sample files. Here's
what you need to do:
sudo apt-get install libvirt-doc |
With that done, all the documentation is available in the
/usr/share/doc/libvirt-doc directory. You'll see the Python examples a bit
later. If you've updated to a more recent version of
libvirt, you may need to explicitly install the
Python bindings. Doing so requires a single command:
sudo apt-get install python-libvirt |
Use Python's IDLE console to examine Python code to establish a connection with a local QEMU instance, and then examine the defined domains. Listing 3 shows what you should see using this approach.
Listing 3. Viewing Python code in the IDLE console
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "copyright", "credits" or "license()" for more information.
==== No Subprocess ====
>>> import libvirt
>>> conn=libvirt.open("qemu:///system")
>>> names = conn.listDefinedDomains()
>>> print names
['Test1', 'SBSLite', 'UbuntuServer1104', 'Win7_64-bit']]
>>>
|
This code shows how to get a list of all defined domains. The return from
the listDefinedDomains() function shows a list
containing four named domains. Once you establish a connection to the
hypervisor, you will have access to a laundry list of available functions.
Here's a short script showing how to get a list of all available functions
available on the conn object:
clist = dir(conn)
for item in clist:
print item
|
To see a list of defined filters, you could use a similar approach:
filts = conn.listNWFilters()
for item in filts:
print item
|
The IDLE tool is a great way to investigate the various API calls and
quickly see the results returned when executed. Some of the functions
operate only on running domains. The Python
dir() function returns a list of valid
attributes for the specified object. It's a convenient command-line tool
to quickly see what a particular object provides. You can use it as shown
above to get a list of functions available after establishing a connection
to the hypervisor.
To demonstrate, you can use a few lines of Python code in the IDLE console to get an idea of the kinds of operations you can perform on a specific domain. Listing 4 provides an example of what you can do.
Listing 4. Python output of a domain object
>>> import libvirt
>>> import pprint
>>> conn=libvirt.open("qemu:///system")
>>> p = conn.lookupByName('ubuntu100403')
>>> pprint.pprint(dir(p))
['ID',
'OSType',
'UUID',
'UUIDString',
'XMLDesc',
'__del__',
'__doc__',
'__init__',
'__module__',
'_conn',
'_o',
'abortJob',
'attachDevice',
'attachDeviceFlags',
'autostart',
'blkioParameters',
'blockInfo',
'blockPeek',
'blockStats',
'connect',
'coreDump',
'create',
'createWithFlags',
'destroy',
'detachDevice',
'detachDeviceFlags',
'hasCurrentSnapshot',
'hasManagedSaveImage',
'info',
'injectNMI',
'interfaceStats',
'isActive',
'isPersistent',
|
You can take this basic approach to building a simple script that lists
information about all running domains. You use the
listDomainsID() and
lookupByID() function calls to do most of the
work, as Listing 5 shows.
Listing 5. Python list domains script
import libvirt
conn=libvirt.open("qemu:///system")
for id in conn.listDomainsID():
dom = conn.lookupByID(id)
infos = dom.info()
print 'ID = %d' % id
print 'Name = %s' % dom.name()
print 'State = %d' % infos[0]
print 'Max Memory = %d' % infos[1]
print 'Number of virt CPUs = %d' % infos[3]
print 'CPU Time (in ns) = %d' % infos[2]
print ' '
|
The output from this script, with one domain active and another suspended, looks like this:
ID = 3 Name = ubuntu100403 State = 3 Max Memory = 1048576 Number of virt CPUs = 1 CPU Time (in ns) = 1048576 ID = 4 Name = Win7_64-bit State = 1 Max Memory = 2097152 Number of virt CPUs = 2 CPU Time (in ns) = 2097152 |
libvirt also implements Python docstrings for
all classes and methods. You can access this information by typing
help(libvirt) for the top-level help or
help(libvirt.class) for a specific class. You
must have imported the libvirt module before
typing the help() command. The version I tested
for this review implements the following 11 classes:
libvirtErrorvirConnectvirDomainvirDomainShapshotvirInterfacevirNWFiltervirNetworkvirSecretvirStoragePoolvirStorageVolvirStream
This list should help you decode the syntax for accessing
libvirt functions from Python. It also gives
you a list of all named constants, like
VIR_DOMAIN_RUNNING, which equals 1. Functions
like dom.info(), used above, return an integer
value and need to be decoded against this constant table.
Utility scripts for automation
You could write any number of utility scripts to manage a KVM installation
using libvirt and Python. It might not be
efficient for a small number of domains but can quickly save time when the
count gets into double digits. One simple task would be to make a mass
change of static IP addresses for all domain images. You can do this by
iterating over all the .conf files, and then making the appropriate
changes. Python has many built-in features to help with this task.
Listing 6 shows an example of an XML network definition.
Listing 6. Network configuration XML file
<network>
<name>testnetwork</name>
<bridge name="virbr1" />
<forward/>
<ip address="192.168.100.1" netmask="255.255.255.0">
<dhcp>
<range start="192.168.100.2" end="192.168.100.254" />
<host mac='de:af:de:af:00:02' name='vm-1' ip='192.168.100.2' />
<host mac='de:af:de:af:00:03' name='vm-2' ip='192.168.100.3' />
<host mac='de:af:de:af:00:04' name='vm-3' ip='192.168.100.4' />
<host mac='de:af:de:af:00:05' name='vm-4' ip='192.168.100.5' />
<host mac='de:af:de:af:00:06' name='vm-5' ip='192.168.100.6' />
<host mac='de:af:de:af:00:07' name='vm-6' ip='192.168.100.7' />
<host mac='de:af:de:af:00:08' name='vm-7' ip='192.168.100.8' />
<host mac='de:af:de:af:00:09' name='vm-8' ip='192.168.100.9' />
<host mac='de:af:de:af:00:10' name='vm-9' ip='192.168.100.10' />
</dhcp
</ip>
</network>
|
If you wanted to change the main subnet from 192.168.100 to 192.168.200, you could open the configuration file in an editor and do a global search and replace. The trick is when you want to do something a bit more complex, like add 10 to all the IP and MAC addresses starting with 2. Figure 7 shows how you might do that with a little over 20 lines of Python code.
Listing 7. Python script to change MAC and IP addresses
#!/usr/bin/env python
from xml.dom.minidom import parseString
import sys
def main():
target = sys.argv[1]
number = int(sys.argv[2])
xml = open(target, 'r').read()
doc = parseString(xml)
for host in doc.getElementsByTagName('host'):
ip = host.getAttribute('ip')
parts = ip.split('.')
parts[-1] = str(int(parts[-1]) + number)
host.setAttribute('ip', '.'.join(parts))
mac = host.getAttribute('mac')
parts = mac.split(':')
parts[-1] = str(int(parts[-1]) + number)
host.setAttribute('mac', ':'.join(parts))
f = open(target, 'w')
f.write(doc.toxml())
f.close()
if __name__ == '__main__':
main()
|
This script demonstrates the power of Python when you take advantage of the
Python Standard Library. Here, you use
parseString from
xml.dom.minidom to do the heavy lifting of
parsing the XML file. After you have a specific XML attribute, you simply
break it into individual pieces using the Python
string.split function. Then, just do the math
and put the strings back together. You can expand this approach to make
bulk changes to any XML file, including the .conf files for
libvirt.
Another helpful script would take a snapshot of all running domains. This
script would need first to get a list of all running domains, then
individually pause and create a snapshot of each one. Although this
operation might not be practical for a production environment, you could
set it to run as a CRON job in the middle of
the night. This script would be straightforward to implement with the
commands highlighted so far, along with a call to
snapshotCreateXML().
This article has just scratched the surface of the capabilities contained
in libvirt. Check the Resources section for links to more in-depth reading on
libvirt and virtualization in general.
Understanding the basics of KVM goes a long way when you start trying to
implement code to monitor and manage your environment. The next
installment in this series will take the foundation established here and
build a few real-world virtual management tools.
Learn
libvirtwebsite: Check out the entire site for more information.-
Reference Manual for
libvirt: Access the completelibvirtAPI reference manual. - Python.org: Find the Python resources you need from the official
website.
- IETF RFC 2396: Read the
complete document that outlines the generic syntax for URIs.
- developerWorks
Open source zone: Find extensive how-to information, tools, and
project updates to help you develop with open source technologies and use
them with IBM products.
- Events of interest: Check out upcoming conferences, trade shows,
and webcasts that are of interest to IBM open source
developers.
- developerWorks
podcasts: Tune into interesting interviews and discussions for
software developers
- developerWorks On demand demos: Watch our no-cost demos and learn
about IBM and open source technologies and product functions.
- developerWorks on
Twitter: Follow us for the latest news.
Get products and technologies
- Evaluate IBM
software products: From trial downloads to cloud-hosted products,
you can innovate your next open source development project using software
especially for developers.
Discuss
- developerWorks
community: Connect with other developerWorks users while exploring
the developer-driven blogs, forums, groups, and wikis. Help build the Real world open source group in the developerWorks
community.
Paul Ferrill has been writing in the computer trade press for more than 20 years. He got his start writing networking reviews for PC Magazine on products like LANtastic and early versions of Novell Netware. Paul holds both BSEE and MSEE degrees and has written software for more computer platforms and architectures than he can remember.



