OpenStack supports many features found in libvirt. Not currently among these features is the ability to back your KVM guests with hugepages.
While this post is directed at those who have a desire to back their guests with hugepages, and therefore are familiar with them, I'm going to take a moment to explain what hugepages are. Hugepages are, as the name suggests, pages that are larger than the standard size. For example, on one machine you may have a standard page size of 4kB and a hugepage size of 2048 kB. Here are a few interesting properties of hugepages
They are large areas of contiguous memory. This makes them harder to allocate, so you may need to reboot your system to get yourdesired amount of pages.
They are not swappable and therefore will not be paged out.
The hugepage size is a multiple of the standard page size.
This functionality is planned to be added in OpenStack's Juno release with the implementation detailed here by Daniel Berrangé. However, if you would like to try backing your guests with hugepages, you can do so with the help of virsh and virt-install. The process detailed here will not be a workaround for getting your OpenStack created guest machines to be backed by hugepages. It will, however, provide a method of creating a virtual machine which you could use to test a workload that may benefit from it.
At IBM we are preparing to test new features that will be in the Juno release, assuring that they are working on Power hardware. In an effort to understand hugepages more thoroughly and make sure that libvirt's hugepages functionality works on PowerKVM, I was tasked with backing a guest VM with hugepages. I have tried the following procedure on three pairs of architectures and linux distributions.
x86_64 running RHEL (Successfully backed a Ubuntu 14 guest
ppc64 running PowerKVM 2.1.1 (Successfully backed a Fedora-20 guest)
ppc64 running Fedora-20 (Ran into a qemu bug due to an old version being in the repository we were using)
In a nutshell, to back a VM with hugepages you first dump the guest machine's xml to a file using virsh's dumpxml command, modify that file to indicate that you want the guest to be backed by hugepages, then recreate the guest using virsh. Though the process seems to work independent of the distribution and architecture we used, I of course cannot guarantee that you will see the same results.
Before we begin, it's worth noting that you can monitor the status of your hugepage allocations by running the following command in another terminal:
This will display information about the hugepage allocation in each NUMA node on the system. As you progress through the steps below, the numbers here will change, verifying that the changes you are making on the system are taking effect.
With all of that out of the way, let's take a look at how to do it!
First, you're going to want to determine how large a hugepage on your system is. To do this, run the following command:
cat /proc/meminfo | grep -i huge
You should see a few lines of output. Among them are HugePages_Total which is the total amount of allocated hugepages on the system, HugePages_Free which is the number of allocated pages that are not in use, HugePages_Rsvd which is the number of pages reserved that are not in use (different that HugePages_Free), but the most important one for this step is 'Hugepagesize' which is the size of a hugepage for this system in kB. For example, on my machine running PowerKVM, I saw:
Hugepagesize: 16384 kB
Jot down the number you see, as you will need it for later on.
Next, check if hugetlbfs is mounted by issuing the following command:
mount | grep huge
If it is mounted, you should see output like this:
hugetlbfs on /dev/hugepages type hugetlbfs (rw)
If there was no output, you can mount hugetlbfs on the host using this command:
mount -t hugetlbfs hugetlbfs /dev/hugepages
Please note that you may need to create the '/dev/hugepages' directory yourself.
Next, you need to reserve enough hugepages to back your guest. To do this, pick a memory size large enough to back your guest, but small enough that it avoids overcommitment of the host. Convert this number to kB and then divide it by the Hugepagesize you noted earlier. Here, I am reserving 4 GB of hugepages with the intent to make a guest with 1 GB of memory. You probably don't need that much headway, but if you think you might create more guests backed by hugepages later on, or might use hugepages for other purposes, it might be a good idea to reserve more pages.
echo 256 > /proc/sys/vm/nr_hugepages
256 is the number of hugepages I wanted to reserve. In this case, the hugepage size was 16 MB. 256 * 16 MB = 4GB and since 4GB is larger than the 1GB I want to use, this amount is sufficient. If you run cat /proc/meminfo again, you should see this number as the value of HugePages_Total and HugePages_Free. If you have the watch command mentioned at the beginning running in a seperate terminal, you will see the changes there as well.
If you already have a guest virtual machine running, you don't need to perform this step and can move on to step 5. However, keep in mind that I have only used tried this method on machines created with virt-install. While I don't forsee there being any issues with using other VM creation methods, I cannot garuntee others will work.
If you need to create a guest machine, I recommend using virt-install. If your machine does not have virt-install, you can get it via repository managers such as yum or apt-get. This step also requires virsh.
4.1. First, you need to create the default storage pool. This may already exist, which you can check by running 'virsh pool-list'. If it does exist, this step is not needed.
virsh pool-create-as default dir –target=/var/lib/libvirt/images
4.2. Next, Create the volume where the guest will reside. You can of course customize your image's file name and capacity. Once again, make sure that you are reserving enough disk space to back your guest while avoiding overcommitment.
4.3. Download the .iso of the distribution you want your guest to use. For simplicity, select an image that uses the same architecture as your machine. For example, if you are using an x86_64 processor, select an image that is made for x86_64. Place this file somewhere accessible to virsh and note the path as you will need it for the next step.
4.4. Next, create the guest using virt-install. The -r argument indicates, in MB, how much RAM you want your guest to have, so make sure it fits within the amount of memory you reserved in step 3. The --accelerate argument tells virt-install to use the kernel's acceleration capabilities. The -n argument indicates the name of the guest in virsh. The -f argument indicates the path to the volume image you create in step 4.2 and the --cdrom argument indicates the path to the iso you downloaded in step 4.3
There are many other arguments you can pass to virt-install to customize your guest. For more information, see the man page of virt-install.
4.5. Follow the steps of the installer. This varies between distributions, but once the install completes you need to exit guest machine using 'Ctrl+]'. This should bring you back to the host machine's shell.
Now you need to modify guest to be backed by hugepages. To do this, you need to first get the xml file to pass to virsh in order to create this guest. Then you need to modify this xml file to indicate that the guest should be backed by hugepages.
5.1. On the host machine, dump the guest XML to a file
virsh dumpxml sample_guest > sample_guest.xml
Of course, you can use any file name you'd like and you need to replace 'sample_guest' with the name you chose for your guest.
5.2. Using the text editor of your choice, add the hugepages option to the file you created in step 5.1
Be sure to place these lines within the <domain></domain> level of the xml and not the <domain><devices></devices></domain> level.
Destroy the guest using virsh. Replace sample_guest with the name you chose for your guest. Don't worry, the image file will retain the machine's file data.
virsh destroy sample_guest
Finally, recreate the guest using virsh and the updated guest xml.
virsh create sample_guest.xml
Again, replace sample_guest.xml with the filename of the xml you created. To reenter your guest, use the virsh console command. If you'd like to verify that it worked, run cat /proc/meminfo again and you should see a change in the value of HugePages_Free. As well, if you ran the watch command, you should see the changes there.
So it's as simple as that! You should now have a guest backed by hugepages. Keep a lookout for future posts on the subject. I'm planning to write a post discussing the benefits of using hugepages, and if I can get it working, another post on using this method with guests created in OpenStack. Such posts will be tagged with 'hugepages' and when applicable 'openstack'. Hoped this was helpful as well as interesting!