Virtual machines (VMs) can help make life easier. Unfortunately, they can also be a pain point when it comes to security. To better protect your data, it is helpful to understand how your data can leak from a virtual machine.
What you need to know about a virtual machine is that it is just a normal process running on a hypervisor. Every process has its own memory space and you could treat it like the “virtual memory” in the virtual machine. Every memory page could be directly accessed by the hypervisor; therefore, it could be used inspect all the confidential information stored in the virtual memory on your VM.
When you are using secure shell to access your VM, all the traffic is encrypted and you also encrypt the virtual disk using a strong PGP encryption algorithm. You think everything is protected and no one can steal your data. However, if your VMs are running on a compromised hypervisor, your data is not safe at all. The key point is that all encrypted data will be eventually be stored in the memory in plain text format. Otherwise how could you read it from your editor? Consequently, everything shows on your editor and any strings you input on the browser could be directly access from the memory. Your data is naked.
Every process running on Linux kernel has two files that store all the memory information:
- /proc/<the pid of the process>/maps
- /proc/<the pid of the process>/mem
In the first file, it is possible to know exactly how the virtual memory is mapped to physical memory and get many ranges from this file to access the physical memory.
Here is the snippet from the /proc/pid/maps:
7f56bc021000-7f56c0000000 —p 00000000 00:00 0
7f56c0000000-7f56c0021000 rw-p 00000000 00:00 0
7f56c0021000-7f56c4000000 —p 00000000 00:00 0
7f56c4000000-7f56c4021000 rw-p 00000000 00:00 0
7f56c4021000-7f56c8000000 —p 00000000 00:00 0
7f56c8000000-7f56c8021000 rw-p 00000000 00:00 0
7f56c8021000-7f56cc000000 —p 00000000 00:00 0
After getting all the ranges to access the physical memory, they could be used to go through the second file. The second file contains all the contents in the memory. It is that easy to search for a credit card number in the memory by doing pattern matching.
The precondition of using this technique is to have root privilege on the hypervisor. The new question becomes: is it easy to get the root privilege? Think about the public cloud. The public cloud providers will definitely have the root privilege to their hypervisors, therefore your data is transparent to them. You are the only person that could access the data stored in your VM!