Make the most of large drives with GPT and Linux

Preparing for future disk storage with the GUID Partition Table

Once a remote prospect, an important barrier in disk storage has become a reality: the venerable master boot record (MBR) partitioning scheme can't fully handle disks larger than 2.2TB (2TiB). With disks as large as 3TB readily available and with much larger RAID arrays common, alternatives to the MBR partitioning scheme have become important to understand. The heir apparent is the GUID Partition Table (GPT). Learn how to make sure your Linux system is fully prepared for the future of disk storage.

03 July 2012 - Author refreshed the entire article with current information.

Share:

Roderick W. Smith, Consultant and author

Photo of Roderick W. SmithRoderick W. Smith is a consultant and author of more than 20 books on UNIX and Linux, including The Definitive Guide to Samba 3, Linux in a Windows World, and Linux Professional Institute Certification Study Guide. He is also the author of the GPT fdisk partitioning software and forked the abandoned rEFIt boot manager to create rEFInd. He currently resides in Woonsocket, Rhode Island.



03 July 2012 (First published 28 July 2009)

Also available in Russian Portuguese

Before embarking on a quest to replace your hard disk's partitioning scheme, it's helpful to review the limitations that are forcing this change. Understanding these limits—and the proposed tools for overcoming them—will enable you to judge how quickly you should jump ship from the master boot record (MBR) to the GUID Partition Table (GPT), particularly if you're considering adopting GPT before new disk purchases force your hand. GPT offers advantages over MBR even on smaller disks, but you must balance those advantages against the difficulties of a switch.

Understanding the MBR's limits

Understanding disk measurements

Disk sizes have traditionally been measured in units of kilobytes (KB), megabytes, gigabytes (GB), and terabytes (TB). Unfortunately, the meanings of these units have often been unclear. Kilo, mega, and other names are taken from the International System (SI) of Units, which describes units used by the metric system. As such, SI units use decimal (base-10) multipliers—kilo means 1000 of the base units, mega refers to 1,000,000 of the base units, and so on. In the computer field, however, binary (base-2) units are often more convenient. Thus, the SI prefixes have often (but not always) been applied to binary units that are close to some decimal units—kilo to 1,024, mega to 1,048,576, and so on.

Unfortunately, this inconsistent application of SI units to binary measures can lead to confusion. Hard disk manufacturers and some disk utilities have traditionally used SI units in their base-10 way, whereas other utilities have used base-2 units. This choice results in discrepancies in reported sizes. In the days of floppy disks, this difference was minor; but today, it's much more substantial. At the terabyte level, it's about 10 percent. Thus, in 2005, the Institute of Electrical and Electronics Engineers (IEEE) created a new standard, known as IEEE1541-2002 (IEEE1541 for short) for binary units. This new system uses new names and suffix abbreviations for binary units—kibibytes (KiB, 210 bytes), mebibytes (MiB, 220 bytes), gibibytes (GiB, 230 bytes), tebibytes (TiB, 240 bytes), and so on.

In this article, I use both IEEE1541 and SI units, depending on which is most appropriate for the context. Most (but not all) data structure limits are best described in binary—hence, in IEEE1541 units—but disk sizes and a few data structure limits are closer to decimal limits—hence, SI units.

The MBR partitioning system is a hodge-podge of data structure patches applied to overcome earlier limits. The MBR itself resides entirely on the first sector (512 bytes) of a hard disk. The first 440 bytes of the MBR are devoted to code: the boot loader. The basic input/output system (BIOS) reads this code and executes it when the computer boots.

Following the code area, the MBR stores data about four partitions, known as primary partitions. Each partition is described in two ways: using cylinder/head/sector (CHS) notation and using logical block addressing (LBA) notation. The CHS notation is almost a historical footnote today, because it's a 24-bit number. This means that it's limited to describing areas of about 8GB in size. The 32-bit LBA values permit 2TiB sizes, assuming a sector size of 512 bytes. This 2TiB ceiling is not easily overcome; there simply aren't any unallocated fields left in the MBR that could be used to add more bits to the LBA addresses.

In addition to the looming 2TiB problem, the MBR presents other difficulties. Chief among these is the limitation of four primary partitions. To work around this limitation, it's possible to set aside one primary partition as a placeholder (known as an extended partition) to hold an arbitrary number of additional partitions, known as logical partitions. This is, however, an ugly workaround that creates its own problems, such as difficulties installing multiple operating systems when too many of them want too many primary partitions to themselves.

The MBR has data-integrity problems, as well. It is a single data structure that's vulnerable to damage by carelessness or hardware failure. In addition, because logical partitions are defined in a linked-list structure, damage to one of them can block access to the remaining logical partitions. None of these data structures includes any form of error-detection capability, so damage can be difficult to spot.


GPT's solution

Intel® created the GPT definition as part of its Extensible Firmware Interface (EFI) specification for a BIOS replacement (see Resources for links to more information). Despite the fact that GPT is part of a standard that's meant to replace the legacy BIOS, it's possible to use GPT even on BIOS-based systems. If your computer uses EFI, this fact is another plus to GPT adoption. Whether your computer uses a legacy BIOS or an EFI, GPT fixes many of the MBR's limitations:

  • GPT uses LBA exclusively, so CHS headaches are gone.
  • Disk pointers are 64 bits in size, meaning that GPT can handle disks of up to 512 x 264 bytes (8 zebibytes, or 8.6 billion TiB), assuming 512-byte sectors.
  • GPT data structures are stored twice on the disk: once at the start and again at the end. This duplication improves the odds of successful recovery in case of damage from an accident or a bad sector.
  • Cyclic redundancy check values are computed for critical data structures, improving the odds of detection of data corruption.
  • GPT stores all partitions in a single partition table (with backup), so there's no need for extended or logical partitions. By default, 128 partitions are supported, although you can change the partition table size if the partitioning software supports such changes.
  • Whereas MBR provides a 1-byte partition type code, GPT uses a 16-byte globally unique identifier (GUID) value to identify partition types. This makes partition-type collisions less likely.
  • GPT enables storing a human-readable partition name. You can use this field to name your Linux® /home, /usr, /var, and other partitions for easier identification within partitioning software.

The first sector of the disk is reserved for a protective MBR, which is a legal MBR data structure that defines a single partition of type 0xEE (EFI GPT). On sub-2TiB disks, this partition should span the entire disk; on larger disks, it should be 2TiB in size. The idea is to protect the GPT disk from damage by GPT-unaware disk utilities. If such tools look at the disk, they'll see an MBR disk with no free space. (Some disk utilities can create a hybrid MBR, which defines up to three MBR partitions in addition to the EFI GPT partition. The idea is to enable a GPT-unaware operating system, such as most pre-Windows Vista® versions of Windows®, to coexist on a disk along with GPT partitions. This configuration is decidedly non-standard and kludgy, though.)

Because GPT incorporates a protective MBR, a BIOS-based computer can boot from a GPT disk using a boot loader stored in the protective MBR's code area, but the boot loader and operating system must both be GPT aware. (Some buggy BIOSes have problems booting from GPT disks, though.) EFI provides its own boot methods, so you can boot from a GPT disk on an EFI-based system.

The main problem with GPT is one of compatibility: Low-level disk utilities and operating systems must all support GPT. Such support is fairly common for Linux, although you may need to attend to some of these details and change some of the tools you use for low-level disk maintenance. If you multi-boot a computer, you'll have to look into GPT support for all of your operating systems.

If you administer many Linux systems, or if you anticipate adding an over-2TiB disk in the not-too-distant future, you may want to consider doing a test installation with GPT. Doing so before you're forced to do it will give you first-hand experience with GPT's features as well as with the quirks of some of the GPT-aware Linux utilities.

It's possible to run a system with a mixture of MBR and GPT disks. For instance, you can boot from an MBR disk but still use GPT for a data disk. Such a configuration is most useful for Windows on BIOS-based systems, because Windows can't boot from GPT using BIOS, but Windows Vista and later Microsoft operating systems can use a GPT data disk.


Using GPT

Three main classes of software all require GPT support: the kernel, the boot loader, and low-level disk utilities. If you're using GPT because you're setting up a large redundant array of independent disks (RAID) array, you may also need to look into file system support for extra-large disks.

Note: If you're installing Linux from scratch and want to use GPT, your installer must provide GPT support in all three of these categories. In 2012, this support is present in all the major Linux distributions.

Kernel support

The Linux kernel must provide GPT support to provide access to data on the disk's partitions. Fortunately, this support has long been present in Linux. If you compile your own kernel, be sure to select EFI GUID Partition Support in the Partition Types area of the Enable the Block Layer configuration area, as shown in Figure 1. (This item used to be located under File Systems, so look there if you've got an older kernel.)

Figure 1. The Linux kernel provides GPT support, but it must be enabled when you compile a new kernel
Kernel configuration: Select Enable the Block Layer, Partition Types, EFI GUID Partition Support

If you don't compile your own kernel, you're at the mercy of your distribution provider to enable this support. Fortunately, most do so. If you're in doubt, you can use a GPT-aware partitioning tool to set up GPT partitions on a test disk. If Linux recognizes the partitions, then your kernel is properly configured.

Boot loader support

Boot loader support for GPT is variable and depends on your computer's firmware type. Under BIOS, only the Grand Unified Bootloader (GRUB) 2 officially supports GPT. Most Linux distributions today use GRUB 2 as the default boot loader, but some continue to use the older GRUB Legacy. GRUB Legacy doesn't officially support GPT, but patched versions with GPT support are readily available. The still-older Linux Loader (LILO) doesn't explicitly support GPT, but its disk-addressing methods are based on sector locations, so it often does work (in practice).

If you use GRUB 2 on a BIOS-based computer, be sure to create a BIOS Boot Partition, which holds GRUB's second-stage code. (This partition is identified as having its bios_grub flag set under GNU Parted or as being of type EF02 under gdisk.) The BIOS Boot Partition can be as small as 32KiB in some configurations, although it must sometimes be a bit larger. Given modern partition alignment policies, a size of 1MiB is common.

If your computer uses EFI, any EFI-capable boot loader will work with GPT; but EFI boot loader selection for Linux is tricky. As of mid-2012, some boot loaders remain unreliable or have system-specific quirks. In my experience, the Linux kernel's EFI stub loader (introduced with the 3.3.0 kernel) is the most reliable, followed by the EFI LILO (ELILO), a heavily patched version of GRUB Legacy used by Fedora, and finally GRUB 2. In addition to the boot loader, you might need a separate boot manager to enable operating system selection, particularly if you dual-boot and use the kernel's EFI stub loader or ELILO to boot Linux. Two common choices for this task are rEFIt and rEFInd, the latter being a more up-to-date fork of the former. (Note that I maintain rEFInd.) See Resources for links to all of these programs.

EFI requires the presence of an EFI System Partition (ESP) to boot. (Macs are a partial exception to this rule, although they ship with ESPs defined.) The ESP should contain a FAT32 file system. The EFI standard doesn't specify a size, but something between 100MiB and 500MiB usually works well. If you use the Linux kernel's EFI stub loader or ELILO, you may need to store your kernel on the ESP, so creating an ESP on the large end of the scale is advisable.

Utilities support

The third area of GPT support is system utilities. Linux provides three main families of partitioning tools, with varying support for GPT:

  • The fdisk family. These programs (fdisk, cfdisk, and sfdisk) are text-mode tools that can handle MBR and some more exotic partition tables, but they can't handle GPT.
  • GNU Parted (libparted). The GNU Parted project provides a library (libparted) and a text-mode utility (parted) for partitioning. Several graphical user interface (GUI) utilities are built atop libparted, as well. The libparted library can handle MBR, GPT, and several other partition table types.
  • GPT fdisk. This family (gdisk, cgdisk, and sgdisk) is modelled after the fdisk family but works on GPT disks. (Note that I'm the author of GPT fdisk.)

As a general rule, tools based on GNU Parted—and particularly GUI tools such as GParted or the Palimpsest Disk Utility—are the easiest to use; however, GPT fdisk (and particularly gdisk) provides access to more GPT features. Thus, you might want to use GParted or other GUI tools to set up your disks but use GPT fdisk to fine-tune your configuration or repair damage to a GPT disk.

If you want to create fresh GPT partitions on a disk using GNU Parted, you should launch the program, then use its mklabel command, as in Listing 1.

Listing 1. Using GNU Parted to create GPT disk partitions
# parted /dev/sdd
GNU Parted 3.1
Using /dev/sdd
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel
New disk label type? gpt
(parted)

At this point, you can begin creating partitions using GNU Parted's mkpart command or otherwise manipulate partitions. The process is similar to that of managing MBR partitions with parted, with a few twists. For instance, there's no need to specify a partition type as primary or logical; but you can enter a name for the partition.

Using gdisk is similar to using fdisk. Launching the program on a blank disk creates a new GPT, as in Listing 2.

Listing 2. Using gdisk to create GPT disk partitions
# gdisk /dev/sdd
GPT fdisk (gdisk) version 0.8.4

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries.

Command (? for help):

The gdisk commands for creating and manipulating partitions are similar to those used in fdisk, such as n to create a partition. As with parted on GPT disks, there's no need to specify a partition as primary, extended, or logical. Type codes in gdisk are based on MBR type codes but multiplied by 0x100—for instance, a Linux swap partition is of type 0x82 in MBR and 0x8200 in gdisk. You can set a partition's name with the c command or perform more advanced operations as described in gdisk's man page and online documentation.

Whether you use parted or gdisk, when you're done, you can use the normal Linux file system management tools, such as mkfs, to create file systems on your disk. You can also create logical volume management and RAID configurations much as you would on MBR disks.

If you prefer a GUI tool, the Gnome Partition Editor (GParted) will do the job. Click Device > Create Partition Table to create a new GPT data structure. Click Advanced, then select gpt from the Select new partition table type list, as in Figure 2. Click Apply to create your new GPT data structures. You can then create new partitions in the same way you would if you were manipulating an MBR disk.

Figure 2. You must explicitly set the GPT partition table type to create GPT partitions in the Gnome Partition Editor
GParted label creation dialog box

The GPT creation tools of both GNU Parted and GParted are inherently destructive: If you've got an MBR disk, the only way to turn it into a GPT disk with these tools is to destroy your existing MBR partitions. If you want to convert an MBR disk in place, GPT fdisk does so automatically when you launch it. Be aware, though, that this conversion renders a BIOS boot disk unbootable until the boot loader is re-installed.

Linux employs a handful of MBR partition type codes, such as 0x82 and 0x83, to identify its MBR partitions. Similar GUID codes exist to identify Linux GPT partitions. One important caveat is that Linux has traditionally used the same GUID code as Windows for its data partitions. Thus, it's impossible to differentiate Linux partitions and NTFS file system or FAT partitions from their partition table GUIDs alone. This is unimportant on a Linux-only system, but if you dual-boot Windows and Linux on an EFI-based computer or if you create Linux partitions on a removable disk and use it in Windows, the result is that your Linux partitions appear to be uninitialized partitions in Windows, and Windows may ask whether you want to format the partitions if you try to access them. You can correct this problem in gdisk by giving your Linux partitions a gdisk type code of 8300. This new type code should be supported by libparted in the future, but it hadn't been implemented as of libparted version 3.1.

Large file systems support

If you're switching to GPT because you're using a large RAID configuration, you may need to investigate support for large file system sizes in the file systems you deploy. Table 1 summarizes these limits. (Note that some values vary with partitioning options.) Some of these values are quite large and use suffixes that may be unfamiliar—for example, 1TiB is 1024GiB, 1 pebibyte (PiB) is 1024TiB, 1 exbibyte (EiB) is 1024 petabytes, and 1 zebibyte is 1024PiB.

Table 1. File system volume and size limits
File systemMaximum volume sizeMaximum file size
Second extended file system (ext2) and third extended file system (ext3) 16TiB2TiB
Fourth extended file system (ext4)1EiB16TiB
ReiserFS16TiB8TiB
Journaled file system (JFS)32PiB4PiB
XFS16EiB8EiB
B-tree file system (Btrfs—under development)16EiB16EiB

Beyond the file- and volume-size limits, there are file system performance differences. This topic is extremely complex, so you may need to consult with others who run setups similar to the one you're planning.


GPT partitioning advice

Some special concerns crop up for GPT partitioning, particularly if your computer uses EFI or you run in a multi-boot environment:

  • EFI requires an ESP, as noted earlier, on any boot disk.
  • Also as noted earlier, you should create a BIOS Boot Partition if you plan to boot from GPT on a BIOS-based computer.
  • Many GPT partitioning tools create gaps of about 128MiB after each partition (the ESP is an exception to this rule). The intention is that disk utilities can use this space to help with their jobs.
  • On Mac OS X systems, partitions are created in sizes that are multiples of 4KiB (typically, eight sectors). This feature relates to limitations of the HFS Plus file system that most modern Macs use.

You can follow these partitioning rules or ignore them as you see fit. Linux is flexible enough that it won't be bothered by a disregard for these rules, unless your computer requires an ESP or BIOS Boot Partition to boot.

One other rule isn't GPT specific but is important on most large disks produced since early 2010: These disks use 4KiB physical sectors but 512-byte logical sectors. This discrepancy creates potentially severe performance issues if partitions aren't aligned on physical sector boundaries. Partitioning tools released since late 2010 generally handle this well, but if you're using older tools, be sure to create properly aligned partitions.


Conclusion

GPT is becoming the standard for hard disk partitioning because of the size limitations of the older MBR. Fortunately, Linux is well prepared for this transition. Although Linux users may have to give up certain tools (such as fdisk), other tools are available to take their place (libparted and GPT fdisk, for instance). Understanding the requirements will help you make the transition easily when the time comes to do so. You'll need to attend to your kernel configuration, your boot loader configuration, and the utilities you use to create and manage partitions.

Resources

Learn

Get products and technologies

  • GNU Parted is a mature text-mode MBR and GPT partitioning tool.
  • The Gnome Partition Editor (GParted) is a GUI MBR and GPT partitioning tool that's based on libparted.
  • GPT fdisk is a GPT-only partitioning program modeled after Linux fdisk.
  • The GRUB web page provides information and resources for both GRUB 0.97 (GRUB Legacy) and GRUB 2.
  • Download and learn more about LILO.
  • The ELILO web page provides information and resources for ELILO.
  • Find information and resources for the rEFIt boot manager.
  • Find information and resources for the rEFInd boot manager.
  • The Linux SystemRescueCd is a useful utility for emergency maintenance. It includes a GPT-aware version of GRUB that you can install from the CD boot.
  • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

  • Get involved in the developerWorks community; with your personal profile and custom home page, you can tailor developerWorks to your interests and interact with other developerWorks users.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=416902
ArticleTitle=Make the most of large drives with GPT and Linux
publish-date=07032012