 | Level: Introductory Lewin Edwards (sysadm@zws.com), Author, Freelance
08 Feb 2005 This installment of "Migrating from x86 to PowerPC" discusses detailed similarities and differences between booting Linux on an x86-based platform (typically a PC-compatible SBC) and a custom embedded platform based around PowerPC, ARM, and others. It discusses suggested hardware and software designs and highlights the tradeoffs of each. It also describes important design pitfalls and best practices. This article describes the most common traits of embedded Linux™
distributions that people employ on x86 hardware and contrasts some of the
different options frequently seen on non-x86 embedded systems.
By the time a system has booted itself to the point where it can run your
application-level code, any one variant of Linux is, practically by
definition, largely similar to another. However, there are several
different methodologies that you can use to get the system from power-on
reset to a running kernel, and beyond that point, you can construct the filesystem in which your application will run in different ways.
Each approach has its own distinct advantages and disadvantages, and a definite, two-way relationship exists between the
hardware you choose to implement and the way you will structure the
power-up and Initial Program Load (IPL) process. Understanding the
software options available to you is a critical part of the research you
must do before designing or selecting hardware.
The x86 Linux boot process
The most fundamental and obvious difference between x86 boards and
embedded systems based on PPC, ARM, and others is that the x86 board will ship
with one or more layers of manufacturer-supplied "black box" firmware that
helps you with power-on initialization and the task of loading the
operating system out of secondary storage. This firmware takes the system
from a cold start to a known, friendly software environment ready to run
your operating system. Figure 1 is a diagram of the typical PC boot process,
with considerably more detail than you tend to find in PC-centric
literature:
Figure 1. Typical
start-up process for x86 Linux
For cost reasons, modern PC mainboard BIOS code is always stored
compressed in flash. The only directly executable code in that chip is a
tiny boot stub. Therefore, the first task on power-up is to initialize the
mainboard chipset enough to get the DRAM controller working so that the
main BIOS code can be decompressed out of flash into a mirror area in RAM,
referred to as shadow RAM. This area is then write-protected and control
is passed to the RAM-resident code. Shadow RAM is permanently stolen by
the mainboard chipset; it cannot later be reclaimed by the operating
system. For legacy reasons, special hardware mappings are set up so that
the shadow RAM areas appear in the CPU's real-mode memory map at the
locations where old operating systems like MS-DOS would expect to find
them.
Keep in mind that the PC is an open architecture. This openness even
extends down to firmware modules within the BIOS itself. Once the power-on
initialization (POI) code has run, the next step it takes is to enumerate
peripherals, and optionally install hooks provided by expansion ROMs in
those peripherals. (Some of those expansion ROMs -- for instance, the
video BIOS in a system that has onboard integrated video hardware -- will
physically reside in the main BIOS image, but conceptually they are
separate entities). The reasons the BIOS has to do this redundant
initialization are:
- The main BIOS itself needs basic console services to announce
messages and allow the user to override default start-up behavior and
configure system-specific parameters.
- Historical issues limit the size of a user-supplied bootloader
program to slightly less than 512 bytes. Since this isn't enough space to
implement all the possible device drivers that might be required to access
different displays and storage devices, it's necessary for the BIOS to
install standardized software interfaces for all installed, recognized
hardware that might be required by the bootloader.
Once all the BIOS-supported system peripherals are initialized, the main
BIOS code will run through candidate boot devices (in accordance with a
user-configurable preference list) looking for a magic signature word.
Storage devices for IBM®-compatible PCs have historically used a sector
size of 512 bytes, and therefore the BIOS only loads the first 512 bytes
from the selected boot device. The operating
system's installation program is responsible for storing sufficient code in that zone to
bootstrap the remainder of the IPL process.
Although it would be possible to write a minimalist Linux bootloader that
would fit into such a space, practical Linux bootloaders for the PC
consist of two stages: a small stub that lives in the boot sector, and a
larger segment that lives somewhere else on the boot medium, usually
inside the partition that contains the root filesystem. LILO and grub are
the best-known bootloaders for mainstream Linux installations, and
SYSLINUX is a popular choice for embedded distributions.
Using a RAMdisk
The primary purpose of the bootloader is to load the operating system
kernel from secondary storage into RAM. In a Linux system (x86 or
otherwise), the bootloader can also optionally load an initial RAMdisk
image. This is a small filesystem that resides entirely in RAM. It
contains a minimal set of modules to get the operating system off the
ground before mounting the primary root filesystem. The original design
purpose for initial RAMdisk support in the kernel was to provide a means
whereby numerous optional device drivers could be made available at boot
time (potentially drivers that needed to be loaded before the root
filesystem could be mounted).
You can get an idea of the original usage scenario for the RAMdisk by
considering a bootable Linux installation CD-ROM. The disk needs to
contain drivers for many different hardware types, so that it can boot
properly on a wide variety of different systems. However, it's desirable
to avoid building an enormous kernel with every single option statically
linked (partly for memory space reasons, but also to a lesser degree
because some drivers "fight" and shouldn't be loaded simultaneously). The
solution to this problem is to link the bare minimum of drivers statically
in the kernel, and to build all the remaining drivers as separately
loadable modules, which are then placed in the RAMdisk. When the unknown
target system is booted, the kernel (or start-up script) mounts the
RAMdisk, probes the hardware, and loads only those modules appropriate for
the system's current configuration.
Having said all that, many embedded Linux applications run entirely out of
the initial RAMdisk. As long as you can spare the memory -- 8MB is usually
more than enough -- it's a very attractive way of organizing your system.
Generally speaking, this is the boot architecture I favor, for a few
reasons:
- The root filesystem is always writeable. It's much less work to have
a writeable root than it is to coerce all your other software to put its
temporary files in special locations.
- There is no danger of exhausting flash memory erase-modify-write
lifetimes or of corrupting the boot copy of the root filesystem, because
the system executes entirely out of a volatile RAM copy.
- It is easy to perform integrity-checking on the root filesystem at
boot time. If you calculate a CRC or other check value when you first
install the root filesystem, that same value will be valid on all
subsequent boots.
- (Particularly interesting to applications where the root filesystem
is stored in flash) You can compress the boot copy of the root
filesystem, and there is no run time performance hit. Although it's
possible to run directly out of a compressed filesystem, there's obviously
an overhead every time your software needs to access that filesystem.
Compressed filesystems also have other annoyances, such as the inability
to report free space accurately (since the estimated free space is a
function of the anticipated compression ratio of whatever data you plan to
write into that space).
Other x86 boot considerations
Notice a few other points from Figure 1. The first
is that the color coding is meaningful. In the blue boxes, the system is
running BIOS code and accessing all system resources through BIOS calls.
In the green boxes, the system is running user-provided code out of RAM,
but all resources are still accessed through BIOS calls. In the yellow
boxes, the system is running Linux kernel code out of RAM and operating
out of a RAM disk. Hardware is accessed through the Linux device driver
architecture. The purple boxes are like the yellow boxes, except that the
system is running out of some kind of secondary storage rather than a
RAMdisk. The rules being followed in the gray box are system-specific.
You'll observe from this that there are two possible boot routes
(actually, more) once the kernel has been loaded. You can load an initial
RAMdisk and run entirely out of that, you can use the initial RAMdisk and
then switch over to a main root filesystem on some other storage medium,
or you can skip the initial RAMdisk altogether and simply tell the kernel
to mount a secondary storage device as root. Desktop Linux distributions
tend to use the latter design model.
Also note that there is an awful lot of redundant code here. The BIOS
performs system tests and sets up a fairly complex software environment to
make things cozy for operating systems like MS-DOS. The Linux kernel has
to duplicate much of the hardware discovery process. As a rule, once the
kernel loads, none of the ROM-resident services are used again (although
there are some exceptions to this statement), yet you still have to waste a
bunch of RAM shadowing that useless BIOS code.
The non-x86 Linux boot process
In contrast to the x86's complex boot process, an embedded device like the
Kuro Box jumps as directly as possible into the operating system. Although
there are extant standards for implementing firmware interfaces
(equivalent to the PC ROM-BIOS) in PowerPC® systems, these standards are
rarely implemented in embedded appliances. The general firmware
construction in such a system (assuming that it is based on Linux) is that
the operating system kernel, a minimal filesystem, and a small bootloader
all reside in linearly-accessible flash memory.
At power-up, the bootloader initializes the RAM controller and copies the
kernel and (usually) the initial RAMdisk into RAM. Flash memory is
typically slow and often has a narrower data bus than other memories in
the system, so it's practically unheard of to execute the kernel directly
out of flash memory, although it's theoretically possible with an
uncompressed kernel image.
Most bootloaders also give the user some kind of recovery interface,
whereby the kernel and initial RAMdisk can be reloaded from some external
interface if the flash copies are bad or missing. Off-the-shelf
bootloaders used in these applications include blob, U-Boot and RedBoot,
although there are others -- and there are many applications that use
utterly proprietary bootloaders. Figure 2 illustrates a typical start-up
flow for a non-x86 embedded Linux device:
Figure 2. Typical
start-up process for PPC or ARM Linux
Observe that, as for the x86 startup process above, you have the same
possible different routes once the kernel has been loaded. Also note that
once control passes to the kernel, the boot process is identical to what
it was on the x86. This is to be expected: the further you get in the
boot process, the more the software environment is defined by the
operating system's API specification rather than the vagaries of the
underlying hardware.
The layout of flash memory
The exact layout of such a system in flash memory depends on two principal
factors: the flash device sector size (usually in the neighborhood of
64KB), and the processor's power-on-reset behavior. A core like ARM, which
starts execution at address 0, will put the bootloader at the bottom of
flash. A core like x86 will need to put the bootloader at the top.
There are at least two, and generally four, entities that need to be
installed in flash: the bootloader (mandatory), an optional parameter
block providing nonvolatile storage for boot options, calibration data and
other information, the Linux kernel itself (again, mandatory), and almost
always an intial RAMdisk image. For example, a layout for a 4MB flash chip
with a 64KB sector size might be as follows:
Listing 1. Typical layout of a 4MB flash chip
000000-01FFFF Bootloader (128KB)
020000-02FFFF Parameter block (64KB, probably mostly unused)
030000-1FFFFF Kernel (1.8MB)
200000-3FFFFF Initial RAMdisk image (2MB)
|
While it is possible to write these various segments across sector
boundaries (and it is especially tempting in the case of the parameter
block, which will likely be more than 99% empty), this is an extremely
unwise practice and should be avoided unless you are under terribly severe
flash space constraints. It is particularly vital that the bootloader
should reside in a private segment that can be left write-protected.
Otherwise, a failed firmware upgrade operation may leave the system
entirely inoperable. Good system engineering should provide a safe
fallback position from any possible user-initiated upgrade process.
The only part of this software bundle that absolutely must be preloaded at
the factory is the bootloader. Once the system is startable from that boot
code, you can use other (end-user-accessible) interfaces to load the
kernel and RAMdisk image.
Why not do this on x86?
By the way, at this point the attentive reader may be wondering why
embedded PC applications can't use a special boot ROM that simply loads
the operating system kernel directly off disk (or some other medium).
The answer to this is that while it's possible to write a custom cut-down
bootstrap program for a PC motherboard (see, for example, the LinuxBIOS
project), the types of applications that use PC hardware tend to be using
the board as a black box. Typically, the system integrator will not even
have access to datasheets or schematics for the board; they can't write a
bootstrap program even if they want to. Furthermore, PC operating systems
are built on the assumption that lowest-common-denominator BIOS services
are available, at least at boot time. In other words, it's a simple fact
that the path of least resistance is so much easier than a fully custom
alternative that practically nobody tries to do it the "smart" way. The
inefficiencies of the multi-layer BIOS approach are lost in the noise (as
it were) compared with the overall system specifications.
Having digested all the above, assuming you understand approximately how
large your various software modules will be, you are well prepared to
select flash and RAM sizes and layouts for a custom embedded system. Kuro
Box happens to use a very uncomplicated memory architecture. It has a
single 4MB linear flash chip and 64MB SDRAM. While this is the
simplest design, it is not necessarily the cheapest, and you may wish to
consider other alternatives if you are designing your own system.
A few alternatives
One hardware architecture that I have used with some success, and which I
have also seen in a few other commercial products, is to use a very small,
cheap (generally narrow-bus) OTP EPROM as the primary boot device. This
chip is factory-programmed with just enough bootstrap code to load the
main firmware image off a secondary storage device and check its
integrity. It is very useful if you can also include a little additional
intelligence so that the secondary storage device can be reloaded from
some external source -- removable media, a serial port, Ethernet, USB or
something else -- if the main firmware image becomes corrupted.
An attractive choice of storage device for the main image is NAND flash,
which is cheaper than the linear NOR flash used by Kuro Box. NAND flash is
produced in vast quantities for removable storage devices: CompactFlash
cards, USB "pen disks," Secure Digital (SD) cards, MP3 players, and so on.
Although it is possible, with a minimal amount of external logic, to graft
NAND flash onto a normal flash/ROM/SRAM controller (such as that in the
MPC8241), there are a couple of reasons why you can't simply boot directly
out of the NAND flash. The first is that NAND is not guaranteed
error-free; it's the host's responsibility to maintain ECC and bad sector
mapping information. The second reason is that NAND flash is addressed
serially; you send the chip a block number, then read the block out into
RAM. Hence, you need a little boot firmware on a normal random-access
PROM to do the physical-level management of the NAND. (See Resources for more on NAND and NOR.)
Note that some microcontrollers provide hardware NAND controllers that
obviate the need for the little boot PROM I discussed above. The
disadvantage of relying entirely on that sort of hardware is that you lose
the failsafe system-recovery features that can easily be implemented in
the boot PROM. However, if you're working against space or cost
constraints, and your micro has the NAND control hardware, you may want to
avail yourself of it. SoCs sold for cell phone applications use this sort
of technology.
Summary
By now you should understand many of the design choices to be made when
building your embedded Linux distribution and selecting the memories in
which it will reside and run. You should also understand the way
parts of your software distribution can be spread between a RAMdisk and
secondary storage devices. The next article gets into the
gory details of the software preloaded on the Kuro Box and sets the stage
for some real development work.
Resources -
Migrating from x86 to PowerPC is the only
developerWorks Power Architecture technology series on the entire Internet
that will help you build your own remote-controlled robot submarine army.
Missed a previous installment? Don't dismay: it's astonishingly easy to
read them all now.
- Need to
hack a serial
port onto your Kuro box? Lewin has posted all of the details to
his site.
- Tired of waiting for the BIOS to probe hardware? You may be able to
load
the kernel directly (developerWorks, May 2004).
- The LinuxBIOS project is a BIOS
replacement for x86 PCs. It directly boots the Linux kernel and requires no
secondary storage.
- Das U-Boot is a
flexible and popular bootloader for numerous different hardware architectures.
- The blob bootloader
(Boot Loader OBject) is a small, easy-to-use bootloader designed for ARM systems.
- Red Hat's RedBoot
bootloader is actually a miniature operating system, based on the
eCos product, also from Red Hat. It is very flexible and has been
ported to many platforms.
- This Toshiba presentation describes some of the structure and advantages
of NAND flash memories (versus NOR).
- Red Hat offers a fairly detailed x86-centric
description of the Linux start-up and shutdown process.
- This useful article discusses how to write code that lives in the Master Boot Record (MBR).
- A recent developerWorks article, Standards
and specs: Open Firmware -- the bridge between power-up and OS, covers
the Open Firmware boot loader environment (developerWorks, October 2004).
- IBM has some related material in Linux
Handbook: A Guide to IBM Linux Solutions and Resources (IBM Redbook,
2003).
- Find more Linux-related resources at the IBM developerWorks
Linux zone and the developerWorks Linux on Power
Architecture Developer's corner.
- Have experience you'd be willing to share with Power Architecture zone
readers? Article submissions on all aspects of Power Architecture technology from authors inside and outside
IBM are welcomed. Check out the Power Architecture author
FAQ to learn more.
- Have a question or comment on this story, or
on Power Architecture technology in general?
Post it in the Power Architecture technical forum
or send in a letter to the editors.
- The Power Architecture Community Newsletter includes full-length articles as well as recent news about members of the Power Architecture community and upcoming events of interest. Subscribe to the newsletter today!
- All things Power are chronicled in the developerWorks Power
Architecture editors' blog, which is just one of many developerWorks
blogs.
- Find more articles and resources on Power Architecture
technology and all things
related in the developerWorks Power
Architecture technology content area.
- Download a IBM PowerPC 405 Evaluation Kit to demo a SoC in a simulated
environment, or just to explore the fully licensed version of
Power Architecture technology. This and other fine Power Architecture-related downloads are listed in
the developerWorks Power Architecture technology content area's downloads section.
About the author  | |  | Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development. He can be reached at sysadm@zws.com. |
Rate this page
|  |