Electronic publishing: Green and convenient
Continuous access to information is a basic assumption for today's mobile workforce and always-on society. Web-based content is inherently available from any networked, browser-equipped mobile device, but longer content is most conveniently stored locally on your mobile or portable device as documents in a variety of electronic publishing formats.
Capitalizing on the ubiquity of devices such as tablets, smart phones, and eBook readers, more organizations are delivering product literature and documentation in electronic publishing formats for both internal and customer consumption. Delivering documents electronically gets them into consumer and customer hands quicker, saves money, conserves paper, and simplifies document updates. By giving customers control over where and how they access your publications and by making it easier to access them, electronic publishing provides direct benefits to both consumers and publishers.
Problems in portable publishing
Electronic publishing solves many of the issues in distributing and accessing documents but introduces a new problem: the format in which electronic content is delivered. Table 1 lists the most common electronic publishing formats and the devices that support them, though many other formats are actively in use. As this table illustrates, all electronic document readers—whether applications or dedicated devices—support a limited number of these formats. When manually creating electronic content, corporate publishers must therefore select the "right" electronic publishing formats—at least those supported on the devices and applications used internally and by customers.
Table 1. Electronic publishing formats and supported devices
| Format | Description | eReader support |
|---|---|---|
| TXT | Plain text format | All |
| HTML | Hypertext Markup Language | Most Exception: Barnes & Noble Nook |
| Adobe® Portable Document Format Not mobile friendly unless text re-flow is enabled during document creation | Most | |
| EPUB | Widely supported open format for electronic publishing See Resources for more information. | Most Exception: Amazon Kindle |
| MOBI | Follows the specifications of the Open eBook (OEB) format May also have an PRC (Palm) or AZW (Amazon DRM-enabled MOBI) extension | Most Exception: Nook |
| LIT | Microsoft's literature format | Microsoft® Reader application only |
| PDB | Extension refers to different formats on different devices | Sony Reader, open source Plucker application, TealDoc documents on Palm and compatible devices |
The free and open source Calibre application provides a convenient
translator for all of these different formats. Calibre is an open source
application that enables you to convert electronic publications from one
format to another, liberating electronic documents from the restrictions
of a specific format or device. Calibre can read all common electronic
publishing formats as well as import documents in PDF, text, and HTML
format. You can then use Calibre's conversion capability to save those
imported documents in the particular format you need.
Corporate publishers should keep in mind that converting electronic publications between different formats for corporate use may require purchasing additional copies of those publications. If in doubt, consult your organization's legal staff before converting and redistributing electronic publications that your company does not own. Converting electronic publications between different formats for personal use is generally viewed as acceptable use as long as you do not remove any DRM mechanisms present in those publications and respect any redistribution limitations that the original publisher imposed.
Installable versions of Calibre for Windows® and Apple Mac OS X
operating systems are available from the Calibre download page (see Resources). Calibre for Linux® is provided
as a package in the repositories for most Linux distributions and can be
installed using your distribution's package management
utilities—typically, apt-get,
rpm, Synaptic, or
yum. If the repositories for your distribution
do not include a prepackaged version of Calibre or you want to install a
newer version of Calibre than what is available for your Linux
distribution, you can download and install the latest version of Calibre
over the Internet using the following command:
python -c "import urllib2; \
exec urllib2.urlopen('http://status.calibre-ebook.com/linux_installer').read(); main()"
|
You must execute this installation command as a privileged user by using
the su or sudo
commands typed on a single line without the backslash character
(\, used in this example for formatting
purposes).
After entering this command, the latest Calibre package for Linux is downloaded to your system. You are then prompted for the name of the directory in which you want to install Calibre:
Enter the installation directory for calibre [/opt]: |
Enter the name of the directory in which the installer should create a calibre sub-directory to hold Calibre and associated applications, data files, and libraries. You can also simply press Return to accept the default value, /opt, which will create the /opt/calibre directory and install Calibre in that location. Installing Calibre over the Internet creates symbolic links in your system's /usr/bin directory that point to the Calibre application and other applications in the Calibre package.
Import documents into a Calibre library
Calibre is a graphical application that uses the QT UI Framework (see Resources) to provide a standard, cross-platform GUI. Installing Calibre also provides a number of command-line tools, discussed later.
When using Calibre in graphical mode, you must import your existing documents into your Calibre library before you can convert, annotate, or otherwise modify them. To select the documents that you want to import, click the Add books icon on the Calibre toolbar. In the navigation window that appears, browse to and select those documents, then import them by clicking Open.
Calibre's main window displays summary information about the documents currently in your Calibre library. When you first start Calibre, this listing contains only the Calibre Quick Start Guide, included to help you get started using Calibre by walking you through its basic capabilities. After you have imported documents into your Calibre library, the summary listing also displays entries for those documents, as shown in Figure 1.
Figure 1. The main Calibre window showing imported documents
Selecting any document in your library displays a preview of that
document in the right pane. The output formats in which that document is
currently available are listed below the preview image.
After you have imported documents into Calibre, you can click icons in its toolbar to perform various tasks, including converting those documents to other formats and adding or augmenting document metadata. See the Calibre documentation or online help for a complete list of toolbar icons and the tasks their associated tasks.
Convert documents between electronic formats
Calibre's Convert books toolbar icon makes it easy to convert documents in your Calibre library to other electronic formats. Selecting one or more documents in your library and clicking this icon displays Calibre's primary conversion window, shown in Figure 2.
Figure 2. Calibre's primary conversion window
The Input format list at the upper left identifies
the format of a selected publication, while the Output
format list on the right enables you to select the electronic
publishing format to which you want to convert that publication. Use the
buttons in the left pane to customize various aspects of the conversion
process specific to your input and output formats. These options let you
customize the default conversion process by, for example, specifying
logical expressions that identify the portions of the input publication
that should be treated as entries for the table of contents in the
converted document.
After you have made your customizations to the conversion process, clicking OK begins the conversion as a background operation and redisplays the main Calibre window. A status indicator in the lower-left corner identifies the number of conversion operations in progress. When the process is complete, Calibre updates the list of output formats in which a document is available to include the newly generated formats.
Add or modify document metadata with Calibre
Metadata, often referred to as data about data, is information about another piece of information. Documents typically contain metadata such as the author, publisher, and publication date. Calibre's Edit metadata command makes it easy to display the current metadata associated with a selected document, enabling you to add, edit, or remove document metadata. Selecting a document in your library and clicking the Edit metadata icon displays Calibre's Edit Metadata window, as shown in Figure 3.
Figure 3. Calibre's Edit Metadata window
To add new metadata or edit existing entries, select the appropriate
field and make your changes or additions. Click OK to
save the modified metadata in the document, or click
Cancel to discard your changes.
Automate electronic publishing using Calibre's command-line utilities
Installing Calibre also installs command-line tools for working with
electronic publications. These tools include the
ebook-convert application for converting
documents and the ebook-meta application for
adding or modifying document metadata. Calibre's command-line utilities
eliminate the need to import documents into your Calibre library and are
convenient if you are integrating document conversion or automatic
metadata insertion into an automated document build or production system.
Calibre's command-line utilities provide the same capabilities available
through its GUI, enabling or overriding specific behavior through
command-line options and associated arguments. Each command-line utility
provides a -h option, which displays a list of
the options available for that command. For example, to see a complete
list of the options available for the conversion process, execute the
ebook-convert command; specify the input
document, output document, and associated format (identified by file
extension); and supply the -h option.
Tips and other tools for electronic document conversion
Calibre provides usable defaults for its conversion processes, but automatic format conversion is rarely perfect. Converting your documents to other formats usually benefits from iterative experimentation with available conversion options, typically involving an initial conversion, displaying the converted document to identify problems or areas for improvement in the conversion process, and re-converting the document after modifying your conversion settings.
The primary area for improving the conversion process is usually in identifying items in the table of contents and how links to associated portions of your document should be created. Figure 4 shows a converted document in Calibre's document previewer, which enables you to explore converted documents without copying them to your eReader device.
Figure 4. A converted document in Calibre's eBook previewer
Other ways of improving the conversion process include using other
tools to perform an initial conversion to another format that may be
easier for Calibre to convert. For example, when converting long technical
documents, I sometimes find that the open source
pdftohtml utility, provided on Linux systems as
part of the poppler-utils package, does a
better job of extracting text from documents in PDF format than Calibre's
built-in text-to-HTML conversion. After converting documents manually with
external tools, you can then use Calibre to generate electronic documents
from the output files that the initial conversion process generates.
Many other open source tools exist for pre-conversion or augmenting and improving the documents that Calibre has converted. For example, Sigil is a graphical WYSIWYG editor for documents in EPUB format that simplifies fine-tuning your converted documents, as shown in Figure 5. (View a larger version of Figure 5.)
Figure 5. Fine-tuning a converted EPUB document in Sigil
You might feel that editing converted documents reduces the time and
cost savings that Calibre's document conversion capabilities provide.
However, exploring converted documents in tools such as Sigil can also
help identify ways you can improve the conversion process. Similarly, you
can often improve the conversion process by using specific techniques in
the tool in which your electronic publications were originally created. As
an example, the Resources section provides links
to presentations that show how to improve the conversion of electronic
documents that were created in Adobe InDesign®. Though such tips
often focus on internal EPUB generation and conversion capabilities, they
can also be useful when using other conversion tools, such as Calibre.
Like traditional publishers, organizations are finding it convenient and cost-effective to deliver documentation in electronic formats. Today's electronic publishing technologies deliver eBooks and other documents in a variety of formats, not all of which are usable on all eReaders and applications.
Calibre, an open source application, makes it easy to convert documents between different electronic publishing formats. Organizations can create documents in one format and use Calibre to quickly convert them to other formats, making those documents portable and easy for both internal users and customers to use.
Learn
- See the Calibre website for the latest
version of Calibre, video tutorials, demos, and other up-to-date
information about Calibre.
- Learn more about Nokia's Qt UI Framework.
- If you use InDesign to prepare corporate
documentation, see Preparing your InDesign files for ePub export and Using InDesign to create eBooks for tips on simplifying exporting
and converting your InDesign documents to the EPUB format.
- See the International Digital Publishing Forum site for information about
existing and upcoming standards in the digital publishing industry.
- The Open source
developerWorks zone provides a wealth of information on open
source tools and using open source technologies.
Get products and technologies
- Download the
Calibre installer for Windows operating systems from the Calibre Windows
download page.
- Download the
latest Calibre installer for Mac OS X from the Mac OS X download
page. The latest version of Calibre for Apple PowerPC systems or
Intel-based Apple systems running Mac OS X Tiger is version 0.7.28, which
is separately available from the Calibre site at SourceForge.
- LexCycle's free Stanza application enables you to
read electronic publications on the Apple iPod Touch, iPhone, and iPad
platforms.
- See the
poppler-utilswebsite and package for thepdftohtmlutility, which simplifies extracting text from PDF files. -
Sigil is a free
cross-platform WYSIWYG editor for eBooks in ePub format.
Discuss
- Check out developerWorks
blogs and get involved in the developerWorks
community.
William von Hagen has been a writer and UNIX systems administrator for more than 20 years and a Linux advocate since 1993. Bill is the author or co-author of books on subjects such as Ubuntu Linux, Xen Virtualization, the GNU Compiler Collection (GCC), SUSE Linux, Mac OS X, Linux file systems, and SGML. He has also written numerous articles for Linux and Mac OS X publications and Web sites. You can reach Bill at wvh@vonhagen.org.



