Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Build a digital book with EPUB

The open XML-based eBook format

Liza Daly, Software Engineer and Owner, Threepress Consulting Inc.
Photo of Liza Daly
Liza Daly is a software engineer who specializes in applications for the publishing industry. She has been the lead developer on major online products for Oxford University Press, O'Reilly Media, and other publishers. Currently she is an independent consultant and the founder of Threepress, an open source project developing ebook applications.

Summary:  Need to distribute documentation, create an eBook, or just archive your favorite blog posts? EPUB is an open specification for digital books based on familiar technologies like XML, CSS, and XHTML, and EPUB files can be read on portable e-ink devices, mobile phones, and desktop computers. This tutorial explains the EPUB format in detail, demonstrates EPUB validation using Java technology, and moves step-by-step through automating EPUB creation using DocBook and Python.

05 Feb 2009 - As a followup to reader comments, the author revised the content of Listing 3 and refreshed the epub-raw-files.zip file (see Downloads).

27 Apr 2010 - Refreshed the epub-raw-files.zip file (see Downloads).

03 Jun 2010 - At author request,revised the content of Listings 3 and 8. Also refreshed the epub-raw-files.zip file (see Downloads).

11 Jan 2011 - At author request,revised the content of Listing 5. Changed second line of code from <item id="ncx" href="toc.ncx" media-type="text/xml"/>; to <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>.

12 Jul 2011 - As a followup to reader comments, revised the content of Listing 14. Removed ` character near end of first line of code from <?xml version="1.0" encoding="utf-8"?`>. Revised code now reads: <?xml version="1.0" encoding="utf-8"?>.

Date:  13 Jul 2011 (Published 25 Nov 2008)
Level:  Intermediate PDF:  A4 and Letter (504 KB | 25 pages)Get Adobe® Reader®

Activity:  314577 views
Comments:  

Summary

The Python script in the previous section is just a first step in fully automating any kind of EPUB conversion. For the sake of brevity, it does not handle many common cases, such as arbitrarily nested paths, stylesheets, or embedded fonts. Ruby fans can look at dbtoepub, included in the DocBook XSL distribution, for a similar approach in that language.

Because EPUB is a relatively young format, many useful conversion paths still await creation. Fortunately, most types of structured markup, such as reStructuredText or Markdown, have pipelines that produce HTML or XHTML already; adapting those to produce EPUBs should be fairly straightforward, especially using the DocBook-to-EPUB Python or Ruby scripts as a guide.

Because EPUB is mostly ZIP and XHTML, there's little reason not to distribute documentation bundles as EPUB archives rather than as simple .zip files. Users with EPUB readers benefit from the additional metadata and automatic table of contents, but those without can simply treat the EPUB archive as a normal ZIP file and view the XHTML contents in a browser. Consider adding EPUB-generating code to any kind of documentation system, such as Javadoc or Perldoc. EPUB is designed for documentation at book length, so it's a perfect distribution format for the increasing number of online or in-progress programming books.

6 of 9 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Mobile development
ArticleID=485946
TutorialTitle=Build a digital book with EPUB
publish-date=07132011
author1-email=liza@threepress.org
author1-email-cc=dhatten@us.ibm.com