Skip to main content

Create a printable and searchable version of your online documentation

Use Adobe Acrobat to create a PDF from online information or a Web site

Jennifer Heins (heinsj@us.ibm.com), Advisory Software Engineer, IBM, Software Group
Jennifer Heins is a lead information developer in IBM Pervasive Computing and WebSphere. Besides Web development, writing, and designing samples, she enjoys architecting and implementing efficient Web information design. Since 1998, Jennifer has been designing online information and information interfaces to run on various computing platforms. Her most recent attention has been on developing information and samples for mobile devices. You can contact Jennifer Heins at heinsj@us.ibm.com.

Summary:  Creating a PDF version of your online documentation or Web site is easy and provides an simple way for users to print, view, and search the information online or offline . This article walks you through using Adobe Acrobat to convert your existing Web site or online documentation into a single PDF file that users can print, download, and easily search.

Date:  01 Jun 2002
Level:  Intermediate
Activity:  788 views

If you have online documentation or a Web site with a lot of information on it, chances are you have probably been asked these questions: How do I print all of the topics? Do you have a PDF version of the information? Creating a PDF version of your online documentation is easy and provides a simple way for users to print and view the information online or offline. A PDF version of your information is also a good way to search for things across all topics.

If you followed my previous article, "Make your Web page picture perfect with frames," you learned how to create a Web site or online information center that links users to specific topics or articles on your site (see Resources). What if your users want to access the information offline or want to print all of the topics?

This article walks you through using Adobe Acrobat to convert your existing Web site or online documentation into one PDF file that users can print, download, and easily search.

Getting started

You only need three things to get started:

  1. Access to all the information topics that you want to include in the PDF
  2. An HTML file with links to all the topics that you want in the PDF
  3. Adobe Acrobat 4.0 or higher (not Adobe Acrobat Reader)

An HTML table of contents

First, make sure that you can access all the information topics that you want in your PDF. Then, before creating your PDF, you need an HTML file that has links to the topics and articles you want to include in your PDF. I call this HTML file htmltoc. Adobe Acrobat uses this file as a guide for creating your PDF. The file must be in HTML format for Adobe Acrobat to import it. You may already have a table of contents file or site map with the appropriate links but it may not be in HTML or it may have a lot of JavaScript in it. If you already have an HTML table of contents, use it or create a new htmltoc file that will work. Make sure the links in the htmltoc file are in the same order that you want the topics to appear in the PDF. I have included a sample htmltoc file in the zip file accompanying this article (see Resources).

I will use an online documentation example. I assume that an htmltoc file exists and contains the table of contents links for the PDF. Figure 1 shows an example of online documentation that I will convert to PDF. Once you have the htmltoc file, you are ready to move on to the next step. For this example, I am using a file named htmltoc.html.


Figure 1. Online documentation
Online documenation

Adobe Acrobat

Once you have an htmltoc file, you need a copy of Adobe Acrobat 4.0 or higher. Version 4.0 of Adobe Acrobat includes a feature that allows you to import Web pages. This feature works well for any information that is in HTML format, such as online documentation. Adobe Acrobat provides many settings that affect the imported HTML pages. Before actually creating the PDF file, I suggest that you make the following initial configuration changes to Adobe Acrobat.


Configuring Adobe Acrobat

The first time you create a PDF from HTML you should configure some settings in Adobe Acrobat. Acrobat saves these settings when you create the first PDF. You only need to configure these settings when you have a new installation of Adobe Acrobat. The following steps explain why and how to configure Adobe Acrobat for creating PDFs from HTML.

  1. Launch Adobe Acrobat and select File -> Open Web Page from the menu.
    Figure 2. Open Web Page
    Open Web Page
  2. Click Conversion Settings to access the configuration settings.
  3. Conversion Settings allows you to configure how the HTML pages will appear when converted to PDF. Figure 3 shows an example of the screen including recommended settings.
    Figure 3. Conversion Settings - General
    General Conversion Settings

    On the General tab, under General Settings for Generated PDF, select one or more of the following options:

    1. Create Bookmarks to New Content: Enable this check box to include left hand navigation with bookmarks in your PDF file.

      The text used for the bookmarks comes from the text in the <title></title> tag in each HTML file that is imported into the PDF. Once the PDF is created, you can edit, move, and nest the bookmarks. However, any manual manipulation must be done each time you create the PDF. Typically, this is why I create the PDF without bookmarks as I will demonstrate in this article. Later on, you can try building the PDF with and without bookmarks to determine which is most appropriate for your information.

    2. Put Headers and Footers on New Page: Enable this check box to put the Web server, directory, and filename of each article imported into the PDF at the bottom of every page.

      You may or may not want to expose this information to users. If you want readers to be able to find the exact page on your Web site later, the Web address may be useful to show. However, if you are creating the PDF from files on your local machine or a private Web server, you may not want this information displayed. When this option is enabled, the page numbers in the PDF can be misleading -- the page numbers represent each imported article rather than the entire PDF file.

    3. Add PDF Structure: I do not usually select this and have not seen a difference whether selected or not.
    4. Save Refresh Commands: I do not usually select this and have not seen a difference whether selected or not.
  4. On the General tab, select HTML under Content-Type Specific Settings and click Settings to open the HTML Conversion Settings window. These settings allow you to control layout, color, and fonts.
    Figure 4. HTML Conversion Settings - Layout
    HTML Layout Conversion Settings

    On the Layout tab of the HTML Conversion Settings window, select one or more of the following options:

    1. Default Colors: I suggest leaving the default colors. You can always return later to make additional changes to these settings.
    2. I strongly suggest that you enable the Force These Settings for All Pages check box. This is important because the HTML files you import may have a background color other than white and Adobe Acrobat imports this color unless you specify not to. You may want the background color maintained but for this example I want consistency. For example, htmltoc.html has a blue background and the rest of the files, that will be imported into the PDF, have white backgrounds.
    3. Leave all the other default settings as they are shown in Figure 4.

  5. On the HTML Conversion Settings window, click the Fonts tab to configure the fonts and select font sizes .
    Figure 5. HTML Conversion Settings - Fonts
    HTML Fonts Conversion Settings

    For each font type, click Choose Font. Configure the fonts and sizes to resemble your online documentation or Web site. The online documentation in this example uses a san-serif font and I want to maintain a similar font in the PDF. Figure 5 shows the settings I have chosen.

  6. To return to the Conversion Settings window when you are done with the layout and font settings, click OK. Your changes are saved for future use.
  7. You are almost done. The last thing I suggest doing is updating the default margins that Adobe Acrobat supplies. In the Conversion Settings window, select the Page Layout tab to set the margins for the PDF.
    Figure 6. Conversion Settings - Page Layout
    Page Layout Conversion Settings
    1. Margins: I think the default margins are too small and should be increased for easier reading. Set the margins as desired. Figure 6 provides recommendations.
    2. Scaling: Scale Wide Contents to Fit Page adjusts the size of text to make it fit on a page. A good example is if you are using <pre></pre> tags to show code samples. Text within <pre></pre> tags does not wrap to the next line, therefore Adobe Acrobat attempts to reduce the size of the text until it fits on the page. Auto-Switch to Landscape if Scaling Smaller than 70% turns the page orientation to landscape if Adobe Acrobat reduces the text size more than 70% (or whatever value you enter) of the original size. For this example, keep these options. You may want to change them later if your information contains many code samples or long directory names.
  8. These are all the changes you need to make to get started. Click OK. Leave this window open to return to the Open Web Page window.

Creating a PDF

Now that you have configured Adobe Acrobat, you are ready to build your first PDF file. The following steps explain how to do it.

  1. Before you begin:
    1. Make sure all the files, including graphics, are on your local machine or on a Web server. If files are missing, Adobe Acrobat informs you and serves as a good link checker.
    2. Make sure the htmltoc file is ready. This is what you use to create the PDF. See Getting started for more information.
    3. Make sure you have configured Adobe Acrobat with the proper settings to create a PDF file from HTML files. See Configuring Adobe Acrobat for more information.
  2. If Adobe Acrobat is not already open, open it .
  3. Select File -> Open Web Page from the menu, if it is not already open.
  4. Enter the URL or location of the htmltoc file that will serve as a guide for importing topics into the PDF. The topic files can reside on a Web server or on your local machine. If the files are on your local machine, use the exact syntax as in the following examples or the PDF will not build.
    • Information on Web server: http://yoursite.yourco.com/info/htmltoc.html
    • Information on local machine: file:///C|/htmlltopdf/info/htmltoc.html

    This syntax is the same syntax that Netscape uses when opening a local file.

  5. On the Open Web Page window under Settings, select Levels and enter 2.
    Level 1 imports the HTML file you are pointing to. In this example, it would only import htmltoc.html. Level 2 imports htmltoc.html (level 1) and all the files that htmltoc.html links to (level 2), such as welcome.html. Level 3 imports htmltoc.html (level 1), all the files that htmltoc.html links to (level 2), and all the files that the level 2 files link to. If you choose more than two levels, you risk pulling in external links that may be on your pages. For example, I have a link to http://www.ibm.com/developerworks on welcome.html. You may not want to import external links into your PDF, so be aware of this.
    Note:Levels resets back to 1 each time you open Adobe Acrobat. To save time, set the value for Levels before creating the PDF. All other settings are saved.
  6. Click Download to start importing files. Adobe Acrobat tells you if it can't find a file.
    Figure 7. Open Web Page
    Open Web Page
  7. Once the PDF is created you can save the file and review it page by page. You can view an example PDF file created from the sample online documentation.

Adding a title to your first page

You may have noticed that the first page of the PDF with your table of contents is not very attractive. It does not say what the information is about and does not have any pretty graphics on it. This section provides some tips to add a front page with a title and graphic.

I suggest adding a PDF heading section that contains a title for the PDF and a graphic. This section should remain in the file (commented out) at all times and only be used when creating a PDF. When you want to update the PDF file, simply remove the comment tags to reveal the title and graphic. When you are done creating the PDF, just comment out the title section again. Using comment tags allows you to store the title section in the file so you do not need to recreate it each time you update your PDF. Listing 1 shows a code sample of what can be added as the first line inside the htmltoc file BODY.


Listing 1. Hidden title to use on your PDF
<!--START PDF TITLE SECTION-->
<!--
<h2>[Your Product] Version [x.x]</h2>
<hr>
<p>(c) Copyright 2000, 2001 [Your Company]. All rights reserved.</p>
<hr>
<p><img src="companylogo.gif" align="right"/></p>
-->
<!--END PDF TITLE SECTION-->
	

<!--START PDF TITLE SECTION-->
<!--
<h2>[Your Product] Version [x.x]</h2>
<hr>
<p>(c) Copyright 2000, 2001 [Your Company]. All rights reserved.</p>
<hr>
<p><img src="companylogo.gif" align="right"/></p>
-->
<!--END PDF TITLE SECTION-->
	

You can view an example PDF file including the PDF title section.


Conclusion

You have just created a printable, viewable, and searchable version of the example information. I hope this article has provided you with enough information to get started and some tips to pursue. Many other Adobe Acrobat features, such as bookmarks, are not covered in this article but may be useful. Now that you can create a PDF version of your information, don't forget to add a link on your Web site or online documentation to your new printable and searchable PDF file.



Download

DescriptionNameSizeDownload method
Sample code for this articlewa-web2pdf/htmltopdf-sample.zip26.4 KB HTTP

Information about download methods


Resources

  • Download the source code for this article, including an online documentation example and PDFs, in a zip format. Then, open index.html to view the online documentaton example and use htmltoc.html to create a PDF.

  • To obtain Adobe Acrobat, go to http://www.adobe.com.

  • If you like the sample online documentation provided in this article, refer to "Make your Web page picture perfect with frames" (developerWorks, April 2001) to learn more about using frames to create online documentation.

About the author

Jennifer Heins is a lead information developer in IBM Pervasive Computing and WebSphere. Besides Web development, writing, and designing samples, she enjoys architecting and implementing efficient Web information design. Since 1998, Jennifer has been designing online information and information interfaces to run on various computing platforms. Her most recent attention has been on developing information and samples for mobile devices. You can contact Jennifer Heins at heinsj@us.ibm.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=11675
ArticleTitle=Create a printable and searchable version of your online documentation
publish-date=06012002
author1-email=heinsj@us.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers