Driving more traffic to your WebSphere Commerce site

Using search engine optimization techniques

Some of the most effective methods to drive more traffic to your site are using free search engines and indexing your Web site properly. This article covers simple techniques you can use to increase online customer visits to your WebSphere® Commerce site.

Share:

Darshanand Khusial (dkhusial@ca.ibm.com), Lead Architect , IBM

Darshanand Khusial is the lead architect for WebSphere Commerce in IBM Software Group at the IBM Toronto lab. He has a Master of Engineering degree from the University of Toronto, Canada.



Walfrey Ng (walfreyn@ca.ibm.com), Software Engineer , IBM

Walfrey Ng is a software engineer at the IBM Toronto Lab, Ontario, Canada. He is a team leader for the Component Foundry Solution Development team, and the solution owner for the WebSphere Commerce Gift Center.



28 September 2005

Introduction

You are probably in the process of building or hosting a site targeted to the general public. Having used WebSphere Commerce, you have taken advantage of its numerous features to sell hundreds or even thousands of items online. With the WebSphere Commerce infrastructure, your site is ready to scale to millions of customers, serving them in sub-second response time. Like any merchant, one of your key goals is to get customers to "step" into your store and to impress them with a wide selection of goods at great prices. In this article, we show you simple techniques that you can use to increase the number of customer visits.

To attract customers to your site, you can distribute flyers, perform email campaigns, or advertise on television. Distribution of flyers offering discounts on a select set of items is a good option to attract customers to your site. The problem with flyers is that they target a few thousand individuals, but you built an online presence to attract millions of customers from the Internet. You may employ a third-party to perform an email campaign to reach the millions of online shoppers, but any good email filter routes the email promotion to the trash can. An expensive alternative to reach the millions of potential customers is to advertise on television. There is an effective and simple option you can use. By making your site search engine friendly, you can use a free and effective "marketing" technique to reach millions of online shoppers.


The incentive for a search friendly site

Let's investigate the stages of the buying decision process to determine why it is important to make your site's search engine consumable. Below are the phases a typical customer goes through to purchase a product:

  1. Desire for product
  2. Perform research
  3. Comparison of alternatives
  4. Product purchase
  5. Post-purchase behavior

In the first phase, desire for product, a search engine does not create a need or wish in the customer's mind. Additionally, the search engine does not play a significant role in the post-purchase behavior phase. However, the middle three phases is where the value of a search engine can shine.

Most online users have a search engine page as their homepages. Mozilla Firefox, an alternative to Microsoft's® Internet Explorer, builds in the Google® search toolbar into their graphical user interface (GUI). One of the main reasons for this is that the disorganization of the Internet's vast resources causes online users to perform a search to find their required information. During the middle three phases of the buying process, a search engine is heavily employed. When an online customer buys a product, he will most likely research the product using his favorite search engine. If he is looking for an office chair, he types "office chair" into the search engine. After browsing various office chairs for awhile, he narrows down his choice to a specific type of office chair, such as "wooden office chair". He subsequently employs his search engine to compare various wooden office chairs.

At the next stage when the customer is about to make the purchase and is shopping for the best price and speed of delivery, he uses his search engine. During each of these three phases of research, comparison, and purchasing, sites that are returned by the search engine start to stick in the customer's memory. More importantly, sites appearing near the top of the search results have an automatic credibility factor, such as the site's reputation, that is established in the customer's mind. Having your site appear at the top of search results of any phase is an important marketing strategy for your site.


Constructing search friendly sites

During the research, comparison, or purchasing phases, sites use advertisements to display on the search engine result page. However, having your site appear as part of the search result, instead of an advertisement on a search result page, is more valuable. If you are looking for a simple introduction on how to optimize your site, or if you do not have the resources to employ a search consultant, nor the desire to read an in-depth book on optimizing your site for search engines, then this article provides the steps to make your WebSphere Commerce site search friendly. The techniques described do not require an expert object-oriented programmer, but does require familiarity with JSP programming.

We first take you through an analysis of your site to determine how well it is being indexed by search engines. Next, we describe how to construct pages that are search engine friendly. Additionally, we show you how you can use the WebSphere Commerce infrastructure to optimize the searching of your site. Also, we look at various artifacts on a site that cause a search engine to choke. Finally, we look at some tools that can help you develop search engine friendly sites.


Querying the search engine's knowledge

It is important to understand what the search engine knows about your site because this determines how much rework is needed to make it search friendly. There are two key pieces of information that you need:

  • One is how many pages the search engine keeps in its local repository about your site. This is known as the search engine’s index of your site.
  • The next vital piece of information is the number of third-party sites the search engine has detected that link to your site.

To determine how well your site is being indexed by search engines, use popular search engines from Google, Yahoo!®, and Microsoft. Search on the domain of your site to see how many pages are indexed by the respective engines. You can search for the domain in the advance features link of the search engine’s main page. Figure 1 shows a shortcut to this on Google. Enter site:<hostname> in the Google text field and submit the form. The results page tell you the number of pages indexed by the search engine. Compare the number of pages indexed by the search engine against the actual number of pages on your site. Do not be surprised if the number of pages returned by the search engines varies widely as they each have their own unique algorithms for indexing pages. Some of them have varying capabilities.

Some engines handle multiple URL parameters, whereas others cannot. If the number of pages indexed by the engine is significantly less than your actual pages on your site, the techniques described below can help. If the number of pages indexed for your site is much more than the actual physical pages on your site, the search engine may be keeping duplicate copies of the same page. Ensure that your site is not adding unique URL parameters on a per session basis, causing the search engine to reference the same page under two different URL entries.

Figure 1. Google search engine
Google search engine

Another important piece of information you can gather is the number of sites that link to your site. The greater the number of reference sites, the better the chances of your site being returned in a search result. In both Google and MSN, this information is determined by entering link:<hostname> in the search engine's input field. This is shown using the MSN Search in Figure 2. Perform this same search on your competitor's Web site to see their link count. If they are significantly higher, you can look at third-party sites linking to their site. You may discover that they are on your home city's Web site listed as a local business, but you are not. You can easily correct this situation with a phone call.

Figure 2. MSN search engine
MSN search engine

Targeting page content to the search engine

Now that you know the search engine information about your Web site, how do you go about telling it more? Recall that the search engine crawls a large set of pages in a small timeframe.The ranking of your pages in the search results depends largely on how you optimize the content of a page. When a search engine analyzes a page, some specific locations are given more weight than others. It is important to put appropriate keywords in these locations.

The search engine treats the first 200-300 words on a page as important and skips complicated content. Therefore, keep your pages simple and descriptive. The following aspects of your page play a key role in determining the rank of the page:

  • Page title: The page title, placed in the <title>...</title> element, is the text that appears in a browser title bar. Use the title to give a short, but precise description of the page’s content. A title, such as "Company XYZ: Category – Chairs", is not as meaningful as "Office Chairs". In the original title, the first three words do not describe the page's content. When you do a search on a keyword, such as "office chairs" in your favorite search engine, the top results are the ones that have the title of "office chairs" at the beginning.
  • Page heading: In the body of the page, the <H1> element is used to highlight the key text in the page. Use it to expand on the page title and to give a short description of the content. For example, use "Office Chairs – Top quality brands such as".
  • First sentence of the body: In the first sentence, carefully craft words that describe the purpose of the page. These words can attract visitors to click on this entry in the search results page.

Work with your marketing manager to analyze the keywords that potential customers will use to search for your products and services. Use these keywords in the page title, page heading, and first sentence of the page's body. Consider building a tool to help the marketing manager change these key attributes of the site, instead of having the developer making these changes. If building a tool is unreasonable, you may want to change your pages so that key aspects described above are in separate property files. This makes it easier to modify by your marketing manager. The following section describes how to use the tools provided by WebSphere Commerce and how to modify the page.


Facilitating page optimization

WebSphere Commerce facilitates the optimization of your page content by providing the Change Pages function and the Product Management tool. These features allow marketing personnel to easily update text in these locations:

  • Store Home page - Store the description in the dynamic properties file so that the marketing manager can use the Change Pages function in WebSphere Commerce Accelerator to modify the text.
  • Catalog page - For the product and category pages, use the appropriate fields in the WebSphere Commerce CATENTDESC table to store the words in different page locations. Below are some examples:
    • For the page title, use the Short Description field.
    • In the heading, use the first Auxiliary Description field.
    • For the meta tag description, for example, <meta name="description" content="", use the second Auxiliary Description field.
    • In the body, use the Long Description field.

For more details, see Adding dynamic text and Changing store information.


Guiding the search engine throughout your site

Web pages designed for human visitors are not friendly for crawlers. There are a number of site design techniques you can use to make the search engine’s time at your site both easy and meaningful:

  • Use a site map to lead the crawler around your site.
  • Cross-reference pages in your site to facilitate easy transition between related pages.
  • Scope out areas of your site that is searchable by using a robot.txt file. This file hints to the search engine that it should not index particular pages, such as error pages.

Building a site map for easy navigation

If there are many clicks from the home page to access the inner pages of your site, it results in the search engine taking a long time to reflect the updated content of these inner pages in its local index repository. It takes several days before your updated content is returned as part of a search result. A site map is useful to facilitate deep crawling so that crawlers can easily visit and index pages deeply embedded within the site. It helps to optimize the entire structure of the site for a crawler by providing an alternate set of Web pages built specifically so that crawlers can quickly access and index a large number of embedded pages.

A site map home page lists the top categories, and also the categories directly under the top categories, such as 2nd and 3rd level categories. This minimizes the number of clicks required for the crawler to get to on any page on the site from the site map. As a result, it reduces the time that it takes for the content to refresh in the search engine. As new content and product information are added to the site, the site map helps the search engine to index this new content quickly.

Cross referencing for quick transition

Pages with links back to the top-level pages will help with crawling. It also provides cross-referencing effect that helps your pages obtain a high ranking. Ensure that your pages have links back to top-level pages for easy crawling.

Scoping the searchable area of your site

Usually on a site, there are pages that are not useful for a search engine to index, such as files in the scripts directory, administrative pages, error pages, and so on. In the case of an error page, it may be linked to a valid URL in the search engine local repository. Imagine a new shopper's impression of your site when he clicks a link from a search result and he is presented with an error message. Additionally, search engines do not like to index pages containing duplicate content. If your site has multiple URLs leading to pages that look exactly the same, such as a set of pages to be cached through edge caching, you may also want to prevent the search engine from indexing multiple copies.

Use the robots exclusion protocol to scope the searchable surface of your site. To do this, create a file called robot.txt to drop into your Web server's root directory. In the robot.txt file, there are two main tags. The first is the User-agent tag that you can use to target a specific search engine. Following each User-agent tag, specify one or more Disallow tags to indicate, for the specific User-agent, the areas of the site the crawler should not visit.

Here is an example of the robots.txt file that disallows all search engines from indexing the administrative pages and the files in the scripts directory:

User-agent: *
Disallow: /Admin
Disallow: /Scripts

When creating the robots.txt file, make sure that it is created correctly. Making mistakes in the robots.txt file accidentally disallow spiders from crawling through the entire site, resulting in no indexed pages. For more details, see Robots Exclusion Protocol.


Using flat URLs instead of dynamic URLs

To optimize search engines, WebSphere Commerce provides a feature, URL Mapping, that takes a dynamic URL and converts it into a static URL. As an example, you can convert a typical dynamic URL, such as:

webapp/wcs/stores/servlet/ProductDisplay?storeId=10001&catalogId=10001 &productId=10032&langId=-1

To a static looking URL, such as:

webapp/wcs/stores/servlet/product_10001_10001_10032_-1

In this example, the ProductDisplay command name was mapped to product, and each URL parameter value was appended to the product string in sequential order. Figure 3 shows how this renders in a browser.

Figure 3. Rendered flattened URL
Rendered flattened URL

Configuring the URL mapping in WebSphere Commerce

Instead of converting your entire site to static URLs, pick the pages you want to index by a crawler. Some typical pages are the category display and product display pages. You can then use the URL Static Converter tool provided by WebSphere Commerce to convert the links in these pages to static URLs. Next, enable the URL mapping to allow the WebSphere Commerce Server to convert these flattened URLs to dynamic links for business logic execution. To do this, enable the URL mapping in WebSphere Commerce. A default mapping file is provided in WebSphere Commerce that you can customize with your own custom commands or parameters.

Here is a snippet of the mapping file:

<pathInfo_mappings separator="_" subdirectory="SiteMap">
<pathInfo_mapping name="product"  requestName="ProductDisplay">
		<parameter name="storeId" />
		<parameter name="catalogId" />
		<parameter name="productId" />
		<parameter name="langId" />
		<parameter name="parent_category_rn" />
</pathInfo_mapping>

Based on this example, it maps the original dynamic URL:

/ProductDisplay?storeId=10001&catalogId=10051&productId=10032&langId=-1

To this static URL:

/product_10001_10051_10032_-1

For more information, see Optimizing your site for search engines.


Avoiding certain site design techniques

A search engine crawls a large set of pages efficiently in a small time frame to build its index. To accomplish this task, the current generation of search engines makes some compromises. Complicated pages or pages with hidden agendas can cause the search engine to choke and may result in the search engine not indexing the site. In the rest of the section, we discuss site design techniques to avoid.

Pages containing pure HTML are easy to parse and understand. Because the pages are complicated with the introduction of JavaScript, Frames, and Macromedia® Flash, the search engine has difficulty parsing these types of pages. If you need to put JavaScript in your pages, put it into a separate file and include this file in your page. If this is not possible, ensure that the JavaScript does not take up the prime real estate of the first 200-300 words in the page. The cut and paste solution is to move the JavaScript to the bottom of the page. If you have to ensure that the <NOFRAMES> element is descriptive, then avoid using frames. For Macromedia Flash, a search engine SDK is provided by Macromedia to help convert your flash enabled pages to search friendly content. To make a page appealing, use languages such as JavaScript. However, to be search friendly, ensure you have most of the pure HTML content at the top of the page. Additionally, pages should be less than 100KB to break a page up and to keep descriptive content at the top main portion of the page.

Avoid constructing pages that require a session. Search engines operate in a stateless mode, thus pages executing only with the presence of a cookie are ignored. If a site checks for browsers that must support cookies, then this action hampers the search engine's ability to index the site. The implementation of this cookie check is performed as a redirect back to the page being accessed. These types of redirects also cause problems for the search engine. Pages, where the URL has a session parameter, are duplicated many times in the search engine's index for your site. This dilutes the relevance of a link that the search engine produces for your site.

Although search engines usually choke on processing complicated pages, that does not mean they are not sophisticated. Trying to trick a search engine by including hidden fields, or overloading it with meta tags with particular keywords, causes the search engine to ignore the page and the site.

Redirects are used on many sites, but some forms of redirects may trip up a search engine. Some sites use short hostnames, such as www.ibm.com, to redirect to a longer URL. This method partitions the request to specific servers for load balancing, to serve a request from a server in the requestor's local region, and so on. If you need to use redirects for one of these cases, we suggest using a 301 server redirect. Status code 301 indicates that the requested page is permanently under a different URL (see 10 Status Code Definitions). Most of the search engine spiders will have a problem crawling through other types of redirects, such as the meta tag or JavaScript redirect. If possible, avoid redirects. If you have to use redirects, ensure that it is a 301 redirect.

Below are examples of bad redirects.

Meta tag redirects: <meta http-equiv="Refresh" content="0;URL=/webapp/wcs/stores/servlet/xxx" />

JavaScript redirects: <body onLoad="setTimeout(location.href='http://xxx', '0')" >


Using tools to tune your site

There are a number of free tools available on the Internet that help build search friendly sites. These tools automate tasks we have described in this article. The first is an online analysis Web site available at Ranks.nl Web Site Promotion Tools. This tool checks your Web site for the proper usage of keywords and compares various search engines. The drawback with this tool is that the content you want to analyze must be live on the Internet. Web CEO is a software package whose basic edition is available to download for free. However, you must purchase the advance versions. The basic version helps with site promotion and workflow analysis. To do serious search optimization, purchase one of the advance versions. These are just a few of the tools we have used, but there are many more available.


Conclusion

This article discussed why a search engine is a tool most likely used by online shoppers during their research, comparison, and purchasing phase of the buying process. Therefore, it is imperative that you use search engines to increase your site''s traffic and presence. The article also discussed various techniques you can use to ensure that your site is optimized for searching, such as developing simple pages and using flattened URLs. On a monthly basis, query popular search engines for their information on your site to ensure that your site is being indexed properly. A site that is not indexed properly by a search engine will miss the opportunity to target a significant segment of customers who use search engines to find and buy products.

Resources

  • Kotler, Philip, and Armstrong, Gary. Principles of Marketing. Tenth Edition, Prentice-Hall, Inc. 2003. Provides a good introduction to fundamental marketing concepts.
  • Adding dynamic text. Describes how to add dynamic text to a WebSphere Commerce store.
  • Changing store information. Describes how to change the store information in WebSphere Commerce.
  • Robots Exclusion Protocol. Covers information on the Robot Exclusion Protocol that is used to inform a search engine to avoid pages on a site.
  • Optimizing your site for search engines. Describes how to enable a WebSphere Commerce site with flattened URLs.
  • Status Code Definitions. Provides status codes for the HTTP protocol.
  • Ranks.nl Web Site Promotion Tools. Provides information on analyzing keyword density and placement and on how various search engines rank your pages.
  • Moran, Mike, and Hunt, Bill. Search Engine Marketing, Inc: Driving Search Traffic to Your Company's Web Site. First Edition, IBM Press, 2005. Covers information on search marketing and search engines.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=94720
ArticleTitle=Driving more traffic to your WebSphere Commerce site
publish-date=09282005