Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Customizing the WebSphere Commerce data extract framework using in-memory paging

Vipin Murali (vipin.murali@in.ibm.com), Software Engineer, IBM
Photo of Vipin Murali
Vipin Murali is a Software Developer with the WebSphere Commerce team at the IBM India Software Lab. He has three years of experience in the e-commerce field. His areas of experience include Java, J2EE, and web services.
Dinup P. Pillai (dinup.pillai@in.ibm.com), Software Engineer, IBM
Photo of Dinup P Pillai
Dinup P. Pillai is a Software Developer with the WebSphere Commerce team at the IBM India Software Lab. He has three years of experience in the e-commerce field. His areas of experience include Java, J2EE, and web services.

Summary:  Learn how to customize the WebSphere® Commerce data extract framework by incorporating in-memory paging, which can significantly improve performance for large data sets and provide a more efficient data extraction process.

Date:  12 Oct 2011
Level:  Intermediate PDF:  A4 and Letter (110KB | 9 pages)Get Adobe® Reader®

Activity:  7889 views
Comments:  

Introduction

The Coremetrics® Intelligent Offer data extraction utility is a command-line utility that creates the Enterprise Product Report (EPR) data. This data is required by Coremetrics for dynamic recommendations. The utility extracts catalog data from the database and generates Enterprise Category Definition File (ECDF) and Enterprise Product Content Mapping File (EPCMF) files in the correct format to load into Coremetrics. The utility also extracts data from the WebSphere Commerce database and formats and writes it into CSV files. It contains several components, such as Data Reader, Business Object Builder, Business Object mediator, and Data Writer.

This utility is provided as a part of WebSphere Commerce V7 Feature Pack 3. For more information, see this Information Center topic, Data extraction utility for dynamic recommendations in Coremetrics Intelligent Offer.

The default implementation for the data extract solution performs "database level paging" by injecting paging indexes to the main SQL query. This query is executed for every service invocation. You can increase the database level paging size by changing the page size parameter in the data extract configuration to increase the data extract performance as needed. For more information, see this Information Center topic, Sample business object configuration file EPCMF data.

If you have a very large dataset and want to achieve considerable performance improvements, you may want to explore the in-memory paging custom approach for the Intelligent Offer Data Extract utility as discussed in this article. This approach runs the main SQL query only once and loads all the primary keys in the memory. Based on the specified paging parameters, the sub-list of the primary keys is passed on to the associated SQLs. For this article, you need to primarily customize the Data Reader layer, SQL Composer, and Xpath SQL Key Processor. You also need to be familiar with the WebSphere Commerce Data Service Layer. For more information, see this Information Center topic, Working with the data service layer.

This article assumes that WebSphere Commerce V7 FEP3 is installed and the Intelligent Offer data extract utility is set up and configured. For more information, see Configuring the Intelligent Offer data extraction utility.


Step 1: Create a custom SQL Composer

Create a new class CustomDataExtractSQLComposer, which extends the abstract class SQLComposer. This custom composer contains code that executes the main SQL query on the first service request, and thereafter executes a dummy SQL query for the subsequent service requests.

Methods to be overridden in the subclasses

The method shown in Listing 1 is the extension point where SQLComposer can modify and compose the final SQL statement.

Listing 1. SQLComposer extension point

a) public SQLComposerInfo composeSQLStatement(String sqlName, String entityTableName, 
List resultSetInfo, String sqlstatement, List params) throws DataServiceSystemException

Refer to CustomDataExtractSQLComposer.java in the code_snippets.zip file that is provided in the Download section of the article.

Path:

WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\facade\server\
services\dataaccess\db\jdbc\CustomDataExtractSQLComposer.java

<WebSphereCommerceServerExtensionsLogic> is the name of the project where the data extract specific customized files are placed.


Step 2: Create a custom Xpath SQL Key Processor

Create a new class CustomDataExtractKeyProcessor, which extends the abstract class XPathSQLKeyProcessor. On the first service request, this custom processor loads the list of primary keys returned by the main SQL query into memory. It then creates a sublist of the primary keys depending on the paging parameters.

Methods to be overridden in the subclasses

The method shown in Listing 2 returns a list of keys that will be used in subsequent associated SQLs.

Listing 2. XpathKeyProcessor extension point

a) abstract public List getKeys(List keys, Map ahmXPathQueryParameters, int 
sqlPagingLimit)

Refer to CustomDataExtractKeyProcessor.java in the code_snippets.zip that is provided in the Download section of the article.

Path:

WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\facade\server\
services\dataaccess\processor\CustomDataExtractKeyProcessor.java


Step 3: Create the main SQL Query in the query template file

Create a new SQL query in the query template file as shown in Listing 3. This makes use of the custom composer and processor created in the previous two steps.

Listing 3. Query template file

BEGIN_XPATH_TO_SQL_STATEMENT
name=/CatalogEntry[CatalogEntryIdentifier[ExternalIdentifier[StoreIdentifier
[(UniqueID=)]] and InMemoryPaging]]
base_table=CATENTRY
className=com.ibm.commerce.catalog.facade.server.services.dataaccess.db.
jdbc.CustomDataExtractSQLComposer
sql_key_processor=com.ibm.commerce.catalog.facade.server.services.dataaccess.
processor.CustomDataExtractKeyProcessor
sql=
	SELECT
 		CATENTRY.$COLS:CATENTRY_ID$
     	FROM
		CATENTRY JOIN STORECENT ON (CATENTRY.CATENTRY_ID =
		STORECENT.CATENTRY_ID AND STORECENT.STOREENT_ID IN
		(?UniqueID?))
	WHERE
	CATENTRY.CATENTTYPE_ID != 'ItemBean' AND CATENTRY.CATENTTYPE_ID
       != 'BundleBean' AND CATENTRY.BUYABLE=1 AND CATENTRY.MARKFORDELETE=0
	ORDER BY
		CATENTRY.CATENTRY_ID
END_XPATH_TO_SQL_STATEMENT

Refer to wc-query-MyCompany-CatalogEntry-admin-get-ext.tpl in the code_snippets.zip file that is provided in the Download section of the article.

Path:

WC\xml\config\com.ibm.commerce.catalog-ext\wc-query-MyCompany-CatalogEntry-admin-get-ext.
tpl


Step 4: Create a custom Reader Mediator

Create a new class CustomDataExtractReaderMediator, which extends the abstract class mediator and invokes the catalog component service created in Step 3. The following parameters are passed along with the service request:

  • _cat.beginIndex: Sets the value of the record set start number.
  • _cat.maxItems: Sets the page size for the record set.
  • _cat.isFirstCall: Indicates the first service request.
  • _wcf.ap: Sets the access profile for the request.
  • _wcf.dataLanguageIds: Sets the data language ID for the request.

Methods to be overridden in the subclasses

The following methods from CatalogEntryReaderMediator need to be overridden.

The method shown in Listing 4 originally initializes the catalog entry reader mediator. Hence, it needs to be overridden to initialize the delta extract mediator classes.

Listing 4. Initialization

a) public void init () throws DataLoadException

The method shown in Listing 5 retrieves a list of catalog entry logical nouns by invoking a catalog service based on the list of catalog entry IDs and access profile.

Listing 5. Service invocation

b) protected Object getDataObject(String beginIndex, String pageSize, String storeId) 
throws AbstractBusinessObjectDocumentException, DataLoadException

Refer to CustomDataExtractReaderMediator.java in the code_snippets.zip file that is provided in the Download section of the article.

Path:

WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\
datareader\CustomDataExtractReaderMediator.java


Step 5: Update the business object configuration file

Update the data reader class name in the business object configuration file wc-dataextract-catalog-entry.xml (see Listing 6) with the custom data reader mediator created in Step 4.

Listing 6. Configuration

<_config:DataReader className="com.mycompany.commerce.catalog.dataload.
 datareader. CustomDataExtractReaderMediator " pageSize="700" >
  <_config:property name="clientId" value="99999999"/>
  <_config:property name="storeId" value="10001"/>
  <_config:property name="username" value="wcsadmin"/>
  <_config:property name="password" value="3fdBFMFoiGNQ0zUStB865w=="/>
</_config:DataReader>

Refer to wc-dataextract-catalog-entry.xml in the code_snippets.zip file that is provided in the Download section of the article.

Path:

samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract.xml
samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract-env.xml
samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract-catalog-entry.xml

<samples> is the name of the directory where the data extract configuration files are located.


Step 6: Run the utility

You can run the data extract utility from the command line (see Listing 7) to extract the catalog entry records using in-memory paging.

Listing 7. Running the utility

dataextract.bat <WC_toolkit>\samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract.xml

Figure 1 shows the output console after successfully running the data extract utility by performing in-memory paging.


Figure 1. Output messages in the console window
Output messages in the console window

Limitation

The in-memory paging mechanism may result in design complexities. We strongly recommend that you perform this mechanism in cases where there are not too many customizations and to use it with large datasets.


Conclusion

In this article, you learned how to customize the data extract framework by incorporating in-memory paging. By following this approach, you can expect a performance improvement of over eight percent for a dataset with 100K records.



Download

DescriptionNameSizeDownload method
Code samplecode_snippets.zip14KBHTTP

Information about download methods


Resources

Learn

Discuss

About the authors

Photo of Vipin Murali

Vipin Murali is a Software Developer with the WebSphere Commerce team at the IBM India Software Lab. He has three years of experience in the e-commerce field. His areas of experience include Java, J2EE, and web services.

Photo of Dinup P Pillai

Dinup P. Pillai is a Software Developer with the WebSphere Commerce team at the IBM India Software Lab. He has three years of experience in the e-commerce field. His areas of experience include Java, J2EE, and web services.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=764731
ArticleTitle=Customizing the WebSphere Commerce data extract framework using in-memory paging
publish-date=10122011

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers