The Coremetrics® Intelligent Offer data extraction utility is a command-line utility that creates the Enterprise Product Report (EPR) data. This data is required by Coremetrics for dynamic recommendations. The utility extracts catalog data from the database and generates Enterprise Category Definition File (ECDF) and Enterprise Product Content Mapping File (EPCMF) files in the correct format to load into Coremetrics. The utility also extracts data from the WebSphere Commerce database and formats and writes it into CSV files. It contains several components, such as Data Reader, Business Object Builder, Business Object mediator, and Data Writer.
This utility is provided as a part of WebSphere Commerce V7 Feature Pack 3. For more information, see this Information Center topic, Data extraction utility for dynamic recommendations in Coremetrics Intelligent Offer.
The default implementation for the data extract solution performs "database level paging" by injecting paging indexes to the main SQL query. This query is executed for every service invocation. You can increase the database level paging size by changing the page size parameter in the data extract configuration to increase the data extract performance as needed. For more information, see this Information Center topic, Sample business object configuration file EPCMF data.
If you have a very large dataset and want to achieve considerable performance improvements, you may want to explore the in-memory paging custom approach for the Intelligent Offer Data Extract utility as discussed in this article. This approach runs the main SQL query only once and loads all the primary keys in the memory. Based on the specified paging parameters, the sub-list of the primary keys is passed on to the associated SQLs. For this article, you need to primarily customize the Data Reader layer, SQL Composer, and Xpath SQL Key Processor. You also need to be familiar with the WebSphere Commerce Data Service Layer. For more information, see this Information Center topic, Working with the data service layer.
This article assumes that WebSphere Commerce V7 FEP3 is installed and the Intelligent Offer data extract utility is set up and configured. For more information, see Configuring the Intelligent Offer data extraction utility.
Step 1: Create a custom SQL Composer
Create a new class CustomDataExtractSQLComposer,
which extends the abstract class SQLComposer.
This custom composer contains code that executes the main SQL query on the
first service request, and thereafter executes a dummy SQL query for the
subsequent service requests.
Methods to be overridden in the subclasses
The method shown in Listing 1 is the extension point where
SQLComposer can modify and compose the final
SQL statement.
Listing 1. SQLComposer extension point
a) public SQLComposerInfo composeSQLStatement(String sqlName, String entityTableName, List resultSetInfo, String sqlstatement, List params) throws DataServiceSystemException |
Refer to
CustomDataExtractSQLComposer.java
in the code_snippets.zip file that is provided in
the Download section of the article.
Path:
WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\facade\server\ services\dataaccess\db\jdbc\CustomDataExtractSQLComposer.java |
<WebSphereCommerceServerExtensionsLogic>
is the name of the project where the data extract specific customized
files are placed.
Step 2: Create a custom Xpath SQL Key Processor
Create a new class
CustomDataExtractKeyProcessor,
which extends the abstract class
XPathSQLKeyProcessor.
On the first service request, this custom processor loads the list of
primary keys returned by the main SQL query into memory. It then creates
a sublist of the primary keys depending on the paging parameters.
Methods to be overridden in the subclasses
The method shown in Listing 2 returns a list of keys that will be used in
subsequent associated SQLs.
Listing 2. XpathKeyProcessor extension point
a) abstract public List getKeys(List keys, Map ahmXPathQueryParameters, int sqlPagingLimit) |
Refer to
CustomDataExtractKeyProcessor.java
in the code_snippets.zip that is provided in the
Download section of the article.
Path:
WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\facade\server\ services\dataaccess\processor\CustomDataExtractKeyProcessor.java |
Step 3: Create the main SQL Query in the query template file
Create a new SQL query in the query template file as shown in Listing 3.
This makes use of the custom composer and processor created in the
previous two steps.
Listing 3. Query template file
BEGIN_XPATH_TO_SQL_STATEMENT
name=/CatalogEntry[CatalogEntryIdentifier[ExternalIdentifier[StoreIdentifier
[(UniqueID=)]] and InMemoryPaging]]
base_table=CATENTRY
className=com.ibm.commerce.catalog.facade.server.services.dataaccess.db.
jdbc.CustomDataExtractSQLComposer
sql_key_processor=com.ibm.commerce.catalog.facade.server.services.dataaccess.
processor.CustomDataExtractKeyProcessor
sql=
SELECT
CATENTRY.$COLS:CATENTRY_ID$
FROM
CATENTRY JOIN STORECENT ON (CATENTRY.CATENTRY_ID =
STORECENT.CATENTRY_ID AND STORECENT.STOREENT_ID IN
(?UniqueID?))
WHERE
CATENTRY.CATENTTYPE_ID != 'ItemBean' AND CATENTRY.CATENTTYPE_ID
!= 'BundleBean' AND CATENTRY.BUYABLE=1 AND CATENTRY.MARKFORDELETE=0
ORDER BY
CATENTRY.CATENTRY_ID
END_XPATH_TO_SQL_STATEMENT
|
Refer to
wc-query-MyCompany-CatalogEntry-admin-get-ext.tpl
in the code_snippets.zip file that is provided in
the Download section of the article.
Path:
WC\xml\config\com.ibm.commerce.catalog-ext\wc-query-MyCompany-CatalogEntry-admin-get-ext. tpl |
Step 4: Create a custom Reader Mediator
Create a new class
CustomDataExtractReaderMediator, which extends
the abstract class
mediator and invokes the catalog component service created in Step 3. The following parameters are
passed along with the service request:
- _cat.beginIndex: Sets the value of the record set start number.
- _cat.maxItems: Sets the page size for the record set.
- _cat.isFirstCall: Indicates the first service request.
- _wcf.ap: Sets the access profile for the request.
- _wcf.dataLanguageIds: Sets the data language ID for the request.
Methods to be overridden in the subclasses
The following methods from
CatalogEntryReaderMediator need to be
overridden.
The method shown in Listing 4 originally initializes the catalog entry reader mediator.
Hence, it needs to be overridden to initialize the delta extract mediator
classes.
Listing 4. Initialization
a) public void init () throws DataLoadException |
The method shown in Listing 5 retrieves a list of catalog entry logical nouns by invoking a
catalog service based on the list of catalog entry IDs and access profile.
Listing 5. Service invocation
b) protected Object getDataObject(String beginIndex, String pageSize, String storeId) throws AbstractBusinessObjectDocumentException, DataLoadException |
Refer to
CustomDataExtractReaderMediator.java
in the code_snippets.zip file that is provided in
the Download section of the article.
Path:
WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\ datareader\CustomDataExtractReaderMediator.java |
Step 5: Update the business object configuration file
Update the data reader class name in the business object configuration file
wc-dataextract-catalog-entry.xml (see Listing 6)
with the custom data reader mediator created in Step 4.
Listing 6. Configuration
<_config:DataReader className="com.mycompany.commerce.catalog.dataload. datareader. CustomDataExtractReaderMediator " pageSize="700" > <_config:property name="clientId" value="99999999"/> <_config:property name="storeId" value="10001"/> <_config:property name="username" value="wcsadmin"/> <_config:property name="password" value="3fdBFMFoiGNQ0zUStB865w=="/> </_config:DataReader> |
Refer to wc-dataextract-catalog-entry.xml in the
code_snippets.zip file that is provided in the
Download section of the article.
Path:
samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract.xml samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract-env.xml samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract-catalog-entry.xml |
<samples> is the name of the directory
where the data extract configuration files are located.
You can run the data extract utility from the command line (see Listing 7)
to extract the catalog entry records using in-memory paging.
Listing 7. Running the utility
dataextract.bat <WC_toolkit>\samples\DataExtract\Catalog\InMemoryPaging\wc-dataextract.xml |
Figure 1 shows the output console after successfully running the data extract utility by performing in-memory paging.
Figure 1. Output messages in the console window
The in-memory paging mechanism may result in design complexities. We strongly recommend that you perform this mechanism in cases where there are not too many customizations and to use it with large datasets.
In this article, you learned how to customize the data extract framework by incorporating in-memory paging. By following this approach, you can expect a performance improvement of over eight percent for a dataset with 100K records.
| Description | Name | Size | Download method |
|---|---|---|---|
| Code sample | code_snippets.zip | 14KB | HTTP |
Information about download methods
Learn
-
Coremetrics web site
-
WebSphere Commerce V7.0 Information Center
-
developerWorks WebSphere Commerce zone
Discuss






