The Intelligent Offer data extraction utility is a command-line utility that creates the Enterprise Product Report (EPR) data that Coremetrics® requires for dynamic recommendations. The utility extracts catalog data from the database and generates Enterprise Category Definition File (ECDF) and Enterprise Product Content Mapping File (EPCMF) files in the correct format to load into Coremetrics. The utility extracts data from the WebSphere® Commerce database and formats and writes it into CSV files. It contains several components such as Data Reader, Business Object Builder, Business Object Mediator, and Data Writer.
This utility is provided as a part of WebSphere Commerce V7 Feature Pack 3 release. For more information, see this Information Center topic, Data extraction utility for dynamic recommendations in Coremetrics Intelligent Offer.
The default implementation of the data extraction utility retrieves all the catalog entries belonging to the store. Even if only a few records have been modified since the previous extraction, the full dataset is extracted from the source system each time the utility is run. The delta extraction mechanism improves efficiency by retrieving only those records that have changed after a specified date, resulting in faster extraction of the dataset. For this article, you will primarily customize the Data Reader layer. This mechanism works in the production environment as well as a staging environment (base schema).
This article assumes that WebSphere Commerce V7 FEP3 is installed and the Intelligent Offer data extract utility is set up and configured. For more information, see Configuring the Intelligent Offer data extraction utility.
Step 1: Create the Delta Extract Reader Mediator classes
Create a new abstract
AbstractDeltaExtractCatalogReaderMediator class
that extends the
AbstractCatalogReaderMediator class, and create
a catalog-entry-specific reader mediator
DeltaExtractCatalogEntryReaderMediator class
that extends the
AbstractDeltaExtractCatalogReaderMediator class.
These mediators invoke the change history API to retrieve the change
history information.
WebSphere Commerce provides the change history API that returns change history information, such as the primary object ID of the changed noun, based on the search criteria. The change history feature captures the information of a catalog entry if:
- A new catalog entry is created.
- An existing catalog entry is deleted.
- An existing catalog entry's property is modified.
The following search criteria are passed to the API:
- Workspace: Sets the workspace name.
- TaskGroup: Sets the task group name.
- ObjectType: Sets the type of the noun, for example, CatalogEntry.
- StoreId: Sets the store ID from which the change history is to be retrieved.
- StartDate: Sets the date starting from which change history information is returned.
- UIObjectNames: Lists the catalog entry types to be retrieved, for example, product, kit, and so on.
- Actions: Returns the change history information based on the actions performed on the noun, for example, N (new), D (delete), U (update).
- DBType: Sets the data base type which determines the paging mechanism.
- BeginIndex: Sets the begin index.
- PageSize: Sets the page size.
A database connection needs to be passed along with the change history
search criteria. The change history API retrieves the change history
information from the database with the help of this database connection.
The database properties need to be configured by the user in the
environment configuration file for the data extraction utility,
wc-dataextract-env.xml. This configuration step
is covered later in Step 4.
A brief description of the newly created mediators is provided below:
- AbstractDeltaExtractCatalogReaderMediator: This class
contains the abstract methods that need to be implemented by the sub
classes. It initializes the StartDate parameter configured in the
business object configuration file. With the help of the primary
object keys retrieved by the change history API, the
catalog-entry-specific data is returned based on the following
actions:
- Actions = (N, U): Primary object keys are passed on to the new catalog service as XPath parameters to retrieve the records.
- Actions = (D): As the records pertaining to the deleted catalog entries do not exist in the database, a new response BOD is built for the deleted entries with its parent catalog group set to "Uncategorized".
- DeltaExtractCatalogEntryReaderMediator: This
class implements the abstract methods declared in the abstract
class
AbstractDeltaExtractCatalogReaderMediator. It sets the change history search criteria specific to the catalog entry.
Methods to be overridden in the sub classes
The following methods from
AbstractCatalogReaderMediator need to be
overridden.
The method shown in Listing 1 originally initializes the catalog entry
reader mediator. Therefore, this method needs to be overridden to
initialize the delta extract mediator classes.
Listing 1. Initialization
a) public void init () throws DataLoadException |
The method shown in Listing 2 retrieves a list of catalog entry logical
nouns by invoking a catalog service based on the list of catalog entry IDs
and the access profile.
Listing 2. Service invocation
b) protected Object getDataObject(String beginIndex, String pageSize, String storeId) throws AbstractBusinessObjectDocumentException, DataLoadException |
The following methods from
AbstractDeltaExtractCatalogReaderMediator need
to be overridden.
The method shown in Listing 3 gets the details pertaining to the deleted
catalog entries from the
TaskGroupChangeHistoryDataSet object, in the
form of a Business Object Document (BOD).
Listing 3. Retrieve the deleted catalog entries
a) protected Object getDeletedDataObjects (List<TaskGroupChangeHistoryDataSet> changeHistoryList) |
The method shown in Listing 4 builds
ShowCatalogEntryDataAreaType, with the value of
recordSetTotal retrieved from the list of the
TaskGroupChangeHistoryDataSet objects.
Listing 4. Build the data object
b) protected Object buildShowDataAreaObject(List <TaskGroupChangeHistoryDataSet> changeHistoryList) |
The method shown in Listing 5 builds the
SelectionCriteriaHelper object using the XPath
expression, XPath parameters, and the control parameters.
Listing 5. Build the selection criteria
c) protected SelectionCriteriaHelper buildSelectCriteriaHelper(String uniqueIdParameter) |
The method shown in Listing 6 returns a list of CatalogEntry nouns from the
response.
Listing 6. Get the retrieved catalog entries
d) protected List getDataObjectFromResponse(Object response) throws DataLoadException |
The method shown in Listing 7 returns the record set total returned from
the CatalogEntry service.
Listing 7. Get the record set total
e) protected String getRecordSetTotal(Object response) throws DataLoadException |
The method shown in Listing 8 populates the
ChangeHistorySearchCriteria object with the
search criteria specific to the catalog entry.
Listing 8. Build the change history search criteria
f) protected void buildChangeHistorySearchCriteria() throws DataLoadException |
Refer to
AbstractDeltaExtractCatalogReaderMediator.java
and DeltaExtractCatalogEntryReaderMediator.java
in the code_snippets.zip file that is provided in
the Download section of this article.
Path:
WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\ datareader\AbstractDeltaExtractCatalogReaderMediator.java WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\ datareader\DeltaExtractCatalogEntryReaderMediator.java |
Step 2: Create the custom exception classes and properties file
- Create a new exception class,
DeltaExtractApplicationException, that extends theDataLoadApplicationExceptionclass. The delta extract data reader mediators will throw the newly created exception when application errors occur during the extraction of business data. - Create a new properties file,
WcDataloadMessages_en_US.properties, that contains the exception messages for the delta extract application exception. - Create a new message keys file,
DeltaExtractMessageKeys.java, that contains the exception message keys for the delta extract application exception. - Refer to
DeltaExtractApplicationException.java,WcDataloadMessages_en_US.properties, andDeltaExtractMessageKeys.javain the code_snippets.zip file that is provided in the Download section of this article.Path:
WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\ exception\DeltaExtractApplicationException.java WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\ logging\properties\WcDataloadMessages_en_US.properties WebSphereCommerceServerExtensionsLogic\src\com\mycompany\commerce\catalog\dataload\ exception\DeltaExtractMessageKeys.java
Step 3: Create the main SQL query in the query template file
- Create a new SQL query in the query template file as shown in Listing 9.
The Xpath parameters comprise of a list of catalog entry IDs that have
been modified after a specific date. This main SQL returns primary keys
that will be passed to the associated SQLs.
Listing 9. Query template fileBEGIN_XPATH_TO_SQL_STATEMENT name=/CatalogEntry[CatalogEntryIdentifier[(UniqueID=)]] base_table=CATENTRY sql= SELECT CATENTRY.$COLS:CATENTRY_ID$ FROM CATENTRY WHERE CATENTRY_ID IN (?UniqueID?) END_XPATH_TO_SQL_STATEMENT
- Refer to
wc-query-MyCompany-CatalogEntry-admin-get-ext.tplin the code_snippets.zip file that is provided in the Download section of this article.Path:
WC\xml\config\com.ibm.commerce.catalog-ext\wc-query-MyCompany-CatalogEntry-admin-get- ext.tpl
Step 4: Update the data extract configuration files
The following data extract configuration files shown in Listing 10 and Listing 11 need to be updated.
In the business object configuration file,
wc-dataextract-catalog-entry.xml:
- Update the data reader className value with the custom data reader mediator class created in Step 1.
- Add a new startDate property to specify the start date for the
delta extraction, as shown in Listing 10.
Listing 10. Configuration<_config:DataReader className="com.mycompany.commerce.catalog.dataload.datareader. DeltaExtractCatalogEntryReaderMediator " pageSize="700" > <_config:property name="clientId" value="99999999"/> <_config:property name="storeId" value="10001"/> <_config:property name="username" value="wcsadmin"/> <_config:property name="password" value="3fdBFMFoiGNQ0zUStB865w=="/> <_config:property name="startDate" value="2011-01-01 00:00:00.000000000" /> </_config:DataReader>
- Add the following database properties shown in Listing
11 in the
wc-dataextract-env.xmlfile. You can use this configuration to create a database connection for the change history API.
Listing 11. Database configuration<_config:Database type="db2" name="mall" user="build" password= "xK36ck80s6GbQL+aVIOszg==" server="localhost" port="50000" schema="wcs" />
- Refer to
wc-dataextract-catalog-entry.xmlandwc-dataextract-env.xmlin the code_snippets.zip file that is provided in the Download section of this article.Path:
samples\DataExtract\Catalog\DeltaExtract\wc-dataextract.xml samples\DataExtract\Catalog\DeltaExtract\wc-dataextract-env.xml samples\DataExtract\Catalog\DeltaExtract\wc-dataextract-catalog-entry.xml
Step 5: Run the data extract utility
To perform delta extraction, run the data extract utility from the command
line to extract the catalog entry records that have changed since the
configured start date (Listing 12).
Listing 12. Running the utility
dataextract.bat <WC_toolkit>\samples\DataExtract\Catalog\DeltaExtract\ wc-dataextract.xml |
Figure 1 shows the output console after successfully running the data extract utility by performing delta extract.
Figure 1. Running the data extract utility
The delta extraction mechanism is not supported for a Cloudscape or Derby database.
In this article, you learned how to extend the data extract framework to perform delta extractions. This procedure provided a more efficient extraction process.
| Description | Name | Size | Download method |
|---|---|---|---|
| Code sample | code_snippets.zip | 17KB | HTTP |
Information about download methods
Learn
-
Customizing the WebSphere Commerce data extract framework using
in-memory paging
-
Coremetrics web site
-
WebSphere Commerce V7.0 Information Center
-
developerWorks WebSphere Commerce zone
Discuss






