Topic
  • 4 replies
  • Latest Post - ‏2014-06-23T14:39:13Z by jdicarl
SystemAdmin
SystemAdmin
7 Posts

Pinned topic Data Explorer (Velocity) Best Practices

‏2012-11-14T00:01:00Z |
What are some of your Data Explorer (Velocity) best practices?
Updated on 2013-03-18T10:44:50Z at 2013-03-18T10:44:50Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    7 Posts

    Re: Data Explorer (Velocity) Best Practices

    ‏2013-01-02T19:43:38Z  
    I hate answering a question with a question, but in this case I must: best practices in what particular area?
    Installation? Hardware? Crawling requirements? Architecture? Best practices can be all over the place. Was there any particular area you were concerned about?

    BTW, do you have an instance of Data Explorer to play with? Have you run into problems? Not sure what to do next?
  • SystemAdmin
    SystemAdmin
    7 Posts

    Re: Data Explorer (Velocity) Best Practices

    ‏2013-03-18T10:44:50Z  
    I hate answering a question with a question, but in this case I must: best practices in what particular area?
    Installation? Hardware? Crawling requirements? Architecture? Best practices can be all over the place. Was there any particular area you were concerned about?

    BTW, do you have an instance of Data Explorer to play with? Have you run into problems? Not sure what to do next?
    Carlos,

    There is a 66 page guide on "Velocity" entitled "Velocity Best Practices Guide". I don't see a link to download it from here but you can request it I am sure.

    Below is a list from table of contents to give you an idea of what's covered.

    This document was a boon on my last project where I was struggling to get information on tuning caching and the like.

    Thanks,
    Manny

    2. Hardware Recommendations ............................................................................................ 3
    2.1. Development Hardware Recommendations ............................................................... 3
    2.2. Deployment Hardware Recommendations ................................................................ 3
    2.3. General Hardware Metrics and Suggestions .............................................................. 4
    2.4. Velocity on Virtual Machines or in the Cloud ........................................................... 4
    2.5. Testing Hardware Performance and Configuration ..................................................... 5
    3. Introduction to Velocity Performance Considerations ............................................................ 7
    3.1. Common Sources of Performance Problems ............................................................. 8
    3.2. Common Performance Improvements ...................................................................... 8
    4. Managing Sources and Collections .................................................................................. 11
    4.1. Creating Single or Multiple Collections ................................................................. 11
    4.1.1. Overview: Common Ways of Organizing Search Collections ........................... 11
    4.1.2. Advantages and Disadvantages of Single Collections ...................................... 12
    4.1.3. Advantages and Disadvantages of Multiple Collections ................................... 14
    4.1.4. Decision Trees for Collection Partitioning .................................................... 14
    4.1.4.1. Dividing Data Across Search Collections ........................................... 15
    4.1.4.2. Dividing Search Collections Across Servers ....................................... 15
    4.2. Using Sources and Source Bundles ....................................................................... 16
    4.2.1. When to Use Source Bundles .................................................................... 16
    4.2.2. Combining Indexed and Federated Sources .................................................. 16
    4.3. Managing Collection Storage Requirements ............................................................ 17
    4.3.1. Storage Requirements for a Collection ........................................................ 17
    4.3.2. Relocating Collections ............................................................................. 18
    4.3.3. Managing Log Data for Collections ............................................................ 18
    5. Managing Crawling and Conversion ................................................................................ 21
    5.1. Configuring the Crawler ..................................................................................... 21
    5.1.1. Global Settings ....................................................................................... 21
    5.1.2. Conditional Settings ................................................................................. 24
    5.2. Configuring the Converter Framework ................................................................... 25
    5.2.1. Converter Configuration Settings ................................................................ 26
    5.2.2. Converter Configuration Tips .................................................................... 26
    5.2.2.1. Consider Disabling Normalization .................................................... 27
    5.2.2.2. Consider Disabling Title Truncation ................................................. 27
    5.2.2.3. Consider Limiting Static Summary Size ............................................ 27
    5.2.2.4. Consider Disabling Shingles ........................................................... 27
    5.3. Configuring Logging Levels ................................................................................ 28
    5.3.1. Connector Logging .................................................................................. 28
    5.3.2. Distributed Indexing Logging .................................................................... 29
    5.4. Using Light Crawler Mode .................................................................................. 29
    5.5. Troubleshooting Crawler Problems ....................................................................... 30
    5.5.1. Debugging General Crawler Problems ......................................................... 30
    5.5.2. Debugging Connector Problems ................................................................. 31
    5.5.3. Debugging Converter Problems .................................................................. 31
    5.5.3.1. Converter Resources ...................................................................... 31
    5.5.3.2. Converter execution ...................................................................... 31
    6. Indexing ..................................................................................................................... 33
    6.1. Configuring Indexing ......................................................................................... 33
    6.1.1. Configuration Options in the General Section ............................................... 33
    6.1.2. Configuration Options in the Indices Section ................................................ 33
    6.1.3. Configuration Options in the Duplicate Filtering Section ................................. 34
  • Luke_P
    Luke_P
    2 Posts

    Re: Data Explorer (Velocity) Best Practices

    ‏2014-05-04T20:46:03Z  
    Carlos,

    There is a 66 page guide on "Velocity" entitled "Velocity Best Practices Guide". I don't see a link to download it from here but you can request it I am sure.

    Below is a list from table of contents to give you an idea of what's covered.

    This document was a boon on my last project where I was struggling to get information on tuning caching and the like.

    Thanks,
    Manny

    2. Hardware Recommendations ............................................................................................ 3
    2.1. Development Hardware Recommendations ............................................................... 3
    2.2. Deployment Hardware Recommendations ................................................................ 3
    2.3. General Hardware Metrics and Suggestions .............................................................. 4
    2.4. Velocity on Virtual Machines or in the Cloud ........................................................... 4
    2.5. Testing Hardware Performance and Configuration ..................................................... 5
    3. Introduction to Velocity Performance Considerations ............................................................ 7
    3.1. Common Sources of Performance Problems ............................................................. 8
    3.2. Common Performance Improvements ...................................................................... 8
    4. Managing Sources and Collections .................................................................................. 11
    4.1. Creating Single or Multiple Collections ................................................................. 11
    4.1.1. Overview: Common Ways of Organizing Search Collections ........................... 11
    4.1.2. Advantages and Disadvantages of Single Collections ...................................... 12
    4.1.3. Advantages and Disadvantages of Multiple Collections ................................... 14
    4.1.4. Decision Trees for Collection Partitioning .................................................... 14
    4.1.4.1. Dividing Data Across Search Collections ........................................... 15
    4.1.4.2. Dividing Search Collections Across Servers ....................................... 15
    4.2. Using Sources and Source Bundles ....................................................................... 16
    4.2.1. When to Use Source Bundles .................................................................... 16
    4.2.2. Combining Indexed and Federated Sources .................................................. 16
    4.3. Managing Collection Storage Requirements ............................................................ 17
    4.3.1. Storage Requirements for a Collection ........................................................ 17
    4.3.2. Relocating Collections ............................................................................. 18
    4.3.3. Managing Log Data for Collections ............................................................ 18
    5. Managing Crawling and Conversion ................................................................................ 21
    5.1. Configuring the Crawler ..................................................................................... 21
    5.1.1. Global Settings ....................................................................................... 21
    5.1.2. Conditional Settings ................................................................................. 24
    5.2. Configuring the Converter Framework ................................................................... 25
    5.2.1. Converter Configuration Settings ................................................................ 26
    5.2.2. Converter Configuration Tips .................................................................... 26
    5.2.2.1. Consider Disabling Normalization .................................................... 27
    5.2.2.2. Consider Disabling Title Truncation ................................................. 27
    5.2.2.3. Consider Limiting Static Summary Size ............................................ 27
    5.2.2.4. Consider Disabling Shingles ........................................................... 27
    5.3. Configuring Logging Levels ................................................................................ 28
    5.3.1. Connector Logging .................................................................................. 28
    5.3.2. Distributed Indexing Logging .................................................................... 29
    5.4. Using Light Crawler Mode .................................................................................. 29
    5.5. Troubleshooting Crawler Problems ....................................................................... 30
    5.5.1. Debugging General Crawler Problems ......................................................... 30
    5.5.2. Debugging Connector Problems ................................................................. 31
    5.5.3. Debugging Converter Problems .................................................................. 31
    5.5.3.1. Converter Resources ...................................................................... 31
    5.5.3.2. Converter execution ...................................................................... 31
    6. Indexing ..................................................................................................................... 33
    6.1. Configuring Indexing ......................................................................................... 33
    6.1.1. Configuration Options in the General Section ............................................... 33
    6.1.2. Configuration Options in the Indices Section ................................................ 33
    6.1.3. Configuration Options in the Duplicate Filtering Section ................................. 34

    Hi All,

    The engine best practices guide is part of the documentation found here: http://pic.dhe.ibm.com/infocenter/dataexpl/v9r0/index.jsp  Note that the link will actually change as new versions of the product are released so searching for the documentation on the web is a good way to get to the latest version.  For the best practices part, see the menu on the left side.

    For 360-degree applications, the redbook found here might also be useful: http://www.redbooks.ibm.com/abstracts/sg248133.html?Open

  • jdicarl
    jdicarl
    1 Post

    Re: Data Explorer (Velocity) Best Practices

    ‏2014-06-23T14:39:13Z  

    Best Practices for Watson Explorer Engine Application 9.0

    http://www.ibm.com/support/knowledgecenter/SS8NLW_9.0.0/com.ibm.swg.im.infosphere.dataexpl.engine.best-prac.doc/c_bp-wrapper.html