IBM InfoSphere BigMatch for Hadoop v.11.5 Fixpack 4 new features
madavid 270001CJGW Visits (2658)
1. Manual Linking and Unlinking using APIs
2. Enhancements to Link workflow to support Disk Caching for larger data sets/out of memory issues.
3. Token Frequency Analysis Enhancements including option to ignore existing strings and file output on HDFS contains all the strings for a given record table.
4. File based Bulk Search and enhancements to Search Workflow to include data in results.
a. New File based bulk search (in addition to table based) with File uploader to use file from local disk.
b. Bulk search output will also output source record/entity data
c. Column Discovery on source table to assist with populating output table column names
5. Support output format of type CSV and TSV for Sample Pair Workflow
6. Addition of Self Score and Center Member Score to Extract Workflow output
7. Major enhancements to Strings UI including import/export and viewing/editing of strings
8. Enhancements to Configuration Wizard including ability to load data, discover data and automatically map to attribute types.
9. Enhancements to Create table and Load data workflow in HBase during project creation
10. Enhancements to Mapping Screen including the ability to discover data, see sample data etc
11. Simplified Workflow output path from UI
12. More workflows now have JVM settings available in the UI
13. Score Analysis UI now also shows Glue Threshold recommendation.
14. Workflows when run from UI will have pre-populated resource recommendations (JVM, Worker count, Executor instances, Executor Memory etc). These are based on the resources of your cluster and calculated automatically for the workflows to perform optimally. Change these only in advanced conditions to control resource usage.
You can download the fixpack from FixCentral with the following package name.
Package name: 11.5