IBM InfoSphere Master Data Management, Version 10.1Bucketing data assists in candidate selection by identifying groups of shared information.
During initial configuration, buckets are identified for data comparison and each bucket can have one or more attributes involved. For example, buckets can be defined for name (first, last, middle), birth date + last name, address, and Social Security number or other identifiers. Any attribute with an anonymous value is skipped in bucketing. The standardized string is converted to a format that makes for quick access during candidate selection.
Using the example of John Q. Public, the standardized strings in the following table might be generated.
| Bucket | Standardized string |
|---|---|
| Name | JohnQPublic |
| Last name + DOB | Public1024 |
| Phone | 5556060 |
| Address + Zip code | 1043WEasySt85545 |
| SSN | 482891822 |
