Using entity analytics with IBM SPSS Modeler
You suspect that you may have identity problems with your data. For example, individuals might appear more than once, or distinct individuals might appear to be merged or missing. How can IBM® SPSS® Modeler Entity Analytics help you address this? The following is a suggested procedure, though you may need to vary this to suit your particular requirements.
- Read the source data into IBM SPSS Modeler
- Create a repository ready to store the data
- Connect IBM SPSS Modeler to the repository
- Map the data fields to repository features
- Export the data into the repository and resolve the identities
- Analyze the resolved identities
- Resolve new cases against the repository
- Generate any necessary alerts (batch or real-time)
At this point, you need to know something of how IBM SPSS Modeler works. IBM SPSS Modeler is a very user-friendly tool, based on the graphical representation of a stream of data flowing through a number of nodes. Each node represents a particular stage of the workflow.
IBM SPSS Modeler itself provides a wide range of nodes, covering all the standard data mining functions. IBM SPSS Modeler Entity Analytics adds nodes for use specifically in entity analytics. These are EA export node, the Entity Analytics(EA) source node, and the Streaming EA process node.
The following figure illustrates the process.
