The solution architecture

IBM® Surveillance Insight for Financial Services is a layered architecture made up of several components.

The following diagram shows the different layers that make up the product:

Diagram showing the different layers of the product. — Figure 1. Product layers

The data layer shows the various types of structured and unstructured data that is consumed by the product.
The data ingestion layer contains the FTP/TCP-based adaptor that is used to load data into Hadoop. The Kafka messaging system is used for loading e-communications into the system.
Note: IBM Surveillance Insight for Financial Services does not provide the adaptors with the product.
The analytics layer contains the following components:
- The Workbench components and the supporting REST services for the user interfaces.
- Specific use case implementations that leverage the base toolkit operators.
- The surveillance library that contains the common components that provide core platform capabilities such as alert management, reasoning, and the policy engine.
- The Spark Streaming API is used by Spark jobs as part of the use case implementations.
- Speech 2 Text and the NLP APIs are used in voice surveillance and eComms surveillance.
- Solr is used to index content to enable search capabilities in the Workbench.
Kafka is used as an integration component in the use case implementations and to enable asynchronous communication between the Streams jobs and the Spark jobs.
The data layer primarily consists of data in Hadoop and IBM DB2®. The day-to-day market data is stored in Hadoop. It is accessed by using the spark-sql or spark-graphx APIs. Data in DB2 is accessed by using traditional relational SQL. REST Services are provided for data that needs to be accessed by the user interfaces and for certain operations such as alert management.

The following diagram shows the end-to-end component interactions in IBM Surveillance Insight for Financial Services.

Diagram showing the end-to-end component interaction — Figure 2. End-to-end component interaction

Trade data is loaded into Hadoop through secure FTP. The Data Loader Streams job monitors specific folders in Hadoop and provides the data to the use cases that need market data.
The trade use case implementations analyze the data and creates relevant risk evidences.
Email and chat data is brought into the system through a REST service that drops the data from third-party sources into the Kafka topic.
The unstructured data is analyzed by the Streams jobs and the results are persisted to Kafka.
Voice data is obtained through secure FTP. The trigger for processing the data is then passed on through the Kafka message that contains the metadata about the voice data that needs to be processed.
After the voice data is converted to text, the rest of the analysis is performed in the same way as the email and chat data is processed.
The output, or the risk evidences from the use case implementations (trade, ecomm, and voice), are dropped into the Kafka messaging topics for the use case-specific Spark jobs. The Spark jobs perform the post processing after the evidences are received from the Streams jobs.