AI

Delivering effective analytics with limited or no ground truth

Share this post:

Insurance companies are continually subjected to questionable claims, whether that be actual fraud, waste, or just abuse. Insurance fraud in the U.S. alone represents a USD 32 billion in P&C and USD 84 billion in health care costs per year loss to insurance companies. Each carrier has tens and even hundreds of thousands of claims processed, yet the fraudulent claims are actually a small fraction of the total. This leads to highly unbalanced datasets with sparse data that makes fraud detection especially hard. 

Combine that with the fact that new schemes are constantly emerging for which there is no available ground truth until well after a scheme is successfully implemented. This leaves insurance companies at a disadvantage.

Anti-fraud AI and machine learning processes

In typical AI and machine learning (ML) processes, the detection model is trained on a set of labeled data, which annotates whether the claim should be considered fraudulent or not. Since the dataset with this information is highly imbalanced, it requires different techniques than a “simple AI/ML model”. The IBM Financial Crimes Insight for Claims Fraud 6.5.1 product specializes in helping insurance companies find fraud, waste, and abuse. In this article I discuss a couple of the analytical techniques that we use given the lack of ground truth characteristic of the data. 

One of the techniques we employ is that of using clustering models to analyze the data across a variety of dimensions. The claims are segmented by claim and/or party characteristics into different populations that are expected to behave differently. Subsequently, a cluster model for each segment is created that identifies “micro-clusters” within the population based on actual behavioral patterns in the claims data. Finally, known outcomes such as referrals and investigations are overlaid onto these micro-clusters to provide additional insight into claims that lack definitive outcomes. The final result is a set of features and feature values that are used to help compensate for the lack of ground truth on specific claims.  

Auto-encoding is another technique that the Financial Crimes Insight solution (FCI) features. This technique actually takes advantage of the unbalanced nature of the data. It’s a type of artificial neural network that is trained on valid claims, which can be ones investigated and determined to be valid or optionally include claims that were never investigated, thereby exploding the amount of training data available. 

A systematic approach to fraud detection

Without going into great detail, auto-encoding is a process by which the data is mathematically simplified and then reconstructed back into its (nearly) original state. Since the model was trained on valid claims, a fraudulent claim will typically result in high reconstruction errors, which provides clear fraud signals to downstream ensemble models performing the final fraud assessment.

These techniques are part of a systematic approach to fraud detection that combines multiple supervised and unsupervised learning methods. It leverages features created using traditional fraud indicators from the data, multiple deterministic techniques and enhanced statistical methods. 

In addition to detecting known patterns, this FCI combined approach allows the system to potentially discover emergent fraudulent patterns that are not yet well established. This early detection in the claim lifecycle allows us to expedite the suspicious behavior alert to the appropriate insurance investigator or analyst. 

Preventing insurance fraud webinar replays on demand

The earlier in the lifecycle that a valid alert can be raised, the better chance the insurance company will have at stopping the attempt and mitigating losses. For more about the latest analytic techniques to combat claims fraud, check out the IBM webinar replays on demand at COVID-19: Responding to the threat of fraudulent claims and How can the Insurance Fraud Industry stay ahead of Insurance Fraud in the new normal?

IBM Financial Crimes Insight for Claims Fraud is part of the IBM RegTech regulatory compliance solutions that are designed to help financial institutions better meet their regulatory monitoring, reporting, compliance and risk management needs. Learn more about IBM Financial Crimes Insight for Claims Fraud.

Sr Cert ITS; Detection Architect, Watson Financial Crimes Insight IBM Cloud and Cognitive Software

More stories

Key challenges and priorities for GRC leaders in 2021

As enterprises move their critical workloads to cloud and regulators tighten the norms in the wake of security breaches, the job of Governance, Risk and Compliance (GRC) professionals has become increasingly important and extremely difficult at the same time. We inspect the escalating cost pressures and reflect on some of the key priorities that GRC […]

Continue reading

Analytics at Work in Detecting Insurance Fraud

With analytic techniques such as business rules, statistical models, and machine learning it can be difficult to understand the role of each approach in identifying fraud. Analytical techniques used in identifying fraud There are a variety of techniques that are used to detect fraud, many of which fall under the umbrella of business rules, statistical […]

Continue reading

IDC ranks IBM #1 FinTech in Top 25 Enterprise category for fourth consecutive year

Based on its research and market analysis, IDC Financial Insights announced its annual FinTech Rankings 2020, recently, in two categories – The IDC FinTech Rankings Top 100 and Top 25 Enterprise. We’re proud to share that IBM is identified #1 FinTech in the Top 25 Enterprise category for the fourth consecutive year in the annual […]

Continue reading