For full details on this research, see the X-Force Red whitepaper “Disrupting the Model: Abusing MLOps Platforms to Compromise ML Models and Enterprise Data Lakes”.

Machine learning operations (MLOps) platforms are used by enterprises of all sizes to develop, train, deploy and monitor large language models (LLMs) and other foundation models (FMs), as well as the generative AI (gen AI) applications built on top of these models. The rush to leverage AI throughout enterprises has meant that security has been often overlooked in the name of progress, resulting in weak controls and direct access to sensitive data lakes and crown jewel data for retrieval augmented generation (RAG) use. Similar to attacks targeting development operations (DevOps), if an attacker can gain unauthorized access to these MLOps platforms, there could be a significant impact through a variety of attacks that affect the confidentiality, integrity and availability of the machine learning (ML) models and the data they provide. Threat actors are likely motivated to abuse these gaps and are pursuing early research and private toolkits to attack MLOps platforms, steal both the valuable FMs/LLMs and weights, poison LLMs used for computer vision and military use and compromise the sensitive enterprise datasets connected to AI-integrated applications.

This research includes a background on MLOps platforms and the machine learning security operations (MLSecOps) lifecycle, along with detailing ways to abuse some of the most popular cloud-based and internally hosted platforms used by enterprises such as BigML, Azure Machine Learning and Google Cloud Vertex AI. These attack scenarios will include data poisoning, data extraction and model extraction. Additionally, there is a public release of open-source tooling to perform and facilitate these attacks, along with defensive guidance for protecting these MLOps platforms.