Overview
Develop business insights
IBM Cloud Object Storage enables organizations to build a centralized data repository for nearly unlimited amounts of data. Data remains in its native format and doesn’t need to be moved in and out of IBM Cloud Object Storage; rather, the IBM Cloud Object Storage-based data lake is the persistent data store for analytics.
How customers use it
Query data in place
Query data in place
Use as an active workspace for a range of big data analytics use cases with query-in-place functionality that lets you run analytics directly on your data at rest.
Use Apache Spark Analytics
Use Apache Spark Analytics
Get a low-cost, scalable persistent storage layer for analytics with optimized connectively to Apache Spark.
Store training models
Store training models
Accelerate machine and deep learning workflows required to infuse AI into your business. Build and train AI models, and prepare and analyze data, in a single, integrated environment.
Build and analyze pipelines
Build and analyze pipelines
Store massive amounts of IoT data at low cost and allow analytics frameworks to access the data directly. Data pipelines can be easily set up and managed to generate analytics-ready data.
Features
Key capabilities that enable this use case
Readily move data from HDFS clusters
Free up space on Hadoop clusters by using IBM Big Replicate to efficiently move data between Hadoop data clusters to IBM Cloud Object Storage to offer continuous replication with data consistency.
Query data in place
IBM Cloud SQL Query is a fully managed service that lets developers analyze and transform data stored across multiple files in various formats using ANSI SQL statements.
Perform Apache Spark analytics
IBM Cloud Object Storage offers optimized connectivity to Apache Spark services to store data from multiple sources. Decouple the tiers to store data in an object storage layer and spin up clusters.
Store data for machine learning workflows
IBM Watson® Studio is a hybrid cloud platform built on open source and IBM tools to analyze data and use it to build and deploy AI models.
Perform intelligent data discovery
Once your data is in IBM Cloud Object Storage, it can be governed with the IBM Watson Knowledge Catalog using data profilers that segment and protect data to allow for better governance.
Easily build and analyze IoT data pipelines
IBM Cloud® provides services based on Apache Kafka and Apache Spark, including IBM Events Streams and Spark as a service. Pipelines from IBM Event Streams to object storage can be set up and managed.
Easy data collection and ingestion
IBM offers a variety of ways to get your data into IBM Cloud Object Storage, including natively integrated Aspera® high-speed data transfer capabilities for quick data transfer over the network.
Cost-effective and flexible
Organizations can build a centralized data repository to leverage cost-effective and scalable storage to collect and store virtually unlimited amounts of data of any type, from any source.
Always available
Designed to deliver 99.999999999% durability based on IBM internal data. Patented technology helps encrypt data and distributed it across multiple devices in IBM data center facilities.
Highly secure
Secure data using automatic server-side encryption and get encryption options with keys managed by IBM Key Protect key management system, or encryption with keys that you manage.