Overview

Develop business insights

IBM Cloud Object Storage enables organizations to build a centralized data repository for nearly unlimited amounts of data. Data remains in its native format and doesn’t need to be moved in and out of IBM Cloud Object Storage; rather, the IBM Cloud Object Storage-based data lake is the persistent data store for analytics.

How customers use it

Migrate data

Migrate data

Free up space on expensive Hadoop clusters by efficiently migrating large amounts of data from Hadoop to IBM Cloud Object Storage.

Query data in place

Query data in place

Use as an active workspace for a range of big data analytics use cases with query-in-place functionality that lets you run analytics directly on your data at rest.

Use Apache Spark Analytics

Use Apache Spark Analytics

Get a low-cost, scalable persistent storage layer for analytics with optimized connectively to Apache Spark.

Store training models

Store training models

Accelerate machine and deep learning workflows required to infuse AI into your business. Build and train AI models, and prepare and analyze data, in a single, integrated environment.

Build and analyze pipelines

Build and analyze pipelines

Store massive amounts of IoT data at low cost and allow analytics frameworks to access the data directly. Data pipelines can be easily set up and managed to generate analytics-ready data.

Features

Key capabilities that enable this use case

Readily move data from HDFS clusters

Free up space on Hadoop clusters by using IBM Big Replicate to efficiently move data between Hadoop data clusters to IBM Cloud Object Storage to offer continuous replication with data consistency.

Query data in place

IBM Cloud SQL Query is a fully managed service that lets developers analyze and transform data stored across multiple files in various formats using ANSI SQL statements.

Perform Apache Spark analytics

IBM Cloud Object Storage offers optimized connectivity to Apache Spark services to store data from multiple sources. Decouple the tiers to store data in an object storage layer and spin up clusters.

Store data for machine learning workflows

IBM Watson® Studio is a hybrid cloud platform built on open source and IBM tools to analyze data and use it to build and deploy AI models.

Perform intelligent data discovery

Once your data is in IBM Cloud Object Storage, it can be governed with the IBM Watson Knowledge Catalog using data profilers that segment and protect data to allow for better governance.

Easily build and analyze IoT data pipelines

IBM Cloud® provides services based on Apache Kafka and Apache Spark, including IBM Events Streams and Spark as a service. Pipelines from IBM Event Streams to object storage can be set up and managed.

Easy data collection and ingestion

IBM offers a variety of ways to get your data into IBM Cloud Object Storage, including natively integrated Aspera® high-speed data transfer capabilities for quick data transfer over the network.

Cost-effective and flexible

Organizations can build a centralized data repository to leverage cost-effective and scalable storage to collect and store virtually unlimited amounts of data of any type, from any source.

Always available

Designed to deliver 99.999999999% durability based on IBM internal data. Patented technology helps encrypt data and distributed it across multiple devices in IBM data center facilities.

Highly secure

Secure data using automatic server-side encryption and get encryption options with keys managed by IBM Key Protect key management system, or encryption with keys that you manage.