by DaeSung Chung, Big Data & Machine Learning, IBM Systems Lab Services, IBM Korea
I recently had a chance to implement a 10 petabyte object storage solution using IBM Spectrum Scale and IBM Elastic Storage Server (ESS). The solution used Spectrum Scale to provide OpenStack Swift API services and Elastic Storage Server as back-end storage. In this post, I want to share some tips based on the lessons I learned from this project.
Tip 1: Look for usage patterns of many, many files with heavy metadata searches.
The starting point for any storage implementation is assessing and validating your storage needs. If you need scale-out storage that can grow up to tens of petabytes, ESS can easily meet this requirement, so why would you also need object storage? Perhaps your infrastructure is being stressed by heavy metadata searches. This is one instance where object storage comes into play. (For other use cases, refer to note 1 at the end of this article.)
Figure 1 illustrates how general object storage differs from the classic network-attached storage (NAS) shown on the left. With NAS, metadata searches are done on specially reserved areas called metadata or i-nodes in a file system. Since its capacity is limited, heavy input/output (I/O) on the area stresses the system. Object storage can overcome this bottleneck by storing metadata on a lightweight database that’s optimized for fast searches.
IBM Spectrum Scale provides object storage services by adopting OpenStack Swift in its architecture. OpenStack Swift is being widely used by many cloud service providers.
Figure 1: Difference in metadata operation between classical NAS and object storage (Adapted from “The Storage Evolution: From Blocks, Files and Objects to Object Storage Systems”)
Tip 2: Determine if you need to maintain file system–based services in parallel with object storage services.
What if you need both object and file storage? If you want to understand what makes Spectrum Scale stand out against competing solutions, it’s the “unified file and object access” feature. Some organizations need to use both object and classical Portable Operating System Interfaces (POSIX) on a single storage system because they still have existing applications running on old systems. It’s a rare requirement but one that IBM Spectrum Scale for object storage can meet.
Figure 2 shows the unified nature of Spectrum Scale, serving both objects and files.
Figure 2: Spectrum Scale with unified file and object access
Tip 3: Develop a network design that’s capable of ingesting your network traffic.
Network design is also critical, because for many organizations network performance is a crucial factor. Figure 3 shows an example of how system resources are used on a protocol node. Each protocol node has two 10GbE network interface cards (NICs), and the two NICs are bonded. Looking at the network performance figure highlighted with a red box, you can see that the network throughput is as high as 531 MB/sec to 937 MB/sec. Considering that the theoretical throughput of each NIC is 1.25 GB/sec, this level of ingest traffic was consuming most of this organization’s available network bandwidth.
Figure 3: System resource usage under heavy ingest traffic
It’s clear from this example that the network throughput performance was a determining factor for comparing the network performance with the node’s CPU utilization. Figure 4 shows that the nodes had around 50 percent unused CPU capacity. For more information about network design for ESS, I recommend reading “Elastic Storage Server: Networking is what it’s all about!” and checking out the IBM Knowledge Center.
Figure 4: CPU usage under heavy ingest traffic
Figure 5 illustrates a final solution design for the example described here. This organization’s requirement was to ingest 300 TB/day using the OpenStack Swift API. To handle this, my team attached two 40 GbE NICs on the protocol nodes and configured network bonding. With this design, the aggregated network throughput of the five protocol nodes can reach up to 200 Gbps. Between the protocol nodes and the ESS systems, we placed 40GbE Ethernet switches. Using a network-centric design helps eliminate potential bottlenecks in the architecture so you can utilize the ESS at its full speed.
Figure 5: Network design
Based on my experience, integration of ESS and OpenStack Swift can create added value in that it:
- Provides a proven and high-performing solution for petabyte-scale data
- Minimizes development costs by unifying file and object services
- Prevents you from being locked into proprietary solutions with industry standards
The project also taught me a lot about performance optimization of OpenStack Swift, but I’ll save that for another post. I hope these lessons prove helpful if you find yourself facing a similar situation!
For more on Spectrum Scale, see "IBM Spectrum Scale Performance and Sizing Update” by Sven Oehme.
Note 1: Other use cases of IBM Spectrum Scale with unified file and object access:
- In-place analytics: In this use case, user data is uploaded via cloud API and stored as objects. This is a common pattern used on cloud platforms. Then, data scientists access the collected data as normal files and execute analytics to draw insights. If you use analytics based on Hadoop, the unified feature of Spectrum Scale can provide higher efficiency.
- Migration to cloud: In this scenario, user data is stored as normal files, and users use legacy applications on IBM Spectrum Scale. While legacy applications are maintained, new applications are developed and deployed using Cloud APIs. The new applications can use Spectrum Scale as a common platform that facilitates the transition to cloud services.