With the rapid increase of data, many IT organizations have the desire to use commodity storage hardware as part of their Software Defined Storage infrastructures. IBM General Parallel File System (GPFS) turns commodity local or external storage into elastic, highly available enterprise cloud storage for Cloud and Big Data applications.
Let’s take a broader look at the unprecedented value IBM GPFS offers to our customers across the world. GPFS is extremely robust from years of supporting the world's largest supercomputers, and is also very distributed and very high bandwidth, attributes great for IBM InfoSphere BigInsights Hadoop data intensive workloads. GPFS can use any block storage device, including SSD, SAS HDD and NL HDD into a single shared file system. GPFS storage pools and placement policies can be used to create cost or performance optimized storage pools to help you maintain your Service Level Agreements (SLAs).
Here is the demonstration video that shows how GPFS based file storage can leverage commodity storage for a BigInsights Hadoop development and production workloads. We assumed the role of an administrator for a car company that wanted to use BigInsights for analytics. In our environment, we divide our local attached storage into three storage pools, each providing different performance characteristics. We created a Platinum storage pool that consists of a few ultra-fast SSDs, a Gold storage pool that consists of several NL HDDs and a Bronze storage pool with a few NL SAS HDDs. We set up the GPFS cluster once with our desired placement policies and haven’t had to modify the configuration again.
Since the GPFS cinder driver is already integrated into OpenStack (Havana) we simply used a pattern language to automate creation of volumes and attach them to each BigInsights data node VM. This is done quickly, with no stress from configuration details. For the initial deployment we chose to use the Bronze storage pool, but discovered that the storage performance caused us to violate our SLAs. We quickly redeployed onto the faster Gold storage pool and verified that the storage performance was sufficient to maintain our SLAs.
GPFS storage can easily grow as customer needs change over time. If the client needs more capacity, they can easily add more drives or nodes to the file system. Additionally, GPFS provides enterprise class availability while protecting data in the case of drive or node failure. To learn more, I encourage you to visit Software Defined Systems community or join us on Twitter @IBMSDE.
Darryl E. Gardner
Hardware Architect - Storage Systems
IBM Systems & Technology Group