A world-class research institution, The University of Queensland (UQ) sought to simplify data capture, storage, analysis, and management for its high-performance computing (HPC) environment. Collaborating with IBM Business Partner Sundata, the University developed a unified data fabric with IBM® Spectrum® Scale software, significantly accelerating image-intensive and AI workloads.
To speed up research collaboration, including for complex, AI-driven projects, UQ needed a storage solution that supports hundreds of terabytes of data generated daily.
UQ built a high-performance data fabric powered by and centrally managed with IBM Spectrum Scale, recently adding an IBM Elastic Storage® System (ESS) solution to support its fastest HPC environment.
How can we use ultrasound technologies so that therapeutic antibodies can overcome the blood-brain barrier and slow Alzheimer's disease? What can the neural circuits of fruit flies teach us about designing robotic movements? Why does cellular inflammation lead to cancer, and how can we learn more by imaging live cells at nanoscale size in real-time? Across UQ, creative researchers tackle these and other tough questions, often leading to discoveries that can change the world and people’s lives.
The research teams focused on these questions rely on the University’s fastest GPU- accelerated computer to carry out their cutting-edge work. Designed specifically for imaging-intensive science and AI workloads, this supercomputer, along with other HPC systems at the University, needs extremely fast, scalable, and flexible data storage available anytime, anywhere.
To create a faster path from ingest to insights, the Research Computing Centre (RCC) at UQ sought to deploy a uniform, high-performance storage strategy, and architecture to effectively support and manage university-wide data capture and analysis.
The RCC wanted a solution that could not only accommodate exponential growth in data volume, velocity, and variety but also provide rapid data access. Researchers at the University generate structured and unstructured data using a variety of computer systems – from desktops to HPC clusters – and from an enormous range of scientific instruments, such as MRI scanners, optical microscopes, and DNA sequencers, explains Professor David Abramson, Director at the RCC. “Our paradigm around data is to keep one logical copy of it and then render it in many different ways, making the data available when a researcher needs it, where they need it,” he says. While evaluating potential solutions, the RCC also looked for technologies that could expand in line with the University’s needs well into the future.
The RCC built a high-performance data storage fabric known as MeDiCI (Metropolitan Data Caching Infrastructure), powered by and centrally managed with IBM Spectrum Scale. “For researchers to drive innovation, they need to be able to undertake high quality research in a timely, scalable and boundary pushing manner, leveraging cutting-edge research computing infrastructure. Our partnership with IBM helps meet these needs,” explains Jake Carroll, Chief Technology Officer, Research Computing Centre at UQ. “With MeDiCI, researchers and students across the University and at other international institutes can seamlessly work with data stored on any compute cluster at UQ and collaborate.”
“When researchers sit down, they see all of their data. They don't realize it's actually moving across optical wires at blind speed from a remote data center,” says Abramson.
In addition, the MeDiCI ecosystem supports a variety of platforms, instruments, and data. “IBM Spectrum Scale software allows us to unify all of our different silos of storage sources into one integrated, intelligent storage infrastructure and then render the data in whichever protocol is appropriate, resulting in faster analytics and greater resource utility,” says Abramson. MeDiCI also automatically captures project metadata, including users, instruments, and data parameters.
The RCC team continues to evolve the MeDiCI infrastructure, most recently deploying it as a storage solution for UQ HPC Wiener (link resides outside of ibm.com). The goal is to allow researchers to do more in the same timeframe given the increased throughput that the platform provides. “We needed a solution that could not only sustain quite substantial bandwidth from a gigabytes-per-second perspective but also a very high IOPS requirement to support massive amounts of data coming at an unprecedented rate from disk systems and flash storage simultaneously,” explains Carroll.
"We wanted [a hardware platform with] IBM Spectrum Scale because its functionality is pretty close to unique,” explains Carroll. “With the ESS solution, we get all the benefits of a high-speed parallel file system inside a supercomputer with the data management transparency that AFM and other IBM Spectrum Scale features provide. That integration fits into the workflow of our users, and in scientific outputs, workflow is king. That’s why we leverage software-defined storage,” he adds.
With the ESS solution, UQ can support massive data volumes with up to 40 GB of throughput and the ability to scale out to exabytes of storage, and its hybrid cloud model provides rapid metadata access. With the IBM Spectrum Scale RAID erasure coding feature, the solution is designed to support high levels of storage reliability, availability, and performance. Combined with AFM, it also enables the RCC to streamline data access within specific project workflows— while still maintaining a single, common storage architecture.
The IBM Systems Lab Services and IBM Systems technical sales teams in Australia worked with Sundata and RCC to quickly deploy the ESS GH14S solution on an InfiniBand network and integrate it with the end-to-end MeDiCI IT architecture. The teams worked cohesively and with attention to detail at every stage, implementing the array in five days.
The RCC has recently implemented the IBM Storage Insights offering, cloud-based storage management, and support platform with predictive analytics. It provides the team with more in-depth, cohesive visibility across the entire infrastructure, enabling higher performance through faster issue resolution.
IBM recently placed a new ESS 5000 at UQ for extensive testing and evaluation. Abramson says IBM is partnering with RCC as it has developed a reputation for stretching existing technologies.
“We have already demonstrated significant innovation in applying Spectrum Scale at the University. We have been able to provide feedback on how well it works in our environment and where it can be enhanced,” explains Abramson. “I’m very excited to be able to test IBM’s other leading-edge hardware on our most demanding research needs.”
With a uniform data fabric featuring IBM Spectrum Scale technologies such as active file management (AFM) for accessing files across the university, the RCC can optimize researchers’ time and university resources while centralizing data management and controlling IT costs. Across UQ, researchers now have comprehensive compute and storage capabilities to support the creation of massive amounts of data at scale and run complex workloads.
With the expanded bandwidth and IOPS available from the ESS device, research teams that rely on the Wiener HPC system can process data at unprecedented speeds. “Machine learning and AI is front and center with the ESS GH14S empowering how our supercomputer’s GPUs get utilized, enabling researchers to do more in the same timeframe and accelerating time to discovery,” says Carroll. In fact, the new storage array delivered an ROI in just two hours, based on performance improvements that save medical imaging researchers across UQ hundreds of processing hours each week.
At UQ’s Queensland Brain Institute (QBI), for instance, neuroscientists studying Alzheimer’s disease reduced the time required to run their project workload, known as a finite element analysis, by approximately 74 percent, shrinking the run time down to 18.72 hours. With a deeper understanding of ultrasound wave distribution on the human skull, researchers can develop technology needed to overcome the blood-brain barrier for drug delivery. “It’s a very complex undertaking, and it needs an enormous amount of compute power and storage,” explains Carroll.
In another case, QBI and other researchers looking at neural circuits in fruit flies developed genetic methods to label and manipulate individual neuron types. With Wiener, they can rapidly process terabytes of high-speed videos of the tiny insects in motion, measuring precise movements of the antennae, abdomen, and joints on six legs. With new insight into each neuron’s role, they can better understand principles governing complex motor tasks, such as walking and flying behavior.
At UQ’s Institute for Molecular Bioscience, researchers studying cellular inflammation employ lattice light-sheet microscopy to capture high-resolution 4D images of living cellular processes. Viewed using a mathematical modeling process known as deconvolution microscopy, the images provide an unprecedented, real-time look at how cancer forms. The Wiener storage solution helps make this possible, including reducing deconvolution time by more than 70 percent. The RCC saved researchers additional time by building a user-friendly portal for streamlining deconvolution tasks.
“We have to provide the best infrastructure we can to support an enormous range of research endeavors. Given exponential data growth, we also need to achieve economies of scale,” says Carroll. “IBM and Sundata help make that possible.”
For more than a century, The University of Queensland (UQ) (link resides outside of ibm.com) has maintained a global reputation for delivering knowledge leadership for a better world. The most prestigious and widely recognized rankings of world universities consistently place UQ among the world's top universities. UQ has also won more national teaching awards than any other Australian university. This commitment to quality teaching empowers our 53,600 current students, who study across UQ’s three campuses, to create positive change for society. Our research has a global impact, delivered by an interdisciplinary research community of more than 1500 researchers at our six faculties, eight research institutes, and more than 100 research centers.
To learn more about IBM Storage solutions, please contact your IBM representative or IBM Business Partner, or visit the following website: ibm.com/storage
Founded in 1986, IBM Business Partner Sundata helps corporate businesses, governments, and educational institutions align their business strategy with technology. A midsized systems integrator and reseller based in Brisbane, the company provides a wide range of planning, installation, support, and financing services.
© Copyright IBM Corporation 2020. IBM Corporation, IBM Systems Hardware, New Orchard Road, Armonk, NY 10504.
Produced in the United States of America, October 2020.
IBM, the IBM logo, ibm.com, IBM Elastic Storage, and IBM Spectrum are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
This document is current as of the initial date of publication and may be changed by IBM at any time. IBM Business Partners set their own prices, which may vary. Not all offerings are available in every country in which IBM operates.
The performance data and client examples cited are presented for illustrative purposes only. Actual performance results may vary depending on specific configurations and operating conditions. It is the user’s responsibility to evaluate and verify the operation of any other products or programs with IBM products and programs. THE INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided.
Actual available storage capacity may be reported for both uncompressed and compressed data and will vary and may be less than stated.
Note: The lead space image in the case study is a stock photo.