Cloud Computing

Building an open seamless science cloud

Share this post:

A Q&A with Ezra Silvera

Ezra Silvera is an IBM researcher who has been tackling the challenge of massive scale data in the cloud’s virtual machines for over a decade. In plain English, he tries to get groups of computers to work together as one virtual machine, and process petabytes of data, without anyone knowing these computers are actually sitting continents apart.

Recently, Ezra presented IBM’s vision for the design of the Helix Nebula Science Cloud. As one of the four finalists in this effort, IBM’s vision for a sustainable science cloud must be capable of serving Europe’s biggest research centers, including CERN, EMBL, ESA, and PIC.

When did you start getting interested in cloud technologies?

Ezra Silvera: For the last 12 years, my research has been focused on areas related to system management and virtualization, including network storage and cloud computing. I’ve been heavily involved in projects around OpenStack and more recently on containers and Docker.

Ezra Silvera

Ezra Silvera, Staff Member at IBM Research – Haifa

Tell us about the Helix Nebula Science Cloud

EZ: Helix Nebula is a new, pioneering partnership between leading IT providers and some of Europe’s biggest research centers, CERN, EMBL, ESA and PIC, to chart a course towards sustainable cloud services for the research communities. This effort is known as the European Open Science Cloud (EOSC) initiative. The vision of the EOSC is to offer Europe’s 1.7 million researchers and 70 million science and technology professionals a virtual environment with open and seamless services for storage, management, analysis, and re-use of research

data across borders and scientific disciplines free at the point of use.

The scientists in CERN and similar research organizations need an infrastructure that is built specifically for their calculations, with computers connected for high performance. Their experiments run jobs that can take months or days, processing vastly enormous amounts of data.

For example, the CERN particle accelerator generates more than 25 petabytes (25,000,000,000,000,000 bytes or 1,000 terabytes) of new data per year. The data is then distributed to more than 100 data centers across the globe for further analysis.

Personally, I find it fascinating to work on super-challenging problems together with real client needs. The fact that these clients are top EU research institutes makes the project especially interesting.

What unique requirements make this science cloud so challenging?

EZ: These research institutions want to move to an infrastructure that takes advantage of new cloud trends suc

Helix Nebula Science Cloud

h as the ‘pay as you use’ model, high performance computing in the cloud, and the ability to elastically use unlimited resources as needed by the data.

The data being used by these organizations is varied and massive,  touching areas like the human genome, astrophysics, physics and more. It’s not just a matter of opening more VMs in the cloud; it involves creating a system that can be reconfigured on the fly to meet different needs, as the computing jobs change.

The Helix Nebula Science Cloud defined four challenge areas:

  • Transparent data access means that running jobs on virtual machines in the cloud should ‘feel’ as though they are running in the data center on premise.
  • Innovative pricing model will need to handle new approaches like spot instances, resource auctions, or scheduling certain jobs when prices are lower
  • Dedicated communication lines for the research institutes
  • Identity management so different scientists in the various organizations will maintain their accounts and access privileges, just it will be on the cloud

In some ways, this is really the pinnacle of cloud and information technology. I consider it a privilege to participate in this initiative.

More stories

IBM open source advancements to help developers be more productive with Kubernetes

IBM is announcing new IBM Research-led open source projects — Kui and Iter8 — for continuous innovation and increased productivity with Kubernetes.

Continue reading

High quality, lightweight and adaptable Text-to-Speech (TTS) using LPCNet

Recent advances in deep learning are dramatically improving the development of Text-to-Speech systems through more effective and efficient learning of voice and speaking styles of speakers and more natural generation of high-quality output speech.

Continue reading

Four Papers Advance Computational Argumentation in IBM’s Project Debater

The latest work on computational argumentation from the IBM Project Debater research team group is being presented at the ACL 2019 conference. Three papers will be presented at the main conference and one more paper will be presented in the co-located Argument Mining Workshop.

Continue reading