Harvard | IBM

Harvard research team faced GPU bottlenecks in AI safety studies

At the forefront of responsible AI research, the Calmon Lab at Harvard John A. Paulson School of Engineering and Applied Sciences was tackling one of the most pressing challenges in AI. They were trying to align large language models (LLMs) with human values and safety standards. Their work focused on improving the performance of chain-of-thought (CoT) reasoning in commonly used models, such as DeepSeek-R1 and Llama, by applying inference-time alignment methods.

However, their progress was hindered by infrastructure limitations. The Harvard cluster was overwhelmed with demand, and running state-of-the-art models required access to several NVIDIA H100 GPUs. These delays significantly limited their ability to efficiently experiment on large models, slowing the overall pace of their research.

2,000+ tokens

per second

Run inferences at speeds exceeding 2,000 tokens per second

No wait times

Train and deploy LLMs without wait times

IBM Cloud powered high-speed model training and experimentation

To overcome these infrastructure constraints, the Calmon Lab partnered with IBM. Using IBM Cloud®, they provisioned two NVIDIA HGX H100 8-GPU servers, within a secured virtual private cloud (VPC), each server equipped with 640GB of GPU memory 2 TB of physical memory. The setup included high input/output operations per second (IOPS) block storage, fast network file sharing and IBM Cloud Object Storage for seamless data transfer.

Using Red Hat ® Enterprise Linux® 9, the Anaconda platform and virtual large language model (vLLM) for model serving, the lab quickly transitioned to a high-performance environment. Within a week, the team was running inference at over 2,000 tokens per second and training models without delays. This transformation empowered them to explore new frontiers in AI safety, including identifying unproductive reasoning paths and refining model alignment techniques.

As a research team working on the frontiers of information theory and generative AI, we need an increasingly powerful compute infrastructure. Our experience working on IBM Cloud was phenomenal. The capabilities and expertise from IBM helped us to run large models that require large GPU clusters quickly and efficiently, helping us obtain new research results—related to AI safety— in record time.

Professor Flavio du Pin Calmon

Thomas D. Cabot Associate Professor

Harvard School of Engineering and Applied Sciences

Faster insights. Greater impact on AI safety.

Following the transformation, the Calmon Lab experienced a substantial improvement in research velocity resulting from access to reliable and easy-to-use GPU infrastructure on IBM Cloud. They were able to:

Train and deploy LLMs, such as DeepSeek and Llama, on demand, without wait times.
Run inferences at speeds exceeding 2,000 tokens per second.
Collaborate at scale with shared storage and high-speed data transfer.
Achieve new research breakthroughs in model multiplicity and AI safety.

IBM continues to support the lab’s mission by providing scalable, secured and high-performance infrastructure—empowering researchers to push the boundaries of trustworthy AI.

About Harvard

Harvard (link resides outside of ibm.com), located in Cambridge, Massachusetts, is a prestigious Ivy League university founded in 1636. They’re renowned for their academic excellence and extensive contributions to various fields, including education, research and culture. Harvard serves a diverse student body and offers a wide range of programs across their 12 degree-granting schools. This esteemed university is consistently ranked among the top universities globally, reflecting their significant influence and resources.

Solution components

IBM Cloud®

Red Hat® Enterprise Linux® 9

NVIDIA GPUs on IBM Cloud®

IBM Cloud® Object Storage

Explore how to scale AI with trust

Discover how IBM solutions can help your organization automate data analysis and enhance fan engagement.

Learn more about IBM Cloud

Get started for free

Legal

© Copyright IBM Corporation 2025. IBM, the IBM logo, Granite, watsonx.ai, watsonx.data, watsonx.governance, and watsonx Orchestrate are trademarks of IBM Corp., registered in many jurisdictions worldwide.

Examples presented as illustrative only. Actual results will vary based on client configurations and conditions and, therefore, generally expected results cannot be provided.

Accelerating AI safety research with a scalable cloud infrastructure

Legal