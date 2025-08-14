Harvard researchers solved infrastructure limitations with IBM’s help
At the forefront of responsible AI research, the Calmon Lab at Harvard John A. Paulson School of Engineering and Applied Sciences was tackling one of the most pressing challenges in AI. They were trying to align large language models (LLMs) with human values and safety standards. Their work focused on improving the performance of chain-of-thought (CoT) reasoning in commonly used models, such as DeepSeek-R1 and Llama, by applying inference-time alignment methods.
However, their progress was hindered by infrastructure limitations. The Harvard cluster was overwhelmed with demand, and running state-of-the-art models required access to several NVIDIA H100 GPUs. These delays significantly limited their ability to efficiently experiment on large models, slowing the overall pace of their research.
Run inferences at speeds exceeding 2,000 tokens per second
Train and deploy LLMs without wait times
To overcome these infrastructure constraints, the Calmon Lab partnered with IBM. Using IBM Cloud®, they provisioned two NVIDIA HGX H100 8-GPU servers, within a secured virtual private cloud (VPC), each server equipped with 640GB of GPU memory 2 TB of physical memory. The setup included high input/output operations per second (IOPS) block storage, fast network file sharing and IBM Cloud Object Storage for seamless data transfer.
Using Red Hat® Enterprise Linux® 9, the Anaconda platform and virtual large language model (vLLM) for model serving, the lab quickly transitioned to a high-performance environment. Within a week, the team was running inference at over 2,000 tokens per second and training models without delays. This transformation empowered them to explore new frontiers in AI safety, including identifying unproductive reasoning paths and refining model alignment techniques.
Following the transformation, the Calmon Lab experienced a substantial improvement in research velocity resulting from access to reliable and easy-to-use GPU infrastructure on IBM Cloud. They were able to:
IBM continues to support the lab’s mission by providing scalable, secured and high-performance infrastructure—empowering researchers to push the boundaries of trustworthy AI.
Harvard (link resides outside of ibm.com), located in Cambridge, Massachusetts, is a prestigious Ivy League university founded in 1636. They’re renowned for their academic excellence and extensive contributions to various fields, including education, research and culture. Harvard serves a diverse student body and offers a wide range of programs across their 12 degree-granting schools. This esteemed university is consistently ranked among the top universities globally, reflecting their significant influence and resources.
