Open Source

Ocean Tensor Package: General-Purpose Software for Tensor Computation

Share this post:

Recent years have brought a tremendous proliferation of hardware acceleration and computation devices such as GPUs and FPGAs to address the ever-increasing need for computational power. Deep learning, which requires the processing of large volumes of data through computationally intensive neural networks, has both been enabled by and driven the development of advanced computational hardware. In conjunction, several tensor-computation packages specialized towards deep learning have been developed to leverage the compute capabilities of these advanced new devices. However, tensor computations are not restricted to deep learning and arise in various other fields, including:

  • Scientific computing
  • Numerical optimization
  • Image and signal processing
  • General machine learning
  • Data science

Current implementations of tensor operations, present in all deep-learning packages, still lack in one or more aspects. For instance, the implementation may lack in modularity, or provide support for only a limited number of data types, with complex numbers often missing. Internally, there may be no or a minimal amount of flexibility in the memory layout of tensors, and operations used to manipulate the tensors may not be available for stand-alone usage.

Current implementations of tensor operations

Current implementations of tensor operations still lack in one or more aspects.

Introducing the Ocean Tensor Package

Given the need for a comprehensive general-purpose tensor package, I developed the Ocean Tensor Package. The Ocean Tensor Package has a modular design that makes it easy to add new functionality, provide support for new and emerging device types, and install packages on a per-need basis. Moreover, the layered implementation makes it possible for users to access functions ranging from low to high level. In particular, the Ocean Tensor Package consists of three layers:

  1. The Solid foundation library, which provides low-level functions that are independent of the higher-level tensor representation;
  2. The Ocean tensor library, which implements the tensor and module infrastructure and provides the high-level tensor APIs; and
  3. A Python interface that provides user-friendly access to all tensor functions, as well as interoperability with existing packages.


The Ocean Tensor Package consists of three layers

The Ocean Tensor Package consists of three layers.

The Ocean Tensor Package provides support for various integer, floating-point, and complex data types and supports non-aligned and byteswapped memory layouts. It supports automatic conversion between data types and devices, as well as dimension broadcasting, and can be configured to provide low-level control over all operations. On the GPU, high levels of asynchronicity are enabled by consistent usage of streams and the availability of special intermediate tensors.

Illustrative example

As an example of the flexible, but well-defined, usage of different devices and data types, consider the below implementation of the modified Gram-Schmidt algorithm for QR factorization, in which the byteswapped double-precision Q matrix is updated in-place on the CPU, and the single-precision R matrix is maintained on a GPU device.


The Ocean Tensor Package runs on various platforms, including MacOS and Linux, on Intel and Power machines, with or without GPU devices, and is available as open-source software at A preprint of the accompanying paper can be found at

Sample implementation

def InplaceQR(Q,R) :
   n = Q.size[1]
   for i in range(n) :
      q = Q[:,i]
      r = ocean.sqrt(q.T * q)
      q /= r
      R[i,i] = r
      for j in range(i+1,n) :
         r = q.T * Q[:,j]
         Q[:,j] -= q * r
         R[i,j] = r
def InplaceQR(Q,R) :
   n = Q.size[1]
   for i in range(n) :
      q = Q[:,i]
      r = ocean.sqrt(q.T * q)
      q /= r
      R[i,i] = r
      if (i+1 < n) :
         r = Q[:,i+1:].T * q
         Q[:,i+1:] -= q * r.T
         R[i,i+1:] = r
import ocean

# Create an example matrix A with one added to the diagonal
# entries to make if full rank.
A = ocean.arange(25, ocean.double).reshape(5,5)
d = A.diag(); d += 1;

# As an example, create a byte-swapped copy Q on the cpu and a
# single-precision result tensor R on gpu[0].
Q = A.clone(); Q.byteswap()
R = ocean.zeros(A.size, ocean.float, ocean.gpu[0])

# Call the in-place QR factorization code (see code above)

# Display matrices, verify orthogonality, and check factorization
print(Q); print(R)
print(ocean.norm(Q.T * Q - ocean.eye(Q.size[0])))
print(ocean.norm(Q*R - A))
# Matrix Q
    0.17961   0.41037   0.58318   0.51237   0.44353
    0.17961   0.77910  -0.59087  -0.07841   0.07392
    0.35921   0.26763   0.51389  -0.66920  -0.29569
    0.53882  -0.05947  -0.06544   0.50915  -0.66530
    0.71842  -0.38658  -0.20594  -0.15575   0.51745

# Matrix R
    5.56776   15.44606   25.50395   35.56185   45.61975
    0.00000    5.42396    9.96772   14.69584   19.42397
    0.00000    0.00000    2.27881    2.87354    3.90708
    0.00000    0.00000    0.00000    1.76914    1.69502
    0.00000    0.00000    0.00000    0.00000    1.55236

# Orthogonality and factorization using single-precision R
More Open Source stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page:

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading