Managing storage volumes

You can create and manage connections to storage volumes from IBM Cloud Pak® for Data.

Many enterprise applications use a mounted file system to work with data sets. For example, many Spark jobs process CSV, PARQUET, and AVRO files that are stored on a POSIX-compliant shared file system that can be accessed by all of the executors. However, you might need to store source code for Spark jobs or extra packages that can be used by your Spark jobs. In some cases, these additional files need to be stored on a mountable, shared file system. You can use a volume instance to store these files by creating connections to a storage volume.

You can perform the following tasks:

What types of storage can you use to create a storage volume?

You can use the following types of storage to create a storage volume in IBM Cloud Pak for Data:

Storage Requirements
External NFS The NFS server must be accessible from the OpenShift® worker nodes through a low latency network. The NFS server should also be resilient and highly available.
External SMB SMB Version 3.0 or later.

To connect to a file share path on a remote SMB server, a cluster administrator must complete Enabling users to connect to external SMB storage volumes.

Existing PVC To use an existing persistent volume claim (PVC) on the cluster, a cluster administrator must create the persistent volume claims that point to the storage that you want to use.
Restriction: Use persistent volume claims that point to file storage rather than block storage.

Additionally, the cluster administrator must add a label to the persistent volume claims so that Cloud Pak for Data can find them.

To label a persistent volume claim:
  1. Show the list of persistent volume claims:
    oc get pvc
  2. Set the PVC_NAME environment variable to the name of the persistent volume claim that you want to label:
    export PVC_NAME=<pvc-name>
  3. Add the zen_storage_volume_include=true label to the persistent volume claim:
    oc label pvc ${PVC_NAME} zen_storage_volume_include=true
New PVC To use a new persistent volume claim (PVC) on the cluster, a cluster administrator must set up dynamic storage on the cluster.
Restriction: Use storage classes that provision file storage rather than block storage. If you try to use a storage class that provisions block storage, you might encounter an error when you try to create storage volumes.

Additionally, the cluster administrator should consider whether they want to restrict the list of storage classes that users can choose from. For details, see Restricting the list of storage classes that are available to an instance of Cloud Pak for Data.

Create a Cloud Pak for Data storage volume

Who needs to complete this task?

To complete this task, you must have the Create service instances permission in Cloud Pak for Data. You can check which permissions you have from your profile.

Before you begin

Ensure that you review the requirements in What types of storage can you use to create a storage volume?.

To create a Cloud Pak for Data storage volume:

  1. From the navigation menu, select Administration > Storage volumes.
  2. Click New volume.
  3. Enter required information about the volume and select the Volume type.
    • Namespace: Specify the namespace (Red Hat® OpenShift Container Platform project) where the volume will be created.

      If there are projects tethered to the project where Cloud Pak for Data is installed, you can optionally create the volume in a tethered project. Otherwise, the project will be created in the same project as the Cloud Pak for Data control plane.

    • Name: Enter the name of the volume. Do not include special characters or blanks in the name of the volume.
    • Description: Optionally, enter a description of the volume.
    • Volume type: Select one of the following options:
      External NFS
      • NFS server: Specify the IP address or the fully qualified hostname of the NFS server.
      • Exported path: Specify the exported directory path that is configured on the NFS server. For example, /shared/data.
      • Mount path: Specify the directory path that users can access the contents of this volume from. For convenience, you can use the exported path. For example, /shared/data.
      External SMB
      • SMB file share server: Specify the IP address or the fully qualified hostname of the SMB file share server.
      • SMB file share path: Specify the directory path that is configured for the file share on the SMB server. For example, /shared/data.
      • Active Directory domain (optional): If you use Active Directory domain to manage the SMB file share servers in your network, specify the domain that the server belongs to.
      • Mount path: Specify the directory path by which users can access the contents of this file share. For convenience, you can use the SMB file share path. For example, /shared/data.
      • Username: Enter the username for the file share. The username is used to create a Kubernetes secret so that other users who have access to this storage volume can access the file share but cannot see the credentials.
      • Password: Enter the password for the file share. The password is used to create a Kubernetes secret so that other users who have access to this storage volume can access the file share but cannot see the credentials.
      Existing PVC
      • Existing PVC: Select the PVC that you want to give users access to.
      • Mount path: Specify the directory path that users can access the contents of this volume from. For convenience, you can use the exported path. For example, /shared/data.
      New PVC
      • Storage class: Specify a storage class. A cluster administrator can create storage classes to define different types of storage. Work with your cluster administrator to determine which storage class to use.
      • Size in GB: Enter the amount of storage to allocate to this volume. The size is constrained by either the total amount of storage on the storage device or the storage class configuration.
      • Mount path: Specify the directory path that users can access the contents of this volume from. For convenience, you can use the exported path. For example, /shared/data.
  4. Click Add.

    After a volume instance is created, you can mount the volume in the appropriate pods in your Cloud Pak for Data deployment. The mount path to the storage volume is prefixed with /mnts/ and you can specify a path in this directory.

Create a storage volume connection

After you create a storage volume, you can create a storage volume connection on the Platform connections page. For more information, see Connecting to data sources.

Manage access to a storage volume

Important: To manage access to a volume, it must be running. Check the status of the volume on the Storage volumes page.

You can specify which users have access to the storage volume to ensure that only authorized users have access to the volume.

  1. On the Storage volumes page, click a volume name to open the volume. Then, open the Access control tab.
  2. Click Add users to grant access to users who can access the storage volume.
  3. Select users and choose the role of each user as Editor, Viewer, or Admin.
  4. Click Add.

As the creator of a volume instance, you can remove access to a storage volume.

  • To remove a single user, on the Access control tab, click Remove Remove icon in the user's row to remove their access to the storage volume.

    The user will no longer be able to access the volume.

  • To remove multiple users, on the Access control tab, select the users and then click Remove Remove icon in the toolbar.

    The users will no longer be able to access the volume.

View details about a storage volume

You can view a list of the available storage volumes, the number of users with access to each volume, and the status of each volume.

  1. On the Storage volumes page, click a volume name to see details.
  2. To generate or revoke an API key for this volume, click Instance API key.
  3. To copy the access token for this volume, click Copy Copy icon.
  4. To regenerate the access token for this volume, click Regenerate token Regenerate icon.

You can use the endpoint and the access token of the storage volume in the Volumes APIs. For more information, see Managing persistent volume instances with the Volumes API.

Delete a storage volume

You can delete a storage volume. However, by default the data inside the storage volume is not deleted and services can continue to use the volume until a Red Hat OpenShift administrator removes the persistent volume claim to reclaim the storage. The reclaim policy that is specified in the storage class determines what happens when the persistent volume claim is deleted.

Important: By default, if you delete a storage volume, you cannot reuse the same name to create a new storage volume. If you want to reuse a name:
  1. Record the name of the storage volume.
  2. Ask a Red Hat OpenShift administrator to complete Cleaning up storage volume resources.
  1. From the Storage volumes page, click Options Options icon for a volume.
  2. Click Delete. The connection is deleted and all users' access to the volume is removed. Users or applications that connect to this volume will no longer be able to connect.

Browse the storage volume and upload content

You can view the content (files and directories) in a storage volume, add or delete files and directories, upload or download files, and extract the contents of files and directories. You can use:
  • The integrated file browser

    To access the integrated file browser, on the Storage volumes page, click a volume name to open the volume. Then, open the File browser tab.

  • The Volumes API
Guidance for large files: If you are working with large files, you might encounter errors when you try to upload or download the files.
Uploading large files
If you use the integrated file browser, your web browser and network speed affect the size of the files that you can upload and the amount of time that it takes to upload large files.
If a file upload fails, try one of the following options:
  • If you want to use the integrated file browser, compress large files (for example files that are 10 GB or larger) as ZIP files or TAR files before you upload them. You can optionally extract compressed files on upload.
  • Use the Volumes API to upload large files to a volume. The API is not constrained by the file size limits that are imposed by your web browser and is capable of uploading very large files, such as 75 GB files.

    You can also compress large files before you upload them.

If you encounter issues when downloading files, or if you continue to have issues uploading files after implementing the preceding recommendations, see Requests time out when you upload or download files from volumes.