Planning for Microsoft Azure cloud
Learn about prerequisites, configurations, and other important planning information you must consider before the deployment of an IBM Storage Scale cluster on Azure.
- Preparing the Azure environment to deploy the IBM Storage Scale cluster.
- Planning the virtual network (VNET) architecture.
- Planning for domain name system (DNS).
- Planning for bastion.
- Creating an IBM Storage Scale virtual machine (VM) image in advance.
- Planning for the IBM Storage Scale deployment architecture on Azure.
- Planning for IBM Storage Scale cluster deployment profiles.
- Determining your performance, scalability, data availability, and data protection requirements.
- Planning for encryption at rest.
Preparing the Azure environment
- Create a service principle by using the az ad sp create-for-rbac
command.To consult the credentials, issue the following command:
az ad sp create-for-rbac --query "{ client_id: appId, client_secret: password, tenant_id: tenant }"
The following output is an example of the previous command:{ "client_id": "00000000-0000-0000-0000-34XXXXXXX", "client_secret": "00000000-0000-0000-0000-50XXXXXXX", "tenant_id": "00000000-0000-0000-0000-40XXXXXXX", }
To authenticate on Azure, you also need to obtain your Azure subscription ID. Issue the az account show command to consult this ID:az account show --query "{ subscription_id: id }"
The following output is an example of the previous command:{ "subscription_id": "00000000-aea2-4177-a0f1-7117aXXXX" }
Remember: Make a note of the next values because they are required during the configuration operations of cloudkit:- client_id
- client_secret
- tenant_id
- subscription_id
- Make sure to configure at least the
Contributor
role. These roles are required for the Azure service principle to run cloudkit. - Verify the Azure quota limits and make sure that enough quotas exist for the Azure instance types that you intend to deploy. To consult the quota limits of the Azure instance, see Azure Quotas documentation.
- Prepare the installer node where the cloudkit is meant to run. For more information, see Preparing the installer node.
- Optionally, create a resource group that holds all Azure resources that are specific for IBM Storage Scale cluster. For more information, see Create resource groups.
Planning the virtual network (VNET) architecture for Azure
When resources are being deployed in the cloud, the cloudkit can either create a new virtual network (VNET) and provision the resources into it or use a previously created VNET.
When the cloudkit creates a new VNET, the cloudkit designs your network infrastructure from scratch. It chooses the subnets, network address ranges, and security groups that best suit the IBM Storage Scale deployment.
A VNET can be created by using the cloudkit create network command, which creates only the VNET. Or you can use the cloudkit create cluster command.
? VPC Mode:[Use arrows to move, type to filter]
> New
Existing
-
- Deploy IBM Storage Scale into a new VNET with a single availability zone
- This option builds a new Azure environment that consists of the VNET, private and public subnets, network security groups, application security groups, security rules, bastion hosts, and other infrastructure components; then, it deploys IBM Storage Scale into this new VNET with a single availability zone.
-
- Deploy IBM Storage Scale into an existing VNET
- This option provisions IBM Storage Scale in your existing Azure infrastructure.
- This is a mandatory step. Private subnets (with allocatable IP address range) in the availability region (minimum 1 private subnet per availability region).
- This is an optional step. A public subnet is
only required if either of the following cases is met:
- Where the cluster is planned to be accessed through a jump host. For more information, see Other considerations.
- If there is a need to precreate an IBM Storage Scale VM image. For more information, see Planning for Microsoft Azure by precreating a VM image of IBM Storage Scale.
- This is an optional step. A cloud NAT attached to the private subnet is only required when the IBM Storage Scale instances need access to the internet (for operating system updates and security patches, and so on).
- If you configured an Azure Private DNS, you must provide the Azure Private DNS information
during the cluster
deployment..
? Do you wish to use Azure Cloud DNS: Yes
- Make sure to select the appropriate resource group that is aimed to deploy the IBM Storage Scale cluster.
Planning for domain name system (DNS)
To facilitate hostname resolution, use the ./cloudkit create dns command to create or associate a DNS.
? Do you wish to use an existing DNS zone: (y/N)
E: No existing DNS zones found. In order to configure, use 'cloudkit create dns'.
Planning for bastion
Use the cloudkit create jumphost command to create a jump host or bastion in the public subnet.
? Bastion OS: RHEL-HA | RedHat | 8_8-gen2 | 8.8.2023121916 | azureuser
? Bastion instance type: Standard_B2s | vCPU(2) | RAM (4.0 GB)
? Do you wish to continue by creation of jumphost ssh key pair: Yes
I: Bastion instance(s) private key path: /user/clouduser/bastion_ssh_keys/id_rsa
? Bastion CIDR allow list: xxx.xxx.xxx.20/32
Standard_B2s | vCPU(2) | RAM (4.0 GB)
Standard_B2s_v2 | vCPU(2) | RAM (8.0 GB)
Standard_B4s_v2 | vCPU(4) | RAM (16.0 GB)
Planning for Microsoft Azure by precreating a VM image of IBM Storage Scale
When the cloudkit is used to create an IBM Storage Scale
cluster by using the cloudkit create cluster
command, it can either automatically
create a stock VM image or the customer can provide a previously created custom image.
The cloudkit can automatically create an image by using Red Hat Enterprise Linux (RHEL) 8.8 or RHEL 9.2. When the option to use a stock image is chosen, the cloudkit automatically performs all required actions to create an IBM Storage Scale and create the IBM Storage Scale cluster.
A custom image can be created by using the cloudkit create image
command. The
create image
command can accept inputs in the form of an existing image or provides
the ability to create an image from a Red Hat operating system image.
- A Red Hat version that is supported by IBM Storage Scale.
- This is an optional step. For any customer applications already preinstalled, the cloudkit
create image
command installs all required IBM Storage Scale packages on top of the existing image. - Up to two virtual machine images.
- An image that is a logical volume management (LVM) partition.
Planning for the IBM Storage Scale deployment architecture on Azure
Before the IBM Storage Scale cluster can be created on the cloud, the deployment architecture must be planned.
-
- Combined-compute-storage
- A unified IBM Storage Scale cluster with both storage and compute nodes. It is recommended that any customer workload runs only on the compute nodes.
-
- Compute-only
- An IBM Storage Scale cluster that consists only of compute nodes. This cluster does not have any local file system and therefore must remote mount the file system from an IBM Storage Scale storage cluster.
-
- Storage-only
- This IBM Storage Scale cluster consists only of storage nodes that have access to storage where the file system is created. This file system is remotely mounted to any number of compute-only IBM Storage Scale clusters.
When you run the cloudkit create cluster command, the following prompt asks the user to select from among the different deployment models that are offered:
? IBM Spectrum Scale deployment model:[Use arrows to move, type to filter, ? for more help]
> Storage-only
Compute-only
Combined-compute-storage
Deployment purpose
The deployment purpose is applicable only to storage clusters.
-
- Non-Production
- This profile lists instance types that are experimental (best suited for proof of concepts) and may require tuning for running scale-suited workloads.
-
- Production
- This profile lists instance types that are well suited for scale workloads. Optimal tuning parameters are calculated and set during the deployment.
? IBM Storage Scale deployment model: Storage-only
? Deployment purpose: [Use arrows to move, type to filter]
Non-Production
> Production
IBM Storage Scale cluster resource group
Azure resource group plays an important role during the deployment of an IBM Storage Scale cluster because cluster deployment depends on resources such as VNET and DNS, which are tied to a single resource group. When you are deploying an IBM Storage Scale cluster, select the same resource group to access all the dependent group resources.
Cloudkit provides the option to choose among existing resource group to deploy IBM Storage scale cluster on Azure.
When you run the cloudkit create cluster command, following cloudkit create cluster prompt asks the user to select from among the different available resource group.
? Resource Group: scale-cluster
IBM Storage Scale cluster deployment profiles
-
- Throughput-Performance-Persistent-Storage
- This profile uses a single availability zone with persistent storage, which means that the file
system device retains data after the instance is stopped. In this mode, cloudkit calculates the
number of storage nodes based on the provided file system capacity. The number of storage instances
is limited to 64; compute instances are limited to
65.Important: When you choose this profile, the file system is configured in such manner that the data is not replicated, only the metadata gets replicated across the instances.
-
- Throughput-Performance-Scratch-Storage
- This profile uses a single availability zone and a proximity placement group, which means that
it packs instances close together inside an availability zone. This strategy enables workloads to
achieve the low-latency network performance necessary for tightly coupled node-to-node communication
that is typical of high-performance computing (HPC) applications, with instance or temporary
storage, meaning that the file system device loses data after the instance is stopped.
In this mode, the number of storage instances is limited to 10 and compute instances is limited to 65.
Important: This profile uses temporary storage (NVMe), which offers high performance and low latency for data-intensive workloads. However, the data that is stored in this mode is volatile and can be lost if the instance is stopped or terminated. Therefore, it is recommended to take frequent backups.This profile must not be used for long-term storage or if the data is not backed up elsewhere.
-
- Throughput-Advance-Persistent-Storage
- This profile uses a single availability zone with persistent storage, which means that the file
system device retains data after the instance is stopped. This mode is meant for storage capacity
rather performance; the number of storage instances are limited to 64; compute instances are limited
to 65. This profile offers these disk types to choose from:
Standard_LRS
,StandardSSD_LRS
,Premium_LRS
, andPremiumV2_LRS
.Important: When you choose this profile, the file system is configured in such manner that the data is not replicated, only the metadata gets replicated across the instances.
-
- Balanced
- This profile uses multiple (3) availability zones to deploy the IBM Storage Scale cluster into. In this mode, the number of storage
instances is limited to 64. It offers a choice of disk types
Standard_LRS
,StandardSSD_LRS
,Premium_LRS
, andPremiumV2_LRS
.Important: The IBM Storage Scale file system is configured in such a way that data and metadata are replicated across availability zones.Instances are spread across the first two availability zones that are specified in the selection, and the tie-breaker instance gets provisioned in the three availability zones that are specified during the selection.? Tuning profile: [Use arrows to move, type to filter, ? for more help] Throughput-Performance-Scratch-Storage > Throughput-Performance-Persistent-Storage Throughput-Advance-Persistent-Storage Balanced
Determining your performance, scalability, data availability, and data protection requirements
Before the deployment of an IBM Storage Scale cluster, it is important to understand your requirements in terms of performance, scalability, data availability, and data protection. These criteria determine what Azure instance types to use for the storage nodes and compute nodes, also what elastic block store types should be used.
The cloudkit provides the following choices for VM instances types:
For IBM Storage Scale compute nodes, all the VM instance types are supported, according to their availability per region.
For IBM Storage Scale storage nodes, the support depends on the profile, as described in the following list.
- For Throughput-Performance-Persistent-Storage and
Balanced profiles, the next VM instance types are
supported.
Standard_F8s_v2 | vCPU(8) | RAM (16.0 GB) | Max network bandwidth (12500 Mbps) Standard_F16s_v2 | vCPU(16) | RAM (32.0 GB) | Max network bandwidth (12500 Mbps) Standard_F32s_v2 | vCPU(32) | RAM (64.0 GB) | Max network bandwidth (16000 Mbps)
-
For Throughput-Performance-Scratch-Storage profile, the next VM instance types are supported.
Standard_L8s_v3 | vCPU(8) | RAM (64) | Instance Storage (1 x 1.92 TB NVMe Disk) | Expected network bandwidth (12500 Mbps) Standard_L16s_v3 | vCPU(16) | RAM (128) | Instance Storage (2 x 1.92 TB NVMe Disk) | Expected network bandwidth (12500 Mbps) Standard_L32s_v3 | vCPU(32) | RAM (256) | Instance Storage (4 x 1.92 TB NVMe Disk) | Expected network bandwidth (16000 Mbps) Standard_L48s_v3 | vCPU(48) | RAM (384) | Instance Storage (6 x 1.92 TB NVMe Disk) | Expected network bandwidth (24000 Mbps) Standard_L64s_v3 | vCPU(64) | RAM (512) | Instance Storage (8 x 1.92 TB NVMe Disk) | Expected network bandwidth (30000 Mbps) Standard_L80s_v3 | vCPU(80) | RAM (640) | Instance Storage (10 x 1.92 TB NVMe Disk) | Expected network bandwidth (32000 Mbps)
-
For Throughput-Advance-Persistent-Storage profile, the next VM instance types are supported.
Standard_F8s_v2 | vCPU(8) | RAM (16.0 GB) | Max network bandwidth (12500 Mbps) Standard_F16s_v2 | vCPU(16) | RAM (32.0 GB) | Max network bandwidth (12500 Mbps) Standard_F32s_v2 | vCPU(32) | RAM (64.0 GB) | Max network bandwidth (16000 Mbps)
For more information on choosing instance types, see Virtual Machine series in Azure documentation.
? Disk type: [Use arrows to move, type to filter, ? for more help]
> Standard_LRS
StandardSSD_LRS
Premium_LRS
PremiumV2_LRS
Planning for encryption at rest
The cloudkit offers an easy and simplified way to enable disk encryption used by IBM Storage Scale instances.
- Platform-managed encryption key. No configuration is required. By default disks are automatically encrypted-at-rest with platform-managed keys
- Customer-managed encryption key (CMEK). Managed through Azure Key Vault.
To enable encryption at rest, you need Key Vault Crypto Service Encryption permissions, which rely on encryption set. Key vault and key creation steps are out of scope.
- Azure virtual machines for deployment
- Azure Resource Manager for template deployment
- Azure Disk Encryption for volume encryption
? Data stored in boot and data disk(s) are encrypted automatically. Select an encryption key management solution: [Use arrows to move, type to filter]
> Platform-managed-encryption-key
Customer-managed-encryption-key
You must select a solution for encryption key management.
- If the key used for encrypting IBM Storage Scale disk volumes is deleted, data cannot be retrieved.
- An invalid key or user, one that was not configured to run cloudkit, lacks permissions to read the key, which causes failures.
The Platform-managed-encryption-key solution does not require any further configuration.
- In these prompts, input a key name for the Azure Key
Vault:
? Data stored in boot and data disk(s) are encrypted automatically. Select an encryption key management solution: Customer-managed-encryption-key ? Cloud Vault Name: scalettest1 ? Customer-managed encryption key (CMEK): scalekey1
- When the cluster deployment configuration is storage only, both data and boot volumes are encrypted.
- When the cluster deployment configuration is compute only, only the boot volumes are encrypted.
Limitations
- Both boot and data-related volumes are encrypted.
- All data and boot volumes are encrypted by using a single key.
- Temporary disks or scratch storage disks are not encrypted when
Customer-managed-encryption-key
is used. - The Azure Key Vault must use the same resource group that is used for IBM Storage Scale cluster creation.
- You cannot use the cloudkit command in the Azure Cloud VM.