Local file system
Creating a local file system consists of the following steps:
-
Specify device names.
-
Label nodes that have a connection to the shared disks.
-
Configure file system replication (optional).
-
Create a LocalDisk resource for each disk to use for the file system.
-
Create a Filesystem resource that references to the LocalDisk resources.
Specify device names
Skip this step if the used disks have one of the device names sd*
, hd*
, vd*
, scini*
, pmem*
, nvm*
, dm-*
, vpath*
, dasd*
, or emcpower*
.
If disks with different device names are used, the devices must be specified in the Cluster custom resource.
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: Cluster
...
spec:
daemon:
nsdDevicesConfig:
localDevicePaths:
- devicePath: /dev/test*
deviceType: generic
Property | Required | Default | Description |
---|---|---|---|
cluster.spec.daemon.nsdDevicesConfig.localDevicePaths.devicePath |
Yes | None | The device name of the disks. The device name also allows wildcards. |
cluster.spec.daemon.nsdDevicesConfig.localDevicePaths.deviceType |
No | generic |
The type of the device. The type is usually generic . Contact IBM if device type generic does not work. |
Configure replication (technology preview)
The replication feature is available as a technology preview. Technology preview features are not supported for use within production environments. Use with non-production workloads, in demo or proof of concept environments only. IBM production service level agreements (SLAs) are not supported. Technology preview features might not be functionally complete. The purpose of technology preview features is to give access to new and exciting technologies, enabling customers to test functionality and provide feedback during the development process. The existence of a technology preview feature does not mean a future release is guaranteed. Feedback is welcome and encouraged.
Optionally the file system data can be replicated between disks by using 2-way or 3-way replication. The disks must be grouped in failure groups. The IBM Storage Scale file system writes the data block replicas to disks with different failure group numbers.
A failure group number can be specified per local disk. For more information, see LocalDisk spec. All disks with the same number belong to the same failure group. The total disk capacity of all failure groups must be equal (with a small difference are tolerated). At least 2 failure groups are required for 2-way replication. At least 3 failure groups are required for 3-way replication. If more failure groups are created, IBM Storage Scale writes replicas to disks of different failure groups to evenly fill the disks.
Instead of manually assigning a failure group per disk, the failure groups can also be automatically assigned to K8s zones. In this case, all nodes with a connection to a disk must be placed in the same zone. And the total capacity of all disks within zones must be equal across all zones.
Create LocalDisk resources
A LocalDisk
custom resource must be created for each disk or volume that is used to provide storage for a local file system.
The following steps help guide you in creating a LocalDisk
resource.
-
Download a copy of the sample
localdisk.yaml
from the GitHub repository.curl -fs https://raw.githubusercontent.com/IBM/ibm-spectrum-scale-container-native/v5.2.3.x/generated/scale/cr/localdisk/localdisk.yaml > localdisk.yaml || echo "Failed to download LocalDisk sample CR"
The sample
localdisk.yaml
provides a list of sample disks that must be changed. Each LocalDisk resource must have a uniquemetadata.name
. -
Edit the
localdisk.yaml
file and change the fields that are specific to your installation. For more information, see LocalDisk Spec. -
Apply the resource by using the following command:
kubectl apply -f localdisk.yaml
-
View the localdisk resources by using the following command:
kubectl get localdisk -n ibm-spectrum-scale
LocalDisk spec
The following table describes the properties for LocalDisk
:
Property | Required | Default | Description |
---|---|---|---|
metadata.name |
Yes | None | The name of the CR. |
device |
Yes | None | The device path of the disk on the specified node (see node property) at creation time. After successful creation of the local disk, this parameter is no longer used. Note: Device paths of disks can change after
a node reboot. |
node |
Yes | None | The Kubernetes node where the specified device exists at creation time. |
nodeConnectionSelector |
No | None | This node selector selects the nodes, which are expected to have a physical connection to the disk. If not specified, by default all nodes are expected to have a physical connection to the disk. At local disk creation time, the operator verifies
whether the disk has a connection to the selected nodes. Only nodes that are selected by the cluster.spec.daemon.nodeSelector node selector are considered. For more information on cluster.spec.daemon.nodeSelector ,
see Cluster spec |
failureGroup |
No | None | The failure group number is only needed if the file system that uses this disk is configured for 2-way or 3-way replication. The replicas of blocks are written to disks with different failure group numbers. If this parameter is not specified,
all disks within the same Kubernetes zone automatically have the same failure group number. Even though the failure group must be a positive number higher or equal than zero, the number must be specified in quotation marks like "2" .
This parameter cannot be changed anymore as soon as the disk is used by a file system. For more information, see Configure replication. |
existingDataSkipVerify |
No | False | This property controls whether existing Storage Scale data structure is overwritten when the disk is created. A disk can have Storage Scale data structures on if it was used in a file system before and if it was not cleaned up. If false, a "DiskHasFilesystemData" event is displayed if the disk still has Storage Scale data on it, and the disk is not used. If true, the disk is formatted, no matter if it still has data on it. |
thinDiskType |
No | false |
The space reclaim disk type of IBM Storage Scale disks. |
In the following example:
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: LocalDisk
metadata:
name: disk0
namespace: ibm-spectrum-scale
spec:
node: worker0.example.com
device: /dev/sdb
nodeConnectionSelector:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker0.example.com
- worker1.example.com
- worker2.example.com
---
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: LocalDisk
metadata:
name: disk1
namespace: ibm-spectrum-scale
spec:
node: worker0.example.com
device: /dev/sdc
nodeConnectionSelector:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker0.example.com
- worker1.example.com
- worker2.example.com
---
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: LocalDisk
metadata:
name: disk2
namespace: ibm-spectrum-scale
spec:
node: worker0.example.com
device: /dev/sdd
nodeConnectionSelector:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker0.example.com
- worker1.example.com
- worker2.example.com
Three local disks with devices /dev/sdb
, /dev/sdc
, and /dev/sdd
on the worker0.example.com
Red Hat OpenShift node are created. All three disks are expected to have a connection to the worker0.example.com
,
worker1.example.com
, and worker2.example.com
nodes. For each disk, the connection to the worker0.example.com
, worker1.example.com
, and worker2.example.com
nodes is verified after
the local disk is created.
After a Red Hat OpenShift node is rebooted, the disks might have different device
on the node. The different device is not a problem because the device
parameter only matters at local disk creation time.
Enter the kubectl explain localdisk.spec
command to view more details.
LocalDisk status
LocalDisk type
The Type
in the local disk status indicates how the disk is connected to the nodes.
Type | Description |
---|---|
unshared |
The disk is connected to the node specified in spec.node . The spec.nodeConnectionSelector parameter is not specified. |
shared |
This disk is connected to all nodes that are selected by spec.nodeConnectionSelector . If spec.nodeConnectionSelector is not specified, the disk is connected to all nodes that are selected to run a core pod. |
partially-shared |
The disk is connected to a subset of nodes that are selected by spec.nodeConnectionSelector . If spec.nodeConnectionSelector is not specified, the disk is connected to a subset of the nodes that are selected to run
a core pod. |
Only file systems with type shared
are supported. After the LocalDisk resources are created, they must all display the shared
type.
kubectl get localdisk -n ibm-spectrum-scale
NAME TYPE READY USED AVAILABLE FILESYSTEM SIZE AGE
disk0 shared True False False 1.4 TiB 80m
disk1 shared True False False 1.4 TiB 80m
disk2 shared True False False 1.4 TiB 80m
If a disk has type unshared
or partially-shared
in status, the disk is not connected to all nodes selected by spec.nodeConnectionSelector
or to all nodes that are selected to run a core pod. In this case, the
connection of the disk must be fixed or spec.nodeConnectionSelector
must be corrected before it can be used for the local file system.
If the file system must be replicated, use the -o wide
option to display the failure group numbers.
$ kubectl get localdisk -o wide -n ibm-spectrum-scale
NAME TYPE READY USED AVAILABLE FILESYSTEM POOL FAILUREGROUP SIZE AGE
disk0 shared True False False 0 1.4 TiB 80m
disk1 shared True False False 1 1.4 TiB 80m
disk2 shared True False False 2 1.4 TiB 80m
LocalDisk status conditions
The status Conditions
can be viewed as a snapshot of the current and most up-to-date status of a LocalDisk
instance.
- The
Ready
condition is set toTrue
if theLocalDisk
successfully created a disk within the IBM Storage Scale cluster and the disk is ready to be used by a local file system. - The
Used
condition is set toTrue
if the disk is used within a local file system. - The
Available
condition is set toTrue
if the disk within the IBM Storage Scale cluster is operational. The condition hasUnknown
status if the disk is not used by a local file system.
Create Filesystem resource
To configure a local file system in the IBM Storage Scale container native cluster, a Filesystem
custom resource (CR) must be defined for each local file system.
A LocalDisk
custom resource must be created for each disk or volume before proceeding. For more information, see Create LocalDisk resources.
Samples are provided as a starting point with some defaults already set in the configuration based on the target environment. Use one of the following commands to download a copy of the sample.
Use caution to select the correct curl command based on your environment. The source files are different but the downloaded files are all named filesystem.local.yaml
.
Red Hat OpenShift
Use the command to download a copy of the sample for Red Hat OpenShift:
curl -fs https://raw.githubusercontent.com/IBM/ibm-spectrum-scale-container-native/v5.2.3.x/generated/scale/cr/filesystem/filesystem.local.yaml > filesystem.local.yaml || echo "Failed to download Filesystem sample CR"
Kubernetes and Google Kubernetes Engine (GKE)
IBM Storage Scale container native deployed in a Google Kubernetes Engine (GKE) is available as a technology preview. For more information, see Support for Google Kubernetes Engine (GKE).
Use the command to download a copy of the sample for Kubernetes or Google Kubernetes Engine (GKE):
curl -fs https://raw.githubusercontent.com/IBM/ibm-spectrum-scale-container-native/v5.2.3.x/generated/scale/cr/filesystem/filesystem.local-kubernetes.yaml > filesystem.local.yaml || echo "Failed to download Filesystem sample CR"
Apply the local file system resource
-
Edit the
filesystem.local.yaml
file to change the fields that are specific to your installation. For more information, see Filesystem Spec. -
After you complete the changes, use the following command to apply the yaml:
kubectl apply -f filesystem.local.yaml
-
Use the following command to view the
Filesystem
resources:kubectl get filesystem -n ibm-spectrum-scale -o wide
To view the detailed properties of a local file system named
local-sample
:kubectl describe filesystem local-sample -n ibm-spectrum-scale
Filesystem spec
The following table describes the properties for Filesystem
:
Property | Required | Default | Description |
---|---|---|---|
metadata.name |
Yes | None | The name of the CR. |
local |
No | None | If specified, describes the file system to be a local file system. |
local.type |
Yes | None | The type of the file system. Only shared is supported. |
local.replication |
Yes | None | Replication causes the number of replicas to create for each data/metadata block that is written to the file system. Specify 1-way to not replicate the file system data. Specify 2-way or 3-way to replicate
data. |
local.pools |
Yes | None | List of file system pools. |
local.pools.name |
Yes | None | It is the name of the pool. One pool with name system is mandatory (default pool). |
local.pools.disks |
Yes | None | The names of LocalDisk resources that provide storage to this local file system pool. |
local.blockSize |
No | 4M |
A block size of 4 MiB provides good sequential performance, makes efficient use of disk space, and provides good performance for small files. It works well for the widest variety of workloads. |
You can define 1 or more Filesystem
custom resources, each file system must use its own local disks.
In the following example:
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: Filesystem
name: local-sample
namespace: ibm-spectrum-scale
spec:
local:
type: shared
replication: 1-way
pools:
- disks:
- disk0
- disk1
- disk2
name: system
The local file system local-sample
uses the disks that are represented by the disk0
, disk1
, and disk2
local disk resources.
Enter the kubectl explain filesystem.spec.local
command to view more details.
Consider the Local file system hardware requirements.
Filesystem status
Status Conditions
can be viewed as a snapshot of the current and most up-to-date status of a Filesystem
instance. The Success
condition is set to True
if the file system is created and mounted.
Enter the kubectl explain filesystem.status.pools
command to view more details.