Information Lifecycle Management for AIX data using IBM SONAS

Use IBM SONAS to offload Information Lifecycle Management responsibilities of your business data running over UNIX environment.

Leverage information management facilities of IBM® Scaled Out Network Attached Storage(IBM SONAS) to keep AIX data managed. IBM SONAS provides you with data placement, migration and deletion policies to ease your burden of Information Lifecycle Management for AIX® data and helps ensure compliance to government regulations. This article takes a deep dive to illustrate how responsibilities of AIX data management are efficiently delegated to IBM SONAS.

Share:

Bhushan Pradip Jain (bhujain1@in.ibm.com), Associate Software Engineer, IBM ISL

author photo Bhushan Pradip Jain is a Associate Software Engineer working for the IBM India Software Labs. He has published a technology named as "Policy-Driven File Encryption Explorer Based on OpenPGP" under alphaWorks and is currently working on IBM Unified Scalable Storage. He has also worked for developing Intrusion Detection System and implementation of part of the operating system for a multi-antenna telescope. Bhushan has completed his B.Tech. in Computer Engineering from College of Engineering, Pune(COEP). You can contact him at bhujain1@in.ibm.com.



Ujwala P Tulshigiri (ujwala.tulshigiri@in.ibm.com), Software developer, IBM India Software Lab

Ujwala P TulshigiriUjwala P Tulshigiri is a software developer for IBM India Software Labs. She is currently working on CIM component of SONAS (Scaled Out Network Attach Storage) project. She was part of Extreme Blue internship project named Enabling ODF for Social Collaboration with Composite Application and Mashups. Ujwala has completed a bachelor's in computer science from College of Engineering, Pune (COEP). You can contact her at ujwala.tulshigiri@in.ibm.com.



16 August 2011

Also available in Chinese

Introduction

Information Lifecycle Management (ILM) is an umbrella term for various practices and processes used to manage information, right from the time of its conception to the time it is discarded. One of the objectives of ILM is to use cost efficient IT infrastructure based on business value of information. Usual practices indicate that organizations used to retain old data out of their own discretion or due to data's business value. However, due to various compliance regulations like Sarbanes-Oxley, HIPAA, FFIEC in United States and European Data Privacy Directive in the European Union, organizations are required to retain and control information for a very long period of time. As a result, they are required to change the way their data is managed. They now have to retain some specific data for a specified period of time. Thus, to meet the regulatory requirements for data retention with minimum expenditure, the organizations are trying to store more amount of data at the lowest possible cost. Organizations typically achieve this by keeping the most recent and important data in a high performing expensive storage and the archived data on cheap storage. Thus, organizations now need to understand how their data evolves, determine how it grows, observe its usage pattern over time, and decide how long it should survive and comply with all the rules and regulations for that data. Additionally, they also store confidential data on some secured storage as opposed to normal data. ILM addresses these issues among others so that an appropriate technology can be automatically used to store data for each phase of its lifecycle.

In this article, we discuss how we can delegate the ILM responsibilities to IBM SONAS when it is used as a backend store to our AIX systems. This helps various AIX users across all industries to tier their data to use efficiently the available infrastructure for maximum yield.


IBM SONAS used as a backend to store data from AIX systems

IBM SONAS is a highly scalable storage system delivering high performance by leveraging prominent storage subsystems and IBM General Parallel File System (GPFS). It enables customers to expand their storage infrastructure quickly with nominal efforts due to its advanced scale out potential along with automated data placement and unified management capabilities. For additional information on IBM SONAS, see Resources.

Figure 1. IBM SONAS used as a backend to store data from AIX systems
IBM SONAS used as a backend to store data from AIX systems

IBM SONAS allows multiple clients to connect and access their data located on a SONAS system. Figure 1 indicates an example setup where a SONAS system is serving as a backend NAS filer to an AIX machine. The IBM SONAS system (Version 1.1.1.0-19) serves as a file server and exports path using NFS protocol. This IBM SONAS system has the domain name almari.in.ibm.com which resolves to 3 IP addresses (9.122.122.151 to 9.122.122.153). In this article, we are going to use an AIX system, motu.in.ibm.com, configured as a NFS client to the IBM SONAS system. Basic configuration of SONAS system as a backend for AIX is discussed in developerWorks article, "IBM SONAS storage for AIX and Linux environment" (see Resources). The end users access their files from this machine which are actually stored on IBM SONAS. We see how the files created from AIX client are placed, migrated, and deleted automatically on IBM SONAS system based on the policies defined.

We first ssh into the management node of SONAS cluster almari.in.ibm.com using a cliuser and then execute SONAS CLI commands. We then create NFS exports on IBM SONAS system almari.in.ibm.com using mkexport command and then mount the NFS exports on AIX machine.

cliuser is a special user who can execute only SONAS CLI commands and works in a chroot jail.

Code 1: Creation of exports on IBM SONAS and mounting them on AIX client
-bash-3.2# hostname
motu.in.ibm.com

-bash-3.2# ssh cliuser@almaric1.in.ibm.com
cliuser@almaric1.in.ibm.com's password:

[almari.in.ibm.com]$ lscluster
ClusterId            	      Name              PrimaryServer SecondaryServer
12402779238933128657 almari.in.ibm.com strg001st001  strg002st001

[almari.in.ibm.com]$ mkexport Sonas_Test /ibm/gpfs0/Sonas_Test 
--nfs "*(rw,no_root_squash)"
EFSSG0019I The export Sonas_Test has been successfully created.

[almari.in.ibm.com]$ lsexport -v
Name       	Path                  Protocol  Active Timestamp       
Options
Sonas_Test /ibm/gpfs0/Sonas_Test      NFS       true   9/6/10 7:00 PM 
*(rw,no_root_squash,fsid=2135594380)

[almari.in.ibm.com]$ logout
Connection to almaric1.in.ibm.com closed.

-bash-3.2# mkdir /mnt/Sonas

-bash-3.2# mount 9.122.122.151:/ibm/gpfs0/Sonas_Test /mnt/Sonas 

-bash-3.2# mount
node mounted mounted over vfs date options
-------- --------------- --------------- ------ ------------ ---------------
/dev/hd4 / jfs2 Jul 26 19:22 rw,log=/dev/hd8
.....
9.122.122.151 /ibm/gpfs0/Sonas_Test /mnt/Sonas nfs3 Sep 06 19:24 
.....

Information Lifecycle Management for IBM SONAS

IBM SONAS allows us to create three types of policies: placement policy, migration policy and deletion policy. We can apply these policies on files created under the shared directory on IBM SONAS machine and use various SONAS CLI policy commands to set different types of policies.

  • Placement policy:

    This policy defines the rule to determine the pool where the new files created by users are to be placed depending on various file attributes like file name, path, owner, and so on. For example, if there are frequently accessed files belonging to a particular directory, we prefer to store them in a fast performing pool. The ones rarely used or less important files can be placed in low performing pool. Such a rule can be created using this type of policy.

  • Migration policy:

    This policy defines the rule to migrate files from one pool to another depending on various file attributes like access time, modified time, file name, path, owner, and so on. For example, if there exists some files which are not accessed for the last month indicating that these files are not actively required, we can move them to a pool which is cheaper in cost, even though it gives lower performance using this type of policy.

  • Deletion policy:

    This policy defines the rule to delete the files from the system depending on various file attributes like access time, modified time, file name, path, owner, and so on. For example, if there are files belonging to a user who is no longer working for the organization, and if those files are not required by anyone else, we may delete these files by using this type of policy.

Use the following commands for SONAS policy creation and management.

  • mkpolicy

    Using the mkpolicy command, we can create a policy and specify its rules.

  • chkpolicy

    chkpolicy validates the rules specified by a policy. It validates the syntax of policy rules and checks whether the pool or other information given in rules indeed exist.

  • setpolicy

    Use the setpolicy command to set a placement policy for a filesystem.

  • runpolicy

    Use runpolicy to apply migration or deletion policies on existing files within a filesystem.

  • lspolicy

    The lspolicy command lists all the policies created on SONAS machine, and it checks which policy is applied on a filesystem.

In subsequent sections, we discuss how these commands are used for ILM.

Next, we discuss how to create and use policies which aid towards leveraging tiers for your data, but discussion about creating the explicit tiers is out of scope for this article.


Information Lifecycle Management: Data placement

In this section, we discuss how the IBM SONAS placement policies are used to ensure that a given type of file gets stored in a particular pool. The filesystem gpfs0 on IBM SONAS is configured using two different pools, secret and system, as seen from the output of the lspool command. The secret pool is configured using disks with higher security and high performance so that we can place all our important and confidential files in that pool and other less important files in the system pool which gives comparatively lesser performance. To specify this, we create a policy named 'Placement' using the mkpolicy command. This policy has a rule called 'Place_Confidential' which places all files named with the 'Confidential-' prefix in storage pool secret. The chkpolicy command validates all the rules of policy and checks that there is a valid storage pool named secret in the gpfs0 filesystem. If this command fails, it implies that there is no secret storage pool in the gpfs0 filesystem. Once we have created a policy, we have to run the setpolicy command to activate it. The lspolicy command allows us to check which policy is set currently.

Code 2: Generic SONAS Policy commands
-bash-3.2# hostname
motu.in.ibm.com

-bash-3.2# ssh cliuser@almaric1.in.ibm.com
cliuser@almaric1.in.ibm.com's password:

[almari.in.ibm.com]$ lspool
Filesystem Name Size Usage Available fragments Available blocks Disk list
...
gpfs0 secret 7.05 TB 0.0% 248 kB 7.05 TB array0_sata_60001ff0732e40988c40003
gpfs0 system 7.05 TB 2.8% 37.72 MB 6.86 TB array0_sata_60001ff0732e40d88cd0007
...

[almari.in.ibm.com]$ mkpolicy Placement -R "RULE 'Place_Confidential' SET POOL 'secret' 
WHERE name like 'Confidential-%'"
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ lspolicy -P Placement
Policy Name Declaration Name    Default Declarations
Placement   Place_Confidential  N       RULE 'Place_Confidential' SET POOL 'secret' 
WHERE name like 'Confidential-%'

[almari.in.ibm.com]$ chkpolicy gpfs0 -P Placement
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ setpolicy gpfs0 -P Placement
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ lspolicy -A
Cluster Device Policy Set Name Policies Applied Time Who applied it?
...
almari.in.ibm.com gpfs0 Placement Placement 9/6/10 6:45 PM root
...

[almari.in.ibm.com]$ logout
Connection to almaric1.in.ibm.com closed.

Now that we have applied the policy on IBM SONAS system, we create a few files which have a prefix 'Confidential-' from the AIX NFS client—motu.in.ibm.com—under the mounted directory '/mnt/Sonas/'. To confirm that the files are indeed placed in a secret pool on our SONAS system, we connect to the SONAS system using root user and execute a GPFS command mmlsattr for the files created. It shows us that the files are indeed placed in secret pool. Thus, this type of policy can be used to manage the automatic placement of our data in respective pools.

Please note that cliuser does not have the permission to execute the GPFS command mmlsattr. We have used this command to check that the file is indeed stored in the specified pool.

Code 3: File placement policy in IBM SONAS
-bash-3.2# hostname
motu.in.ibm.com

-bash-3.2# cd /mnt/Sonas/

-bash-3.2# echo "Confidential Sales File" > Confidential-Sales.txt

-bash-3.2# echo "Confidential Development File" > Confidential-Development.txt

-bash-3.2# echo "Confidential Report File" > Confidential-Report.txt

-bash-3.2# ls
Confidential-Development.txt Confidential-Report.doc Confidential-Sales.txt 

-bash-3.2# ssh root@almaric1.in.ibm.com
root@almaric1.in.ibm.com's password:

[root@almari.mgmt001st001 ~]# cd /ibm/gpfs0/Sonas_Test

[root@almari.mgmt001st001 Sonas_Test]# ls
Confidential-Development.txt Confidential-Report.doc Confidential-Sales.txt

[root@almari.mgmt001st001 Sonas_Test]# mmlsattr -L Confidential-Development.txt
file name: Confidential-Development.txt
metadata replication: 1 max 2
data replication: 1 max 2
immutable: no
flags:
storage pool name: secret
fileset name: root
snapshot name:

[root@almari.mgmt001st001 Sonas_Test]# logout
Connection to almaric1.in.ibm.com closed.

Information Lifecycle Management: Data Migration

In this section, we discuss how the IBM SONAS migration policies are used to exploit tiered storage pool for migration of files. Let's suppose that we want to migrate all the files with .txt extension to system pool. To do that, we create a data migration policy migration using the mkpolicy command specifying this rule and validate it using chkpolicy command. In case of migration policies, there is no need to set migration policies on a SONAS system; we have to just use the command runpolicy to run the policy on demand which will bring about migration of the files. Neither the *.txt files, which are already in the system pool, nor the files not having .txt extension belonging to the secret pool will be affected after executing this policy; but, the files with '.txt' extension in the secret pool will be migrated to the system pool. The file Confidential-Development.txt that we created previously will now be migrated to the system pool. To confirm that the files have indeed migrated, we login to management server using root and run the mmlsattr GPFS command on the files created earlier. The output shows that storage pool of the file Confidential-Development.txt is now changed to system while that of the file Confidential-Report.doc is still secret. Thus the migration policy has successfully moved relevant files which fit the specified rule to the new pool.

Code 4: File migration policy in IBM SONAS
-bash-3.2# hostname
motu.in.ibm.com

-bash-3.2# ssh cliuser@almaric1.in.ibm.com
cliuser@almaric1.in.ibm.com's password:

[almari.in.ibm.com]$ mkpolicy Migration -R "RULE 'Place_txt' SET POOL 'system' ; RULE 
'Migrate_txt' MIGRATE TO POOL 'system' WHERE name like '%.txt'"
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ lspolicy -P Migration
Policy Name Declaration Name Default Declarations
Migration   Place_txt        N       RULE 'Place_txt' SET POOL 'system'
Migration   Migrate_txt      N       RULE 'Migrate_txt' MIGRATE TO POOL 'system' WHERE 
name like '%.txt'

[almari.in.ibm.com]$ chkpolicy gpfs0 -P Migration
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ runpolicy gpfs0 -P Migration
[I] GPFS Current Data Pool Utilization in KB and %
secret 24189440 7574913024 0.319336%
system 30284032 7574913024 0.399794%
[I] 2120151 of 100000256 inodes used: 2.120146%.
[I] Loaded policy rules from /var/opt/IBM/sofs/PolicyFiles/policy1282223388136.
...
EFSSA0094I A command was executed on node int001st001.
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ logout
Connection to almaric1.in.ibm.com closed.

-bash-3.2# ssh root@almaric1.in.ibm.com
root@almaric1.in.ibm.com's password:

[root@almari.mgmt001st001 ~]# cd /ibm/gpfs0/Sonas_Test

[root@almari.mgmt001st001 Sonas_Test]# ls
Confidential-Development.txt Confidential-Report.doc Confidential-Sales.txt

[root@almari.mgmt001st001 Sonas_Test]# mmlsattr -L Confidential-Development.txt
file name: Confidential-Development.txt
metadata replication: 2 max 2
data replication: 2 max 2
immutable: no
flags:
storage pool name: system
fileset name: root
snapshot name:

[root@almari.mgmt001st001 Sonas_Test]# mmlsattr -L Confidential-Report.doc
file name: Confidential-Report.doc
metadata replication: 2 max 2
data replication: 2 max 2
immutable: no
flags:
storage pool name: secret
fileset name: root
snapshot name:

Information Lifecycle Management: Data deletion

To reduce management overhead, we need to delete unused and irrelevant files from time to time. We will now explore how the deletion policy can be used to perform an automatic deletion of the files satisfying the criteria. We specify this criterion by creating a policy. Let's say that we need to delete all the files having names with prefix Confidential-. We create a policy named Deletion using the mkpolicy command and validate it using the chkpolicy command. Similar to migration policies, deletion policies also are not required to be set on a SONAS system. They can be executed on demand using the runpolicy command which deletes the files. All the files that we created earlier will be deleted after running this policy since all of them satisfy our criteria. When we try to list files in the NFS mounted directory on AIX client, we get an empty output implying that the files got deleted due to execution of this policy. Thus, we have deleted files conforming to our criteria.

Code 5: File deletion policy in IBM SONAS
-bash-3.2# hostname
motu.in.ibm.com

-bash-3.2# ssh cliuser@almaric1.in.ibm.com
cliuser@almaric1.in.ibm.com's password:

[almari.in.ibm.com]$ mkpolicy Deletion -R "RULE 'Place_data' SET POOL 'system'; RULE 
 'Delete_Confidential' DELETE WHERE name like 'Confidential-%'"
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ lspolicy -P Deletion
Policy Name Declaration Name   Default Declarations
Deletion    Place_data           N     RULE 'Place_data' SET POOL 'system'
Deletion    Delete_Confidential  N     RULE 'Delete_Confidential' DELETE WHERE name 
like 'Confidential-%'

[almari.in.ibm.com]$ chkpolicy gpfs0 -P Deletion
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ runpolicy gpfs0 -P Deletion
[I] GPFS Current Data Pool Utilization in KB and %
secret 24189440 7574913024 0.319336%
system 30284288 7574913024 0.399797%
[I] 2120154 of 100000256 inodes used: 2.120149%.
[I] Loaded policy rules from /var/opt/IBM/sofs/PolicyFiles/policy1282224298452.
...
EFSSA0094I A command was executed on node mgmt001st001.
EFSSG1000I The command completed successfully.

[almari.in.ibm.com]$ logout
Connection to almaric1.in.ibm.com closed.

-bash-3.2# cd /mnt/Sonas/

-bash-3.2# ls

-bash-3.2#

Conclusion

Using the placement, migration, and deletion policies for IBM SONAS, we can ensure that compliance to the regulations is automatically met at minimal cost while offloading the administrator responsibilities to IBM SONAS. Thus, having IBM SONAS as a backend store to our AIX or any other UNIX® flavors can help us achieve Information Lifecycle Management, as well as comply with the regulations in a cost-effective manner.

Acknowledgement

The authors sincerely acknowledge Sandeep R. Patil (rsandeep@in.ibm.com) from IBM Corporation for his valued insights, exposure, and motivation to write this article and help convey the subject to the community, customers, and practitioners.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=751823
ArticleTitle=Information Lifecycle Management for AIX data using IBM SONAS
publish-date=08162011