Using the dd Command vs cplv to Copy a Filesystem

Question & Answer

Question

The cplv command is very slow copying filesystems, and it locks the volume group so no other copies can be done until the first is finished. Is there another way to copy filesystems that would be faster?

Answer

The command /usr/sbin/cplv will copy a logical volume to another logical volume. Typically this is used to move a logical volume to a new volume group. The cplv command can either copy raw logical volumes or ones containing a JFS or JFS2 filesystem. If the source and target are filesystems it may be necessary to change some attributes of the target logical volume or filesystem.

What we're interested in is to try to eliminate three problems inherent with cplv:

It is fairly slow
It can only run on 1 logical volume at a time
It locks the volume group that is being used

What we can use instead is the /usr/bin/dd command, which is more flexible and doesn't limit us to the constraints that cplv does. In this technote we'll discuss using dd on a filesystem existing inside a logical volume, rather than on a raw logical volume. However the same technique can be used for the raw logical volume.

While using either command, the source filesystem will need to be unmounted. This prevents the possibility of data corruption due to changing data as a file is being read and written to while we are trying to copy it.

So let's say we have two 200GB Filesystems, both with inline logs. We would like to copy the entire contents of the filesystem on the logical volume "datalv" to the target logical volume "targetlv".

We'll test out a variety of methods to see if we can get faster transfers between these two logical volumes.

1. Using cplv

First we change the target logical volume type to "copy" to allow cplv to copy over our existing filesystem that resides on it. See the references section for other technotes with specific steps for using cplv.

# chlv  -t copy targetlv

Now use cplv to copy the data from our source logical volume to the target. We'll use /usr/bin/timex to tell us how long the transfer takes.

# timex cplv -f -e targetlv datalv
cplv: Logical volume datalv successfully copied to targetlv .

real 1851.37
user 2.88
sys  97.38

Total time 30min 51.37sec

The cplv command moves data in 128KB blocks. This is not always efficient. Plus it locks the volume group while it's in use, so any other cplv on a logical volume in that volume group will hang until the first cplv is complete.

2. Using dd

Let's try the same thing using the dd command to copy the logical volume data across. We can use a block size larger than 128KB for efficiency. Plus we can use the raw logical volume name, rather than the block (cooked) device. This saves us from buffering into 4k block sizes in memory. We're using a big block size of 1MB

# dd if=/dev/rSOURCE_LV of=/dev/rDEST_LV bs=1024K

I would also check the LTG (Logical Track Group) size for the volume groups.
If the LTG size > 1024KB (1M), INCREASE the block size to be equal to the LTG size.

# lsvg sourcevg | grep LTG
LTG size (Dynamic): 256 kilobyte(s)

# lsvg targetvg | grep LTG
LTG size (Dynamic): 256 kilobyte(s)

So in this example 256KB LTG so 1024KB is better.

Our first test of dd, just using dd across the entire logical volume.

# timex dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M
204800+0 records in.
204800+0 records out.

real 667.79
user 0.60
sys  67.52

So that took 11min 7.79 seconds.
667.79 / 1851.37 = 36.0% or 2.77 X faster

Now we should run fsck to make sure the filesystem is not damaged in any way.

# fsck -y /dev/targetlv

The current volume is: /dev/targetlv
Primary superblock is valid.
J2_LOGREDO:log redo processing for /dev/targetlv
Primary superblock is valid.
*** Phase 1 - Initial inode scan
*** Phase 2 - Process remaining directories
*** Phase 3 - Process remaining files
*** Phase 4 - Check and repair inode allocation map
*** Phase 5 - Check and repair block allocation map
File system is clean.

After this I mounted both source and target, ran checksum on every file in them, then compared the sums.

# mount /target
# cd /target
# find . -exec cksum {} \; > /tmp/target.sum

# mount /data
# cd /data
# find . -exec cksum {} \; > /tmp/source/sum

# cd /tmp
# diff source.sum target.sum

The checksums all matched, so we can be reasonably sure that the data is intact.

3. Parallel dd

For even more speed, we can run dd's in parallel on the same filesystem. In essence this is what we'll do: Using "skip" and "seek" options to dd, we can grab a specific set of blocks from the source filesystem and put them into the same locations in the target logical volume.

For example if we decide we want to run 4 dd's simultaneously, we would divide them up like this:

block 0 50G 100G 150G 200G
|-----------------|-----------------|-----------------|-----------------|

So we have one dd transfer running from block 0 to 50GB into the filesystem. The next dd runs from 50GB to 100GB and so on.

Since these dd's do not overlap we can run them simultaneously.
We use the "skip" to skip over input blocks and "seek" to go out a number of blocks before writing the output.

We're going to use 1024K (1MB) blocks for the transfer.

First transfer of 50G worth of 1MB blocks:
# dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M count=51200 &

2nd transfer, we skip over the first 50GB and write the next 50GB
# dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M skip=51200 seek=51200 count=51200 &

3rd transfer, we skip over first 100GB and write 50GB
# dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M skip=102400 seek=102400 count=51200 &

4th transfer, we skip over the first 150GB and write the final 50GB
# dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M skip=153600 seek=153600 count=51200 &

Then we'll need to fsck and run our checksum test to make sure the data is ok.
The best way to use this is to put the dd's in a script, included in the Reference section below.

# timex ./parallel-dd
starting transfer 1
Sending nohup output to nohup.out.

starting transfer 2
Sending nohup output to nohup.out.

starting transfer 3
Sending nohup output to nohup.out.

starting transfer 4
Sending nohup output to nohup.out.

real 489.27
user 0.61
sys  53.43

So for this run, we get a total time of 8 min 9.27sec
73% of the single dd, 1.36 X faster
26% of the cplv, 3.78 X faster

Changing Disk Attributes

There are some disk parameters that may make this transfer go faster. Although the focus of this document is not about I/O tuning, we can show that modifying some disk parameters may give us better performance for our copy.

Notice that the virtual SCSI disk has a queue depth of 3 on the client:

# lsattr -El hdisk13
...
queue_depth     3                                Queue DEPTH                True
..

However the disk mapped to it from the VIO server has a queue depth of 20:

# lsattr -El hdisk2
...
queue_depth     20                               Queue DEPTH                True
...

Matching up the queue depth parameter between the virtual SCSI disk and actual physical disk is an important setting. We can vary off the volume group and change queue depth to see if that improves the transfer:

# varyoffvg sourcevg
# varyoffvg targetvg

# chdev -a queue_depth=20 -l hdisk12
hdisk12 changed

# chdev -a queue_depth=20 -l hdisk13
hdisk13 changed

Now run our 4 parallel dds again:

# timex ./parallel-dd
starting transfer 1

Sending nohup output to nohup.out.
starting transfer 2

Sending nohup output to nohup.out.
starting transfer 3

Sending nohup output to nohup.out.
starting transfer 4

Sending nohup output to nohup.out.

real 388.38
user 0.55
sys  74.79

This is a little bit faster. 79% of the parallel dd before tuning, or 1.25 X faster.

So with parallel dd's and tuning the disk parameters we gain a speed of 4.76 X faster than the original cplv. There may be some other parameters we can tune against the disk or adapter, but this example illustrates how effective dd can be when a disk subsystem is tuned for multiple large streams of I/O.

Results

The last example transfer using parallel dd gives a performance number of 527.31 MB/s (or 4218.48 megabits/sec).
Our initial cplv transfer gives 110.62 MB/s (884.96 megabits/sec).

For 1TB of data the parallel dd would take approx 32.21 min, while the cplv would take approx 2h 34min.

This technique could be expanded out to multiple filesystems at the same time, rather than a single filesystem.

References

How can I copy a non-striped logical volume to a striped one?

Moving a JFS/JFS2 File System to a New Volume Group

The parallel-dd script

#!/bin/ksh # # Disclaimer # ========== # # Information in this document is correct to the best of our knowledge at the # time of this writing. Please use this information with care. IBM will not be # responsible for damages of any kind resulting from its use. The use of this # information is the sole responsibility of the customer and depends on the # customer's ability to evaluate and integrate this information into the # customer's operational environment. # # The following is an example script and is not supported by IBM. # # This script implements a parallel dd from one filesystem to another # Since these dd's do not overlap we can run them simultaneously. # We use the "skip" to skip over input blocks and "seek" to go out # a number of blocks before writing the output. # # We use nohup to keep the transfers running in case our session disconnects. # # We're going to use 1024K (1MB) blocks for the transfer. # # First transfer of 50G worth of 1MB blocks: echo "starting transfer 1" nohup dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M count=51200 & # 2nd transfer, we skip over the first 50GB and write 50GB echo "starting transfer 2" nohup dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M skip=51200 seek=51200 count=51200 & # 3rd transfer, we skip over first 100GB and write 50GB echo "starting transfer 3" nohup dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M skip=102400 seek=102400 count=51200 & # 4th transfer, we skip over the first 150GB and write the final 50GB echo "starting transfer 4" nohup dd if=/dev/rdatalv of=/dev/rtargetlv bs=1M skip=153600 seek=153600 count=51200 & wait exit 0

[{"Product":{"code":"SWG10","label":"AIX"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"File management","Platform":[{"code":"PF002","label":"AIX"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Was this topic helpful?

Document Information

More support for:
AIX

Software version:
Version Independent

Operating system(s):
AIX

Document number:
629505

Modified date:
26 June 2018

UID

isg3T1024165

IBM Support

Tips