Anthony's Blog: Using System Storage - An Aussie Storage Blog
I think this picture speaks for itself: Three XIVs. Three cities. Three way iSCSI.
One of the many popular features of the XIV is the ability to replicate using iSCSI. On XIV Gen3 there are now at least 10 and up to 22 active iSCSI ports on each machine (depending on how many modules you order).
Implementation of the iSCSI connection between two XIVs is a piece of cake. If both XIVs are defined to the XIV GUI (which they should be), you just need to drag and drop links between XIVs to bring the iSCSI mirroring connections alive. If the network gods are with you, the link goes green. But... if the networking gods are against you... the links staysred and then the question is... what to do?
Old fashioned problem diagnosis leads us straight to the ping command. However I routinely find that the ping command works fine (all interfaces respond), but the link stubbornly remains red.
The first possible problem is that iSCSI uses TCP port 3260, so hopefully there are no firewalls blocking that port.
The second possible problem is the MTU size (Maximum Transmission Unit). When we define the iSCSI interfaces on the XIV we set the MTU as a value of up to 4500 bytes. When we establish connections between two XIVs, each XIV will send test packets that are sized to the MTU. If the intervening network does not support that packet size, the packets will be dropped by the network, because the XIV sets the don't fragment flag to ON.
So how to work out what the MTU is? Well the first thing to do is ask your friendly networking team member. But sometimes I find that the intervening networks are controlled by third parties, which means that getting a straight (and reliable) answer can prove difficult. Even worse, some of these third parties charge a fee every time you call them, so there may be hesitation to even get them involved!
One simple trick is to re-use the ping command but play with payload sizes. We can use a command that looks like this:
ping -f -l 1472 192.168.0.1
That command sends a ping with a payload of 1472 bytes to IP address 192.168.0.1. We add the -f parameter to prevent packet fragmentation. What you then do is slowly increase the payload until you no longer get a reply.
This process works fine and is great way to determine the maximum payload size the end to end network will support. However if you're using the payload size to determine the maximum transmission unit, there is a little trick. The MTU is the maximum packet size, but a ping sends a payload wrapped in 28 bytes of IP and ICMP headers. So our example:
ping -f -l 1472 192.168.0.1
sends a 1500 byte frame to the 192.168.0.1 IP address (1472 bytes of payload and 28 bytes of headers.
If this command succeeds, you can use an MTU of 1500 in the XIV GUI or XCLI (rather than an MTU of 1472, which is 28 bytes smaller).
For those who are wondering how I did the networking sniffing to get the screen captures above, I used a brilliant piece of freeware software called Wireshark. My only warning is that your corporate security policies may have rules on sniffing the network. Don't take my blog post as permission to use it # And for the networking geeks among you, yes I know that extra packets could actually be wrapped around our ethernet packet for things like VLAN tags or encapsulation, but hopefully this should not affect our mathematics.
Controlling the background traffic
Final pointer. Having finally gotten the link up and going, you are now free to start replicating volumes. But how much traffic can the cross site link support? The XIV can limit the background copy bandwidth with a parameter called max_initialization_rate. This is useful to stop you flooding the cross site link and annoying your link co-tenants. To display the current setting, open an XCLI window and issue the following command:
For each target you should see three parameters:
<max_initialization_rate value="100"/> <max_resync_rate value="300"/> <max_syncjob_rate value="300"/>
These three settings should be tuned to reflect the possible throughput of the cross site links.
To change the settings use a command like this (change the target name and the rates to suit):
target_config_sync_rates target="Remote_XIV" max_initialization_rate=120 max_syncjob_rate=240 max_resync_rate=240
If you want to see the current throughput rate, open XIV Top on the remote machine. You should see how much write I/O is being sent to the mirror target volumes in MBps.
So hopefully your now better positioned to diagnose iSCSI link issues, maximize your MTU and tune and monitor your link speed.
Questions? Fire away...
anthonyv 2000004B9K Tags:  password v7000 iscsi svc firmware storwize ldap code 22.214.171.124 comma backspace 10,607 Views
The first update for Storwize V7000 and SVC release 6.3 is now available. You will find it here for Storwize V7000 and here for SVC (note both links will require you to login to Fix Central with your IBM ID). As usual the new release contains a combination of new features and fixes. The new features are:
New features in SVC 126.96.36.199 * Support for multi-session iSCSI host attachment * Language Support for Brazilian Portuguese, French, German, Italian, Japanese, Korean, Spanish, Turkish, Simplified Chinese and Traditional Chinese
There are also several fixes (with some variation between SVC and Storwize V7000, mainly around the platform hardware). The release notes (which you can find at the links above) detail them all. Two fixes I have been looking forward to are:
IC80253 Unable to log into the GUI if password contains special characters.
IC80501 Performance statistics collection fails to record read and write response times for internal drives.
Note that the Drive firmware does not need to be updated with this release. The new upgrade test tool (version 7.3) will not ask you to update them. I will let you know when that situation changes.
IBM SAN Volume Controller (SVC) has offered Fibre Channel Storage Virtualization since June 2003. Two SVC nodes communicate with each other via fibre channel to form a high availability I/O group. They then communicate with the storage that they virtualize via Fibre Channel and with the hosts they serve that virtual storage to, via Fibre Channel. When IBM added real-time (metro mirror) and near real-time (global mirror) replication it was also done using Fibre Channel, with each SVC cluster communicating to the other by connecting using fibre channel protocol transported over dark fibre with or without a WDM or via FCIP (Fibre Channel over IP) routers.
Each Fibre Channel port on an SVC node can be a SCSI initiator to backend storage, a SCSI target to hosts and all the time communicate to its peer nodes using those same ports. With every generation of SVC node, these ports got faster and faster, going from 2 Gbps to 4 Gbps to 8 Gbps. In SVC firmware V5.1 IBM added iSCSI capability to the SVC using the two 1 Gbps ethernet ports in each node. This allowed each node to also be an iSCSI target to LAN attached hosts.
When the Storwize V7000 came out in Oct 2010 it offered all of this capability, plus offered two fundamental changes to the design.
When IBM added 10 Gbps Converged Enhanced Ethernet adapters to the SVC and to the Storwize V7000, these adapters operated as iSCSI Targets, allowing clients to access their volumes via a high-speed iSCSI network. In V6.4 code IBM allowed these adapters to also be used for FCoE (Fibre Channel over Ethernet). These are also effectively SCSI targets ports allowing hosts that use CEE adapters to connect to the SVC or V7000 over a converged network.
If you have a look at the Configuration limits page for SVC and Storwize V7000 version 6.4 (the Storwize V7000 one is here), you will see this interesting comment:
"Partnerships between systems, for Metro Mirror or Global Mirror replication, do not require Fibre Channel SAN connectivity and can be supported using only FCoE if desired"
So does this mean we can stop using FCIP routers to achieve near real-time replication between SVC clusters or Storwize V7000s? The short answer is most likely not. Lets look at why...
The whole reason Fibre Channel became the standard method to interconnect Enterprise Storage to Enterprise hosts is simple: Packet loss is prevented by buffer credit flow control. Frames are not allowed to enter a Fibre Channel network unless there are buffers in the system to hold them. Frames are normally only dropped if there is no destination to accept them. Fibre channel is a highly reliable, scalable and mature architecture. When we extend Fibre Channel over a WAN we do not want to lose this reliable nature, so we use FCIP routers like Brocade 7800s, that continues to ensure frames are reliably delivered in order, from one end point to another.
Converged enhanced ethernet allows Fibre Channel to be transported inside enhanced ethernet frames. The one fundamental that CEE brings to the table is the same principle that a frame should not enter the network without a buffer to hold it. Extending FCoE over distance has the same challenge: the moment you start moving those frames over a WAN connection you need to ensure frames are not lost due to congestion. How do we do this? The same way we did with Fibre Channel: we use Dark Fibre, we use WDMs or we use routers. The same issues and requirements exist.
For more information on FCoE over distance check out this fantastic Q&A from Cisco:
If you want to understand FCoE better, this document from Brocade is very good: