I thought I'd never have to write about fillwords. I thought: there will be a phase of some months and then this topic is dead. Strangely enough it's still alive. I still get questions about them, I still see people blaming them and I still see evitable problems because of changing them.
For every new line rate (now read "Generation" or "Gen"), usually the switch and HBA vendors are the first ones to adopt the new standard and release their products. It was the same for 8Gbps, which came with a new fillword. Fillwords are 4-byte-words without a special task. A port sends them all the time it doesn't have to send something else. They're used to maintain the synchronization of the link and therefore the fillword used up to incl. 4Gbps was fittingly called IDLE. Depending on the workload, the ports and the CPU utilization of a PC have one thing in common: You see a lot of IDLE. Therefore it made sense to think about the optimal fillword and so it was changed for 8Gbps. In the first published version it was quite like "Let's replace all instances of IDLE with a better one: ARBff". First products were developed and among them Brocade's 8Gbps switches.
Later it turned out that it would be better to not just replace all IDLEs out of hand, because they were not only used as a fillword, but in the link initialization, too. The standard was updated and then said, "Use ARBff as a fillword, but keep the IDLE for link initialization".
For products released after that point in time the vendors usually implemented the new version of the standard, which was not compatible with the first one. So clients bought new 8Gbps-capable devices, for example DS5000 boxes or SAN Volume Controllers, and failed to get them online. These devices tried to use the standard-compliant word during the critical link initialization phase and when they noticed that the switches sent the wrong ones, the link initialization failed.
I have to admit that most vendors' information politics were very "unlucky" at that time. Everybody blamed everybody else. After some protocol traces it was clear that the problem was the use of ARBff during link initialization. So as a workaround we recommended to configure the switches to use IDLE again (mode 0). Eventually new firmware versions were written and Brocade came up with two new fillword modes - one of them compliant to the standard (mode 2) and another more dynamic mode 3. It tried ARBff in link initialization first (like mode 1) and if that failed, then it behaved like mode 2. So mode 3 became the natural choice.
For some time we had a lot of cases for that problem and many people in the broad area of storage got in touch with the term fillword. While the number of problem cases about them decreased, the memory about fillwords stayed active in people's minds. In addition there is a counter called "er_bad_os" for each port. It means "Error: bad ordered set" and increases basically for 2 situations: 1) If such a 4-byte word is corrupted or 2) if the port receives an ordered set it didn't expect. The first situation is a problem, but you get other indications as well ("enc out", "enc in", ...). The second situation could for example happen if a running port expects the IDLE fillword (because it was configured to mode 0 as a workaround as stated above) but receives ARBff. Although the counter increases in the ASIC there is no impact on a running connection. In fact the fibre channel protocol says that each well-encoded ordered set without any other function should be treated the same way as IDLEs. So as long as there is no bit error in them, it doesn't matter what kind of fillword is received - the switch must use it to maintain the synchronization.
However, the myth was already born: Blame it on the fillword! For a lot of totally unrelated problems, like performance problems, CRC errors, occasional link resets and even SFP heat issues, SAN admins and even support personnel for the attached device blamed the fillword. "The fillword is wrong!", "Change the fillword first!", "Look at this rapidly increasing error counter!" - Changing the fillword mode to 3 became the new mantra for every howsoever remote storage problem. And now it's very similar to bloodletting in the medicine of the previous centuries: A sophisticated-sounding theory everybody could agree on and a simple action plan.
But just like bloodletting, it only helps in certain situations and used as a general treatment it does more harm than good.
Changing the fillword mode is disruptive for a link. If you really have a problem with a wrong fillword setting, this is not very concerning, because as stated above, the link initialization would have failed and the device wouldn't be online at that moment anyway. But for all the cases where the port is actually up and running there will be a new link initialization. All current I/O belonging to this port will be void. There will be command timeouts. Error recovery needs to take place. Depending on the robustness of the attached device this could already lead to problems. But not enough, I even saw a lot of SAN admins even changing the fillword mode for normal E-ports, which is complete nonsense. Believe me, you don't want to disturb your fabric stability by bouncing each and every ISL in your SAN environment within a short time without a solid reason.
And changing running ports to a more compliant fillword is certainly NOT a solid reason.
The sad part is that often the perceived problems improved by this action. But then a simple portdisable/portenable would most probably have had the same effect, too. It's like patients recover - not because of bloodletting, but despite of it.
Conclusion and tl;dr
Don't change the fillword mode on a running port! It's disruptive!