I was at a customer discussing High Availability and found an interesting aspect to WLM.
The scenario was 2 sites. Each site had production LPAR and a development LPAR in the same sysplex (not good practice).
If the first site fails, all of the work moves to the second site. We all thought it was "obvious" that we make the development LPAR very low priority - as we wanted production to get all of the resource.
The problem is that some sysplex wide processes may be running on this development LPAR, and so if this LPAR gets no CPU, then theses processes will not run, and the whole sysplex will have strange behaviour.
I learned from this
- Do not mix development/test in the same sysplex as production
- In the event of a failure scenario, shut down the development LPAR rather than just giving it no resources.