Today's post will be not overly informational, yet highly insightful all at the same time! Many clients have tried, tested, and true internal Holiday preparedness plans, and have been very successful for it. Yet for some, there are many factors which cannot be anticipated, or many which could end up being overlooked. One of the main objectives of all the e-support we put out during this season is to address specifically that - learn from the mistakes of others. Each year there are a few situations close to Black Friday (or other client-specific traffic and sales peaks) that really hand-cuff IBM Support when engaged. The gist of everything I'll be posting this year is 'be prepared', and that means early Summer preparation, not late November. Below are some paraphrased summaries based on real-life situations we have encountered. Consider if you will find yourself asking any of these questions in the months ahead.. The content IBM Support is sharing throughout the season will elaborate in detail on each of these (but from a more useful technical perspective).
Worst practice: “I’m relieved we got our massive new catalog on production now. At least we tested the old one already last month.”
Best practice: make load testing representative of peak day, in terms of infrastructure, integrations, user shopping patterns, and data!
Worst practice: “We just loaded all the new promotions from marketing last night, and now it takes 20 seconds to add to cart!”
Best practice: see above! Combinations of promotions being calculated during checkout can really kill your performance, work out the bugs in advance.
Worst practice: “We only have 3 more defects to close out then we can push out our final release just in time for Thanksgiving rush!”
Best practice: do not be making code changes into November, have a code freeze well in advance to introduce no new factors into tests.
Worst practice: “There was a mandatory security fix pushed out in September? Can we still squeeze it in on time?”
Best practice: Don't be the last to know! Stay on top of proactive alerts and notifications from IBM to be aware of any critical fixes.
Worst practice: “We had an outage on Monday, but didn’t capture any javacores. What do we need in place for next time?”
Best practice: Review key performance MustGathers in advance, and prepare any necessary configurations. Don't require a second costly outage to troubleshoot the issue.
Worst practice: “Anyone else notice those Out-of-memory errors we’ve been getting for a while?”
Best practice: Waiting till the quiet days leading up till Black Friday is not a good time to start discovering long-standing problems on your site. A log full of exceptions also makes it incredibly difficult to isolate and debug a specific problem, as well as differentiate 'what is normal' versus 'what is happening now'.
Worst practice: “I just noticed our site is pretty slow.. We still have a few days till Thanksgiving, should we open a PMR?”
Best practice: If you are having troubles getting your site up to speed, engage IBM Support or Services as soon as possible. There is very little that can be done last minute.
Worst practice: "We're on 22.214.171.124 , can we get that caching fix retrofit?"
Best practice: Being on old maintenance levels hurts our ability to troubleshoot, mitigate, or provide fixes or enhancements. You do not want to lose money in revenue or investment to diagnose and resolve already known issues!
Worst practice: "We added more CPUs just as a buffer, so we may as well increase the webcontainer threads too so we can handle more throughput"
Best practice: Tuning is an art! More CPU does not mean more webcontainers. IBM Lab Services is the IBM tuning experts, but there are various configuration mistakes that we see time and time again.