I am preparing sessions for a POWER/AIX Technical Conference in Norway - unfortunately the only place they could book for the event was a Ski Resort up in the mountains and there is still good snow. It is a tough gig but someone has to do it :-) As part of that conference, I am updating them on Systems Director and demonstrating it. I had a slide with these hints and tips. A slide with zero percent marks for style (far to many words and just a long list) and I thought ... I should share these with everyone, so here they are. Most of these are the result of my mistakes, hard won experience or working with customers trying Systems Director.1) Trying Systems Director for the first time?
2) Know and check the pre-requisites - no rule bending as it will not work
- Use the very latest Systems Director version with all service packs and update the Common Agents on AIX+VIOS to the latest version.
- Don't run last years version with now fixed bugs!
3) Do you know AIX ?
- From my list (that might be a little out of date):
- HMC 7.3.4+ - I am on 7.7.2
- VIOS 2.1.2+ - I am on 2.2 FP24 sp1.
- AIX 5.3 up to TL06 - will never work with Systems Director and I would update the TL any way!
- AIX 5.3 TL07 ..TL12 - install the latest Common Agent manually (note TL12 is the last one ever)
- AIX 6.1 TL00…TL02 - will never work with Systems Director and I would update the TL any way!
- AIX 6.1 TL03…TL06 - good but update to the latest Common Agent with Systems Director Update Manager
- Always add the latest AIX service packs - a few of there fix Common Agent bugs.
- NIM Server - Might as well start as high as possible AIX 7.1 latest TL and service packs
If not why are you reading this?
4) Allocate plenty of resources for the Systems Director server
- Run the Systems Director server on AIX using spare “free” Power CPU cycles you have in the shared CPU Pool.
- Don't try Systems Director on your first ever Linux on POWER virtual server - KISS.
- Lets have a quick success before we try to run "lean and mean".
- This avoids many self inflicted problems - sorry but I have no sympathy.
- CPU: POWER5/6/7 two shared CPUs, uncapped and virtual CPU=4, SMT=maximum.
- Memory: 6 to 8 GB RAM and if you have it switch on AME.
- Disk: 64 GB of disk space on four or more spindles.
- Network: 1Gbps - virtual Ethernet is fine.
- OS: AIX 6.1 TL05 or later or AIX 7.1 + all service packs
6) Backup up every night - even the prototype or support can’t help you after its "pear shaped"
- Use the default Apache Derby DB for only 200 AIX endpoints I found it slows down from there (slower & takes more memory than DB2)
- Using default Apache Derby DB never run out of memory/paging & backup every day
- Only use default Apache Derby for a quick test that you throw away in two to four weeks time - you have to start from scratch to switch to a real one
- DB2 is recommended – via the no cost single use license offering, see the download site for details.
- Only use one of the alternative if a) you have the skills b) you have a licence and c) don't mind wasting your money!
- Currently Storage Control required DB2 for the Systems Director database.
- RDBMS = needs disks for I/O speed & start with 64GB minimum.
7) Install VNC & do graphical installs (if not installp files)
- I have seen customer that trashed the database trying to update Systems Director - OK something went badly wrong. They then ask for support to fix a now corrupt database but with no backup and expected a miracle.
- The first part of recovery would be rebuilding the database from the backup to a known working state - give the support guys a break!
8) Opt for manual install
- AIX system admin guys love installp and some parts like the Common Agent, come that way but many parts don't and I have given up trying to do command line install which silently log failures to unknown log files.
- Some install tools have console mode - this is not the Power or AIX ideas of a console like HMC VTerm but a Windows view of a console, as in a graphics screen and keyboard and will never work on Power.
- If you have X Widows available via VNC for AIX then the errors are on the screen and it tends to work first time rather than tend to fail every time.
9) Don’t rely on the AIX lslpp command (list license program product) for the Systems Director server version10) Security may not fit your systems management practices - just get over it!
- Remote initial installs of features from Systems Director also suffer from Task Log user panel showing "error" with no hints of why?
- So I opt for manual installs - like for the VMControl NIM Agent and WPAR global AIX Agent.
- Then you know in 5 seconds that you forgot to install the pre-req - dsm.core or whatever.
11) For the first month use Systems Director base features and perhaps Active Energy Manager (AEM) ONLY
- We can't expect one tool to have 100's of security model, policy and method options to fit around the 1000's of permutation used by every computer room on the planet.
- You may have to bend a bit to fit in with Systems Director.
- It does had security but it might be different to your ideas.
- This is very common for most Systems Management tools and not a Systems Director only issue.
- Focus on the basics - not the rocket science advanced new features - lets walk before running a marathon.
- There is still lots of features and benefits like:
- Health, Events, Thresholds, escalation
- Update HMC, System Firmware, AIX, AIX device drivers even VIOS
- Automation Monitor, filters, thresholds, action plans
- CEC level performance monitor (add the VMControl plug-in to get these switched on)
- Switch to Power Saving Favour Performance mode and get those POWER7 CPU over-clocking.
12) Only print the Redbook, if you need a door stop!
13) You will need Systems Director support and comfort
- To pick up skills watch a movie/video its much more relaxing and the only way to learn a graphical user interface. See:
- http://tinyurl.com/AIXmovies = 19 Hands-on techie videos in the Systems Director ection
- http://tinyurl.com/SDnewwiki = 25 Overview videos
- http://youtube.com search for (with the quote marks) “IBM Systems Director” = 28 videos
14) Don’t rush the Browser
- Some little thing will “get you stuck” and the only way out to find out what is happening is to switch on Trace support, re-run the issue and study the human hostile XML logs.
- Reading these XML logs is a black art - about as friendly as the people in "The Matrix" film watching the raw Matrix feed ripple down the screen.
- Also get yourself a Systems Director mentor - some times it really helps to talk and exchange ideas, theories and experience.
15) Don't trust the Access Failed message
- Patience is a virtue (or so my wife tells me, often) and trust me, version 6.2.0 (May 2010) was a big improvement in speed.
- Don’t click or type as the screen arrives - you need all that Java script and stuff for the screen to work correctly.
- Wait until the page it fully displayed - in Firefox, I get a "Done" at the bottom left.
- And don't go updating to a bleeding edge browser either - early Firefox 4 beta had some weird effects and Chrome a few years ago was ... interesting in the wrong way (now fixed and works well)!
- Also note the modern browser auto fills on screen fields (sometimes badly like IP addresses or annoyingly like password) - this is a browser feature nothing to do with Systems Director.
- Also sometimes a click on a button actually closes a "pop up list" instead of clicking the button and Systems Director "seems" to have ignored you - this is a browser feature nothing to do with Systems Director.
16) Getting Access to a AIX Endpoint without the root password but doing it backwards (from AIX to the Systems Director server)
- When gaining Access to an Endpoint like AIX, it might report "Failed" but sometimes there is a “delayed OK” so don’t despair to early.
- Check you have the super user password right !!
- Then double check via the Navigator - All Systems and search for the Endpoint.
- I have seen it take 30 seconds.
16) My minimum filesystem space for installing Systems Director server, as AIX commands in a script:
- Some computer rooms don't allow root access or root password under normal operation.
- Log in to the virtual server (LPAR) endpoint as a non-root and then get root privileges (you have to work this bit out yourself) and run the below:
- /opt/ibm/director/agent/runtime/agent/toolkit/bin/configure.sh -force -amhost 184.108.40.206 -passwd XYZ
- The IP address is the Systems Director server (or more correctly the Agent Manager which in 99% of the time is the same thing) and the XYZ password is the Agent Manager password that you set when you installed Systems Director server.
- Another good command will get the virtual server (LPAR) endpoint to forget the Systems Director server it was connected too:
- /opt/ibm/director/agent/runtime/agent/toolkit/bin/configure.sh -unmanaged -force
- Particularly, useful if you rebuilt your Systems Director server and the endpoint is still trying to "talk" to the old one using now invalid security keys.
- Which Systems Director server does AIX think its connected too? is found in the file /opt/ibm/director/agent/runtime/agent/config/endpoint.properties
17) As normal a AIX endpoint should have CAS Access but not necessarily CIM
- chfs -a size=512M /
- chfs -a size=3G /usr
- chfs -a size=1G /var
- chfs -a size=1G /tmp
- chfs -a size=5G /home
- chfs -a size=16G /opt
18) Easy to forget and difficult to find commands to "work" the Common Agent installed by default on your Virtual I/O Server
- If you Navigate to your AIX OS Endpoint, right click then Security and then Configure-Access you should see:
- CIM - No access - https://220.127.116.11:5989/
- CAS - Access OK - https://18.104.22.168:9510/
- You may also get SSH (if installed) as a bonus.
- No real idea what it all means but if CAS is not Access OK - most things will not work.
19) Fed up with that 30 minute User Timout? - Stop it or make it larger
- startsvc DIRECTOR_agent
- stopsvc DIRECTOR_agent
20) Need to Discover a whole bunch of specific AIX clients?
- Edit /opt/ibm/director/lwi/conf/overrides/usmi_settings.properties
- Find the line containing "sessionsGlobalTimeout" set to more minutes or -1 (minus one) to disable the inactivity time-out.
- Then restart Systems DirectorServer: smstop; smtart; smstatus -r
21) Only watch back ground Tasks from a master top level from the left hand menu.
- One of the few times that I would recommend the command line interface.
- Use the following gem in a script:
- smcli discover -i IP-Address-of-AIX-Endpoint
22) The AIX NIM server can be the Systems Director server itself
- I mean from the left hand menu "Task Management" then "Active and Schedule Jobs" - you can then drill down to the specific task and can follow the breadcrumbs back up to the top full list.
- If after starting any background task, you take an offered "Display Proprieties" button - you can't drill out to the full list of all tasks.
- You also get suck in a weird state where the left hand menu "Task Management" + "Active and Schedule Jobs" will not work until you kill the single task "Display Properties" tab.
- This one drive me nuts.
- I still recommended a separate “disposable” NIM server as you can then upgrade the NIM server (it has to be at the latest AIX level that you want to deploy) without effecting the Systems Director server (which may not be tested on the very latest/just release AIX level).
- Disposable means you can "crash and burn" the NIM server and connect Systems Director to another one and it will carry on - it will if necessary re-download the packages to it.
- Do backup those Appliance images. Then if you rebuild the NIM server, VMControl can re-discover them once they are recovered from your back up.
Well, I hope this list is useful, saves you some time and lets you dodge a few pain points, thanks, Nigel Griffiths
ps: This is Part 1 as I think as soon as I hit the "Post" button, I will think of three more Hints but no promises.