Overview of LDAP caches in Portal
WebSphere Portal 6.1 and 7.0 both leverage a number of out of the box caches to improve system performance. When configured to an enterprise LDAP, two specific sets of caches are leveraged to cache data from the LDAP
- Portal User Management Architecture (PUMA) caches
- Virtual Member Manager (VMM) caches
Rather than going to the LDAP every single time there is a need to access
LDAP data, caches will instead be used, significantly reducing load on
the LDAP server. A typical flow would go something to the effect of:
1) Portlet needs LDAP data (such as user or group information)
2) Portlet asks PUMA for LDAP data
3) PUMA checks PUMA caches to see if LDAP data is cached and not yet expired
-->If cached and not yet expired, it will return the data to the portlet.
-->If not cached, OR, if cache has expired, move onto next stage in flow
4) PUMA will query VMM for the data
*Technical note: Portal itself never directly talks to the LDAP during runtime. In a standalone LDAP configuration, both WAS + VMM code are used to talk to the LDAP. In a federated LDAP configuration, only VMM code is used to talk to the LDAP. However, key point is during runtime Portal itself never talks to the LDAP directly, it always goes through another layer.
5) VMM checks VMM caches to see if LDAP data is cached and not yet expired
-->If cached and not yet expired, it will return the data to PUMA. PUMA will update its caches with the data from VMM and return the data to the portlet.
-->If not cached, OR, if cache has expired, move onto next stage in flow
6) VMM queries LDAP for requested data. VMM caches are updated.
7) VMM returns to PUMA the requested data. PUMA caches are updated.
8) PUMA returns requested data to the portlet
By default the PUMA caches are set to 10 minutes, and the VMM caches are set to 20 minutes. Both settings are configurable, and the defaults generally allow a good mix of performance during peak busy periods and will expire the caches during non-busy periods so as not to use up too much server memory. Both caches operate independently of each other, so it is entirely possible one cache could timeout, while the other cache would still be active.
Direct Updates to data in LDAP do not show in the Portal
While caches offer the advantage of reducing load on your LDAP server, the one disadvantage is that data in the Portal is not guaranteed to be as current as the data in the LDAP. i.e. It is entirely possible that an update can be made to the data in the LDAP, but, that data will not show up immediately in the Portal, as Portal will still be pulling the cached data from either the PUMA caches or VMM caches. Generally speaking, after 30-60 minutes updates made to data in LDAP will be visible in the Portal once the caches expire.
One potential strategy is to modify the timeout values from the defaults of 10 minutes for PUMA, and 20 minutes for VMM to a lower value. The following Technote discusses this strategy in more detail:
As noted in the Technote, turning off both caches is NOT recommended. In addition to increased CPU/memory usage on the Portal server, you run the risk of overloading your production LDAP server(s), risking an enterprise outage of login services across multiple applications.
OK, so we know there is an advantage to the caches to help with performance, and there is also a disadvantage that the caches may contain stale data relative to what's in the LDAP. Is there a way to get the best of both worlds, i.e. maintain the caches in most cases, but, when needed we can pull the most recent data from the LDAP?
Answer, yes this is possible. APAR PM16430 was introduced to allow the ability to programatically invalidate the current LDAP caches and refresh from the current data in LDAP.
Sample Code to programatically refresh data from the LDAP
How do we implement the steps in PM16430? Let's go over some prerequisites before we get into specifics:
i. You should be in a federated LDAP configuration. Federated LDAP code is guaranteed to go through VMM (and VMM caches) every time. Standalone LDAP configuration with WebSphere Portal is not guaranteed to use VMM code every time. The APAR will still work with standalone LDAP code, but the recommendation is to have a federated LDAP configuration if you plan on implementing the APAR
-->See this Infocenter document for how to convert from standalone LDAP to federated LDAP if needed: http://www-10.lotus.com/ldd/portalwiki.nsf/dx/Changing_from_a_standalone_repository_to_a_federated_repository_on_AIX_wp7
ii. You should already have code in place which leverages the PUMA API to query data from the LDAP, OR, you should have a willingness to experiment with creating such code
iii. Review the following blog entry to be familiar with using JSPs to implement sample code: https://www.ibm.com/developerworks/mydeveloperworks/blogs/PortalL2Thoughts/entry/debugging_your_portal_with_jsps_basics28?lang=en
To get started:
1) Locate the wimconfig.xml file under:
In the file, locate the section which corresponds to the LDAP server you wish to programatically invalidate caches with. From my lab system, we have:
<config:repositories xsi:type="config:LdapRepositoryType" adapterClassName="com.ibm.ws.wim.adapter.ldap.LdapAdapter"
id="PORTSELDAP" isExtIdUnique="true" supportAsyncMode="false" supportExternalName="false"
supportPaging="false" supportSorting="false" supportTransactions="false" certificateFilter=""
certificateMapMode="EXACT_DN" ldapServerType="IDS" translateRDN="false">
<config:baseEntries name="o=ibm,c=us" nameInRepository="o=ibm,c=us"/>
Note the id="PORTSELDAP" in particular, we'll need this a it later.
2) Login to the WAS Admin console (or Deployment Manager if in a cluster). Navigate to Resources --> Resource Environment Providers --> WP PumaStoreService --> Custom Properties
3) Create a new custom property:
4) Create a new custom property:
*Note, for the value here, use the id= we identified from the wimconfig.xml file in step #1.
5) Save changes. If clustered, sync nodes.
6) Restart the Portal Servers.
7) Create a new file named reloadall.jsp with the following code:
<%@ page import="com.ibm.portal.um.*" %>
<%@ page import="javax.naming.*"%>
<%@ page import="java.util.*"%>
final String BR = "<br><br>";
out.print("Starting to gathering info"); out.print(BR);
Context ctx = null;
PumaHome pumaHome = null;
PumaProfile pumaProfile = null;
PumaProfile pumaProfileForContext = null;
PumaLocator pumaLocator = null;
PumaController pumaController = null;
ctx = new InitialContext();
pumaHome = (PumaHome)ctx.lookup(PumaHome.JNDI_NAME);
pumaProfile = pumaHome.getProfile();
pumaController = pumaHome.getController();
pumaLocator = pumaHome.getLocator();
List myList = new ArrayList();
List userList = new ArrayList();
userList = pumaLocator.findUsersByAttribute("uid","*");
if (userList.size() == 0)
out.print("No results returned from search! Nothing to reload"); out.print(BR);
out.print("Found a number of users in LDAP " + userList.size()); out.print(BR);
for (int i = 0; i < userList.size(); i++)
out.print("About to Reload User Attributes for user number " + (i+1) + " " + pumaProfile.getAttributes((User)userList.get(i), myList)); out.print("<br>");
out.print("Done Reloading. Updated User Attributes for user " + pumaProfile.getAttributes((User)userList.get(i), myList)); out.print(BR);
out.print("Exception occurred!!! " + e); out.print(BR);
8) Upload the reloadall.jsp file to your Portal server under <wp_profile>/installedApps/<cellname>/PA_Blurb.ear/Blurb.war/jsp
9) Setup a test page with the Welcome portlet. Configure the Welcome portlet to point to the following jsp:
10) Allow the page to refresh. At this point, we are pulling data for the first time from the LDAP, so the information will be current / fresh.
11) Make a direct change in the LDAP to one of the users. For purposes of this article, I changed the "sn" attribute on the "uid=travis" user
12) Now, at this point, the data in the LDAP is updated, and, the Portal/VMM caches are both stale. The first call we'll make to fetch the LDAP data will be pulling from these caches, and will contain stale/outdated data. The reload() call, now with PM16430 active, will invalidate the caches and pull the data fresh from the LDAP. The second call we'll make to fetch the LDAP data will still be pulling from the caches, however, the reload() call will have pulled fresh data into the caches, and, we'll now display correct updated data from the LDAP.
13) Enjoy! Feel free to implement within your own custom applications as needed.
Addendum to Original Posting
New bits of information will be periodically added to this blog entry.
1) Data stored in a property extension database will never be stored in VMM caches. However, it is possible for that data to be stored in PUMA caches. It is possible for another application (say the Deployment Manager) to make an update to the property extension data. Thus, it possible for the same condition to occur, that is, ApplicationA (DMGR) makes an update to the property extension database, ApplicationB (Portal) is still actively using caches and may not retrieve the new data. Therefore, reload() should be used to invalidate the PUMA caches and force the updated data to be retrieved from the property extension database.
2) The APAR and reload() functionality does NOT apply to groups and listing members of a group. The results of those calls are stored in the SearchResults cache, whereas, the APAR is only applicable to invalidating the AttributesCache. Therefore, if you make a direct update to LDAP to a group (say adding a new member to a group), then the Portal server is not guaranteed to pull this new data immediately. Example, in the Manage Users & Groups portlet, search for the group name, click on the group, it may show outdated information relative to what's in the LDAP.
However, for access control purposes, such as a users ability to see pages and portlets based on which LDAP groups they belong to, there is a potential workaround in place via a membership attribute (ibm-allGroups for Tivoli Directory Server, memberOf for Active Directory, etc.). In general, membership attributes offer huge performance gains during
user logins, and we recommend implementing a membership attribute if your LDAP server supports it. Normally, without a membership attribute, we query the LDAP for the groups and their members one at a time eventually determining which groups the individual user belongs to. Think of this as a "Group" operation, similar to the example above with the Manage Users and Groups portlet. Now, with a membership attribute, a single query is made to the LDAP server to pull which groups the user belongs to. Think of this as a "User" operation, and, note the particular terminology of the word "attribute". User + attribute means the membership attribute is stored in the AttributesCache, and therefore PM16430+reload()+a membership attribute will allow us to pull updated group information for a user, in addition to the other updated attribute information. Using our example above, if we call reload() for this user immediately after they are added to a group in the LDAP, and we have a membership attribute configured, their updated groups from LDAP will be successfully pulled. Note, due to other Portal Access Control caches in Portal, the changes may not show immediately. If you have a requirement to have such changes show up immediately, please open a PMR with IBM Support, reference this blog entry and an interest in "PAC invalidation API".
3) WebSphere Application Server has its own code which can be used to pull a user's groups from the LDAP. WebSphere Application Server also maintains a separate set of caches for this information. WebSphere Portal offers a configuration option to allow Portal to reuse group information from WAS. This option is disabled by default and by default Portal does not reuse group information from WAS. If the reuse WAS group information configuration option is enabled, then PM16430+reload()+a membership attribute will still refresh information from LDAP in Portal and VMM caches, but, not necessary WAS caches. In this situation, WAS caches which store group information for a user will be outdated, but Portal PUMA and VMM caches will be current. However, because Portal is configured to reuse group information from WAS, it will perform this action prior to checking its own PUMA caches or VMM caches. Therefore, WAS caches may be contain stale data and Portal will reuse that stale data. More details on this condition are noted in the following Technote: http://www-01.ibm.com/support/docview.wss?uid=swg21593268
If this condition occurs, a decision will need to be made weighing out the need to reuse WAS group information vs. implementing the functionality offered by PM16430+reload(). If the PM16430+reload() functionality is desired, then disable group reuse as noted in the Technotee. If the reuse WAS group information should remain enabled, consider adjusting your timeouts on your caches as noted further up in the original blog entry under subheading "Direct Updates to data in LDAP do not show in the Portal".