Portal administration and performance
I've updated my Theme Performance blog entry below on using "Expires" for Portal content to also force more Cache-Control directives on content already not tagged by Portal.
In researching this, I found that a lot of people think that you can use "FilesMatch" in the httpd.conf file to match content coming from Portal/WAS back to IHS thru the plugin for this purpose. It will not work! The reason is that "FilesMatch" will only match the file name of files stored on the IHS server itself. To match content coming back from Portal/WAS, you have to use "LocationMatch". See my example in my blog post below related to Cache Control.
When asked if WebSphere Portal supports 64bit, what really needs to be asked is if WebSphere Portal supports running in a 64bit JVM? We have supported running in a 32bit JVM on 64bit hardware for quite some time. We introduced our first 64bit JVM support on the Series i platform with WP 5.0.2. We added 64bit JVM support on zLinux with WP 188.8.131.52 (to overcome the 31bit address space limitation with small maximum heap sizes) and most recently added 64bit JVM support with HP-UX on HP Integrity Servers with WP 6.0.1. We will be adding more and more platforms, especially with new releases, but I typically ask in return if you are sure you really need 64bit JVM support.
Sure, with 64bit you can have very large heap sizes (many gigabytes instead of the maximum of 2GB on most UNIX systems), and thus allow a single application server instance to become CPU saturated before the heap is consumed, but that isn't necessarily a good thing. The larger the heap can grow, the longer garbage collections can take, especially full GCs which require a pause of the JVM while the heap is scanned and defragmented, looking for the maximum amount of garbage to collect. The larger the heap, the greater the potential fragmentation, and thus the longer the full GC cycles. And with the shear number of objects that are created and destroyed every second in a portal, fragmentation in the heap can happen more often than you might think. These pauses can amount to poor user experience.
Personally, I haven't seen any specific data to suggest what the perfect maximum heap size is for a portal, and I'm certain that number will vary by implementation, but based on conversations I've had with performance specialists, I suspect it is somewhere in the 1.75GB to 2GB range.[Read More]
Thomas_Hurek 110000ADUU 2,204 Visits
This blog post outlines a few development tips and tricks for a WebSphere Portal / WCM 8 project.
Rendering of a site area via site area rendering template:
With version 8 it is possible to render a site area that has no default content associated with it. To use this feature create a rendering template for your site area, create a new site area authoring template, associate the rendering template with the site area authoring template in the rendering behavior section and start creating site areas with the according template. You can then also change the rendering template for the site area. This process can help you to avoid having to create some default content that would render a certain component.
The following screenshot shows a sample of such a site area:
Clean WebSphere Portal URLs without state information:
The following article explains how to implement state-free URLs:
Implementing friendly URLs in IBM WebSphere Portal 8-based WCM rendering
While the vast majority of WCM URLs were without state, page navigator links still contained state information since the information which page to navigate to needs to be stored in the URL.
Order of Theme and Rendering Plugins
WCM Rendering Plugins can be very useful to influence the rendered content of a WCM Rendering Portlet. Note that the Rendering Plugin gets called before the theme starts to render. So if code should be executed before the Rendering Plugin a different place has to be used - e.g. a custom SessionValidatorFilter.
Dynamic Content Spot jsp's are called without Render Parameters:
The Portal 184.108.40.206 and 8.0.x themes introduced the concept of dynamic content spots - layers to hold dynamic logic separated from the static html/css/... files that are by default stored in webdav. Note that the dynamic content spots do not have access to render parameters passed in the URL. In case render parameters need to be accessed before portlet rendering starts, the logic needs to implemented somewhere else - for instance as part of Default.jsp or in a custom SessionValidatorFilter.
Location when importing items with WCI as components or content:
Web Content Integrator is a powerful and easy to use tool to import existing data into WCM. WCI allows to not only create content but also components. Note that it is currently not possible to specify the location of components - so it is not possible to organize them into folders when running WCI - after the import with WCI the location can be changed (e.g. via API or UI). Content on the other hand allows to specify the location during the WCI import.
Nested Navigator Designs:
In today's blog post we will explain a few lessons around WebSphere Portal and Web Content Manager Administration:
1. How to exclude the friendly name of a page in the path:
Before installing maintenance the TAI should be re-activated.
9. Modifying contenthandler / mycontenthandler URLs via ConfigService setting
You can find more information here: http://www-01.ibm.com/support/docview.wss?uid=swg21647572
AlexLang 100000NC7C 540 Visits
A common problem that customer's often face is having a large Portal cluster in one part of the world, but having to access it from another. Because of the large number of "GET" requests needed to render a Portal page along with the latency of transcontenental access, performance typically suffers.
A typicaly Portal topology might look like this:
Client browser <-> WebServer <-> Portal Server
To improve the performance, a user can change the topology to this:
Client browser (in EMEA, for example) <-> WebServer (in EMEA) <-> Web Server (in USA) <-> Portal Cluster (co-located in USA)
In addition, the WebServer in EMEA would impliment a caching proxy as a part of the reverse proxy cache. So, the client brower would access the site via the EMEA WebServer. That EMEA webserver would serve content from it's local cache as able. If the content is not in the local cache, it would forward the request to the USA WebServer. Note that this would always happen for the base Portal page as well as any statics not already in the EMEA cache.
Instead of using the address of the server(s) in the USA, the EMEA customer would now request the page via the hostname of the EMEA WebServer. Something like this:
http://host.usa.com/wps/portal becomes http://host.emea.com/wps/portal
Users could either use the new hostname (as immediately above), or the DNS server in their geography could return the EMEA webserver instead of the USA version of it. The later would make this local proxy server strategy transparent to the user.
The EMEA Apache web server would use this in it's httpd.conf to proxy the request to the USA webserver:
Note also that the EMEA web server would also use mod_disk_cache for the local caching of static content.
Thomas_Hurek 110000ADUU 1,021 Visits
The workshop on WebSphere Portal 8 Administration was just published and is now available:
It offers a lot of hands-on exercises to get to know WebSphere Portal 8 Administration better.
AlexLang 100000NC7C Tags:  apache reverse mod_mem_cache cache proxy mod_disk_cache mod_cache 3,744 Visits
These statics are not security aware. In other words, a lot of these statics are delivered on the same URL regardless of your security rights. The statics are, generally, not considered secure content.
Given that CPU cycles are considered "expensive" on the Portal/WAS servers and cheaper on the IHS/Apache servers, it is very desirable to reverse proxy these statics on the IHS servers using the mod_cache facilities.
Prior to IHS version 7, the only choice available was the "mod_mem_cache" module. Mod_mem_cache provides an RFC2616 compliant reverse proxy cache. IHS version 7 added support for the "mod_disk_cache" option in addition to the mem_cache.
Choosing the best type of cache neither intuitive nor obvious. The Apache Caching guide provides some guidance.
In short, the correct answer is to use mod_disk_cache. Let's look at some attributes of each type of cache.
1. Cache is "Per process": Apache spawns processes to handle inbound HTTP(S) requests. An instance of the mem_cache is created for each process. There is duplication of cache entries in this scenario (i.e. wasted CPU memory).
2. Cache size limitations: Because of "1", the cache instances must necessary be limited in size to not exhaust CPU main memory. There are several mod_mem_cache directives to help in limiting the size of responses that can be stored in the cache.
3. Occasional inefficient replacement algorithm: Because of "1" and "2" together, responses that are near the limit of the size allowed in the cache, make force removal of responses better left in the cache.
4. Limited capability for stale pages: Because the cache is limited in size and because it gets regenerated with each new process instantiation, there is very limited chance of stale responses being in the cache.
1. Cached responses are shared among all process: There is on instance of the cache system wide. Therefore, there is less wasted space in memory.
2. Disk_cache type takes advantage of Unix/Linux file buffering: See the commentary below for discussion of this item.
2. Need to use clean up utility - htcacheclean: mod_disk_cache does not automatically clean stale items in the cache. This can result in wasted disk space. More troubling is that some responses from the response owner (i.e. Portal/WAS) may not have proper cache-control headers indicating how long responses are allowed to live in proxy caches. Therefore, the cache can potentially return the wrong, stale version of a response. The htcleancache utility is therefore needed to be periodically used (via "cron", for example), to insure stale responses are removed from the cache.
3. Need to allocate disk space: Since the responses are stored on disk, there is always the potential to exhaust disk space. Like all production Unix machines, monitoring policies need to be in place to insure you don't let this happen.
When first considering which type caching to use, most would immediately suggest mem_cache as the better option. From a performance perspective, serving from memory is obviously better than serving from a disk. In reality though, if you understand how Unix/Linux buffer file I/O, the benefits of disk_cache become apparent. Unix allocates unused portions of memory to buffer files as they are read. So, the initial read request starts reading the file into memory. Subsequent read requests for the same file are read from memory without even touching the disk. So, with the exception of the initial load, disk_cache performs as well as mem_cache and there is only one instance of the response in memory as opposed to "per process" duplication of mem_cache. Because memory utilization is more efficient, cache hit ratios can be much higher with disk_cache.
There is some interesting commentary for optimizing disk_cache on Linux in the article "Some Tuning Tips For Apache Mod_Cache (mod_disk_cache)"
I'm afraid the following decision tree slightly over-simplifies the process of determining what application integration technique should be chosen for WebSphere Portal. It is not meant to be the "gospel" by which all integration techniques are decided. Instead, it should serve as a guide for the types of information that needs to be factored into the decision making process. (This blog post is a continuation of the previous post, where the integration techniques are described in detail.)
The questions below are posited in order. If I believe the answer to a question leads to a certain integration technique, I will say so. Otherwise, I'll direct you to the next question.
After reading through the questions, it may seem odd at first that I lead with NOT starting with portlets, but I found it easier to eliminate the lower fidelity, less elegant solutions first than to try to arrive at the decision to run everyting as a portlet first. There are just too many advantages to using portlets to enumerate as a series of discrete questions, where as there are very few questions that can help you eliminate portlets outright. Hopefully this will make sense to you as you work your way through the list below. Let's get started:
mlamb 100000SCY2 Tags:  webservice integration application wsrp external aggregation 1 Comment 4,137 Visits
I'm asked quite often about what the best approach for integrating applications into an existing portal. Should they be rewritten as portlets and run in the Portal itself? What about application isolation, because I don't want a bad portlet taking down my portal? What about the latency effects of running some applications remotely? The answer for what options to use, in typical developer fashion, is "it depends."
First, I think it makes sense to briefly outline your options. You can find more detail on all of these through our product's Info Center as well as from the wide variety of white papers and wiki posts available from the Portal Zone (http://www-106.ibm.com/developerworks/websphere/zones/portal/).
In my next blog post, I'll walk through the decision making process to help you decide which one of the above options would work best for you.
When ever you deploy a new WebSphere Portal instance you must tune the Portal appropriate to the environment. For example, out of the box, Portal is NOT tuned for production.
The IBM Portal Performance team has produced several documents that contain the tuning that should be done after installation and before deploying a Portal to production. This is the document for V7 and this is the one for V8. This document should be considered a prerequisite for the beginning of performance testing and certainly a prerequiste for production.
There are several components that should be tuned. These include the Portal itself, the LDAP, the database, the OS and the web server.
Mike White and I have written a ConfigEngine script that will apply the tunings for the Portal automatically. The script is included in Portal V220.127.116.11 CF22 and Portal V8 CF05. Use of this script will improve the accuracy of applying the tuning changes as well as reduce the time needed to do the task.
The changes are driven by a properties file (tuning.properties) along with several resource environment providers files. Since the tuning.properties file assume you are a WCM rendering server that is only a subscriber (and NOT a syndicator), you may need to adjust some of the setting to match your environment. You can easily do this by copying the read-only tuning.properties to a local directory, updating the name/value pairs that need updating and point to the new copy when you run the task.
Note for zOS users: This task does run on zOS. Note however, that there were significant fixes put into the code to resolve zOS issues in March, 2014. These fixes wil be included in 8001, CF10 as well as the initial release of V8.5. If you are on zOS and not running at least 8001 CF10 level, please download the linked code immediately following.
The latest README file is linked here. It provides the details on using this new task. I have also included the (current) latest version of the code here. The latest version can be untarred in the PortalServer/installer/wp.config/config subdirectory as is. Note that the default the subdirectory just mentioned does NOT have write permission. The permission must be changed before trying to untar this file.
Note: It has been discovered that if you are on Portal 8.0.0.x CF06 (or earlier) and also have installed the tuning task from the tar file (linked just above) that the installation of a Portal CF may fail. The failure reason is that the files installed by the above linked tar file are not supposed to exist according to the CF installer. To resolve this issue (and thus to be able to proceed with CF installation) just remove the files that were installed for the tuning task. This list can be obtained by just listing the files in the tar file. After removal, just install the CF as per instructions.
WCM customers frequently want to preview draft WCM content in an actual WebSphere Portal context before approving that content.
Out of the box, WCM only supports a “preview” button on the WCM Authoring GUI that will pop-up a window in the browser and display draft content via the presentation template assigned to that content. However. the methods that a user might use to get to this new component cannot be preview. Fro example, the preview button does not allow the viewing of new draft content in an existing WCM Menu component.
This topology and code sample code provides one solution for this problem.
To implement a preview server topology, we need at least 3 servers. The high level picture looks like this:
From the picture, you can see that the WCM Authoring Server is a syndication source. It will syndicate content to two systems. One is the Production Rendering Server. The other system is the WCM Preview Server.
The WCM Preview Server is a subscriber only; it will not syndicate content but only receive it. Further, in Portal 7, all page definitions that are required to render content via WCM rendering portlets must be defined. In Portal 8, Portal pages themselves can be syndicated along with the WCM content.
Once content is syndicated to the preview server, it will automatically be advanced from “draft” status to “publish” status. By doing this, all content will immediately render (on the preview server) via the Portal page's WCM rendering portlets. So, for example, content which is “draft” on the authoring server will appear on the preview server in WCM menus exactly as they would ultimately appear on the production rendering server.
An approver could look at the draft content in a traditional Portal content on the preview server. If the approver wishes to advance the draft content in it's workflow after previewing it, he would go on the authoring server and move the draft item to the next stage of the workflow. Ultimately, the content would move to the publish state on the authoring server and be syndicated to the production rendering server. Note that it would also re-syndicate to the preview server with a new state.
In order for this scheme to work, all content that syndicates to the preview server will need to move from the “draft” state to the WCM “publish” state automatically. By moving to the “publish” state, the draft content will be viewable in the WCM rendering portlets contained on the actual portal pages.
One approach to automatically covert this content to the “publish” state would be to write Java code which listens for the updating of items on the preview server. The mechanics of the process rely on fact that WCM can generate a JMS message anytime an item is created, updated or removed if configured properly. When an “itemUpdated” event occurs on the preview server WCM will fire a JMS message. This JMS message will contain the identification of an item that was updated.
The act of “subscription” causes an item to be updated on the preview server. If the item that is syndicated is in “draft” state, JMS handlers will be deployed on the subscriber which will consume the JMS “itemUpdated” message. The new JMS handler (a Javabean) is notified as each item is processed on the subscriber. This Javabean will iteratively move the syndicated content thru it's workflow stages until it reaches the “publish” state. This conversion to “publish” state will allow the content item to be shown in Portal pages and WCM rendering portlets.
Sample code is included which consumes the JMS itemUpdate message and iteratively moves the item thru the item's workflow stages until it reaches the “publish” state.
Simply install the provided EAR file as a WebSphere Application.
WCM Setup Required
In order to use the code, WCM must be configured to generate JMS events on an item update event. To do this, refer this WCM configuration URL.
When you install WebSphere Portal / Web Content Manager version 7 and 8 versioning is enabled by default. That means whenever you update a WCM artifact an additional version is created. After some time you have a lot of versions and this can slow down your system.
If you want to find out how many version nodes you have you can run the following SQL query and check for the number of entries in the jcr:versioning workspace (replace JCR with the according schema):
SELECT JCR.ICMSTJCRWS.WSID AS WORKSPACE_ID, WSNAME AS WORKSPACE_NAME, COUNT(*) AS NODE_COUNT
FROM JCR.ICMSTJCRWSNODES, JCR.ICMSTJCRWS
WHERE JCR.ICMSTJCRWSNODES.WSID > 0 AND JCR.ICMSTJCRWS.WSID = JCR.ICMSTJCRWSNODES.WSID
GROUP BY JCR.ICMSTJCRWS.WSID, WSNAME
My recommendation is to only enable versioning for the design items and keep versioning set to manual for content items and site areas.
This can by accomplished with the following:
In WAS AdminConsole:
Go to WAS Admin Console -> Resources -> Resource Environment Providers -> WCM WCMConfigService
Set the following:
The settings require a restart of the JVMs.
Now you still have to remove the versions you do not need any more.
If you are wondering what happened to the VersionCropper in WebSphere Portal version 7 and 8 - it is now called ClearVersions.
There are two options with the ClearVersions module - one to report the number of versions that it will clean and then the fix mode that will clean the versions.
In the sample below we keep the current version and one extra version behind.
Note that one system version is always kept behind by the system even if you delete all versions.
Report mode (sample URLs need to be adjusted for your environment - note you should go to only one of the JVMs directly):
You should make sure to have a db backup before triggering the fix mode.
Fix mode (keeping the dates of the live version intact):
Details are documented here:
WebSphere Portal leverages the VMM (Virtual Member Manager) component to access the configured user registries. VMM is part of the WebSphere Application Server. For authentication, WebSphere Application Server allows configuration of either the standalone or federated option - the federated option also uses VMM for that purpose. Our recommendation is to use the federated option since with that, one configuration is sufficient and caches can be used for authentication by WebSphere Application Server and user and group lookup by Portal both.
Every main profile repository that is used with Virtual Member Manager needs to have an attribute whose value is unique, static, and never reused for any member entry. In VMM, this attribute is called extId. By default when configuring a LDAP, VMM will leverage the default unique ID from the LDAP (i.e. objectGUID in Microsoft Active Directory) as extId. WebSphere Portal and Web Content Manager use the extId to map permissions (roles), page and portlet customizations, private pages, tags and ratings, web content manager content ownership, or other artifacts to the extId of groups or users. The value is different between different LDAPs - so even if the "same" group or user exists in a staging and production LDAP, they can have different extIds. Syndication is a process that transfers WCM artifacts between systems. If those systems have different LDAPs configured, the mapping of the artifact will no longer function after the item has been syndicated. To fix this it is possible to run the member fixer task. With Web Content Manager 8 it is possible to configure member fixer to run as part of syndication.
See the following link for details: InfoCenter Configuration of Member Fixer as part of Syndication
If the extId chosen by default is not unique or if there are issues with member fixer, an option is to change the mapping of the extId to another value that is identical between multiple LDAPs but still unique and static. An ideal value can be the distinguished name value in the LDAP since it is identical between different LDAPs for the same group or user. The following steps describe how to change the configured extId. Note, ideally the steps should be performed directly after configuring the LDAP in the environment. If the change is made at a later point in time, when artifacts in Portal and WCM are already mapped to the extId values, it is non-trivial to change the mappings. Also consider that in your environment the distinguished name might be changed in the LDAP by some process and in that case the mapping in Portal / WCM would be lost too.
In a cluster the VMM configuration files exist in multiple locations - in the deployment manager profile and in the profile of each JVM belonging to the cell / cluster. For a standalone system, only the VMM config files in the single profile need to be adjusted.
It is important to take a backup of all involved servers' file systems and Portal databases before proceeding. In case syndication is used after configuring the steps, you might want to consider the same backup for the subscriber.
1. Stop all WebSphere Portal and Deployment Manager and nodeagent JVMs involved.
2. In all involved profiles, edit the file <profile>/config/cells/<cellname>/wim/config/wimconfig.xml making the following adjustments:
Find the tag <config:attributeConfiguration> for the LDAP in question and in there add the following tag before the closing </config:attributeConfiguration> tag:
Save the file.
3. Restart the involved JVMs - in a cell/cluster starting with the Deployment Manager, nodeagents and then WebSphere Portal JVMs.
The default unique identifiers for the supported LDAP types are:
IBM Tivoli Directory Server: ibm-entryUUID
Microsoft Active Directory: objectGUID
Novell eDirectory: GUID
IBM Domino Server: dominoUNID
SunOne Directory Server: nsuniqueId
Web Content Manager performance can be improved by enabling caching. In fact we recommend to enable caching for WCM for production sites. There are multiple caching options with WCM.
In certain scenarios the default cache key options will not be sufficient. This blog entry describes how to customize the WCM Advanced Cache Keys.
The solution uses the WCMGroupProcessor extension point to add custom keywords. These keywords are added to the Advanced cache by WCM if the Personalized Advanced cache option is configured. Note that for the solution to work you need to configure sessions to be always created for the WCM portlets what can have a performance impact.
To utilize the solution take the following steps:
1. Download the Eclipse project containing the source code. You can find the project here: https://www.ibm.com/developerworks/mydeveloperworks/blogs/portalops/resource/cache_preprocessor_eclipse_project.zip
2. Customize the source code - the source only contains one class called WCMGroupProcessor.java. Note that the package and the name and public methods of the class must exist exactly as is.
To add custom attributes / values look for the comment "add custom attributes here".
3. Build the source code and post the created jar file to a shared library - e.g. WebSphere\PortalServer\wcm\prereq.wcm\wcm\shared\app\
4. Configure the following in the WCM WCMConfigService in WAS:
5. Ensure that you have either PM72350 as separate fix or a Cumulative Fix installed that contains PM72350.
To verify the functionality I would recommend to use the WebSphere Extended cache Monitor and looking at the cache contents of the Processing cache after hitting a WCM viewer portlet.
Extended Cache Monitor: http://www.ibm.com/developerworks/websphere/downloads/cache_monitor.html
When there is a suspected memory leak in a Portal application, or in the Portal itself, I typically follow this process to collect the heapdumps necessary to properly debug the problem:
Now that you have three heapdumps spanning a relatively long period of time, you need to analyze the heapdumps to look for possible leak suspects. I use HeapAnalyzer from alphaWorks (http://www.alphaworks.ibm.com/tech/heapanalyzer). Be warned, though, that for heapdumps taken from heaps of size 1.5GB or more, you will need a LOT of memory on the system where you run HeapAnalyzer to analyze the heapdump. I would recommend running it on a 64-bit system where you can configure the tool itself with a massive heapsize (7GB or more). It will take a long time to analyze it too.
Once analyzed, the tool can be used to point out suspected memory leaks. By having the system quiesce (no active requests or session), there should be a large disparity between the leak suspects and other allocated "noise" in the heap, especially as you analyze the two older heapdumps.
In terms of detecting whether you actually have a memory leak situation versus simply running out of memory because of running too many requests through a single portal, look at the verboseGC output. The JVM heap will fill up over time, and sometimes quickly depending on traffic patterns, but once it reaches 90% capacity or so, the JVM should perform a full GC with compaction, to defragment the heap and claim as much memory as possible. I call the point it returns to the "low water mark". If over time, during a load test with a constant number of users, you see this low water mark creep upwards, then you may have a memory leak. Ideally, it should return to about the same point each time.
I have used AlphaWorks' PMAT tool (http://www.alphaworks.ibm.com/tech/pmat) to graphically detail the GC cycles. It is very simple to visually see the pattern and determine if you see the low water mark creeping upwards over time.[Read More]