Using Netcool/WebGUI to scale your OMNIbus architecture
ZaneBray 060001YRX4 Visits (4728)
Many organisations run globally distributed network operations centres and have multiple Netcool/OMNIbus installations - sometimes located in different countries around the world. Usually the same organisations are required to provide 24 hour support and therefore need to implement a "follow-the-sun" type environment - where operations centres distributed around the globe take successive shifts to provide 24 hour coverage of management of events originating from around the world. For operators employed for the task of monitoring events in multiple regions, often the preferred method of visualisation is a single event list - containing filtered views of events from all regions.
On a technical level, the challenge has been how to combine the events from multiple disparate Netcool/OMNIbus installations into a single event view.
With Netcool/Webtop, you could only include events from a single datasource within a single Active Event List (AEL). The only way to provide a combined view of events from different OMNIbus systems with Webtop was to merge the Aggregation layer ObjectServers together into a single pair - located in one of the regions. This solution however carried the overhead of having to propagate all events from the remote regional Collection layer ObjectServers to the central Aggregation ObjectServer pair - and then push the event data back out the remote regional Display ObjectServers for local servicing of AEL clients. Another problem with this approach is if there was a network outage and the remote NOC got disconnected from the Aggregation ObjectServer pair, the AELs would not function until the connection to the Aggregation pair was restored - or some manual process contingency plan was executed. On restoration of any disconnected systems, there would then be the issue of resynchronising the two systems' sets of events - both of which may have changed or been updated while the two systems were disconnected.
With Netcool/WebGUI however, multiple datasources can be included within a single map object or AEL. This means that the underlying, separate Netcool/OMNIbus systems do not need to be combined or modified in any way in order to combine events into the same AEL. The WebGUI server will simply pull the data it requires from each system - and aggregate the data sets into a single AEL view within the WebGUI server. During a network outage, the events from an unreachable datasource simply drop out of the AEL view and a message is presented to the operator to alert them that the datasource has gone offline. When the connection to the remote datasource is restored, the AEL will automatically redisplay the updated events from that datasource and continue as normal. The loss and restoration of datasources are handled seamlessly and automatically. During the outage meanwhile, the operator is still able to work with the events from the datasources that are available.
This new functionality provided by Netcool/WebGUI allows for a simpler, cleaner geographically distributed Netcool/OMNIbus system - and minimises the impact of any sort of network outage between sites. Additionally, it allows a large Netcool solution to be radically more scalable. No longer constrained by the maximum number of events an Aggregation pair can hold, an overall system can now be made up of potentially numerous, separate Netcool/OMNIbus instances - with WebGUI pulling the data it requires from as many datasources as is necessary.
In the above example, the OMNIbus systems are logically distinct from each other by geographical separation. Multiple OMNIbus systems could also potentially be deployed within the same geographical region however - where, for example, there are more events than one system can manage on its own. In such a scenario, events could be categorised in some other way - for example: by event type, business sector or ITNM domain - and then each subset of events allocated to its own dedicated OMNIbus system. WebGUI will still be able to combine the events from these multiple, separate OMNIbus systems within the same AEL - and the events' aggregation will be transparent to the end-users. Cross-system correlation, if it is needed, can be carried out by Netcool/Impact. This architecture model allows a Netcool solution architect to dramatically increase an organisation's event handling capacity.
This new architecture design concept further extends the boundaries of OMNIbus' already extensive scalability - giving organisations enormous headroom for the growth and expansion of their existing Netcool/OMNIbus deployments. If one of your existing Netcool/OMNIbus systems starts reaching its maximum capacity, simply add another one - and include it in your WebGUI AEL view.
A best practice publication will follow in due-course including details of the architecture design and some performance metrics gathered from a real system deployed across three different continents. Keep an eye out on the Best Practice pages for upcoming publications.