Monitor and adjust resource allocation in a PureData for Analytics system using IBM Netezza Performance Portal 2.1
The IBM Netezza Performance Portal is a web-based tool for managing and monitoring the PureData for Analytics data warehouse appliance (also known as Netezza appliance). The Netezza Performance Portal:
- Is for system and database administrators who want to manage and monitor one or more appliances from within a single user interface.
- Offers a wide range of database administration capabilities (creating, viewing, modifying, and deleting database objects in the system, or managing database users and groups).
- Includes realtime monitoring of the system's hardware configuration and health, running and finished SQL queries, and active sessions.
- Has a simple but powerful mechanism of collecting system performance data and correlating it with SQL queries that ran on the system in the given time frame in the past.
Netezza Performance Portal is available for PureData for Analytics 6.x and 7.x users at no additional cost and can be installed either on the appliance or on a separate RHEL 5.x or 6.x system.
In this article, you learn to monitor and adjust resource utilization of the data warehouse appliance using Netezza Performance Portal 2.1 features and PureData for Analytics 7.1.
New features of Netezza Performance Portal 2.1
Netezza Performance Portal 2.1 introduces multiple features to improve the monitoring and management of the data warehouse appliance. Several of the new features are related to new features in PureData for Analytics 7.1, such as support for client information fields and the scheduler rules.
Other features offer additional insight into activities performed on the warehouse, such as: displaying explain information for SQL queries run in the system, history of backup and restore operations displayed along with their log files, or resource allocation history information for particular resource sharing groups (discussed in Resource allocation).
Managing workload with scheduler rules
The scheduler rules feature, introduced in PureData System for Analytics 7.1, lets you exert more direct control over scheduling and executing queries. A scheduler rule consists of a set of conditions that identify the workload and the action to be taken for the plans belonging to this workload. Conditions allow you to filter:
- Only the plans coming from a specific user or resource sharing group.
- Only the plans having specific priority.
- Duration estimate or job type (loads, for example).
- Only the plans using a specific database or table.
- Plans coming from a client identified by specific client information fields (user ID, application name, workstation name, and accounting string).
You can provide any number of conditions that will be evaluated in conjunction or no conditions at all. With no conditions, the rule will be applied to all jobs running on the appliance. Actions that can be triggered by the scheduler rule include: changing the job's priority or estimate, adding a tag, executing as a different resource group (which can impact scheduling), treating the job as a short query, or even aborting the job altogether. Scheduler rules are a simple but powerful tool to control the workload execution on a PDA appliance.
A detailed discussion of the PureData System for Analytics approach to workload management (WLM) is outside the scope of this article. For more information, see Workload management on the Netezza appliance in the IBM Netezza System Administration section of the IBM PureData System for Analytics information center.
Monitoring and adjusting resource allocation in Netezza Performance Portal
This section discusses the features of the resource allocation graph and how to create a custom graph.
Resource allocation graph
Prior to Version 2.1, Netezza Performance Portal offered capabilities to manage resource sharing groups (defining the groups, setting thresholds, and changing the group assignment for database users). Netezza Performance Portal 2.1 provides the ability to monitor the resource allocation for particular resource sharing groups. Access this feature by clicking the Groups -> Resource Allocation Performance tab, as shown in Figure 1.
Figure 1. Resource Allocation Performance tab
In the Resource Allocation Performance view, you see the default chart, which displays the actual_rsg_pct value for all active resource sharing groups. The actual_rsg_pct is the percentage of system resources assigned to the given resource sharing group. The data collection interval for the chart is 10 minutes and it always shows the last 1000 samples. Pie charts show the current breakdown of resources among resource sharing groups and the percentage of jobs run from particular resource sharing groups.
The example in Figure 1 shows that in the recent past, there was activity carried out by members of the LIGHT_GRP_1, PUBLIC, PWR_GRP, SPECIAL_GRP, and BATCH_GRP groups. Currently only the LIGHT_GRP_1 is active, using 23% of the system resources while the rest of the resources are not allocated.
Click Details in the Resource Allocation pane to see a tabular view of the resource allocation metrics, as shown in Figure 2. The metrics show actuals as well as maximum and target resource sharing group percentages. The number of plans that are waiting or already running on the system are also shown.
Figure 2. Workload Management details view
Creating custom resource allocation graphs
In the upper part of the Resource Allocation Performance tab, you can create custom resource allocation charts. For example, let's visualize the amount of long and short running plans for resource sharing groups PWR_GRP and SPECIAL_GRP. In order to do so, the groups need to be selected in the multiple selection widget, as in Figure 3. It is possible to plot data for not more than four groups. After you make the group selection, click Next.
Figure 3. Selecting groups for custom resource allocation graph
In the second step, select the attribute or attributes (not more than 2) in the same manner, as in Figure 4. Those attributes are plotted in the graph for every resource sharing group selected in the first step.
Figure 4. Selecting attributes for custom resource allocation graph
In the third and final step, you can specify whether data for the chart should be shown unaggregated, aggregated per hour, or aggregated per day, as in Figure 5. This is related to the range of data shown in the chart because the chart only shows up to 1000 points. In the example, where only a few days worth of data is available, the hourly aggregation is chosen. Click Finish to display the custom chart in a new tab in the bottom pane.
Figure 5. Selecting aggregation for custom resource allocation graph
The resulting graph looks like Figure 6. You can see that the PWR_GRP was running only long queries and the SPECIAL_GRP was hardly running any queries in the recent time frame. Again, it is possible to access the data in tabular format by clicking Details.
Figure 6. Custom resource allocation graph
The custom graphs are not persistent between Netezza Performance Portal sessions; once you log out of the Portal, they disappear. However, they do remain in place when you switch between various Netezza Performance Portal views during one session.
Handling scheduler rules in Netezza Performance Portal
Netezza Performance Portal 2.1 introduces a view for managing scheduler rules. Access the Scheduler Rules view from the tree menu under the System node for 7.1 appliances only (it is not displayed for earlier appliances that don't yet support the scheduler rules). The view lists all scheduler rules currently in the system, along with their owner and status information. Status indicates whether the rule is active or not and whether it should apply to administrative user jobs or just the non-administrative user jobs. Right-click on the list to open the Context menu, which shows what actions are allowed on the rules, as in Figure 7.
Figure 7. Scheduler Rules list
Even if no scheduler rules were created explicitly on the system, there are three default inactive rules present after installing PureData System for Analytics 7.1, as shown in Figure 7.
Creating a scheduler rule
To create a new scheduler rule, click Add rule from the Scheduler Rules view context menu. A dialog opens where you can configure rule conditions and actions. For the example, let's create a rule called AbortPwrGroupJobs, which will abort all the queries executed as the resource sharing group PWR_GRP. You need to specify the condition needs by clicking Add Condition, as in Figure 8.
Figure 8. Adding condition
After specifying the condition, you need to specify the action to be taken. In the example, the action is Abort, as in Figure 9. The conditions and action are graphically visible in the rule editor window so you can easily add more conditions or remove existing conditions or actions. By default, the new rule is activated. It is possible to create an inactive rule by clicking the Set scheduler rule off box in the upper right of the rule's pane. Click OK in this view to save the new scheduler rule.
Figure 9. Rule layout
After you create the scheduler rule, it can be modified using the context menu for rules. You can rename it, edit it in SQL mode, drop it, change its owner, and deactivate or reactivate it.
Scheduler rules' effect on resource allocation
Using our example, let's look at the effect of scheduler rule AbortPwrGroupJobs on the workload executed on a PDA system. Before the activation of the rule, the PWR_GRP was active and executed a significant part of the jobs executed on the system, as shown in Figure 1 and Figure 6. Once it was activated, all jobs from PWR_GRP should be aborted. Indeed, this is what happens. The Sessions view in Netezza Performance Portal shows that all sessions from the various PWR_USR users belonging to PWR_GRP have been disconnected, as in Figure 10.
Figure 10. Resource allocation graph after scheduler rule activation
Disabling a scheduler rule
When you no longer want the scheduler rule to be applied to the workload on an appliance, deactivate the rule by clicking Deactivate from the context menu. The rule will be deactivated immediately and will not be applied to any warehouse jobs started after deactivation.
This article discussed two new features of Netezza Performance Portal: resource allocation graphs and the scheduler rules management view. Used together, they let PDA users dynamically adjust the workload running on their system and influence the resource allocation.
- IBM PureData System for Analytics Information Center: Contains the product documentation for the IBM Netezza Performance Software for release 7.0.3. The Netezza Performance Software is designed for use on the IBM Netezza and IBM PureData System for Analytics family of data warehouse appliances.
- IBM PureData System for Analytics: Powered by Netezza technology, the IBM PureData System for Analytics is a simple database for serious analytics.
- Check out the Redbooks® publication The Netezza Data Appliance Architecture: A Platform for High Performance Data Warehousing and Analytics.