IBM Support

Best Practice and Guidelines - Performance - Adjusting Big SQL Memory Allocation - Hadoop Dev

Technical Blog Post


Abstract

Best Practice and Guidelines - Performance - Adjusting Big SQL Memory Allocation - Hadoop Dev

Body

1. Identify which applications use YARN Memory

The Big SQL memory utilization can be changed at install time and after install time. When deciding whether to increase Big SQL memory on the cluster, first consider which applications are running on the system. Ambari looks at the profile on the cluster and makes some recommendation on initial memory configuration. It balances between YARN and non-YARN resources. Some applications that use YARN include Pig, Hive and Spark. Non Yarn resources include HBase, Kafka and Big SQL (though LOAD and Analyze v1 use YARN). For example you could have Spark and Big SQL applications running on your clusters so you may want to spread the memory evenly across YARN and non-YARN applications.

1.1 Big SQL Dependencies on YARN Memory

Big SQL memory does not use YARN memory by default with the exception of LOAD and ANALYZE v1 commands. Analyze v2 is the default since Big SQL 4.2 so you do not need to change the parameter mentioned above if you are on Big SQL 4.2 or later. In Big SQL v5 there is now integration with YARN through Apache Slider.

2. Determine how much memory is used by YARN

Ambari looks at the profile on the cluster and makes some recommendation on initial memory configuration. To determine the actual memory used by YARN look at the following memory configurations from Ambari in the YARN->Configs->Settings->Memory section:

  • yarn.nodemanager.resource.memory-mb
  • yarn.scheduler.maximum-allocation-mb
  • yarn.scheduler.minimum-allocation-mb

As an example consider figure 2 which shows an approximate allocation of memory for YARN and Big SQL memory as a percentage of the memory on each physical machine of the cluster.
YARN Big SQL Memory
Figure 2. Big SQL and YARN memory allocation

3. Increasing Big SQL Memory and Decreasing YARN Memory

So if you want to increase the memory of Big SQL you must first reduce the memory allocated to YARN. Figure 3 and 4 shows a representation of increasing Big SQL memory by 25 and 50%, note that YARN memory must be decreased by the same amount prior to changing Big SQL memory.

Increasing Big SQL memory by 25%

Figure 3. Increasing Big SQL memory by 25%

Increasing Big SQL memory by 50%

Figure 4. Increasing Big SQL memory by 50%

3.1 Changing Big SQL Memory from Ambari at Install

When you install the Big SQL service, the Ambari dashboard contains configurations that you can change as the administrator. One of these configurations is bigsql_resource_percent also referred to as bigsql_resource_allocation. The default percentage given to Big SQL is 25%. You can choose a different percentage to reflect your usage of Big SQL.
If the bigsql_resource_percent is changed at install time, follow the instructions in Section 3.2 “Updating YARN Memory Configuration Properties” to adjust the YARN properties accordingly once the installation of Big SQL has successfully completed.
Modifying the bigsql_resource_percent value in Ambari after the initial install of Big SQL will have no impact. The property must be modified from the command line. See Section 4.5 “Changing Big SQL memory After Install” for more details.

3.2 Updating YARN Memory Configuration Properties

Before changing any of the YARN configuration properties first take note of what they were originally set to
1. Take note and save the following configuration values:

  • yarn.nodemanager.resource.memory-mb
  • yarn.scheduler.maximum-allocation-mb
  • yarn.scheduler.minimum-allocation-mb
  • yarn.nodemanager.resource.cpu-vcores

2. Calculate the memory in MBs needed for the change in the bigsql_resource_pct:
Determine how much memory is on the system by looking at the /proc/meminfo file and looking for MemTotal then convert this to MBs (SysMem). An example of is shown below:

    cat /proc/meminfo  MemTotal:       132151672 kB    

Use the following calculation to determine the amount of memory in MBs in which you want to increase Big SQL memory by:

MemInc = SysMem * (x2-x1)/100

where:
x1 is the default or percentage you start off with, x2 is the percentage of memory you want to end up with
SysMem is the Total Memory in MBs on the system and MemInc is the memory to be added to Big SQL

For example if we want to increase Big SQL memory from 25 to 50% and the MemTotal is shown as in the example above:

    x1 will be 25, x2 will be 50.   SysMem = 132151672/1024=129054  MemInc = 120954 * (50-25)/100 = 32263    

3. Update the following YARN parameters to accommodate for the changes made to the bigsql_resource_pct:

  • yarn.nodemanager.resource.memory-mb=current value – MemInc
  • yarn.scheduler.maximum-allocation-mb=yarn.nodemanager.resource.memory-mb
  • yarn.scheduler.minimum-allocation-mb=1024 (fixed value in MBs)
  • yarn.nodemanager.resource.cpu-vcores=current value * ((100 – x)/100)

where x is the delta as a percentage, so if you had intended to increase the memory for Big SQL from 25 to 50% then the delta would be 25, and if the intention is to increase the memory for big sql from 25 to 75% then the delta would be 50%. Note that the maximum allowed value for yarn.scheduler.maximum-allocation-mb is 53248 (52GB) and 32 vcores for yarn.nodemanager.resource.cpu-vcores. If the calculation above gives numbers greater than these values, set them to the maximum allowed.
The result of the yarn.nodemanger.resource.memory-mb would be the total physical memory in MBs available on the system minus the memory allocated to Big SQL and other services (e.g. HBase Region Server, HDFS DataNode, YARN NodeManager, Linux OS). Essentially whatever you are adding to Big SQL you are removing from YARN. The following diagrams show the configuration of YARN after changing the configuration parameters.
Reducing YARN memory
Reducing YARN CPU resources

4. Restart the services for the changes to take effect – including Big SQL, YARN, Hive (there may be others).

3.3 Changing Big SQL Memory after Install

This section describes the steps that are needed to update the amount of memory assigned to Big SQL after install, note that changing the value of bigsql_resource_percent after install in the Ambari UI has no impact on updating the memory assigned to Big SQL. The YARN/MR properties also need to be updated as outlined in Section 3.2 before Big SQL is restarted.
1. Verify that all Big SQL services are running: If Big SQL is stopped, start the Big SQL service.
2. Log on to the Big SQL head node by running the following commands:
su – bigsql
When prompted, enter the bigsql administrator password that you created at installation time.
3. If you want to know the maximum amount of memory to be consumed with the current configuration, run the following commands:
Issue the ATTACH command to connect the application to the database instance:
db2 attach to bigsql
Show the detail of the INSTANCE_MEMORY configuration
db2 get dbm cfg show detail | grep INSTANCE_MEMORY
The result shows both the percentage and the actual memory that is allocated in 4 K units.
Terminate this session
db2 terminate
4. As the owner of the Big SQL instance, run the following command to connect to the bigsql database:
db2 connect to BIGSQL
5. As the owner of the Big SQL instance, run the following commands to update the node resources percentage:
db2 “call syshadoop.big_sql_service_mode(‘on’)”
db2 autoconfigure using mem_percent x workload_type complex is_populated no apply db and dbm
db2 “call syshadoop.big_sql_service_mode(‘off’)”
In the example, substitute an integer value for example 50 or 75 in place of x. The value of x represents the percentage of memory on the machine that the local database is allowed to consume. Also it maybe helpful to save the output of these commands for future reference.
6. Terminate the session by using the following command:
db2 terminate
7. As the Big SQL administrator user, stop and then restart the Big SQL service.
8. You can verify the change in the resources percentage by using the following command:
cd $BIGSQL_HOME/bin
su – bigsql
db2 get dbm cfg | grep INSTANCE_MEMORY

4. Restoring Memory Defaults

4.1 Restoring Yarn Settings to default

Change the following Yarn settings to the values that you kept and saved in 1. of Section 3.2.

  • yarn.nodemanager.resource.memory-mb
  • yarn.scheduler.maximum-allocation-mb
  • yarn.scheduler.minimum-allocation-mb
  • yarn.nodemanager.resource.cpu-vcores

4.2 Restoring Big SQL memory to the default

Verify that all Big SQL services are running: If Big SQL is stopped, start the Big SQL service.
1. Log on to the Big SQL head node by running the following commands:
su – bigsql
When prompted, enter the bigsql administrator password that you created at installation time.
3. If you want to know the maximum amount of memory to be consumed with the current configuration, run the following commands:
Issue the ATTACH command to connect the application to the database instance:
db2 attach to bigsql
Show the detail of the INSTANCE_MEMORY configuration
db2 get dbm cfg show detail | grep INSTANCE_MEMORY
The result shows both the percentage and the actual memory that is allocated in 4 K units.
Terminate this session
db2 terminate
4. As the owner of the Big SQL instance, run the following command to connect to the bigsql database:
db2 connect to BIGSQL
5. As the owner of the Big SQL instance, run the following commands to update the node resources percentage:
6. db2 “call syshadoop.big_sql_service_mode(‘on’)”
7. db2 autoconfigure using mem_percent 25 workload_type complex is_populated no apply db and dbm
db2 “call syshadoop.big_sql_service_mode(‘off’)”
Also it maybe helpful to save the output of these commands for future reference.
8. Disconnect from the BIGSQL database and then terminate the session by using the following two commands:
db2 disconnect BIGSQL
db2 terminate
9. As the Big SQL administrator user, stop and then restart the Big SQL service.
10. You can verify the change in the resources percentage by using the following command:
cd $BIGSQL_HOME/bin
su – bigsql
db2 get dbm cfg | grep INSTANCE_MEMORY

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16259957