Db2 Big SQL scheduler
The Big SQL scheduler is a service that helps in the compilation and execution of queries involving Datalake tables. The scheduler provides access to the metadata for Datalake tables that are involved in a scan. This metadata includes table descriptors with details such as column types, column lengths, partition information, and input format.
Scheduler cache
expireAfterAccess
, which resets the timer every time a cache item is accessed
before it expires, allowing for the lifespan of cache items to be extended when being frequently
accessed. The alternative policy is expireAfterWrite
, where the timer begins
counting down after the cache items are first written into the cache. You can check the current
value of these parameters using the GET_Datalake_CONFIG table
function:select * from table(syshadoop.get_datalake_config('BIGSQL'));
call syshadoop.set_datalake_config('BIGSQL', 'scheduler.tableMetaDataCache.timeToLive', '0');
ALTER Datalake TABLE schema1.table1 SET TBLPROPERTIES('scheduler.cache.exclusion'='true');
This
operation takes effect immediately.Error logs
When the scheduler receives or generates an exception, an SQL5105N error is produced with the
name of the component that issued the exception. The error includes a log entry identifier, a
three-part name that enables you to identify to which file to refer for a particular error message.
The identifier consists of three parts as in the following example:
SCL-NNN-XXXYYYZZZ
. You can use the SYSHADOOP.LOG_ENTRY table function to get log file information for a particular log entry
identifier. Alternatively, you can manually search the
$BIGSQL_DIST_VAR/logs/bigsql-sched.log file.
If the Db2 Big SQL I/O engine invokes the Db2 Big SQL scheduler and receives an exception, the log entry identifier might be specific to the Db2 Big SQL I/O engine. Therefore, you might need to search the logs that are specific to the readers and writers.
For more information about log entry identifiers and the locations of the log files, see Db2 Big SQL log files.
Scheduler configuration
- scheduler.java.opts
- The Java options for the scheduler JVM. You can specify any JVM option. The following example
sets the starting memory to 512 MB and the maximum memory to 2
GB:
call syshadoop.set_datalake_config('BIGSQL', 'scheduler.java.opts', '-Xms512M -Xmx2G');
- scheduler.client.request.timeout
- Throughout normal workload execution, the Db2 engine components in Db2 Big SQL make requests to the scheduler and then wait for a response. Because the scheduler is running as a different service than the Db2 engine, there is a timeout that enables the Db2 engine to exit early and fail if the scheduler is taking too long to respond. The DDL processor component also uses this property when sending metadata refresh requests to the scheduler. The default value of this property is 120 seconds.
- scheduler.client.request.IUDEnd.timeout
- This property is an identical concept to the scheduler.client.request.timeout property, except that this property is uniquely for INSERT, or UPDATE, or DELETE (IUD) queries only. Due to the nature of IUD workloads, the scheduler might take somewhat longer for these types of operations. Therefore, this configuration property provides more granular timeout control depending on the type of query. In other words, it is possible to give more time for IUD queries with respect to scheduler.client.request.timeout, without having to influence the existing timeouts for non-IUD queries. The default value of this property is 600 seconds.