Query Optimizer activation failure

Issue: Query Optimizer activation fails to complete within a reasonable time frame. The oc get wxdengine command displays an ERROR status for the optimizer pod.

Symptoms

The issue may arise from the Big SQL metastore or scheduler not running correctly, preventing the optimizer from initializing properly.

Resolving the problem

You have to verify the Query Optimizer status and manually start Big SQL to activate Query Optimizer encase of a failure.

  1. Run the following command to verify the status of Query Optimizer.
    oc get wxdengine
    The result might be as follows for an example:
    NAME                       VERSION   TYPE          DISPLAY NAME     SIZE     RECONCILE    STATUS    AGE
    lakehouse-oaas             2.0.3     optimizer     lakehouse-oaas   custom   InProgress   ERROR     70m
    (skip)
  2. Run the following command to check the Query Optimizer pod logs if the status is in ERROR state.
    oc logs <OPT_POD>
    The result might show Exception: Bigsql metastore or the scheduler are not running as the error:
    oc logs c-lakehouse-oaas-db2u-0
    {"level":"error","component":"server","subComponent":"GetDb2HealthV2","caller":"[14]:db2.go:301:GetDb2HealthV2()","timestamp":"2024-11-19T12:47:50Z","message":"--readiness status failed:  stderr: Traceback (most recent call last):\n  File \"/usr/local/bin/status\", line 33, in <module>\n    sys.exit(load_entry_point('db2u-containers==1.0.0.dev1', 'console_scripts', 'status')())\n  File \"/usr/local/lib/python3.9/site-packages/cli/status.py\", line 28, in main\n    Status().readiness()\n  File \"/usr/local/lib/python3.9/site-packages/status/status.py\", line 78, in readiness\n    self.status_check()\n  File \"/usr/local/lib/python3.9/site-packages/status/status.py\", line 66, in status_check\n    raise Exception(\"Bigsql metastore or the scheduler are not running\")\nException: Bigsql metastore or the scheduler are not running\n hostname: c-lakehouse-oaas-db2u-0"}
    (skip)
    {"level":"error","component":"server","subComponent":"GetDb2HealthV2","caller":"[14]:db2.go:301:GetDb2HealthV2()","timestamp":"2024-11-20T00:57:23Z","message":"--readiness status failed:  stderr: Traceback (most recent call last):\n  File \"/usr/local/bin/status\", line 33, in <module>\n    sys.exit(load_entry_point('db2u-containers==1.0.0.dev1', 'console_scripts', 'status')())\n  File \"/usr/local/lib/python3.9/site-packages/cli/status.py\", line 28, in main\n    Status().readiness()\n  File \"/usr/local/lib/python3.9/site-packages/status/status.py\", line 78, in readiness\n    self.status_check()\n  File \"/usr/local/lib/python3.9/site-packages/status/status.py\", line 66, in status_check\n    raise Exception(\"Bigsql metastore or the scheduler are not running\")\nException: Bigsql metastore or the scheduler are not running\n hostname: c-lakehouse-oaas-db2u-0"}
  3. Run the following command to start Big SQL manually in case of the error shown in the previous step.
    oc exec -i $OPT_POD -- sudo su - db2inst1 -c "bigsql start"
    The successful result should be as follows starting the Big SQL components:
    $ oc exec -i $OPT_POD -- sudo su - db2inst1 -c "bigsql start"
    Defaulted container "db2u" out of: db2u, instdb (init), init-labels (init), init-kernel (init)
    Global config update           : OK
    Starting Big SQL Scheduler     : OK
    Starting Big SQL               : OK
    Starting Standalone Metastore  : OK
    $ 
    
  4. Run the following command to verify the status of Query Optimizer.
    oc get wxdengine
    The result should change the status from ERROR to SYNCING:
    # oc get wxdengine
    NAME                       VERSION   TYPE          DISPLAY NAME     SIZE     RECONCILE        STATUS    AGE
    lakehouse-oaas             2.0.3     optimizer     lakehouse-oaas   custom   Syncing tables   SYNCING   13h
    (skip)