Useful commands for use with a running node

By entering the relevant command, you can return information regarding a running node. This information spans from the status of the node, to the platform names that each site is on.

Note: The following commands should be run as an admin user.

To display status information, run the following command:

status

To check the status of a machine in a cluster (or standalone), run the following command:

status

If the status check is a success, you receive a message that ends with the following:

SUCCESS: All services are Up and the cluster timestamps are in sync

If the machine is not clustered, the message you receive is:

SUCCESS: All services are Up

The status command displays the following status information:

System and distribution version
Cluster members, and whether they are active
Configuration values
Database status
Web server and file synchronization
Cluster timestamps

To show you a list of the time stamp files that are on a node, run the following command:

timestamps

Each node writes the current time to a file every minute and these get copied around the cluster by csync2 and lsyncd. If any file is more then 90 seconds behind the current system time, or in front, then the command returns a non-zero return code. The code indicates how many timestamps were bad.

To show you the processes running on this machine that are performing any of the tasks mentioned in the Queue System section of the Load balancing the requests the Cloud Manager sends to the Developer Portal topic, run the following command:

show_queue_executors

Note: all task types except add_language are considered priority tasks and add_language tasks are considered non-priority tasks. It also shows the locks that are held across the cluster and if any tasks are waiting on the queue that are currently executable on this node then they will show, also.

To show all tasks that have run, and re-run, on the queue and information about that execution, including the return code, time taken and which host they ran on, run the following command:

show_queue_history

To show any tasks, runs or re-runs, that have failed to return a zero return code, run the following command:

show_queue_history -n

To show any tasks that have failed to return a zero return code even after all retry attempts (The number of attempts is controlled by QUEUE_TASK_MAX_RETRIES in config/config.ini), run the following command:

show_queue_history -f

A value of Killed in the rc column means that run_site_queue found a queue lock without the process that originally locked it still running so it killed the old queue lock and therefore does not know the rc of the task. This counts as a task failure, so increment the retry count and the task will re-run at the next opportunity.

To show the queue history for just the single site specified, enter the following command:

show_queue_history -q orguuid.envuuid

To show all the columns, enter the following command in conjunction with any of the preceding commands:

show_queue_history -a

Some columns are not shown by default. If you find a task that has completely failed and you want to retry it, you can use the preceding command to show all columns and then highlight the contents of the JSON column. Now enter the following command:

echo '<paste the json contents here>' | site_action

Note: If you re-issue an add action then all the corresponding add_language actions will be added to the queue by site_action automatically so you do not need to also reissue them.

To delete all traces of a site if it appears to be in a broken state and it cannot be deleted by the CMC, run the following command:

delete_site orguuid.envuuid

To delete all traces of a site if you cannot find the corresponding UUID for the site, enter the following command:

delete_site -u mysite.url.com

You should first check the URL you are passing in is not in the list returned by list_sites. If the site does appear in list_sites, then delete the site by passing the UUID to delete_site like delete_site orguuid.envuuid as this will then also delete the mapping between the UUID and the URL.

config.ini - changes to values in config.ini are effectively immediately (run_site_queue checks the file regularly on the node you edited the file on, and within a few seconds more on all other nodes.

You can find which platform name each site is on by running

list_sites -p

then you can do the following on both nodes involved:

sudo rm -f /var/lib/csync2/*

then run

sudo service apim_dcluster update-cluster

on the sending and receiving nodes.

Tip: From API Connect Version 5.0.7.2, you can check the status of a Developer Portal cluster by calling a cluster health REST API. For more information, see Obtaining health check data of Developer Portal servers by using a REST API call.

Failed log in commands

To show the number of failed log in attempts per site and user, and reset the failed log in count per site either for a specific user or all users, run the following command:

Usage: reset_locked_user [-s] [-l [sitename]] [-r sitename/-a accountname/-a]

Where -s lists sites by URL, and -l shows all locked accounts per site either for all sites, or for sitename if the value is provided.

-r resets the failed log in attempts to zero for sitename and accountname.

-a is all sites or all accounts.

To show all locked users on all sites, run the following command:

reset_locked_user -l

To show locked users on a single site, run:

reset_locked_user -l mysite.com

Then, you can unlock specific users on specific sites by running one of the following commands:

reset_locked_user -r mysite.com admin

Or:

reset_locked_user -r mysite.com myuser@email.com

You can reset the access for all users on a single site by running the following command:

reset_locked_user -r mysite.com -a

Or, to reset the access for all users on all accounts, run the following command:

reset_locked_user -r -a -a

Bootstrap commands

To check the database status for the entire cluster of machines, run the following the command:

bootstrap_cluster -s

The bootstrap_cluster -s command will check the database status on all defined cluster members. If the check is successful, you receive a message that ends with the following line:

SUCCESS: All reachable cluster members reporting Primary database status

To forcibly restart the database on each cluster member, enter the following command:

bootstrap_cluster -bf

This can be used if some of the cluster members show as Starting following the bootstrap_cluster -s command, but you know that the entire cluster is down.

If all the cluster members show as STOPPED, you can run the following command to find the machine with the most up-to-date database, and use it to bootstrap the cluster:

bootstrap_cluster -b

After this machine has bootstrapped, it will start the database on the other machines consecutively so that they join the cluster.