Enabling tracing to aid in troubleshooting

To identify the cause of failures or errors, you can enable tracing (logging) for Cassandra queries or all operations. Only enable tracing during problem identification and disable immediately upon resolution.

About this task

Important: Tracing affects performance negatively. Never enable tracing in a production environment unless advised by an IBM® Support Engineer.
You can set logging properties in the server configuration file (server.xml) by adding a logging element to the server configuration file as follows:
<logging traceSpecification = "*=audit:com.myco.mypackage.*=finest" />
The format of the log detail level specification is:
<component> = <level>
Where <component> is the component for which to set a log detail level, and <level> is one of the valid logger levels (off, fatal, severe, warning, audit, info, config, detail, fine, finer, finest, all). Separate multiple log detail level specifications with colons (:).
Tracing for Cassandra query language (CQL) is enabled by setting the logging level of a Cassandra Data Access Object (DAO) to ALL. Cassandra DAOs are in the following Java™ package:
com.ibm.mailbox.database.dao.cassandra

To enable tracing:

Procedure

  1. To enable tracing for a specific Cassandra Data Access Object (DAO), specify the logging level with the following value:
    logging traceSpecification = com.ibm.mailbox.database.dao.cassandra.DAOName.level=ALL
    Where DAOName is the name of the DAO whose CQL execution you want to trace.
    Tip: All CQL tracing for a DAO is performed and reported in the CassandraDAO abstract superclass.
  2. To enable CQL tracing for all Cassandra DAOs, specify the logging level with the following value:
    logging traceSpecification = com.ibm.mailbox.database.dao.cassandra.level=ALL
    Restriction: Only use this option if you do not resolve the issue with the first option.
  3. To enable all logging in Global Mailbox, including CQL trace, specify the logging level with the following value:
    logging traceSpecification = com.ibm.mailbox.level=ALL
    Restriction: Only use this option if you do not resolve the issue with the first or second option.
  4. A mediator class coordinates activities between various DAOs. The CQL statements created by mediated DAOs are not executed until the mediator commit() method is called. To enable tracing of the CQL statements executed by the mediator, enable tracing by using the following logging level:
    com.ibm.mailbox.database.mediator.CassandraAtomicityMediator.CQL.level=ALL
    Restriction: This option enables CQL tracing for ALL CQL statements created by mediated DAO methods.

Results

For a successful trace, the output in the log looks similar to the following excerpt:
[2015-03-13T18:05:08.695-0500] 00000001 DatacenterDAO > 
com.ibm.mailbox.database.dao.cassandra.DatacenterDAO listAllDatacenters ENTRY
[2015-03-13T18:05:08.729-0500] 00000001 CassandraDAO  
I com.ibm.mailbox.database.dao.cassandra.CassandraDAO cqlTrace
Statements:
SELECT * FROM gatekeeper.datacenters;

Host (queried): localhost/127.0.0.1:44952
Host (tried): localhost/127.0.0.1:44952
Trace id: 64f51d80-c9d5-11e4-996d-7d8ea8b489ab

------------------------------------------------------------------------------------------------+---------------------------+------------+---------------
Determining replicas to query | 2015-03-13T18:05:08-05:00 | /127.x.0.x |xx
Executing seq scan across 0 sstables for [min(-9223372036854775808), 
min(-9223372036854775808)] | 2015-03-13T18:05:08-05:00 | /127.x.0.x |   yyy
Scanned 0 rows and matched 0 | 2015-03-13T18:05:08-05:00 | /127.0.0.1 |   zzz
The Statements section shows the CQL statements executed (in the case of a Batch, there might be more than one CQL statement executed), followed by a table. The Queried host is the host that coordinated (performed) the query. The Tried host is the host (or hosts) that the query is performed against. This is usually the same as the queried host but if connection failures occur, this is a list of all hosts that were tried. The Events table shows a list of query events (oldest to youngest). Each row contains:
  • Event description
  • Event time stamp
  • Event host - the host that generated the event
  • Event elapsed time - the amount of time, in microseconds, taken from the start of query processing to when the event was raised.

Sometimes, the tracing information cannot be retrieved from Cassandra. Five attempts are made by the Cassandra driver logic to retrieve the tracing information. If the information cannot be retrieved, a log entry similar to the following message is created:

[2015-03-13T18:05:08.775-0500] 00000001 DatacenterDAO > 
com.ibm.mailbox.database.dao.cassandra.DatacenterDAO listAllDatacenters ENTRY
[2015-03-13T18:04:52.806-0500] 00000001 CassandraDAO  
I com.ibm.mailbox.database.dao.cassandra.CassandraDAO cqlTrace
Statements:
SELECT * FROM gatekeeper.datacenters;
Failed to retrive trace for statement(s)

The tracing information from mediated DAO methods looks similar to the following excerpt:

[2015-03-18T11:51:09.640-0700] 00000001 CassandraAtom I com.ibm.mailbox.database.mediator.CassandraAtomicityMediator$CQL cqlTrace
Statements:
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);
INSERT INTO gatekeeper.password_vault(id,password) VALUES (?,?);

Host (queried): localhost/127.0.0.1:47195
Host (tried): localhost/127.0.0.1:47195
Trace id: bdc3afd0-cd9f-11e4-a376-f51d2de36ca2

-----------------------------------------------------+---------------------------+------------+---------------
Determining replicas for atomic batch | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |          557
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         4163
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         4190
          Adding to batchlog memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         4216
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         8058
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         8133
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         8253
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         8637
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         8726
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         8817
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         9109
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         9217
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |         9462
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        10697
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        10755
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        10791
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        10852
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        10888
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        10992
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11464
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11511
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11618
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11642
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11730
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11750
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11832
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11856
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11937
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        11959
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        12021
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        12043
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        12131
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        12151
    Adding to password_vault memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        18755
       Acquiring switchLock read lock | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        19153
               Appending to commitlog | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        19167
          Adding to batchlog memtable | 2015-03-18T11:51:09-07:00 | /127.0.0.1 |        19202

What to do next

When you resolve the issue, immediately disable logging. Remove the added entry and save the new version of server.xml.