Troubleshooting performance issues with SPARQL queries
You can take a number of steps to improve the performance of SPARQL queries that take a long time to run.
- Ensure that you have the AutoRunStats database
configuration parameter set to ON.
If you recently performed a significant update of triples in the RDF store, consider explicitly invoking the RUNSTATS command on the tables of the RDF store. RUNSTATS profiles are created for each of the tables in the RDF stores. Invoke RUNSTATS on each of the RDF store table with the following command: db2 RUNSTATS ON SCHEMA.TABLE USE PROFILE
that you have bound and using the REOPT ONCE or REOPT
ALWAYS packages. Since parameterized SQL statements are used
during SQL query execution, you can take advantage of any skew in
the data distribution by performing the following steps:
- Bind the NULLIDR1 package set with the REOPT ONCE bind option or bind the NULLIDRA package set with the REOPT ALWAYS bind option.
- Specify a correct value for the currentPackageSet property of the DB2BaseDataSource class using the setXXX method. In an application server environment, you can set the currentPackageSet property on a DataSource.
- Ensure that you have set the schedule for the Db2® Administrative Task
Scheduler to gather statistics specific to the RDF stores using the
setstatsschedule command. Further, ensure that the task is running successfully
at the setup time interval. You can do this by first getting the ID of this task from the ADMINTASKS
table and then checking the status of this task from the ADMINTASKSTATUS table.
If you recently performed a significant update of triples in the RDF store, issue the updaterdfstorestats command to immediately force the RDF store specific statistics to get updated.
- After performing the previous three steps, if you still experience
performance issues with SPARQL queries where the queries take a long
time to execute or you encounter any exception during the processing
of SPARQL queries, turn on the query log in the RDF store API. The
following example shows different methods of turning on the query
log in the RDF store API.
Store store = StoreManager.connectStore(conn, "db2admin", "Sample"); //set the file where sparql query information should be logged // note if the file already exists, it is overwritten. store.setQueryLogFile("\querylogfile.txt"); //set the file where sparql query information should be logged // and also the time limit (in millsecs). If a query takes more // time that the specified value, it is logged store.setQueryLogFile("\querylogfile.txt",3000);
All exceptions that occur during the processing of SPARQL queries are logged in to the query log file. To help resolve these performance and processing issues of SPARQL queries, contact IBM® Support and provide the query log file to aid diagnosis..
- The Db2 RDF component throws Java™ Runtime exceptions whenever an error is encountered. Turn on the LOG4J logging from the Java virtual machine (JVM) application to collect more information during such exceptions. The JVM application can examine the SQL exceptions and error codes in case the problem is a database setup issue.