Loading the sample Spark application code

Examples of Apache Spark application code are provided in files that you can load into your $HOME/spark/apps directory and use as templates for writing your own code.

Note: The current examples are based on Spark Release 2.3.0. Earlier versions of these examples were based on an earlier Spark release and might not run correctly with the current Spark release. Therefore, always load the latest examples.

The following conditions apply to the sample files:

  • The Python examples are individual files, each of which has the file extension .py.
  • The Scala examples have the file extension .scala and are grouped into a single archive file with the name idax_examples.jar.
  • The R examples are individual files, each of which has the file extension .R.

To load these files into your $HOME/spark/apps directory, issue the following command:
spark-submit.sh --load-samples
This creates a /spark/apps directory (if it does not already exist) in your Db2® Warehouse home directory, and copies the Python examples, the R examples, and the idax_examples.jar file into that directory.

You might want to download one or more of the examples from your $HOME/spark/apps directory to your client system, so that you can use the sample code as a basis for your own applications.

To list the available samples, issue the following command:

spark-submit.sh --list-files apps
Issue the following command for each file you want to download:
spark-submit.sh --download-file apps file_name

Using a REST API call

Alternatively, use the IBM® Db2 Warehouse Analytics API to submit an HTTP POST request that calls the /dashdb-api/analytics/public/samples/load endpoint. For example, issue the following cURL command (replace the user ID, password, and host name):
curl --user "userid:password" 
  -X POST "https://hostname:8443/dashdb-api/analytics/public/samples/load"

To download a file from your $HOME/spark/apps directory to your client system, use the IBM Db2 Warehouse API to submit an HTTP GET request that calls the /dashdb-api/home endpoint. Add the path to the file to be downloaded as a suffix to the URL.

For example, issue the following cURL command to list the available samples (replace the user ID, password, and host name):

curl --user "userid:password" 
-X GET "https://hostname:8443/dashdb-api/home/spark/apps"
Issue the following cURL command to download idax_examples.jar (replace the user ID, password, and host name):
curl --user "userid:password" 
  -X GET "https://hostname:8443/dashdb-api/home/spark/apps/idax_examples.jar" > idax_examples-copy.jar