Import a custom JDBC data source
If a user plans to connect to a relational database that is not supported by default, an IBM® Watson™ Studio Local administrator must import a JDBC driver for the relational database.
For a list of supported data sources, see Add data sources and remote data sets
To use data from an unsupported relational database, complete the following tasks:
Import a JDBC driver
This task must be completed by the IBM Watson Studio Local administrator.
To import a JDBC driver:
- From the command line, change to the /wdp/utils/ directory on the master node of the cluster.
- Enter the following
command:
moveJarClasspath.sh -add jarfile-nameReplace jarfile-name with the fully qualified path to the JAR file that contains the JDBC driver. For example:
moveJarClasspath.sh -add /wdp/tmp/postgresql-42.1.1.jar
Remove a JDBC driver
This task must be completed by the IBM Watson Studio Local administrator.
To remove a JDBC driver:
- From the command line, change to the /wdp/utils/ directory on the master node of the cluster.
- Enter the following
command:
moveJarClasspath.sh -remove jarfile-nameReplace jarfile-name with the name of the file (not the fully qualified path of the file). For example:moveJarClasspath.sh -remove postgresql-42.1.1.jar
Create a custom data source and remote data set
- From your project, go to the Data Sources page and click add data source.
- In the Create Data Source window, specify all of the required fields.
- In the Data Source Type pull-down menu, select Custom JDBC.
- Type in the JDBC driver class name, for example,
org.postgresql.Driver. - Type in the JDBC URL, for example,
jdbc:postgresql://<host>:29922/compose. - If you select the Shared check box, the data source and its credentials become accessible to all collaborators in the project. If you do not select Shared, then each collaborator must provide their own credentials to use the data source.
All collaborators in a project can use a shared data source in their notebook, or add their own private data source. Collaborators with viewer privileges cannot edit data sources.
Load data in the notebook
To load data from a relational database into a notebook, you must manually add a connection to the database in your notebook.
The following table lists the recommended APIs for adding a connection:
| Language | Recommended APIs |
|---|---|
| Scala | Spark APIs, java.sql.{Connection,DriverManager} |
| Python | Spark APIs, jaydebeapi |
| R | Spark APIs |
You can use the following examples to get started:
- Scala syntax
-
java.sql.{Connection,DriverManager} import java.sql.{Connection,DriverManager} val url = "jdbc:postgresql://9.87.654.321:5432/postgres" val driver = "org.postgresql.Driver" val username = "postgres" val password = "password"var connection:Connection = _ try { Class.forName(driver) connection = DriverManager.getConnection(url, username, password) val statement = connection.createStatement val rs = statement.executeQuery("SELECT * FROM myschema.cars") while (rs.next) { val make = rs.getString("make") println("make = %s".format(make)) } } catch { case e: Exception => e.printStackTrace } connection.close - Python syntax
-
import jaydebeapi, sys #Enter the values for you database connection dsn_database = "postgres"# e.g. "BLUDB" dsn_hostname = "9.87.654.321"# e.g.: "hostname.site.com" dsn_port = "5432"# e.g. "50000" dsn_uid = "postgres"# e.g. "dash104434" dsn_pwd = "password"# e.g. "7dBZ3jWt9xN6$o0JiX!m" connection_string='jdbc:postgresql://'+dsn_hostname+':'+dsn_port+'/'+dsn_database print connection_string if (sys.version_info >= (3,0)): conn = jaydebeapi.connect(dataSource['driver_class'], dataSource['URL'], [dataSource['user'], dataSource['password']]) else: conn = jaydebeapi.connect(dataSource['driver_class'], [dataSource['URL'], dataSource['user'], dataSource['password']]) query = 'select * from ' + (dataSet['schema'] + '.'if (len(dataSet['schema'].strip()) != 0) else'') + dataSet['table'] curs = conn.cursor() curs.execute("select * from myschema.cars") curs.close() - R syntax
-
url <- "jdbc:postgresql://9.87.654.321:5432/postgres" user <- "postgres" password <- "password" dbtable <- "myschema.cars" spark<-sparkR.session(); df <- read.jdbc(url, dbtable, user=user, password=password) collect(df)