Using project-lib for Python (deprecated)

If you need to interact with your Watson Studio projects and project assets from a notebook, you can use the project-lib library for Python. The library is like a programmatical interface to a project.

By using the project-lib library for Python, you can access project metadata and assets, including files and connections. The library also contains functions that simplify fetching files associated with the project.

The project-lib library for Python is deprecated and has been replaced by the ibm-watson-studio-lib library for Python. Although you can still use the project-lib library, you should start using the ibm-watson-studio-lib library in your notebooks. See ibm-watson-studio-lib for Python.

Note:

The project-lib functions

The instantiated project object that is created after you have imported the project-lib library exposes a set of functions that are grouped in the following way:

Fetch project information

You can use the following functions to fetch project-related information programmatically:

Fetch files

You can use the following function to fetch files associated with your project.

Save data

You can use the following function to save data to a file associated with your project. This function does multiple things. Firstly, it puts the data into a file and then it adds this data as a data asset to your project so you can see the data that you saved as a file in the data assets list in your project.

save_data(file_name, data, set_project_asset=True, overwrite=False)

The function takes the following parameters:

Here is an example, which shows you how you can save data to a file:

# Import the lib
from project_lib import Project
project = Project.access()

# let's assume you have the pandas DataFrame  pandas_df which contains the data
# you want to save as a csv file
project.save_data("file_name.csv", pandas_df.to_csv(index=False))

# the function returns a dict which contains the asset_id and file_name
# upon successful saving of the data

Read data from a connection

You can use the following function to get the metadata (credentials) of a given connection.

get_connection: the function takes as input the ID of the connection or the name of the connection. You can get these values by using the get_assets() function which returns the id, name and type of all the assets listed in project.

The function get_connection returns the connection credentials which you can use to fetch data from the connection data source.

Here is an example, which shows you how you can fetch the credentials of a connection by using the get_connection function:


# Import the lib
from project_lib import Project
      project = Project.access()

conn_creds = project.get_connection(name="<ConnectionName>")

If your connection is a connection to dashDB for example, you can fetch your data by running the following code:

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

host_url = "jdbc:db2://{}:{}/{}".format(conn_creds["host"], "50000", conn_creds["database"])
data_df = spark.read.jdbc(host_url, table="<TableName>", properties={"user": conn_creds["username"], "password": conn_creds["password"]})
data_df.show()

Fetch connected data

You can use the following function to fetch the credentials of connected data. The function returns a dictionary that contains the connection credentials in addition to a datapath attribute that points to specific data in that connection, for example, a table in a dashDB instance or a database in a Cloudant instance.

get_connected_data: this function takes as input the ID of the connected data or the name of the connected data. You can get these values by using the get_assets() function which returns the id, name and type of all the assets listed in project.

Here is an example, which shows you how to fetch the credentials of connected data in a dashDB instance by using the get_connected_data function:

# Import the lib
from project_lib import Project
project = Project.access()

creds = project.get_connected_data(name="<ConnectedDataName>")
# creds is a dictionary that has the connection credentials in addition to
# a datapath that references a specific table in the database, for example: 
# creds: {'database': 'DB_NAME',
# 'datapath': '/DASHDB/SAMPLE_TABLE',
# 'host': 'hostname',
# 'password': 'XXXX',
# 'sg_service_url': 'https://sgmanager.ng.bluemix.net',
# 'username': 'XXXX'}

Learn more

See a demo of these functions in a blog post.

Parent topic: Loading and accessing data in a notebook