Migrating from project-lib for Python to ibm-watson-studio-lib

The ibm-watson-studio-lib library is the successor of the project-lib library. Although you can still continue using project-lib API in your notebooks, it has been deprecated and you should think about migrating existing notebooks to use the ibm-watson-studio-lib library.

Advantages of using ibm-watson-studio-lib include:

  • The ibm-watson-studio-lib library can be used in notebooks in projects and in notebooks that were promoted to a deployment space, without requiring modifications to the code. In project-lib, the scope is restricted to projects only.
  • The asset browsing API provides read-only access to all types of assets, not only those explicitly supported by the library.
  • ibm-watson-studio-lib uses a constistent API naming convention that structures available functions according to their area of application.

The following sections describe the changes you need to make in existing Python notebooks to start using the ibm-watson-studio-lib library.

Set up the library

You need to make the following changes in existing notebooks to start using ibm-watson-studio-lib:

In code using project-lib change:

from project_lib import Project
project = Project()

# or:
project = Project.access()

To the following using ibm-watson-studio-lib:

from ibm_watson_studio_lib import access_project_or_space
credentials_dic={"project_id":'<ProjectId>', "token":'<ProjectToken>'}
wslib = access_project_or_space(params=credentials_dic)

Library usage

The following sections describe the code changes that you need to make in your notebooks when migrating functions in project-lib to the corresponding functions in ibm-watson-studio-lib.

Get project information

To fetch project related information programmatically, you need to change the following functions:

List data connections

In code using project-lib change:

project.get_connections()

To the following using ibm-watson-studio-lib:

assets = wslib.list_connections()
wslib.show(assets)

Alternatively, with ibm-watson-studio-lib, you can list connected data assets:

assets = wslib.list_connected_data()
wslib.show(assets)

List data files

This function returns the list of the data files in your project.

In code using project-lib change using:

project.get_files()

To the following using ibm-watson-studio-lib:

assets = wslib.list_stored_data()
wslib.show(assets)

Get name or description

In ibm-watson-studio-lib, you can retrieve any metadata about the project, for example the name of a project or its description, via the entrypoint wslib.here.

In code using project-lib change:

name = project.get_name()
desc = project.get_description()

To the following using ibm-watson-studio-lib:

name = wslib.here.get_name()
desc = wslib.here.get_description()

Get metadata

There is no replacement for get_matadata in project-lib:

project.get_metadata()

The function wslib.here in ibm-watson-studio-lib exposes parts of this information. To see what project metadata information is available, use:

help(wslib.here.API)

For example:

  • wslib.here.get_name(): Returns the project name
  • wslib.here.get_description(): Returns the proejct description
  • wslib.here.get_ID(): Returns the project ID
  • wslib.here.get_storage(): Returns the storage metadata

Get storage metadata

In code using project-lib change:

project.get_storage_metadata()

To the following using ibm-watson-studio-lib:

wslib.here.get_storage()

Fetch data

To access data in a file, you need to change the following functions.

In code using project-lib change:

buffer = project.get_file("MyAssetName.csv")

# or, without direct storage access:
buffer = project.get_file("MyAssetName.csv", direct_storage=False)

# or:
buffer = project.get_file("MyAssetName.csv", direct_os_retrieval=False)

To the following using ibm-watson-studio-lib:

buffer = wslib.load_data("MyAssetName.csv")

Additionally, ibm-watson-studio-lib offers a function to download a data asset and store it in the local file system:

info = wslib.download_file("MyAssetName.csv", "MyLocalFile.csv")

Save data

To save data to a file, you need to change the following functions.

In code using project-lib change (and for all variations of direct_store=False and set_project_asset=True):

project.save_data("NewAssetName.csv", data)
project.save_data("MyAssetName.csv", data, overwrite=True)

To the following using ibm-watson-studio-lib:

asset = wslib.save_data("NewAssetName.csv", data)
wslib.show(asset)
asset = wslib.save_data("MyAssetName.csv", data, overwrite=True)
wslib.show(asset)

Additionally, ibm-watson-studio-lib offers a function to upload a local file to the project or space storage and create a data asset:

asset = wslib.upload_file("MyLocalFile.csv", "MyAssetName.csv")
wslib.show(asset)

Get connection information

To return the metadata associated with a connection, you need to change the following functions.

In code using project-lib change:

connprops = project.get_connection(name="MyConnection")

To the following using ibm-watson-studio-lib:

connprops = wslib.get_connection("MyConnection")

Get connected data information

To return the metadata associated with a connected data asset, you need to change the following functions.

In code using project-lib change:

dataprops = project.get_connected_data(name="MyConnectedData")

To the following using ibm-watson-studio-lib:

dataprops = wslib.get_connected_data("MyConnectedData")

Access asset by ID instead of name

You can return the metadata of a connection or connected data asset by accessing the asset by ID instead of by name.

In project-lib change:

connprops = project.get_connection(id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")

# or:
connprops = project.get_connection("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")

# or:
datapros = project.get_connected_data(id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")

# or:
datapros = project.get_connected_data("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")

To the following using ibm-watson-studio-lib:

connprops = wslib.by_id.get_connection("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
dataprops = wslib.by_id.get_connected_data("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")

In project-lib, it is not possible to access files (stored data assets) by ID. You can only do this by name. The ibm-watson-studio-lib library supports accessing files by ID. See Using ibm-watson-studio-lib.

Fetch assets by asset type

When you retrieve the list of all project assets, you can pass the optional parameter asset_type to the function get_assets which allows you to filter assets by type. The accepted values for this parameter in project-lib are data_asset, connection and asset.

In code using project-lib change:

project.get_assets()

# Or, for a supported asset type:
project.get_assets("<asset_type>")

# Or:
project.get_assets(asset_type="<asset_type>")

To the following using ibm-watson-studio-lib:

assets = wslib.assets.list_assets("asset")
wslib.show(assets)

# Or, for a specific asset type:
assets = wslib.assets.list_assets("<asset_type>")

# Example, list all notebooks:
notebook_assets = wslib.assets.list_assets("notebook")
wslib.show(notebook_assets)

To list the available asset types, use:

assettypes = wslib.assets.list_asset_types()
wslib.show(assettypes)

Spark support

To work with Spark, you need to change the functions that enable Spark support and retrieving the URL to a file.

Set up Spark support

To set up Spark support:

In code using project-lib change:

# Provide SparkContext during setup
from project_lib import Project
project = Project(sc,"<ProjectId>","<ProjectToken>")

To the following using ibm-watson-studio-lib:

from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space({'token':'<ProjectToken>'}

# provide SparkContext in a subsequent step
wslib.spark.provide_spark_context(sc)

Retrieve URL to access a file from Spark

To retrieve a URL to access a file referenced by an asset from Spark via Hadoop:

In code using project-lib change:

url = project.get_file_url("MyAssetName.csv")
# or
url = project.get_file_url("MyAssetName.csv", direct_storage=False)
# or
url = project.get_file_url("MyAssetName.csv", direct_os_retrieval=False)

To the following using ibm-watson-studio-lib:

url = wslib.spark.get_data_url("MyAssetName.csv")

Get file URL for usage with Spark

Retrieve a URL to access a file referenced by an asset from Spark via Hadoop.

In code using project-lib change:

project.get_file_url("MyFileName.csv", direct_storage=True)
# or
project.get_file_url("MyFileName.csv", direct_os_retrieval=True)

To the following using ibm-watson-studio-lib:

wslib.spark.storage.get_data_url("MyFileName.csv")

Access project storage directly

You can fetch data from the project storage or save data to the project storage without synchronising the project assets.

Fetch data

To fetch data from the project storage:

In code using project-lib change:

project.get_file("MyFileName.csv", direct_storage=True)

# Or:
project.get_file("MyFileName.csv", direct_os_retrieval=True)

To the following using ibm-watson-studio-lib:

wslib.storage.fetch_data("MyFileName.csv")

Save data

To save data to a file in the project storage:

In code using project-lib change:

# Save and do not create an asset in a project
project.save_data("NewFileName.csv", data, direct_storage=True)

# Or:
project.save_data("NewFileName.csv", data, set_project_asset=False)

To the following using ibm-watson-studio-lib:

wslib.storage.store_data("NewFileName.csv", data)

In code using project-lib change:

# Save (and overwrite if file exists) and do not create an asset in the project
project.save_data("MyFileName.csv", data, direct_storage=True, overwrite=True)

# Or:
project.save_data("MyFileName.csv", data, set_project_asset=False, overwrite=True)

To the following using ibm-watson-studio-lib:

wslib.storage.store_data("MyFileName.csv", data, overwrite=True)

Additionaly, ibm-watson-studio-lib provides a function to download a file from the project or space storage to the local file system:

wslib.storage.download_file("MyStorageFile.csv", "MyLocalFile.csv")

You can also register a file in the project or space storage as data asset using:

wslib.storage.register_asset("MyStorageFile.csv", "MyAssetName.csv")

Learn more

To use the ibm-watson-studio-lib library for Python in notebooks, see ibm-watson-studio-lib for Python.

Parent topic: Using ibm-watson-studio-lib