IBM Support

How to read a file in the notebook from Cloud Object Storage?

Question & Answer


Question

How to read a file in the notebook from Cloud Object Storage?

Answer

Answer

The notebook in Watson Studio has the functionality to allow you to insert auto-generated code to read .csv files. However, if you upload any other types of file, it will not auto-generate the code to read the file will likely insert a StreamingBody object or insert a sparksession setup. For example, if you upload a pickle file and insert a StreamingBody object, you will get this result:

# Your data file was loaded into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about your possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/
streaming_body_1 = client_a9bbfb9f99684afe9ec11076b75f1831.get_object(Bucket=';catalogdsxreproduce4a77ab6a4f2f47b3b6bedc7174a64c4a';, Key=';test.pickle';)[';Body';]

# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(streaming_body_1, "__iter__"): streaming_body_1.__iter__ = types.MethodType( __iter__, streaming_body_1 )

Now, you have a StreamingBody object that is simply an HTTP response that the boto client returns.
Complete the following steps to read the file:
  1. Read the object into memory using the following command:
    readrawdata = streaming_body_1.read()
    Note: Calling subsequent read() on streaming_body_1 will return a NULL value so execute this call just once.

  2. Convert the object to BytesIO to be able to read it using Pickle Connector or any other connector. For example, you might want to read the Excel file using xlrd. Complete these steps:
    1. Import BytesIO using the following command:
      from io import BytesIO
    2. Import the connector library using the following command:
      import pickle
      You can install it using !pip install --user pypi-packagename
  3. Read it using your connector library function. For example, you might use:
    load() df = pickle.load(BytesIO(readrawdata))
To view the contents of a dataframe, run the df command.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCLA9","label":"IBM Watson Studio Cloud"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

More support for:
IBM Watson Studio Cloud

Software version:
All Versions

Document number:
963490

Modified date:
01 August 2019

UID

ibm1KB0011031

Manage My Notification Subscriptions