Question & Answer
Question
How to read a file in the notebook from Cloud Object Storage?
Answer
Answer
The notebook in Watson Studio has the functionality to allow you to insert auto-generated code to read .csv files. However, if you upload any other types of file, it will not auto-generate the code to read the file will likely insert a StreamingBody object or insert a sparksession setup. For example, if you upload a pickle file and insert a StreamingBody object, you will get this result:
# Your data file was loaded into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about your possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/
streaming_body_1 = client_a9bbfb9f99684afe9ec11076b75f1831.get_object(Bucket=';catalogdsxreproduce4a77ab6a4f2f47b3b6bedc7174a64c4a';, Key=';test.pickle';)[';Body';]
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(streaming_body_1, "__iter__"): streaming_body_1.__iter__ = types.MethodType( __iter__, streaming_body_1 )
- Read the object into memory using the following command:
readrawdata = streaming_body_1.read()
Note: Calling subsequent read() on streaming_body_1 will return a NULL value so execute this call just once. - Convert the object to BytesIO to be able to read it using Pickle Connector or any other connector. For example, you might want to read the Excel file using xlrd. Complete these steps:
- Import BytesIO using the following command:
from io import BytesIO - Import the connector library using the following command:
import pickle
You can install it using !pip install --user pypi-packagename
- Import BytesIO using the following command:
- Read it using your connector library function. For example, you might use:
load() df = pickle.load(BytesIO(readrawdata))
Was this topic helpful?
Document Information
More support for:
IBM Watson Studio Cloud
Software version:
All Versions
Document number:
963490
Modified date:
01 August 2019
UID
ibm1KB0011031