Flight service in R notebooks
You can use Flight service and the Apache Arrow Flight protocol to read from and write data to data assets in a project or space. These data assets can be files in the storage associated with your current project or space, or data accessed through a database connection.
Loading data with generated code using Flight service
When you load data from a project asset to a notebook, the generated code uses pyarrow to invoke Flight service, although the function calls to pyarrow are not visible because they are wrapped in higher level functions
provided by another library called itc_utils.
This itc_utils library is pre-installed in all notebook runtime environments provided by IBM for the purpose of reducing the code size and making the code more readable. To achieve this goal, the itc_utils library leverages
information from the runtime environment and from ibm_watson_studio_lib library.
Another significant advantage of the generated code is that the data request has special properties, namely 'connection_name', 'connected_data_name', or 'data_name', depending on the kind of asset for which you generate the code. itc_utils converts these properties into an asset_id, plus a project_id or space_id, before creating a flight descriptor.
Example of the code that is generated to load data:
library("reticulate")
library("arrow")
itcfs <- import("itc_utils.flight_service")
readClient <- itcfs$get_flight_client()
nb_data_request = dict(
"connection_name" = "MyConnection",
"interaction_properties" = dict(
"row_limit" = 500,
"schema_name" = "<schema>",
"table_name" = "<table>"
)
)
flightInfo <- itcfs$get_flight_info(readClient, nb_data_request=nb_data_request)
tables <- itcfs$read_tables(readClient, flightInfo, timeout=240)
data_df_1 <- as.data.frame(tables[[1]])
head(data_df_1)
You can use the functions provided by itc_utils to extend code that uses the pyarrow library. Because itc_utils is based on pyarrow, you can pick specific functions of itc_utils and combine them with your pyarrow code.
The itc_utils library provides helper functions that you can use to make your programs easier to read. This library can be removed by IBM at any time, if deemed necessary, and functions can be changed without prior notice.
For a list of the itc_utils functions, which can greatly simplify your code development, see Using itc_utils with your own code. Note that the descriptor and request syntax,
and all function calls in the examples are in Python notation, which you would need to modify appropriately for R.
If you encounter timeout errors, see Hints and tips around timeouts. Here too, all function calls in the examples are in Python notation, which you would need to modify appropriately for R.