Reading data from a data source
You can read Cognos Analytics data in a notebook using the Python or R programming languages.
- Uploaded CSV or XLS files
- Data sets
- Data modules
- Framework Manager packages, including OLAP data
read_data()
method in a notebook cell, do the following steps:- Create or edit a notebook and position your cursor in the cell after which you want to do the read.
- Click Sources .
- Navigate to a data source, select it, and then click Open. The method is inserted in the cell with the data source to read specified.
- In Python
-
data = CADataConnector.read_data(parameters as described in the following sections)
- In R
-
data <- CADataConnector$read_data(parameters as described in the following sections)
- DataFrame
- Returned by default.
- List of DataFrame
- Returned when the sheet_name parameter is specified and multiple sheets are requested.
- Iterator of DataFrame
- Returned when the chunksize or iterator parameter is specified.
Parameters common to all data sources
For examples of how to code the read_data()
method, see Python notebook examples.
Parameter | Required or optional | Description |
---|---|---|
path id |
Specify one of path or id |
Path to the data source. If the data source is in My content, specify
.my_folders at the start of the path. If the data source is in Team
content, specify .public_folders at the start of the path. For
example, to specify a file called sales-notebook that is stored in
My content, specify
The ID of the file. For information about how to get the ID of a file, see Finding the ID of a file. For
example:
|
usecols | optional |
List of integers or strings that identifies a subset of columns to be returned. You can specify
one or more columns by the position number of the column in the file or by column name. For example,
to request columns City and Quantity, specify
If not specified, then all columns are returned.
Note: This parameter is not applicable for OLAP
data in a Framework Manager package.
|
chunksize | optional |
An integer. If specified, the method returns an iterator. The value specifies the number of rows
to return each time the method is invoked. For example, to return 3 rows,
specify
If both chunksize and iterator are not specified, then a DataFrame is returned. |
iterator | optional |
If specified, the method returns an iterator. Specifying
without chunksize specified returns one row
each time the method is invoked.Specifying Reset the iterator using the reset method:
|
nrows | optional |
The maximum number of rows returned. Specify an integer. If omitted, the maximum number of rows returned is 10,000. If
is
specified, then the column headings are returned. |
XLS file parameters
Uploaded files are treated as sheets of data. Reading one of these data sources returns the rows for the specified sheet of data in the data source.
- sheet_name
-
Optional. Integer, string, or a list of integers or strings. Specifies the integer position of a specific sheet or tab in an XLS file. The first sheet is 0. If you specify a list of strings or integers, then a list of the corresponding sheets is returned. If not specified, the value defaults to 0. Example:
- In Python
-
sheet_name=["sheet2","sheet3"]
- In R
-
sheet_name=list("sheet2","sheet3")
Data module parameters
Reading data from a data module must reflect the modeling done on the data source. For example, a join between two tables and aggregation types to apply.
Parameter | Required or optional | Description |
---|---|---|
table_name | Optional |
String or list of strings. Restrict the scope of the request to specific table(s) in the data module. Specify the table(s) by table name(s). If you omit table_name, the
read_data() method returns a
DataFrame that contains the names of the tables defined in the data module. Example:
|
calculation | Optional | String or list of strings. Read a calculation from a data module. Specify a calculated column in the data module. |
Package parameters
Reading data from a package must reflect the modeling done on the data source. For example, the relationship between objects in the package.
Parameter | Required or optional | Description |
---|---|---|
query_subject | Optional | String or list of strings. Restrict the scope of the request to the specific query
subjects(s) in the package. Specify the query subject(s) by query subject name(s).
Example:
|
folder_name | Optional | String or list of strings. Restrict the scope of the request to a folder in the package.
Displays the contents of all folder(s) with that name in the package. If you want to see only the
content of folder c that's in folder b that's in folder
a specify the folder_name parameter as a list:
|
metadata | Optional. Applies only to OLAP data in a Framework Manager package. | Value can be True or False. Specifying
returns the
query subjects in the package. Specifying returns the data in
the package. If not specified, then a default value of False is used. |