Adding data from other sources
You can add IoT and non-IoT data for devices from other tables in your database into the data lake by using some of the built-in functions or by extending some of the base functions.
Adding time series data
You might want to augment the time series data that was ingested by Maximo® Monitor with time series data for the same devices from another source. The data must be stored relationally in the other tables.
Extend the BaseDataSource class in the base.py module in your custom function. The base function is designed to
work with time series data.
When you merge time series data, you must align time stamps between the source and the target. In your custom function, your merge strategy might be to treat the incoming time stamps as the primary time stamps. Alternatively, you might decide to create a new time stamp.
Example: Adding time series data
The following data is stored for a container device type in the data lake.
| Time stamp | Device ID | Pressure |
|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 |
| 1 Nov 8:00 | 8040 | 80.4 |
| 1 Nov 8:04 | 8039 | 80.3 |
| 1 Nov 8:04 | 8040 | 80.5 |
The following production data is stored in an external system for the same devices:
| Timestamp | Device ID | Rejected_QTY |
|---|---|---|
| 1 Nov 8:01 | 8039 | 3 |
| 1 Nov 8:01 | 8040 | 2 |
| 1 Nov 8:05 | 8039 | 3 |
| 1 Nov 8:05 | 8040 | 2 |
In your custom function, you specify the items to merge and how to align the time stamps. In this example, you set the merge strategy to merge to the nearest time stamp. After the function runs, the container data is updated in the data lake to include the number of rejected devices for each container.
| Time stamp | Device ID | Pressure | Rejected_qty |
|---|---|---|---|
| 1 Nov 8:01 | 8039 | 80.3 | 3 |
| 1 Nov 8:01 | 8040 | 80.4 | 2 |
| 1 Nov 8:05 | 8039 | 80.3 | 3 |
| 1 Nov 8:05 | 8040 | 80.5 | 2 |
For device 8040, you can see that the 8.05 time stamp includes data from the 8.04 time stamp in Table 1.
Adding non-time variant data
You might want to augment the time series data that was ingested by Maximo Monitor with non-time variant data for the same devices from another source.
You can use the DataBaseLookup built-in function. When you configure the function, specify the table name, the items to look up, and the key value that maps data items for the device type to the data in the external table.
For example, you might want to look up EmployeeCount and Country from a Company table and use the country_code field as the key.
If the built-in function does not meet your needs, extend the BaseDatabasLookup base function in a custom function.
Example: Adding non-time variant data
The following data is stored for a container device type in the data lake.
| Time stamp | Device ID | Pressure |
|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 |
| 1 Nov 8:00 | 8040 | 80.4 |
| 1 Nov 8:04 | 8039 | 80.3 |
| 1 Nov 8:04 | 8040 | 80.5 |
The following device data is stored for the same devices in a device master table :
| Asset ID | Device ID |
|---|---|
| A47 | 8039 |
| A48 | 8040 |
In your custom function, you specify the key that is shared between both tables and the items to import. In this example, after the function runs, the container data is updated in the data lake to include the device ID for each container.
| Time stamp | Device ID | Pressure | Asset ID |
|---|---|---|---|
| 1 Nov 8:01 | 8039 | 80.3 | A47 |
| 1 Nov 8:01 | 8040 | 80.4 | A48 |
| 1 Nov 8:05 | 8039 | 80.3 | A47 |
| 1 Nov 8:05 | 8040 | 80.5 | A48 |
Adding slow changing dimensions
You might want to add slow changing dimensions (SCDs) from another source to enhance your data. Slow changing dimensions are aspects of your data that do not change frequently. For example, your device might change location from one building to another once a year.
You can use the SCDLookup.py built in function to add an SCD property as a data item. When you configure the function, you specify the table name that holds the property.
The table must include the following columns:
-
start_date -
end_date -
device_id -
property
For example, you might want to look up location for devices from a
Location table and add location as a dimension.
Example: Adding slow changing dimensions
The following data is stored for a container device type in the data lake.
| Time stamp | Device ID | Pressure |
|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 |
| 2 Nov 8:00 | 8039 | 80.3 |
| 3 Nov 8:00 | 8039 | 80.5 |
The following location data is stored for the same devices in a Location table:
| Property | Device ID | start_date | end_date |
|---|---|---|---|
| London | 8039 | 1 Nov 8:00 | Nov 2 8:15 |
| New York | 8039 | 2 Nov 8:16 | Nov 4 8:00 |
In the SCDLookup function, you specify the table name. In this example,
after the function runs, the container data is updated in the data lake to include the location of
each container at the timestamp.
| Time stamp | Device ID | Pressure | Location |
|---|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 | London |
| 2 Nov 8:00 | 8039 | 80.3 | London |
| 3 Nov 8:00 | 8039 | 80.5 | New York |
Adding activity data
You might want to enhance the time series data from IoT devices with activity data for the same devices from another source. For example, you might store information about scheduled maintenance and refuel activities in an Activity table in the database.
You can use the ActivityDuration built-in function. When you configure the function, specify the table name and one or more activities codes.
The table must include the following columns:
- Activity code
- start date
- end date
- deviceid
The function returns an activity duration for each activity code.
If the built-in function does not meet your needs, extend the BaseDBActivityMerge base function in a custom function.
Example: Adding activity data
The following data is stored for a container device type in the data lake.
| Time stamp | Device ID | Pressure |
|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 |
| 1 Nov 8:00 | 8040 | 80.4 |
| 1 Nov 11:00 | 8039 | 80.7 |
| 1 Nov 11:00 | 8040 | 80.7 |
| 1 Nov 12:00 | 8039 | 80.3 |
| 1 Nov 12:00 | 8040 | 80.4 |
The following activity data is stored for the same devices in a device master table:
| Device ID | Activity Code | Start Time | End Time |
|---|---|---|---|
| 8039 | scheduled_maint | 8:00 | 16:00 |
| 8040 | scheduled_maint | 8:00 | 16:00 |
| 8039 | refuel | 11:00 | 12:00 |
| 8040 | refuel | 11:00 | 12:00 |
After the activity duration function runs, the container data is updated in the data lake to include the duration of the specified activity for each device type.
| Timestamp | Device ID | Pressure | Scheduled Maintenance | Refueling |
|---|---|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 | 3 hours | |
| 1 Nov 8:00 | 8040 | 80.4 | 3 hours | |
| 1 Nov 11:00 | 8039 | 80.7 | 1 hour | |
| 1 Nov 11:00 | 8040 | 80.7 | 1 hour | |
| 1 Nov 12:00 | 8039 | 80.3 | 4 hours | |
| 1 Nov 12:00 | 8040 | 80.4 | 4 hours |
Adding shift data from a calendar
You might want to enhance the time series data from IoT devices with shift data for the same devices from a calendar.
You can use the ShiftCalendar built-in function. When you configure the function, specify the shift definition in JSON format.
Example: Adding shift data from a calendar
The following data is stored for a container device type in the data lake.
| Time stamp | Device ID | Pressure |
|---|---|---|
| 1 Nov 8:00 | 8039 | 80.3 |
| 1 Nov 8:00 | 8040 | 80.4 |
| 1 Nov 8:04 | 8039 | 80.3 |
| 1 Nov 8:04 | 8040 | 80.5 |
Shifts are defined as follows:
| Shift name | Start | End |
|---|---|---|
| Day_shift | 8:00 | 16:00 |
| Evening_shift | 16:00 | 24:00 |
| Night_shift | 24:00 | 8:00 |
You want to merge this shift calendar data with the time series data. You want to know the shift day for each record. Remember, this day might not be the same as the gregorian day. You also want to know the shift ID of each record.
When you configure the function, add a shift_definition in the form of a dictionary that is keyed on shift_id. The dictionary contains a tuple with the start and end hours of the shift that is expressed as numbers.
For example:
{
"1": [
8,
16
],
"2": [
16,
24
],
"3": [
24,
32
]
}
After the function runs, the container data is updated in the data lake to include the shift day and the shift ID for each record.
| Time stamp | Device ID | Pressure | Shift Day | Shift ID |
|---|---|---|---|---|
| 1 Nov 15:59 | 8039 | 80.3 | 7 Nov | 1 |
| 1 Nov 15:59 | 8040 | 80.4 | 7 Nov | 1 |
| 1 Nov 16:05 | 8039 | 80.3 | 7 Nov | 2 |
| 1 Nov 16:05 | 8040 | 80.5 | 7 Nov | 2 |