If you are familiar with Python, you can add dimension data to existing device types through code.
You can add dimensions and dimension values programmatically to existing device types.
You can develop a custom function based on the sample custom function to add dimensions. The sample function uses the Preload
base class of IoT Functions to add dimensions. Each time that you run the function, you delete and re-create the dimensions and their values. If several device types share dimensions, you can apply the custom function to each device type to set the dimensions.
Complete the steps in Tutorial: Add a custom function to learn how to create and register a custom function. Refer to the tutorial for information about the following tasks:
If you want to simulate dimensions for a device type, you can use the sample script to add the dimensions to a device type. In the script, use the make_dimenion
and generate_dimension_data
methods for
a device type to simulate dimension data.
Create a custom function that adds a dimension table to the database and assigns values to the dimensions. Use the following sample code as a template for your function. You can extend the custom function to assign the dimension values. For example, you might extend the function to load dimension values from a CSV file.
The sample function has two classes. Both classes add a dimension table and assign values to the dimension per device ID.
SampleDimensionPreload_random
: Assigns random values.SampleDimensionPreload_preset
: Assigns preset values.import logging
import pandas as pd
import numpy as np
from iotfunctions.base import BasePreload
from iotfunctions import ui
from sqlalchemy import Column, Integer, String, Float, DateTime, Boolean, func
logger = logging.getLogger(__name__)
# Specify the URL to your package here.
# This URL must be accessible via pip install.
# Example assumes the repository is private.
# Replace XXXXXX with your personal access token.
PACKAGE_URL = 'git+https://XXXXXX@github.com/<user_id><path_to_repository>@<package'
# If your code is hosted in GitLab, use the format 'git+https://<deploy_token_username>:<deploy_token_password>@gitlab.com/<user_id><path_to_repository>@prod'
class SampleDimensionPreload_random(BasePreload):
'''
Create a dimension (dimension_1) and add random values to it.
The function deletes the old values and reloads the dimension table.
'''
dim_table_name = None
def __init__(self, output_item ='dimension_preload_done'):
super().__init__(dummy_items=[],output_item = output_item)
def execute(self, df, start_ts = None,end_ts=None,entities=None):
'''
The function uses the preload class to create and load dimension data
and randomly assign test data.
'''
entity_type = self.get_entity_type()
self.db = entity_type.db
schema = entity_type._db_schema
# Get the dimension table name and add dimension values to the table.
try:
self.dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).upper()
except:
self.dim_table_name = entity_type.logical_name + '_DIMENSION'
msg = 'Dimension table name: ' + str(self.dim_table_name)
logging.debug(msg)
# Read the device ID of each device and add the dimension to each device ID.
ef = self.db.read_table(entity_type.logical_name, schema=schema, columns=[entity_type._entity_id])
ids = set(ef[entity_type._entity_id].unique())
msg = 'entities with device_id present:' + str(ids)
logger.debug(msg)
# Note:
# The make_dimension method only adds a new dimension table.
# The old dimension table and all of the data in it is deleted.
self.db.drop_table(self.dim_table_name, schema=schema)
# Make a dimension table (dimension_1)
entity_type.make_dimension(self.dim_table_name,
Column('dimension_1', String(50)), # add dimension_1
**{'schema': schema})
# Register the device to add the dimension to its metadata.
entity_type.register()
'''
Randomly generating dimension data.
Generate random data for dimension_1
and for those device IDs that don't already have data.
'''
entity_type.generate_dimension_data(entities=ids)
return True
@classmethod
def build_ui(cls):
'''
Register metadata
'''
# Define arguments that behave as function inputs
inputs = []
# Define arguments that behave as function outputs
outputs=[]
outputs.append(ui.UIStatusFlag(name='output_item'))
return (inputs, outputs)
class SampleDimensionPreload_preset(BasePreload):
'''
Create a dimension (dimension_1) and add temporary values to it.
The function assigns the string preload_value as the value.
The function deletes the old values and reloads the dimension table.
'''
dim_table_name = None
def __init__(self, output_item='dimension_preload_done'):
super().__init__(dummy_items=[], output_item=output_item)
def execute(self, df, start_ts=None, end_ts=None, entities=None):
'''
The function uses the Preload class to create and load dimension data.
It assigns preset values.
'''
entity_type = self.get_entity_type()
self.db = entity_type.db
schema = entity_type._db_schema
# Get the dimension table name and add dimension values to the table.
try:
self.dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).upper()
except:
self.dim_table_name = entity_type.logical_name + '_DIMENSION'
msg = 'Dimension table name: ' + str(self.dim_table_name)
logging.debug(msg)
# Read the device ID of each device and add the dimension to each device ID.
ef = self.db.read_table(entity_type.logical_name, schema=schema, columns=[entity_type._entity_id])
ids = set(ef[entity_type._entity_id].unique())
msg = 'entities with device_id present:' + str(ids)
logger.debug(msg)
# Note:
# The make_dimension method adds a new dimension table.
# The old dimension table and all of the data in it is deleted.
self.db.drop_table(self.dim_table_name, schema=schema)
# Make a dimension table.
entity_type.make_dimension(self.dim_table_name,
Column('dimension_1', String(50)), # add dimension_1
**{'schema': schema})
# Register the device to add the dimension to its metadata.
entity_type.register()
'''
Preload dimensional data with preset values.
These values are hardcoded but can be extended by
loading data from a CSV file or through a HTTP request.
In this template, the value is hardcoded to 'preload_value'.
'''
# Create hardcoded data
preload_data = {}
preload_values = np.repeat('preload_value', len(ids))
preload_data['dimension_1'] = preload_values
preload_data[entity_type._entity_id] = list(ids)
df = pd.DataFrame(preload_data)
'''
# Load the hardcoded data into the database.
'''
msg = 'Setting columns for dimensional table\n'
required_cols = self.db.get_column_names(table=self.dim_table_name, schema=schema)
missing_cols = list(set(required_cols) - set(df.columns))
msg = msg + 'required_cols ' + str(required_cols) + '\n'
msg = msg + 'missing_cols ' + str(missing_cols) + '\n'
logger.debug(msg)
# Write the dataframe for dimension to the IBM IOT Platform database table.
self.write_frame(df=df, table_name=self.dim_table_name, if_exists='append')
kwargs = {
'dimension_table': self.dim_table_name,
'schema': schema,
}
entity_type.trace_append(created_by=self,
msg='Wrote dimension to table',
log_method=logger.debug,
**kwargs)
return True
@classmethod
def build_ui(cls):
'''
Registration metadata
'''
# Define arguments that behave as function inputs.
inputs = []
# Define arguments that behave as function outputs.
outputs = []
outputs.append(ui.UIStatusFlag(name='output_item'))
return (inputs, outputs)
When you test the custom function in your local environment, you can use a script similar to the following sample to test the SampleDimensionPreload_random
class. If you plan to use preset values, replace the class name SampleDimensionPreload_random
with SampleDimensionPreload_preset
class.
import json
import logging
from ai.function_dimension import (SampleDimensionPreload_random,
SampleDimensionPreload_preset)
from iotfunctions.db import Database
from iotfunctions.enginelog import EngineLogging
EngineLogging.configure_console_logging(logging.DEBUG)
'''
# 1. Getting DB credentials.
# Go to Services > Watson IOT Platform Analytics > Copy to clipboard.
# Paste the contents in a credentials_as.json file.
# Save the file in scripts.
# Take care not to push the credentials file to your external repository.
'''
'''
1. Create a database object to access the Analytics Service DB.
Take care not to push the credentials file to your external repository.
'''
schema = 'bluadmin' # set if you are not using the default
with open('./scripts/credentials_as.json', encoding='utf-8') as F:
credentials = json.loads(F.read())
db = Database(credentials = credentials)
'''
2. Register the custom function.
You must use unregister_functions if you change the method signature or required inputs.
'''
db.unregister_functions(['SampleDimensionPreload_random'])
db.register_functions([SampleDimensionPreload_random])
'''
3. Add a device type.
This example assumes that the device to which we are adding dimensions already exists.
We add the custom function to this device type to test it locally.
Remember to update the device type name to the name of your device type.
'''
entity_name = 'entity_dimension_test_random'
entity_type = db.get_entity_type(name=entity_name)
# Get the dimension table name and add dimension values to the table.
try:
dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).lower()
except:
dim_table_name = entity_name + '_dimension'
entity_type._functions.extend([SampleDimensionPreload_random()])
'''
Test the execution of KPI calculations defined for the device type locally.
A local test will not update the server job log or write KPI data to the Analytics Service data
lake. Instead, the KPI data is written to the local file system in csv format.
'''
entity_type.exec_local_pipeline(**{'_production_mode': False})
'''
View device data.
'''
print ("Reading new dimension table")
print(dim_table_name)
df = db.read_dimension(dimension=dim_table_name, schema=schema)
print(df.head())
print("Finished reading the device dimension table")
You can simulate dimensions for a device type. You can use a Python script similar to the following script to add dimensions and their values. Random values are set from an array of values. The script includes these steps:
import json
import logging
from iotfunctions.db import Database
from iotfunctions.enginelog import EngineLogging
from sqlalchemy import Column, String, Integer, Float
EngineLogging.configure_console_logging(logging.DEBUG)
'''
This script adds dimensions to a device type and simulates the values by adding random values.
'''
'''
# 1. Getting DB credentials.
# Go to Services > Watson IOT Platform Analytics > Copy to clipboard.
# Paste the contents in a credentials_as.json file.
# Save the file in scripts.
# Take care not to push the credentials file to your external repository.
'''
schema = 'bluadmin' # set if you are not using the default
with open('./scripts/credentials_as.json', encoding='utf-8') as F:
credentials = json.loads(F.read())
db = Database(credentials = credentials)
'''
2. Create a database object to access the Analytics Service database.
'''
entity_name = 'entity_dimension_simulate'
entity_type = db.get_entity_type(name=entity_name)
# Get the dimension table name and add dimension values to the table.
try:
dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).lower()
except:
dim_table_name = entity_name + '_dimension'
# db.drop_table(dim_table_name, schema=schema)
'''
3. Create new dimensions by specifying them as columns
Dimension can be of the following types
Integer, INTEGER, Float, FLOAT, String, VARCHAR, DateTime, Timestamp
3.1 Using dimension with predefined values:
When you use any of these dimensions, values are selected from a predefined set.
In the following arrays, the dimension name on the left-hand side generates dimension values from the arrawy on the right-hand side.
Dimension name:Values
['company', 'company_id', 'company_code']: ['ABC', 'ACME', 'JDI']
['plant', 'plant_id', 'plant_code']: ['Zhongshun', 'Holtsburg', 'Midway']
['plant', 'plant_id', 'plant_code']: ['US', 'CA', 'UK', 'DE']
['firmware', 'firmware_version']: ['1.0', '1.12', '1.13', '2.1']
['manufacturer']: ['Rentech', 'GHI Industries']
['zone']: ['27A', '27B', '27C']
['status', 'status_code']: ['inactive', 'active']
['operator', 'operator_id', 'person', 'employee']: ['Fred', 'Joe', 'Mary', 'Steve', 'Henry', 'Jane', 'Hillary', 'Justin', 'Rod']
3.2 Using other dimension names:
Any other dimension name generates random values.
'''
db.drop_table(dim_table_name, schema=schema)
entity_type.make_dimension(dim_table_name,
Column('company', String(50)),
Column('status', String(50)),
Column('operator', String(50)),
**{'schema': schema})
entity_type.register()
'''
To test the execution of KPI calculations locally use the following function.
A local test will not update the server job log or write KPI data to the Analytics Service data
lake. Instead, the KPI data is written to the local file system in csv format.
'''
# entity_type.exec_local_pipeline(**{'_production_mode': False})
ef = db.read_table(entity_type.logical_name, schema=schema, columns=[entity_type._entity_id])
ids = set(ef[entity_type._entity_id].unique())
entity_type.generate_dimension_data(entities=ids)