Adding dimensional data from a custom function

You can add dimensions and dimension values to existing device types. If you are familiar with Python, you can add dimension data to existing device types through code.

You can develop a custom function for a batch data metric based on the sample custom function to add dimensions. The sample function uses the Preload base class of IoT Functions to add dimensions. Each time that you run the function, you delete and re-create the dimensions and their values. If several device types share dimensions, you can apply the custom function to each device type to set the dimensions.

Before you begin

Complete the steps in Tutorial: Adding a custom function to learn how to set up your environment and package, store, install, test, register, and apply your function.

If you want to simulate dimensions for a device type, you can use the sample script to add the dimensions to a device type. In the script, use the make_dimenion and generate_dimension_data methods for a device type to simulate dimension data.

Adding dimensions by using a custom function

Create a custom function that adds a dimension table to the database and assigns values to the dimensions. Use the following sample code as a template for your function. You can extend the custom function to assign the dimension values. For example, you might extend the function to load dimension values from a .csv file.

The sample function has two classes. Both classes add a dimension table and assign values to the dimension per device ID.

  • The SampleDimensionPreload_random class assigns random values.
  • The SampleDimensionPreload_preset class assigns preset values.

Sample custom function

The following code is the sample custom function. The function uses the preinstalled dimensional data with preset values to create and load dimension data and randomly generates and assigns test data. The function creates a dimension, dimension_1, and adds random values to it. The function deletes the old values and reloads the dimension table. The dimensional data values are hardcoded but can be extended by loading data from a .csv file or through an HTTP request. In the following template, the value is hardcoded to preload_value.

For the PACKAGE_URL variable, specify the URL to your package. This URL must be accessible through a pip installation. Use the following format: git+https://<personal_access_token>@github.com/<user_id><path_to_repository>@<package. If your code is hosted in GitLab, use the following format: git+https://<deploy_token_username>:<deploy_token_password>@gitlab.com/<user_id><path_to_repository>@prod

import logging
import pandas as pd
import numpy as np


from iotfunctions.base import BasePreload
from iotfunctions import ui

from sqlalchemy import Column, Integer, String, Float, DateTime, Boolean, func

logger = logging.getLogger(__name__)

PACKAGE_URL = 'PACKAGE_URL'

class SampleDimensionPreload_random(BasePreload):

    dim_table_name = None

    def __init__(self, output_item ='dimension_preload_done'):

        super().__init__(dummy_items=[],output_item = output_item)


    def execute(self, df, start_ts = None,end_ts=None,entities=None):

        entity_type = self.get_entity_type()
        self.db = entity_type.db
        schema = entity_type._db_schema

        try:
            self.dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).upper()
        except:
            self.dim_table_name = entity_type.logical_name + '_DIMENSION'

        msg = 'Dimension table name: ' + str(self.dim_table_name)
        logging.debug(msg)

        ef = self.db.read_table(entity_type.logical_name, schema=schema, columns=[entity_type._entity_id])
        ids = set(ef[entity_type._entity_id].unique())
        msg = 'entities with device_id present:' + str(ids)
        logger.debug(msg)

        self.db.drop_table(self.dim_table_name, schema=schema)
        entity_type.make_dimension(self.dim_table_name,
                                   Column('dimension_1', String(50)), # add dimension_1
                                   **{'schema': schema})
        entity_type.register()
       
        entity_type.generate_dimension_data(entities=ids)

        return True

    @classmethod
    def build_ui(cls):
     
        inputs = []
        outputs=[]
        outputs.append(ui.UIStatusFlag(name='output_item'))
        return (inputs, outputs)


class SampleDimensionPreload_preset(BasePreload):

    dim_table_name = None

    def __init__(self, output_item='dimension_preload_done'):

        super().__init__(dummy_items=[], output_item=output_item)

    def execute(self, df, start_ts=None, end_ts=None, entities=None):
       
        entity_type = self.get_entity_type()
        self.db = entity_type.db
        schema = entity_type._db_schema
        try:
            self.dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).upper()
        except:
            self.dim_table_name = entity_type.logical_name + '_DIMENSION'

        msg = 'Dimension table name: ' + str(self.dim_table_name)
        logging.debug(msg)
        ef = self.db.read_table(entity_type.logical_name, schema=schema, columns=[entity_type._entity_id])
        ids = set(ef[entity_type._entity_id].unique())
        msg = 'entities with device_id present:' + str(ids)
        logger.debug(msg)

        self.db.drop_table(self.dim_table_name, schema=schema)
        entity_type.make_dimension(self.dim_table_name,
                                   Column('dimension_1', String(50)),  # add dimension_1
                                   **{'schema': schema})
        entity_type.register()
       
        preload_data = {}
        preload_values = np.repeat('preload_value', len(ids))
        preload_data['dimension_1'] = preload_values
        preload_data[entity_type._entity_id] = list(ids)
        df = pd.DataFrame(preload_data)
        msg = 'Setting columns for dimensional table\n'
        required_cols = self.db.get_column_names(table=self.dim_table_name, schema=schema)
        missing_cols = list(set(required_cols) - set(df.columns))
        msg = msg + 'required_cols ' + str(required_cols) + '\n'
        msg = msg + 'missing_cols ' + str(missing_cols) + '\n'
        logger.debug(msg)

        self.write_frame(df=df, table_name=self.dim_table_name, if_exists='append')

        kwargs = {
            'dimension_table': self.dim_table_name,
            'schema': schema,
        }
        entity_type.trace_append(created_by=self,
                                 msg='Wrote dimension to table',
                                 log_method=logger.debug,
                                 **kwargs)

        return True

    @classmethod
    def build_ui(cls):
        
        inputs = []
        outputs = []
        outputs.append(ui.UIStatusFlag(name='output_item'))
        return (inputs, outputs)

When you test the custom function in your local environment, you can use a script similar to the following sample to test the SampleDimensionPreload_random class. If you plan to use preset values, replace the class name SampleDimensionPreload_random with SampleDimensionPreload_preset class.

Before you use the script, retrieve the database credentials.

  1. Download the credentials_as.json file.
  2. Replace the variables with your data and then save the file to your local workstation.
Note: Do not save the credentials file to any external repository.
The following script can be used to complete the following tasks:
  • Create a database object to access the Maximo Monitor database. Do not to push the credentials file to your external repository. Set the schema value if you are not using the default.
  • Register the custom function. You must use unregister_functions if you change the method signature or required inputs.
  • Add a device type. This example assumes that the device to which you are adding dimensions exists. The script adds the custom function to this device type to test it locally. Update the device type name to the name of your device type.
  • Get the dimension table name and add dimension values to the table.
  • Test the execution of metric calculations that are defined for the device type locally. A local test does not update the server job log or write metric data to the data lake. Instead, the metric data is written to the local file system in .csv format.
  • View device data.
import json
import logging
from ai.function_dimension import (SampleDimensionPreload_random,
                                   SampleDimensionPreload_preset)
from iotfunctions.db import Database
from iotfunctions.enginelog import EngineLogging

EngineLogging.configure_console_logging(logging.DEBUG)

schema = 'bluadmin' 
with open('./scripts/credentials_as.json', encoding='utf-8') as F:
    credentials = json.loads(F.read())
db = Database(credentials = credentials)

db.unregister_functions(['SampleDimensionPreload_random'])
db.register_functions([SampleDimensionPreload_random])

entity_name = 'entity_dimension_test_random'
entity_type = db.get_entity_type(name=entity_name)

try:
    dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).lower()
except:
    dim_table_name = entity_name + '_dimension'

entity_type._functions.extend([SampleDimensionPreload_random()])

entity_type.exec_local_pipeline(**{'_production_mode': False})

print ("Reading new dimension table")
print(dim_table_name)
df = db.read_dimension(dimension=dim_table_name, schema=schema)
print(df.head())

print("Finished reading the device dimension table")

The following script adds dimensions to a device type and simulates the values by adding random values. The script creates the dimensions by adding them as columns. The dimension type can be Integer, INTEGER, Float, FLOAT, String, VARCHAR, DateTime, or Timestamp.

Before you use the script, retrieve the database credentials.

  1. Download the credentials_as.json file.
  2. Replace the variables with your data and then save the file to your local workstation.
Note: Do not save the credentials file to any external repository.

In the script, set the schema value if you are not using the default. When you use any of these dimensions, values are selected from a predefined set. Any other dimension name generates random values.

To test the execution of metric calculations locally, use the # entity_type.exec_local_pipeline(**{'_production_mode': False}) function. A local test does not update the server job log or write metric data to the data lake. Instead, the metric data is written to the local file system in .csv format.

Figure 1. Sample script

import json
import logging
from iotfunctions.db import Database
from iotfunctions.enginelog import EngineLogging

from sqlalchemy import Column, String, Integer, Float

EngineLogging.configure_console_logging(logging.DEBUG)

schema = 'bluadmin'
with open('./scripts/credentials_as.json', encoding='utf-8') as F:
    credentials = json.loads(F.read())
db = Database(credentials = credentials)

entity_name = 'entity_dimension_simulate'
entity_type = db.get_entity_type(name=entity_name)

try:
    dim_table_name = (entity_type.get_attributes_dict()['_dimension_table_name']).lower()
except:
    dim_table_name = entity_name + '_dimension'

# db.drop_table(dim_table_name, schema=schema)

Dimension name:Values
['company', 'company_id', 'company_code']: ['ABC', 'COMPANY', 'JDI']
['plant', 'plant_id', 'plant_code']: ['Zhongshun', 'Holtsburg', 'Midway']
['plant', 'plant_id', 'plant_code']: ['US', 'CA', 'UK', 'DE']
['firmware', 'firmware_version']: ['1.0', '1.12', '1.13', '2.1']
['manufacturer']: ['Rentech', 'GHI Industries']
['zone']: ['27A', '27B', '27C']
['status', 'status_code']: ['inactive', 'active']
['operator', 'operator_id', 'person', 'employee']: ['Fred', 'Joe', 'Mary', 'Steve', 'Henry', 'Jane', 'Hillary', 'Justin', 'Rod']

db.drop_table(dim_table_name, schema=schema)
entity_type.make_dimension(dim_table_name,
                           Column('company', String(50)),
                           Column('status', String(50)),
                           Column('operator', String(50)),
                           **{'schema': schema})

entity_type.register()

# entity_type.exec_local_pipeline(**{'_production_mode': False})

ef = db.read_table(entity_type.logical_name, schema=schema, columns=[entity_type._entity_id])
ids = set(ef[entity_type._entity_id].unique())
entity_type.generate_dimension_data(entities=ids)

Simulating dimension data

You can simulate dimensions for a device type. You can use a Python script similar to the following script to add dimensions and their values. Random values are set from an array of values. The script includes these steps:

  1. Connect to Maximo Monitor.
  2. Create a database object in the Maximo Monitor database.
  3. Add new database columns for the dimensions and add random values.