Preparing the data for the ETL module
Explains how you prepare the raw data for the ETL module on the Jupyter node as a .csv file.
- connectivity
- electric_station
- exclude_time
- feeder_root
- meter
- meter_load
- meter_voltage
- overhead_cable
- scada
- scada_load
- substation_region
- transformer
- underground_cable
The prepared data includes master data and reading data. The master data contains the assets in the power grid, and the reading data contains the measurement readings, for example: voltage and load.
As the ETL module supports data in .csv format, by using the csv format, you can edit and update the master data and reading data. You can also incrementally add reading data to the ana store. You can define multiple csv files in each folder. The ETL module does not support sub-folders.
The format of the .csv files is given as follows:
Master data
| Column name | Type | Description | Constraints |
|---|---|---|---|
| assetId | String | asset ID | Primary Key, Unique, Not NULL. |
| substation | String | ID of substation region to where electric station belongs. | Foreign Key, must be a valid substation in the substation region table. |
| type | String | The asset type. | |
| phase | String | The phase of the asset. | |
| numberOfPhases | String | The amount of phases. | |
| normalStatus | String | The normal status. | |
| node1 | String | The node 1. | |
| node2 | String | The node 2. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| id | String | ID of the electric station | Primary Key, Unique, Not NULL. |
| substation | String | ID of substation region to where electric station belongs. | Foreign Key, must be a valid substation in the substation region table. |
| geometry | String | The geometry of the electric station. | Must be in a valid WKT POLYGON, in WSG84 projection. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| feeder | String | ID of the feeder. | Primary Key. Must be a valid feeder in the feeder table or feeder_group in the feeder_group table |
| type | String | Type of analysis. | Primary Key. The candidate value include Load, Voltage, Voltage_with_scada. |
| startTime | String | Timestamp, in the format of yyyy-MM-ddThh:mm:ss.sss. | Primary Key. The time range to be excluded from the analysis with the startTime being the start time of the range (inclusive). |
| endTime | String | Timestamp, in the format of yyyy-MM-ddThh:mm:ss.sss. | Primary Key. The time range to be excluded from the analysis with the endTime being the end time of the range (exclusive). |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| feeder | String | ID of the feeder where the meter is connected | Foreign Key, must be a valid feeder in the feeder table. |
| rootAsset | String | The root asset. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| id | String | ID of the meter | Primary Key, Unique, Not NULL. |
| substation | String | ID of substation region where the meter belongs to. | Foreign Key, must be a valid substation in the substation region table. |
| feeder | String | ID of the feeder where the meter is connected | Foreign Key, must be a valid feeder in the feeder table. |
| transformer | String | ID of the transformer | Foreign Key, must be valid transformer in transformer table. |
| phase | String | The phase of meter | Must be a valid phase code from 1 to 7. For example: 0b100=4=A, 0b010=2=B, 0b001=1=C, 0b110=6=AB. |
| geometry | String | Geospatial geometry in WKT format, should be a point. | Must be a valid WKT POINT, in WSG84 projection. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| id | String | ID of the overhead cable | Primary Key, Unique, Not NULL. |
| substation | String | ID of the substation region where the overhead cable belongs to. | Foreign Key, must be a valid substation in the substation region table. |
| feeder | String | ID of the feeder where the overhead cable is connected. | Foreign Key, must be a valid feeder in the feeder table. |
| phase | String | The phase of the overhead cable. | Must be a valid phase code from 1 to 7. For example: 0b100=4=A, 0b010=2=B, 0b001=1=C, 0b110=6=AB. |
| geometry | String | The geometry of the overhead cable. | Must be a valid WKT MULTILINESTRING, in WSG84 projection. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| assetId | String | ID of the feeder. | Primary Key, Unique, Not NULL. |
| measurement | String | Type of analysis | Primary Key Candidate value must include Load, Voltage, Voltage_with_scada |
| scadaId | String | ID of SCADA. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| id | String | ID of the substation region. | Primary Key, Unique, Not Null. |
| geometry | String | Geospatial geometry in WKT format, should be a polygon. | Must be valid WKT POLYGON. In WSG84 projection |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| id | String | ID of the transformer. | Primary Key, Unique, Not NULL. |
| substation | String | ID of the substation region where the transformer belongs to. | Foreign Key, must be a valid substation in the substation region table. |
| feeder | String | ID of the feeder where the transformer is connected. | Foreign Key, must be a valid feeder in the feeder table. |
| phase | String | The phase of the transformer | Must be a valid phase code from 1 to 7. For example: 0b100=4=A, 0b010=2=B, 0b001=1=C, 0b110=6=AB. |
| kva | Int | The kilovolt-amps. | |
| voltage | Int | The voltage. | |
| geometry | String | Geospatial geometry in WKT format, should be a point. | Must be a valid WKT POINT, in WSG84 projection. |
| Column name | Type | Description | Constraints |
|---|---|---|---|
| id | String | ID of the underground cable. | Primary Key, Unique, Not NULL. |
| substation | String | ID of the substation region where the underground cable belongs to, | Foreign Key, must be a valid substation in the substation region table. |
| feeder | String | ID of the feeder where the underground cable is connected. | Foreign Key, must be a valid feeder in the feeder table. |
| phase | String | The phase number. | Must be a valid phase code from 1 to 7. For example: 0b100=4=A, 0b010=2=B, 0b001=1=C, 0b110=6=AB. |
| geometry | String | The geometry of the underground cable. | Must be a valid WKT MULTILINESTRING, in WSG84 projection. |
Reading data
- The timestamp must be in same time zone as the corresponding load table. For example both load tables and reading tables must either use UTC or use the local time zone.
- The time interval between two timestamps must be fixed and be the same as time interval in corresponding load table. For example, if the time interval is 1 hour, then all readings in the meter load table and feeder load table should use 1 hour interval.
- The timestamp should align with timestamp in the corresponding load table. For example, if the feeder load has a timestamp 9:00 then the meter load timestamp should be the same.
- The timestamp represents the end edge. For example, if the timestamp is 10:00 and interval used is 1 hour, the kWh is the load between 9:00-10:00.
| Column name | Type | Description | Constraint |
|---|---|---|---|
| scadaId | String | ID of the feeder | Derived |
| timestamp | String | Timestamp in the format yyyy-MM-ddThh:mm:ss.sss . |
Expected in UTC |
| kwh1 | Double | Load of phase A. |
The value represents the aggregated active load in the past time interval. All values must be in units of kWh not MWh. The gap between feeder load and aggregated meter load should less than 10% otherwise the accuracy is affected. The kwh1 is load of phase A that maps to phase code 0b100=4. |
| kwh2 | Double | Load of phase B. | |
| kwh3 | Double | Load of phase C. |
| Column name | Type | Description | Constraint |
|---|---|---|---|
| meterId | String | ID of the meter | Primary Key, Foreign Key Must be a valid meter in the meter table |
| timestamp | String | Timestamp of reading data, in the format of yyyy-MM-ddThh:mm:ss.sss |
|
| kwh | Double | Load | Value represent aggregated active load of all 3 phases in past interval. All values in unit to KWH. |
| Column name | Type | Description | Constraint |
|---|---|---|---|
| scadaId | String | ID of the SCADA tag. |
Derived |
| timestamp | String | Timestamp in the format yyyy-MM-ddThh:mm:ss.sss. |
Expected in UTC |
| volt1 | Double | Phase A voltage |
The value represents the average voltage in the past time interval. The value must be normalized to the same voltage level. For example, if some meters are 120v, and others are 240v, when the feeder voltage is 1kV, the values is normalized to the minimal voltage level, that is: Value * 120 / 240. When ch1volt, ch2volt, ch3volt represent the voltage of 3 phases, ch1volt maps to phase A, ch2volt maps to phase B, and ch3volt maps to phase C. |
| volt2 | Double | Phase B voltage | |
| volt3 | Double | Phase C voltage |
| Column name | Type | Description | Constraint |
|---|---|---|---|
| meterId | String | ID of the meter | Primary Key, Foreign Key The feeder should be a valid feeder in the feeder table, or feeder group in the feeder_group table. |
| timestamp | String | Timestamp yyyy-MM-ddThh:mm:ss.sss. | |
| volt1 | Double | Phase A voltage |
The value represents the average voltage in the past interval. The value must be normalized to the same voltage level. For example, if some meters are 120v, and others are 240v, when the feeder voltage is 1kV, the values is normalized to the minimal voltage level, that is: Value * 120 / 240. Channels ch1, ch2 and ch3 need not be the same value. For a single phase meter, only one channel should have a value. For two phases meters, only two channels should have values. For 3 phases meters, all channels must have values. |
| volt2 | Double | Phase B voltage | |
| volt3 | Double | Phase C voltage |