SevOne SDN Collector Advanced Configuration & Troubleshooting Guide
This document offers detailed instructions for executing advanced configurations of the SDN collector by utilizing configuration variables. It also includes troubleshooting guidelines.
Advanced Configuration
Please refer to SDN Plugin in SevOne NMS User Guide for the APIC Connectivity details to configure the Cisco ACI solution.
Example

- <device-name>.yaml
- default-<device-name>.yaml
- Apic.yaml
- default-Apic.yaml
Example: Apic.yaml (sample file)
The list of SDN plugin variables can be found in Configuration Variables table below.
deployment_name: Apic
version: 7.2.0
run_agents_immediately_and_exit_collector: true
log:
level: debug
agent:
include:
- InstallerAgent
- TopologyInstallerAgent
- PodAgent
- NodeAgent
- PodExtendedAgent
- NodeExtendedAgent
- NodeInterfaceAgent
- MetadataAgent
- TopologyAgent
- DeviceDescriptionAgent
- FaultStreamingAgent
- ObjectGroupAgent
- ExternalSwitchAgent
- HypervisorAndVirtualMachineAgent
vendor:
is_multi_site_mode: false
no_prefix: false
site:
name: Apic
apic_url: https://10.52.0.171
apic_uid: developer
apic_password: DevTeam1234#
device_name_prefix: MyPrefix
fault_configuration_filename: ""
timeout: 30s
page_size: 10000
sleep_time: 200
dn_order: true
do_nodes_traffic: true
fault_prefix: ""
do_pod_traffic: true
do_virtual_traffic: false
do_bytes: true
do_packets: false
skip_tunnel_if: true
skip_off_vm: true
skip_bad_nic: true
pod_agent:
schedule: ""
node_agent:
schedule: ""
pod_extended_agent:
schedule: ""
node_interface_agent:
schedule: ""
node_extended_agent:
schedule: ""
external_switch_agent:
schedule: ""
hypervisor_and_virtual_machine_agent:
schedule: ""
topology_agent:
schedule: ""
object_group_agent:
schedule: ""
nms:
api:
insecure_tls_connection: true
host: 127.0.0.1
v2_api_key: eyJhbGciOiJIUzUxMiJ9eyJpc3MiOiJhZG1pbiJ92wPJ-R9zaAoD3sJ95dSzN_irIaLn7E_o1SpHrkpVTOegoInNZ0r-s7zELy6GJS7bdLJuExqF9ksB4JfMHlcKJA
v3_api_key: eyJ1dWlkIjoiYzNhMTc1NGEtZDBjMC00ZTczLWE1YzgtODk5OTBiMWMxZDQ3IiwiYXBwbGljYXRpb24iOiJTRE4iLCJlbnRyb3B5IjoiazNZN0JMWGIwWVBCbzhzcGlmdmpUbjdOOHlEenh0WFpPUktnZVZVWVRTTzQzTWtwMDZSVmozQ3p0RWFUYlZkbyJ9
fault_config:
filter: []
granular_fault_filter: []
severity_mapping: []
Example: default-Apic.yaml (sample file)
After the SDN plugin is configured and you want to set / modify the SDN plugin configuration variables, using a text editor of your choice, you may edit default-Apic.yaml.
Please see the Configuration Variables table below for the list of SDN plugin configuration variables available.
Log rotations are performed automatically.
As of SDN 7.2.1, the log path directory has been changed from /var/log to /var/log/SDN. For example, /var/log/SDN/<site name provided when adding SDN device>/<v7.2.x>/
deployment_name: ""
version: 7.2.0
run_agents_immediately_and_exit_collector: true
log:
level: debug
agent:
include:
- InstallerAgent
- TopologyInstallerAgent
- PodAgent
- NodeAgent
- PodExtendedAgent
- NodeExtendedAgent
- NodeInterfaceAgent
- MetadataAgent
- TopologyAgent
- DeviceDescriptionAgent
- FaultStreamingAgent
- ObjectGroupAgent
vendor:
is_multi_site_mode: false
no_prefix: true
site:
name: ""
apic_url: ""
apic_uid: ""
apic_password: ""
device_name_prefix: SiteName
fault_configuration_filename: ""
timeout: 30s
page_size: 10000
sleep_time: 200
dn_order: true
do_nodes_traffic: true
fault_prefix: ""
do_pod_traffic: true
do_virtual_traffic: false
do_bytes: true
do_packets: false
skip_tunnel_if: true
skip_off_vm: true
skip_bad_nic: true
pod_agent:
schedule: ""
node_agent:
schedule: ""
pod_extended_agent:
schedule: ""
node_interface_agent:
schedule: ""
node_extended_agent:
schedule: ""
external_switch_agent:
schedule: ""
hypervisor_and_virtual_machine_agent:
schedule: ""
topology_agent:
schedule: ""
object_group_agent:
schedule: ""
nms:
api:
insecure_tls_connection: true
host: ""
v2_api_key: ""
v3_api_key: ""
fault_config:
filter: []
granular_fault_filter: []
severity_mapping: []
Filter Alerts
- SSH to SevOne NMS appliance as root user.
ssh root@<NMS appliance> - Change directory to /config/SDN.
cd /config/SDN - You will see two configuration files <device-name>.yaml and default-<device-name>.yaml for the device created through the SDN plugin. For example,
lsApic. yaml default-Apic.yaml where, Apic is the device name of the device created in the example above.
-
Note: If you are configuring the alerts for the first time, the fault-config values in /config/SDN/default-<device-name>.yaml file will be blank.Using a text editor of your choice, edit and save /config/SDN/default-<device-name>.yaml file. Please refer to the table below for details on the variables in the .yaml file.For example,
vi /config/default-Apic.yamlfault_config: filter: - filter_on: aci_severity filter_value: - aci-severity-1 - aci-severity-2 - filter_on: aci_fault_code filter_value: - fault-code-1 - fault-code-2 granular_fault_filter: - code: fault-code-3 aci_severity: - aci-severity-3 - aci-severity-4 - code: fault-code-4 aci_severity: - aci-severity-4 - aci-severity-5 severity_mapping: - code: - fault-code-1 - fault-code-2 severity: nms-severity-1 - code: - fault-code-3 - fault-code-4 - fault-code-5 severity: nms-severity-2Save /config/SDN/default-Apic.yaml file.
| Variable | Description |
|---|---|
| aci_severity | This sheet is used to provide attributes of a fault to filter on.
code: Contains ACI severities to create SevOne NMS alerts on. Important:
|
| fault_code | This sheet is used to provide attributes of a fault to filter on.
code: Contains fault codes to create SevOne NMS Alerts on. To learn more about the fault codes, please refer to https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/all/syslog/guide/b_ACI_System_Messages_Guide.html Important:
|
| granular | This sheet is used to provide attributes of a fault to filter on.
code: Contains fault codes to create SevOne NMS Alerts on. To learn more about the fault codes, please refer to https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/all/syslog/guide/b_ACI_System_Messages_Guide.html aci_severity: ACI severities that the faults with the above-mentioned fault codes need to be mapped to. Important:
|
| severity_mapping | This sheet is used if the severity of faults with certain codes needs to be mapped to a particular SevOne NMS severity.
code: Contains fault codes mapped to the severity mentioned in severity. severity: SevOne NMS severity that the faults with the above-mentioned fault codes need to be mapped to. Accepted keywords are emergency, alert, critical, error, warning, notice, info, or debug. Important:
|
Troubleshooting
Upgrade from SevOne NMS version 6.8.x to 7.2.1 fails with Service and Pod issues
Upgrade process from SevOne NMS version 6.8 to 7.2.1, results in system initialization failure, potentially causing service outages and pods stuck in CrashLoopBackOff or Init status.
- Login to SevOne NMS as support user, and enter NMS container to view the list all running containers.
ssh support@<NMS_IP_address> sudo su nms podman ps
- Locate the container ID of the faulty container in the displayed logs.
- Using the same container ID, run the command to access the faulty container.
podman exec -it <container_id> /bin/sh - Run the revert script.
./sdn-plugin revert.shImportant: SevOne NMS version 7.2.1 includes a revert script that streamlines the upgrade process. - Exit the container.
exit - Following successful execution, restart the impacted container.
podman restart <container_id> - To verify whether the container is running successfully, run the command below and monitor the container logs.
podman logs <container_id>
After performing the revert operation and restarting the container, the system will successfully complete the upgrade process.
Where can the log files be found?
- Go to the log folder.
cd /var/log/Apic - Go to version 7.2, for example.
You are now in /var/log/Apic/7.2.0.cd 7.2.0 - You will find a folder for each supported agent.
- If you are looking for the log file for agent, DeviceDescriptionAgent, for example, then go to folder DeviceDescriptionAgent.
You are now in /var/log/Apic/7.2.0/DeviceDescriptionAgent.cd DeviceDescriptionAgent - You will now find the log file <device-name>_DeviceDescriptionAgent_7.2.0.log file. For example, Apic_DeviceDescriptionAgent_7.2.0.log
Why are the SDN TopN views unavailable on NMS version 7.0.x after upgrading from version 6.0.x.?
If you are implementing new SevOne SDN solution deployment in NPM version 7.0.x environment, that was upgraded from NMS version 6.0.x to version 7.0.x without configuring the SDN solution beforehand, the SDN TopN OOTB views will not be available post-upgrade on the new SevOne NMS 7.0.x, and the SDN TopN OOTB reports in SevOne Data Insight will be inaccessible due to the absence of these views.
Do not import the .spk file for the SDN TopN views before configuring the SDN solution and adding the SDN devices. Execute these steps only after the SDN device has been enabled with the SDN plugin.
To import the .spk file for the SevOne SDN Solution TopN views, perform the steps on the cluster leader appliance of the NMS cluster running version 7.0.x.
- To execute the commands in the SDN container.
Examplepodman exec -it <nms-container_id_or_name>/bin/shpodman exec -it nms-collections-sdn-plugin /bin/sh - Copy the spk files from the /opt/reports/OOTB directory to the /config directory
cp -r /opt/reports/OOTB/* /config/ - To exit from the current container.
exit - Login in to NMS container
podman exec -it nms-nms-nms /bin/sh - To import spk files, please run the commands as shown below
SevOne-import --file config/SDNSolution-ACI-Capacity-reports-NMS.spk
Example:SevOne-import --file config/SDNSolution-ootb-reports-NMS.spkSevOne-import --file config/SDNSolution-ACI-Capacity-reports-NMS.spk * Verifying the package manifest... * Done. Allow overwrite: no Import tags only: no Dry run: no Output CSV: no Importing items for core/TopnView. Ignoring existing Top N View 'SDN Solution - ACI Capacity Average'. Ignoring existing Top N View 'SDN Solution - ACI Capacity Maximum'. Ignoring existing Top N View 'SDN Solution - ACI Capacity Minimum'. Ignoring existing Top N View 'SDN Solution - ACI Switch Capacity Average'. Ignoring existing Top N View 'SDN Solution - ACI Switch Capacity Maximum'. Ignoring existing Top N View 'SDN Solution - ACI Switch Capacity Minimum'. === Import complete
Why does the upgrade from NMS version 6.8.x to 7.0.x result in a loss of functionality and data, after migrating from SDN solution to plugin mode?
The loss of functionality and data during the upgrade from NMS version 6.8.x to 7.0.x, when the SDN solution fails to transition to plugin mode, could be due to specific configurations that were not present or correctly set up in version 6.8.x, leading to a malfunction when attempting to migrate.
- Using ssh, login to SevOne NMS appliance as root.
ssh root@<SevOne NMS appliance IP address> - To create path:
mkdir /tmp/clean_migrationcd /tmp/clean_migration - Download the following (latest) files from IBM Passport Advantage (https://www.ibm.com/software/passportadvantage/pao_download_software.html) via Passport Advantage Online. However, if you are on a legacy / flexible SevOne contract and do not have access to IBM Passport Advantage but have an active Support contract, please contact IBM SevOne Support for the latest files.
In this case, you must download sevone_solutions_sdn_cleanMigration.tar.gz file and place it in /tmp/clean_migration directory.
- To retrieve the container id.
Example:podman ps
As shown in the output above, the container id is identified as 00bfd5a708e4. - To copy the binary file in the sevone_solutions_sdn_cleanMigration.tar.gz to /tmp folder.
Example:sudo podman cp sevone_solutions_sdn_cleanMigration.tar.gz <container_id>:/tmpsudo podman cp sevone_solutions_sdn_cleanMigration.tar.gz 00bfd5a708e4:/tmp - To execute the commands in the SDN container.
podman exec -it <container_id> /bin/sh - Change directory to access /tmp folder.
cd /tmp/ - Extract the binary file from .tar file.
tar -xvf sevone_solutions_sdn_cleanMigration.tar.gz - Execute the revert command.
./cleanMigration revert -
After executing the previous commands, please restart the container using the following command.
podman restart <container_id>Note: Wait for at least 15 minutes after restarting the container to allow for the changes to take effect.
SelfMon Policy 'Trigger Condition' modifications do not persist after first save on SevOne NMS
Modifications made to the Policy Trigger Condition for SDN SelfMon policies do not persist after the first save in NMS. Upon refreshing the browser or re-opening the policy, the updated condition is not reflected. However, saving the modification a second time typically resolves the issue.
This issue is causing false positives in alerting.
- Using a web browser of your choice, log in to SevOne NMS cluster.
- From Events drop-down, click Configuration > Policy Browser.
- Search for SDN.
- Select siteN::SDN::SelfmonAvailability.
- Select Trigger Conditions tab.
- Under Conditions, edit the condition. By default, the values are set to:
- Indicator = availability
- Type = Time since newest data point
- Threshold = 7200 seconds; 7200 seconds is the default OOTB value. If the threshold value is changed, the new value does not persist the first time.
Workaround: To persist the value, you need to set the new value again.
- Custom Message = Agent $objectName not available as $indicatorName is less than 100%, it is no longer running and/or is having communication issues writing to the NMS
- Click Save button.
For details, please refer to Support Ticket DT444645.
| Policy Name | Trigger Condition | Clear Condition | Description |
|---|---|---|---|
| SelfmonAvailability | 7200 seconds since newest data point | 100 seconds since newest data point | SDN Selfmon availability |
Configuration Variables
| YAML setting | Default Value | Description |
|---|---|---|
| msp_name | ORGANIZATION | MSP name for this instance. MSP is a grouping of one or more tenants. For example, ORGANIZATION. |
| version | 7.2.0 | Version of the build. For example, 7.2.0 |
| run_agents_immediately_and_exit_collector | true | Will run all the agent in the include list sequentially and exit the collector. |
| log.level | debug | Log output minimum level. May be one of: debug, info, warning, error. |
| agent.include |
|
Set to array of agent names to explicitly include. |
| vendor.site.name | (required) - <enter value> | Provide the site name. |
| vendor.site.apic_URL | (required) - <enter value> | APIC IP address. For example, https://192.168.1.2 |
| vendor.site.apic_uid | (required) - <enter value> | APIC username. |
| vendor.site.apic_password | (required) - <enter value> | APIC password. |
| vendor.site.device_name_prefix | Site Name | Common prefix name for all devices. |
| vendor.site.timeout | 30s | The amount of seconds to wait before timing out on attempting to connect to the APIC. |
| vendor.is_multi_site_mode | false | If set to True, run the collector in multisite mode. Default setting is false. |
| vendor.no_prefix | false | If set to true, prefixes will be provided to device names. |
| vendor.page_size | 10000 | The page size to use for paginating API requests. |
| vendor.sleep_time | 200 | The time to sleep after APIC API queries in milliseconds. |
| vendor.dn_order | true | Request objects to be sorted by DN in the APIC API query. |
| vendor.do_nodes_traffic | true | Enable Node device's network statistics. |
| vendor.fault_prefix | "" | Used to specify a prefix text in the summary field of alerts that are created from ACI faults. |
| vendor.do_pod_traffic | true | Enable POD device's network statistics. |
| vendor.do_bytes | true | Collect statistics in bytes. |
| vendor.do_packets | false | Collect statistics in packets. |
| vendor.do_virtual_traffic | false | Poll for network statistics of VMs and HVs. |
| vendor.skip_tunnel_if | true | Skip polling the POD for Tunnel Interfaces. |
| vendor.skip_off_vm | true | Skip VMs that have been powered off. |
| vendor.skip_bad_nic | true | Skip VM network interfaces with an IP address of 0.0.0.0. |
| vendor.pod_agent.schedule | "" | Poll pod agent devices every 10 mins. |
| vendor.node_agent.schedule | "" | Poll node agent devices every 10 mins. |
| vendor.pod_extended_agent.schedule | "" | Poll pod extended agent devices every 10 mins. |
| vendor.node_interface_agent.schedule | "" | Poll node interface agent devices every 10 mins. |
| vendor.node_extended_agent.schedule | "" | Poll node extended agent devices every 10 mins. |
| vendor.external_switch_agent.schedule | "" | Poll external switch agent devices every 10 mins. |
| vendor.hypervisor_and_virtual_machine_agent.schedule | "" | Poll hypervisor and virtual machine agent devices every 10 mins |
| vendor.topology_agent.schedule | "" | Poll topology agent devices every 10 mins |
| vendor.object_group_agent.schedule | "" | Poll object group agent devices every 10 mins. |
| nms.api.host | "" | The hostname or IP address for SOA and REST API endpoints. |
| nms.api.v2_api_key | "" | API key used for NMS REST API authentication. |
| nms.api.v3_api_key | "" | API key used for NMS SOA authentication. |
| nms.api.insecure_tls_connection | true | Set true to enable insecure TLS connection by skipping certification verification. |