Creating a monitoring job

You can create monitoring jobs to check the up and down status of application endpoints and services through PING (ICMP), TCP, HTTP(S), and DNS internet protocols.

Before you begin

  • You must have permissions to create monitoring jobs.
  • Your user or team permissions are set to allow you to manage notifier lists if you want to set notifications.
  • The number of IBM® NS1 Connect® monitoring regions available to you depends on your plan type. You can view your plan type and usage limits on the Usage page.
  • If you are creating a PING (Internet Control Message Protocol —ICMP) monitoring job, verify that your firewall settings are configured to allow ICMP/ICMPv6 packets to reach the device under test; otherwise, the ping could fail.
  • A notifier list must be created if you want to send notifications if the endpoint is down, if regional failures occur, or if you want to send automatic updates to an answer metadata that you connect the monitor to.

About this task

You can create a monitoring job to use any of the following internet protocols to check the availability of your application endpoints and services.

  • PING (ICMP/ICMPv6 echo): Check the availability of network devices by sending an ICMP echo packet and waiting for a response. The monitor then assesses the connection health based on specified criteria, such as round-trip time (RTT) or percent packet loss.
  • TCP: Test the availability and performance of network services that use TCP for communication. This could include services like email, file transfer, or any other service that relies on TCP for data transmission. The probe typically attempts to establish a connection with a server on a specific port and then sends and receives data to ensure the connection is functioning correctly. The monitoring job connects to the specified port and sends a specific string. For example, if testing a mail server (port 25 or 587), NS1 Connect sends a string, EHLO, and validates that the response is as expected. If the response is not what is expected or if the response time is too long, the monitoring job marks the endpoint as down. Optionally, you can configure the probe to connect using Transfer Layer Security (TLS) to encrypt the connection.
  • HTTP(S): Used to check website availability. NS1 Connect sends a request over HTTP or HTTPS to a server and waits for a response. Then, the monitoring job assesses the connection health based on specified criteria, including the HTTP headers and body in the response.
  • DNS: Check the availability and response time of a DNS record from a name server or resolver. This can be used to check, for example, if an external public resolver is serving the correct data to your end users.

When you create a monitoring job, you configure the following:

  • The type of monitoring job and the settings specific to the type of monitoring job
  • The conditions to determine if an endpoint is available
  • The regions from where the probes assess availability
  • A policy to establish when the conditions are met
  • Notifications (optional)

By default, the monitoring probes connect over IPv4, however, you can choose to connect over IPv6.

Procedure

  1. Click Monitors.
  2. Click Create a monitoring job.
  3. In Monitoring job name, enter a name for the monitoring job.
    Make sure that the job name is more than 2 characters.
Select the type of monitoring job and configure the settings specific to the type of monitoring job:
  1. From the Monitoring job type drop-down list, select the method to use to check your endpoints.
    • PING
    • TCP
    • HTTP(S)
    • DNS
  2. Complete the steps for the type of monitoring job that you selected:
    Type of monitoring jobSettings
    PING
    1. In IP address or hostname to ping, enter the IPv4 or IPv6 address to probe using ICMP echo packets.
    2. To change the number of packets to send, in Number of packets to send, enter the number. The default is 4. A higher number of packets results in a more accurate RTT calculation and packet loss statistics.
    3. To specify the amount of time between subsequent ICMP echo request packets, in Time between packets, enter the amount in milliseconds.
    4. To specify the amount of time before an endpoint is marked as down, in PING timeout, enter the amount in milliseconds. If the RTT is greater than the ping timeout, then that ICMP echo packet is considered as a timeout. This metric influences the up condition for percent packet lost. For example, if you set the number of packets to four and one of the packets takes longer than the ping timeout value that you set, then 25% (1 out of 4) of the packets are considered lost.
    TCP
    1. In IP address or hostname, enter the IPv4 or IPv6 address or hostname of the endpoint to monitor.
    2. To negotiate a Secure Socket Layer (SSL) connection before sending or receiving protocol data, select the Connect with SSL checkbox.
    3. To enforce verification of TLS certificates, select the Add TLS verify checkbox. If the monitoring job is failing due to a certificate error, such as an expired certificate that you don't want to use as a failure condition, clear this checkbox.
    4. In TCP port, enter the port number to use to connect to the endpoint.
    5. To specify the maximum amount of time allowed for a data transfer after the connection is established, in Response timeout, enter the amount in milliseconds.
    6. To specify a string to send to the endpoints when the connection is established, in String to send, enter the string.
    7. To specify the amount of time that the monitoring job spends trying to establish a connection with an endpoint (host:port), in Connect timeout, enter the amount in milliseconds. If there is no response or if the endpoint rejects the connection within this period of time, the monitoring job is considered down.
    HTTP(S)
    1. In URL, enter the URL, including the protocol, to monitor. The hostname in the URL is used to determine the IP address of the endpoint to check. If you want to check the health of a specific virtual hosting server, specify that hostname in the URL; for example, https://example.amazonaws.com/healthcheck.php for a server hosted on AWS.
    2. To enforce verification of TLS certificates, select the Add TLS Verify checkbox. If the monitoring job is failing due to a certificate error, such as an expired certificate that you don't want to use as a failure condition, clear this checkbox.
    3. To allow connection to an HTTP redirect, select the Follow HTTP redirects when enabled checkbox. The status that shows for the monitoring job is for the new URL. To not follow redirects and show the status of the monitoring job for the original URL entered, clear the checkbox.
    4. To include a string as the HOST request header in the HTTP transaction, in Virtual host, enter the string. For example, if virtual host = www.example.com, then Host:www.example.com is included in the HTTP request header. The virtual host is also used to support testing hosts that use server name indication (SNI). The monitor adds the virtual host to the TLS handshake process so that it receives the correct SSL certificate, enabling the rest of the TLS handshake to proceed as normal.
    5. To include a brief description of the monitoring job in the User-Agent request header, in User agent, enter the description.
    6. To include a value for the Authorization request header in the HTTP transaction, such as a bearer token or API key, in Authorization header, enter the value.
    7. To enter an HTTP method for the monitoring job, in HTTP method, enter either HEAD, GET, or POST.
    8. To enter the amount of time that the monitoring job spends to establish a connection with an endpoint (host:port), in Connection timeout, enter the amount in seconds. If there is no response or if the endpoint rejects the connection within this period of time, the monitoring job is considered down.
    9. To enter the amount of time that NS1 Connect waits to read the response body after the connection is established and the HTTP request is sent, in Idle timeout, enter the amount in seconds.
    DNS
    1. In Query domain, enter the domain name that the monitoring job queries.
    2. In Nameserver IP or hostname, enter the IP address (IPv4 or IPv6) or fully qualified domain name (FQDN) of the DNS server to use to query the domain.
    3. To specify the DNS record type to query in the domain, enter the type in Query type.
    4. To specify the port number in the DNS query, in DNS port, enter the number.
    5. To specify the maximum amount of time to wait for output before the endpoint is considered down, in Response timeout, enter the amount in milliseconds.
  3. To use IPv6 to connect to a server, select the Connect over IPv6 checkbox; otherwise, the probe connects over IPv4.
Define the conditions to determine if an endpoint is available:
  1. Click Add condition.
  2. Select a metric, an operator, and enter a target value to create a logical statement so that NS1 Connect can determine if an endpoint is up.
    If you are creating a PING monitoring job, you can set conditions for RTT and for percent packet loss.
    If you are creating a TCP monitoring job, you can set conditions for output received from the connection and the time for the connection to open.
    If you are creating an HTTP(S) monitoring job, you can set conditions for HTTP(S) response body and HTTP(S) status code.
    If you are creating a DNS monitoring job, you can set conditions for number of records, record RDATA, and response time.
  3. Click Add.
Select the geographic regions from where to assess endpoint availability:
  1. Under Monitoring regions, select the global regions from where the probes assess endpoint availability.
    The number of monitoring regions available to you depends on your plan type.
Select a policy to establish when the endpoints are down.
  1. Under Policy, choose one of the following methods to determine if the endpoint is down.
    PolicyDescription
    Quorum The endpoint is considered down if health checks conducted from a majority of regions indicate that the endpoint is down.
    All The endpoint is considered down if health checks from all regions indicate that the endpoint is down. In other words, if health checks from one or more regions indicate that an endpoint is up, the endpoint is considered up.
    One The endpoint is considered down if a health check from one of the regions indicate that the endpoint is down. In other words, if health checks from all of the regions indicate that the endpoint is up, the endpoint is considered up.
  2. In Interval, enter the amount of time in seconds between each health check in the selected regions.
  3. Optional: To automatically conduct a second verification before changing the status of an endpoint, select the Rapid recheck check box. Selecting this option can help prevent false positives.
Set override status:
  1. Optional: To set this monitoring job to not probe, toggle the Pause probing switch to the on position.
Set the type of notifications and the frequency of notifications:
  1. To set notifications, click Next.
  2. To turn on notifications, toggle the Notifications switch to the On position.
    With notifications turned on, notifications are sent when an endpoint is down and when regional failures occur. Also, automatic updates are sent to an answer metadata that you connect the monitor to.
  3. In Notifier list, select the type of notifier list to use.
    • To send automatic updates to an answer metadata, select a data feed notifier.
    • To alert systems and stakeholders that an endpoint is down, select an external notifier (email, Slack, PagerDuty, or custom webhook).
    Attention: If you don’t select a notifier list containing the data feed notifier, one is automatically generated when you connect the monitoring job to the metadata of an answer in a DNS record.
  4. Automatic recovery controls up event notifications only. When enabled, notifies all connected notifiers when probe results meet monitoring policy conditions. If the NS1 Monitoring Data Feed is included, DNS answer metadata automatically returns to up. Does not affect down events.
    If you prefer to manually force a monitoring job up after it has gone down, clear the Automatic recovery checkbox.
  5. Optional: To send a notification to systems or stakeholders in the notifier list if NS1 Connect detects a failure in any of the selected monitoring regions, select the Notify about regional failure checkbox.
    NS1 Connect sends a notification even if the status of the monitoring job is up.
  6. Optional: To set the length of time that NS1 Connect waits before sending the monitoring alert, enter the time in seconds in Notify delay.
    Enter 0 to send an alert immediately when the monitoring job detects that the endpoint is down.
  7. Optional: To set the length of time between repeated notifications sent until the status of the monitoring job returns to an up state, enter the time in seconds in Notify repeat.
    Repeat notifications are reminders that the monitored endpoint is still down. To send one notification to external notifiers for each down event, enter 0.
  8. Optional: To enter details about this monitoring job for internal reference, type the details in Notes.
  9. Click Create.

What to do next

To enable automatic updates, you can connect the monitoring job to a DNS answer.

If you paused probing for this monitoring job, you can start probing when you are ready.