Automatic failover
DNS failover refers to a traffic steering configuration in which the platform automatically diverts DNS traffic away from down or unavailable endpoints, opting for alternatives to ensure the high availability of your applications and services during an outage. This configuration is recommended if multiple endpoints host the same application or provide the same service. Automating this process makes it easy to adapt quickly to changing network conditions—minimizing downtime and the manual effort to maintain this configuration.
On the IBM® NS1 Connect® platform, critical components of an automatic failover configuration include:
- An NS1 monitor or third-party data source tracks an endpoint's up/down status.
- A data feed connects the monitor to the up/down status of the corresponding DNS answer, enabling automatic updates.
- A Filter Chain containing the Up filter eliminates unavailable answers when making traffic
steering decisions. Typically, the Up filter is combined with other traffic steering filters. Some examples:
- Up + Priority + Select First N supports an active-passive failover configuration.
- Up + Shuffle + Select First N supports active-active failover with round-robin traffic distribution.
- Up + Geotarget Regional + Select First N supports active-active failover with geographic-based distribution.
How it works
Suppose you have an A record with multiple answers—each specifying the IPv4 address of a host on which an application or service is accessible. To configure automatic failover, you create a monitor for each endpoint, connect each monitor to its corresponding DNS answer, and create a Filter Chain within the record that includes the Up filter.
Each monitor frequently probes its designated endpoint from one or more monitoring regions to determine whether it should be considered up or available based on the up conditions defined in the monitor settings. If the results of a probe fail to meet these conditions, the endpoint is considered down. In response, the data feed connecting the monitor to its corresponding answer pushes an update, automatically changing the answer's up metadata value to false.
As the platform receives incoming queries for record domain and type, it references the Filter Chain to determine the best answer(s) to return. The Filter Chain must include the Up filter to facilitate automatic failover, but it is typically used in conjunction with other filters to apply secondary processing. Without additional filters, you risk directing all traffic to the same answer as long as it is up.
For example, if the Filter Chain contains the Up, Shuffle, and Select First N filters (in that order), then incoming queries would be processed as follows:
- The Up filter eliminates any answer marked as down from the answer pool before passing the list to the next filter. Note that because you have automatic updates configured, each answer's up/down status reflects the monitored endpoint.
- (Optional) The Shuffle filter randomizes the order of answers in the list. Note that many filters can be used after the Up filter to achieve more even traffic distribution among the available endpoints or to favor specific endpoints over others based on some conditions.
- The Select First N filter eliminates all but the first N (number) of answers in the list. In most cases, and by default, N is set to 1, meaning only the first answer in the list remains. This filter is placed at the end of most Filter Chain configurations to ensure only one answer is returned to requesting clients.
At a minimum, the Filter Chain must include the Up filter to support automatic failover, but most Filter Chains leverage additional filters based on the desired outcome. For example, the Select First N filter is typically placed at the end of the Filter Chain so that only one answer is returned to the requesting client. Further, additional filters, such as randomization or geographic-based filters, can be used to achieve more even traffic distribution among the available endpoints or to favor specific endpoints over others based on some conditions.
Configuring automatic failover
The following steps assume you already have a DNS record with multiple answers for which you want to configure automatic failover.
Create an NS1 monitor or a third-party data source from one of the supported monitoring integrations for each endpoint represented by an answer within the DNS record.
The type of NS1 monitor you create depends on the nature of the endpoint you are monitoring. Refer to Create a monitoring job to learn about configuring monitoring jobs for DNS, HTTP/S, PING (ICMP), and TCP.
Alternatively, you can configure a data source from a third-party monitoring service.
A data feed is a mechanism to push updates from an NS1 monitor or third-party data source to the corresponding answer. A data feed is automatically created when you create an NS1 monitor via the portal. If you are using a third-party data source, you must create the data feeds manually.
To connect each data feed to the corresponding answer:
- Click Zones.
- Click the name of the zone containing the record.
- Click the name of the record to view its details, including the list of answers.
- Navigate to the DNS record on which you want to configure automatic failover.
- Click the Overflow menu icon to the right of one of the answers in the list, and click Edit Answer Metadata.
- Click the Up/down metadata setting in the Settingcolumn.
- Click a feed in the Feedscolumn to display a list of available data feeds from NS1 monitors or third-party data sources in the Available column.
- Select the data feed corresponding to the endpoint represented by this answer.
- Click OK on the bottom right of the window.
- Repeat these steps for each answer in the record.
Complete the following instructions to configure a Filter Chain that supports an active-active or active-passive failover configuration.
- On the record details page, click Create Filter Chain.
- Click +) next to the Up filter.
- (Recommended) Add one or more filters to the middle of the Filter Chain to apply secondary
processes. Doing so can help prevent one endpoint from being overloaded when multiple or all answers
are available.
- If configuring an active-passive configuration with one primary endpoint and one or more backup
endpoints, use the Priority filter and enter a priority metadata value for each answer. Note that
lower numbers indicate a higher priority—for example, 1 is the highest priority.Attention: The order of answers on the Record details page indicates the priority order unless the priority is defined in the answer metadata. If you do not override the priority or use a second filter after the Up filter in this chain, then the platform will always return the first answer that appears on this page if it is available.
- If configuring an active-active configuration where all endpoints should share DNS traffic, use
another filter to apply secondary filtering. For example, use the Shuffle filter to distribute
traffic evenly across your endpoints, a Weighted Shuffle filter to skew traffic toward specific
endpoints more often, a geographic filter to favor endpoints that are geographically proximate to
the requester, or any of the other filters to achieve the desired outcome.Note: If you apply a filter that references answer metadata, you must edit the answer metadata manually or connect a data source to update that field automatically.
- If configuring an active-passive configuration with one primary endpoint and one or more backup
endpoints, use the Priority filter and enter a priority metadata value for each answer. Note that
lower numbers indicate a higher priority—for example, 1 is the highest priority.
- (Recommended) Add the Select First N filter at the end of the Filter Chain to control the number of responses returned to the requesting client.
- Click Save Filter Chain.
This completes the automatic configuration process. When a client queries the DNS record, any endpoints marked as down are removed from the answer pool to ensure the requester can connect to your application or service.