What is DNS failover?

22 October 2024

 

 

Authors

Chrystal R. China

Writer

What is DNS failover?

DNS failover is an automated routing technique that redirects traffic from failed or unreachable servers to operational, available servers.

Typically provided by cloud-based authoritative DNS providers, failover services use health checks and monitoring nodes to assess DNS server health. If a server responds appropriately to monitoring nodes during a health check, user queries are routed to and resolved by that server. If, however, the server is unavailable (due to an unresponsive host or a server outage), failover services withdraw its IP address and redirect network traffic to a new IP address with a working server. 

Failover works through the domain name system (DNS), which converts human-readable domain names into the computer-readable IP addresses devices use to identify each other on the network.

In a traditional DNS infrastructure, domain names drive traffic to IP addresses that hold the correct resources for addressing user queries. When a user enters a domain name, their computer communicates with a DNS resolver. The resolver traverses the DNS to reach an authoritative name server (typically, the primary DNS server), which holds the IP address for the requested website. The server then converts domain names into corresponding IP addresses and sends the queried information back to the user.

In many ways, failover DNS servers are non-essential to network function in a traditional infrastructure; the DNS can perform query resolution tasks when only primary servers are available. However, backup servers maintain synchronized copies of DNS records in case primary servers fail, making them essential to DNS failover. Without failover servers, the entire DNS would fail if main servers went down or became unreachable. 

As such, DNS failover services are vital to maintaining resilient, redundant, high-availability computing networks.

DNS servers, explained

The DNS was designed with a hierarchical, distributed database structure that facilitates a more dynamic approach to domain name resolution, one that could keep pace with a rapidly expanding network of computers. It’s colloquially called the “phonebook for the internet,” but a more apt analogy is that the DNS manages domain names in much the same way as smartphones manage contacts. 

Smartphones eliminate the need for users to remember individual phone numbers by storing them in easily searchable contact lists. Similarly, the DNS allows users to connect to websites using internet domain names instead of IP addresses. Rather than having to remember the web server at "93.184.216.34," users can just go to the webpage "www.example.com".

When a domain is registered, its name server records are created and stored on a primary DNS server. The primary DNS server holds the original read/write version of the zone file and various types of resource records (including A records, AAAA records, MX records, CNAME records and other types) that map and route the appropriate data back to the user. 

Backup DNS servers, or failover servers, hold read-only replicas of the zone file. They function as secondary DNS servers that only handle requests during primary server downtime or when the primary server is overloaded.

Though primary DNS servers are central to how the DNS operates, they also represent a single point of failure. If they fail and there are no designated backup servers to take over the workload, the entire DNS resolution process can suffer. Conversely, backup servers cannot exist without a primary DNS server, but if there’s a primary server outage, backup servers manage failover protocols and make sure that user queries are resolved until the primary server is restored.

Today, most leading managed DNS providers offer name server IPs to use and behind each of those IPs is a pool of geographically distributed DNS servers that route requests using Anycast. Unlike of the one-to-one communication dynamics associated with conventional DNS, Anycast DNS routes user requests to a network of resolvers (instead of a single resolver) and to the closest available server for resolution, optimizing load balancing features and overall network resilience.

How does DNS failover work?

DNS failover protocols can vary significantly between networks, but they typically involve a few key processes.

Health monitoring

DNS systems must conduct ongoing health checks to determine the status and performance of the internet service provider (ISP), all network API endpoints and the primary IP servers. Health checks can include Internet Control Message Protocol (ICMP) pings at the network level, HTTP/HTTPS checks to assess web servers at the application level, Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) checks at the port level and any other custom scripts a business wants to run.

Failure detection

Administrators typically customize failure criteria based on the needs of applications and the mission-criticality of services. Regardless of criteria, if the monitoring nodes detect a failure (where the primary server is unresponsive or returns errors), it triggers a failover event and sends out failure notifications.

The monitoring nodes then dynamically withdraw the unavailable IP address and move the hostname to a backup IP (or CNAME) so that routers direct DNS queries to a secondary IP address until primary servers are restored. Failover DNS also adjusts time-to-live (TTL) values and DNS cache times to make sure that changes propagate quickly to DNS resolvers across the network and users experience minimal, if any, downtime.

Recovery and failback

When the primary servers are restored and can pass health checks, the system prepares for failback, wherein DNS settings and resolution processes revert to the primary IP address. Monitoring nodes oversee the process to avoid flapping (frequent switching between primary and backup servers) and continue to perform health checks to keep the network working optimally. 

Many businesses also implement advanced failover strategies, such as multi-region failover (where routing policies across multiple regions direct users to the nearest or best-performing server) and Anycast DNS (where the same IP address is broadcasted from multiple locations and requests are routed to the best server based on network topology).

Moreover, DNS failover services can facilitate round-robin DNS, which distributes traffic evenly across each server and helps prevent distributed denial-of-service (DDoS) attacks. And hybrid failover solutions, which combine failover DNS with other high-availability network solutions (global server load balancing (GSLB) and content delivery networks (CDNs), for instance) can optimize traffic management, minimize latency and accommodate more complex failover scenarios.

Benefits of DNS failover

  • High network availability. DNS failover is a cornerstone of high-uptime networks. It provides a mechanism for automated traffic switching so that web services remain accessible even in the face of server or infrastructure failures.
  • Disaster recovery. In disaster recovery scenarios, DNS failover can be used to switch traffic to servers in different geographic regions or data centers to mitigate the impact of regional outages or catastrophic events.
  • Enhanced user experience. Robust DNS failover mechanisms help create a seamless user experience, as users are less likely to encounter downtime and other service disruptions.
  • Multi-region redundancy. DNS failover enables deployment of backup servers in different geographical locations, maximizing network resilience against regional outages, facilitating better load distribution and improving response times for users across the world.
  • Cost-effectiveness. Compared to other high-availability solutions like data center failover, DNS failover options can be more cost-effective. It uses existing DNS infrastructure, so it requires fewer hardware expenditures and complex architectural configurations.

Think Newsletter

The latest AI and tech insights from Think

Sign up today
Related solutions IBM NS1 Connect

Get DNS that does more. Turn the workhorse of your network into an engine of innovation.

IBM NS1 Connect GSLB

Optimize end-user experience and improve network resilience at a lower cost.

DNS traffic steering

Deliver faster, more reliable user experiences with highly customizable traffic steering.

Network resilience and uptime

Keep revenue-generating applications online, all the time.

Resources

What is DNS (Domain Name System)?
Related topic

Read the topic page

What is managed DNS?
Related topic

Read the topic page

What are DNS security extensions?
Related topic

Read the topic page

What is network optimization?
Related topic

Read the topic page

Take the next step

IBM NS1 Connect provides fast, secure connections to users anywhere in the world with premium DNS and advanced, customizable traffic steering. NS1 Connect’s always-on, API-first architecture enables your IT teams to more efficiently monitor networks, deploy changes and conduct routine maintenance.

Explore NS1 Connect Book a live demo