Infrastructure correlation
This page describes how the worlds of application monitoring and infrastructure monitoring are integrated together. While some users may mostly rely on application monitoring, it is at times useful to gain a better understanding of how the logical layer is mapped to the physical layer, and required when it comes to troubleshooting issues detected at the application level, but whose root cause lies in the infrastructure layer.
Instana allows bi-directional navigation in the UI between application and infrastructure monitoring:
- From applications and services to infrastructure monitoring
- From calls to infrastructure monitoring
- From infrastructure monitoring to application monitoring
Infrastructure correlation also plays a significant role in:
Similarly application monitoring and website monitoring are integrated together as explained in more details here.
- Navigation from services
- Navigation from calls
- Navigation from Infrastructure
- How it works
- Infrastructure correlation in application and service mapping
- Infrastructure correlation and incidents
How it works
Before anything, it is important to understand that application and infrastructure monitoring are powered by two distinct data pipelines:
-
Application monitoring: the data (traces and calls.md) come from the Instana tracers or third-party tracers.
-
Infrastructure monitoring: the data (tags and metrics.md) come from the Instana sensors.
These 2 worlds merge seamlessly thanks to a mechanism that we call infrastructure linking, where calls are linked to monitored infrastructure entities. Linking occurs when a common identifier on both sides is found.
Instrumented services
Tracers instrument your processes to capture incoming and outgoing calls. These calls are then reported to the Instana backend where we attempt to link the source and destination of those calls to some known infrastructure entities. When the source process (or destination process) is instrumented, it necessarily implies that the source process (or destination process) is also monitored by an Instana sensor, which knows everything about it. Because both the tracer and sensor are co-located they both know the host and process, which makes infrastructure linking possible.
For example, a Python process is instrumented by the Python tracer, which captures all of the incoming and outgoing calls. Meanwhile several sensors got activated on the host where this process is running: the host, process, and python sensors. Both the tracer and sensors send data separately to the Instana backend but they both contain the same identifier to the process. It is therefore possible to link the destination of the incoming calls, and the source of the outgoing calls, to the Python process.
Databases, messaging systems, and cloud services
Instana tracers do not instrument databases, messaging systems, nor cloud services. However processes which call these untraced systems are instrumented and therefore outgoing requests are properly mapped to calls. For example the Java tracer records outgoing requests from a Java process to a MySQL database and these are analyzed into calls with the Java process as the source and MySQL database as the destination. These calls are visible in Instana and their destination is usually linked to the infrastructure entity which receives the call. How is it possible?
On the one hand, Instana does monitor the database or messaging system through one of the Instana sensors and therefore knows about the process, its port, and the host. On the other hand Instana analyzes an outgoing request which may contain enough information to guess the destination process, usually the hostname or IP and port which is carried, e.g. in the connection string.
For example, an outgoing request to a MySQL instance could contain the connection string jdbc:mysql://10.128.0.6:3306
.
Infrastructure monitoring detected a corresponding MySQL process exposing the port 3306
and runnning on a host which exposes the IP 10.128.0.6
.
Because of both the IP and port match, the calls and the MySQL instance are linked together:
Instana also supports connection strings which contain a Kubernetes service name like jdbc:mysql://mysql-svc
. Behind the scenes it will attempt to fully qualify the service name to uniquely identify the service across all namespaces
and clusters. The result is a call whose destination is linked to the Kubernetes service, instead of the final process.
For cloud services there are no processes but the idea is the same: find a common identifier shared by the monitored cloud service and the outgoing request to that service. This could be for example a resource identifier like a AWS ARN.
Linking calls to infrastructure is sometimes not possible when the host or IP given in the connection string does not match any of hosts or IPs known from the infrastructure monitoring side. It is usually the case when there is a level of indirection where the process calling the remote database (or messaging service or cloud service) uses a hostname which is:
- an entry in the
/etc/hosts
system file - a DNS CNAME entry
- a pointer to a proxy or load balancer
- an alias given by a Service Discovery service like Consul or Zookeeper
External services
External services are by definition not monitored by Instana and therefore not even visible on the infrastructure monitoring side. Because we know nothing about them, calls to these services are simply not linked to any known infrastructure entities.
In the Infrastructure tab, you can identify these calls as "Unmonitored":
Infrastructure correlation in application and service mapping
What is the role of infrastructure correlation in application and service mapping?
When the Instana backend analyzes traces and calls, it will first link them to known infrastructure entities, and enrich them with infrastructure tags such a host.name
, springboot.name
or docker.label
.
These tags are then used to automatically map these calls to services using pre-defined rules or user-defined rules.
For example, a call linked to a spring boot process will be mapped to a service which gets its name from the springboot application name. Or you could define a docker label service-name
which could be used to create a custom service mapping rule to name most of your services running in Docker.
Same is true for application mapping where you can use these infrastructure tags to define applications, for example using the kubernetes.namespace
tag:
When infrastructure linking is not possible, service mapping cannot rely on infrastructure tags and rely instead on so called fallback rules which are defined using call tags, like call.http.host
or call.database.schema
.
Infrastructure correlation and incidents
An incident groups related events by leveraging the Dynamic Graph. The ability to link calls (and therefore applications and services) to infrastructure entities enriches the dynamic graph with additional connections bridging the two worlds and will therefore result in even more complete incidents and faster root cause analysis.