Highway interchange, aerial view

What is a GraphQL gateway?

GraphQL gateway, defined

A GraphQL gateway is an architectural component that receives queries in the GraphQL language and routes, orchestrates and aggregates responses across disparate backend services, which might expose REST, gRPC, GraphQL or other application programming interfaces (APIs). This framework enables clients to retrieve data from multiple sources through a single API call to a single endpoint.

Traditional API gateways present a single entry point for REST APIs (APIs that adhere to RESTful design principles), but each endpoint returns a fixed, predefined data structure determined by the server. This can lead to over-fetching, where responses include more data than the client needs, or under-fetching, where a single call doesn’t return sufficient information and clients are forced to make multiple calls to piece together the information needed.

GraphQL gateways address this problem by allowing clients to specify exactly the fields they need, and by aggregating relevant data across multiple microservices into a single response. This approach eliminates over-fetching because responses contain only the requested fields, and under-fetching because data from multiple services is returned at once.

For example, in a healthcare context, a GraphQL gateway might enable a client (an app or dashboard that serves doctors or patients) to access a patient’s medical history (through the “patient” service), identify her next appointment (through the “appointments” service) and determine her outstanding balance (through the “billing” service) with a single API call, rather than through three separate requests.

Aside from giving clients more control over query responses than REST, GraphQL gateways can also contribute to a more flexible, scalable IT system. They are an essential part of GraphQL federation, where teams can independently manage, update and optimize their own services while receiving centralized governance and oversight through the gateway.

GraphQL API gateways can also perform high-level management tasks across security (authentication and authorization), observability (logging, monitoring and tracing), orchestration (query routing, aggregation and error handling) and optimization (caching, rate limiting and batching).

But GraphQL gateways aren’t appropriate for every use case. For smaller development teams, GraphQL gateways might be unnecessarily complex and costly, requiring more custom logic, maintenance and technical expertise. The router must dynamically resolve unpredictable query shapes, devise intricate execution strategies and account for dependencies. These capabilities are harder to build and maintain—and take more network and computational resources—compared to straightforward REST frameworks.

Also, without robust governance, individual resolver functions can fire a separate API call for each field in a query, creating excessive and redundant backend requests. Clients also cannot anticipate how resource-intensive a particular request might be. For example, fetching 100 users’ purchase histories might trigger 101 database calls (one for users, and one for each individual user’s purchase history), rather than two bundled calls. This tendency, known as the N+1 problem, can drive up costs and hurt performance at scale but can be mitigated through caching, batching and other techniques.

Monolithic vs. schema stitching vs. GraphQL federation

Before we explore how GraphQL gateways work, it’s important to understand the role they play in different architectural patterns.

In monolithic GraphQL frameworks, a gateway isn’t necessary. A single GraphQL server exposes and defines the API directly, including its underlying schema (the contract that defines the types of data that it stores) and resolver logic (the functions that fetch requested data). However, monolithic architectures can become unwieldy as organizations scale: Because every team shares the same API layer, services can become tightly coupled and updates can affect the entire codebase, slowing deployments and introducing bottlenecks, among other issues.

A decentralized alternative called schema stitching gives teams greater autonomy by enabling them to define and manage GraphQL schemas individually at the service level. These schemas are then manually stitched together, with the GraphQL gateway defining relationships between each service so that they appear as a unified API in the front-end. But this approach faces similar shortcomings as monolithic frameworks. As new services are added over time, developers must spend more resources on custom coding and maintenance to preserve relationships between each schema at the gateway level.

GraphQL federation is a decentralized architectural pattern that aims to overcome these challenges by shifting relationship logic from the gateway to individual services. In terms of a theoretical GraphQL maturity model, federation might be considered the most advanced option—but also the most complex and difficult to manage.

In GraphQL federation, services define and manage their own schemas, as well as how the data those schemas expose relates to data owned by other services. In many implementations, such as Apollo Federation, a schema registry receives service updates, validates them and composes them into a unified schema that the gateway can reference. This process can be automated with a CI/CD pipeline or built-in composition tools.

The gateway is only responsible for routing requests and aggregating responses—it has no role in defining service relationships. One major benefit of this approach is that teams can scale, add, test and update services without needing to directly modify the gateway each time. Teams achieve improved flexibility and scalability while still presenting clients with a single, unified API.  

API Connect

IBM API Connect Developer Portal

See how the API Connect Developer Portal empowers you to share, manage, and customize APIs for your developers—accelerating integration and collaboration with a secure, scalable platform.

What is GraphQL Apollo federation?

Introduced in 2019, Apollo federation is arguably the most extensively developed standardization framework for GraphQL federation. In Apollo GraphQL federation, independent services are known as subgraphs. For example, a “user” subgraph might manage usernames, while a “product” subgraph might manage product listings.

A composition engine composes each subgraph’s schema to form a unified supergraph. The GraphQL gateway (also called a federated or federation gateway in this context) can then reference this supergraph as it responds to client queries. Schema-embedded directives provide added context for how specific data fields should be handled.

Apollo federational is made up of two key components:

  • The Apollo Router is a Rust-based runtime that responds to API queries and retrieves data for clients—essentially, Apollo’s version of a GraphQL gateway. It can also be thought of as the execution layer or the data plane. (A separate platform called Apollo Server enables teams to design and build individual GraphQL APIs with JavaScript or TypeScript.)

  • GraphOS is the control plane. It enables teams to view and monitor APIs, read schema documentation (or docs), track performance, manage integrations, implement access controls and adjust high-level settings (configurations or configs). In Apollo federation, GraphOS also composes subgraphs to form the supergraph, which the router references at run time.

While Apollo is a popular choice for federating microservices architectures, it isn’t the only GraphQL framework available. Alternative libraries and toolsets—including Hive Gateway, AWS AppSync and Grafbase Gateway—are designed for different use cases, with varying levels of operational complexity.

One emerging option is Open Federation, a “community-driven, open source” specification, primarily designed by WunderGraph, that is compatible with Apollo but enables organizations to mix and match components. This configuration helps organizations avoid vendor lock-in.

The non-profit GraphQL Foundation’s Composite Schema Working Group is also designing its own high-level federation specification, according to the foundation’s website. (The group regularly publishes updates through its GitHub repository.)

GraphQL gateway capabilities

While GraphQL gateways are primarily designed to facilitate efficient API request routing and aggregation, they can perform numerous other functions that help organizations balance team autonomy with centralized management. Capabilities include:

Orchestration

At its core, a GraphQL gateway can intelligently interpret API requests, construct a plan for collecting requested data (query planning), aggregate that data across disparate sources and return it as a single, readable response. The gateway can also interpret various commands: Clients can use “mutations” to create, update and delete data, and “subscriptions” to receive a notification every time a particular event takes place. Finally, clients and the gateway can exchange metadata (such as security tokens, rate limits, request traces and caching instructions) to provide more context alongside the payload itself.

Authentication and authorization

GraphQL gateways support various authentication patterns—including JSON Web Tokens (JWTs), API keys and custom middleware—to verify the identity of the user or service making a request. Authorization, or the process of determining what an authenticated user or service is allowed to access, can be enforced at the gateway, schema or resolver levels. Often, these layers are combined for more robust security.

Observability

Many GraphQL gateways feature built-in tracing and logging tools, which help organizations maintain a record of every stage of the API request-response lifecycle. Gateways might also support OpenTelemetry (OTel), an open source observability framework that can collect and organize telemetry data across disparate microservices and programming languages.

Error codes contribute to efficient error handling by giving clients a consistent, machine-readable signal about what went wrong, making errors easier to diagnose and resolve. Finally, dashboards and analytics tools enable IT teams to visualize data flows, optimize performance and quickly identify vulnerabilities.

Request optimization

GraphQL gateways improve request efficiency in several ways. Most fundamentally, a single GraphQL query can retrieve data across multiple backend services in one request, eliminating the redundant round trips common in REST architectures.

Caching enables gateways to preserve and quickly recall previous results, saving time and compute when the same request is made repeatedly. However, because GraphQL queries are typically sent as POST requests, gateways often implement specialized approaches like persisted queries to enable effective caching.

Request-layer batching further reduces latency by bundling multiple independent queries into a single HTTP request. At the resolver layer, individual resolvers can forward field-level requests to a batching function, executing what would otherwise be many separate calls as a single bundled request. This method helps reduce server strain and improve efficiency.

Developer support

Many GraphQL gateways include built-in developer features that can streamline core gateway management functions, including orchestration, oversight and testing. Introspection tools enable developers to query each schema to better understand what data and operations they expose. Third-party plug-ins extend the gateway’s capabilities; for example, code generators and integrated development environments (IDEs) can accelerate coding workflows by adding autofill and other features to GraphQL-based architectures.

GraphQL gateway vs. traditional API gateway

Both GraphQL gateways and traditional API gateways present a single point of entry for clients to access multiple backend services. With this approach, teams can independently manage, update and scale their own services—and work in any programming language—while the gateway provides centralized governance and oversight.

Traditional API gateways, such as REST API gateways, feature fixed, clearly defined endpoints and align naturally with HTTP caching standards. These qualities can make them a more straightforward and performant choice for simpler deployments.

However, while traditional API gateways are well equipped to return data at the resource level, they do not natively support field-level requests. Also, REST API gateways are unable to compose data from multiple resources in a single response without custom aggregation logic.

Imagine an e-commerce user who has multiple items in their cart. During the checkout process, the client application (the e-commerce site) needs to verify the cost of each product by referencing the “product” server. In its GraphQL API request, the client can specify which products it needs to reference by including specific product IDs.

But with REST, the client has limited control over the fields returned for each product. As a result, instead of returning only the requested prices, the REST API gateway might also return unnecessary fields, such as product description and supplier data. This is an example of over-fetching, which can lead to performance slowdowns and bottlenecks at scale.

Organizations can overcome this limitation by manually designing custom logic for each API or by building custom endpoints for specific workflows. But these approaches require extensive development resources and can lead to more verbose requests and misalignments as organizations scale up.

Now, imagine that the e-commerce client also needs to determine whether the user is eligible for a discount. With a traditional API gateway, the client application cannot combine this request with the earlier product price request. Instead, it needs to make a separate call to the “user” service. This is an example of under-fetching—having to make multiple requests to obtain requested data—which can contribute to excessive cloud spending and inefficiencies.

A GraphQL gateway aims to address both under- and over-fetching. Because orchestration and field logic are embedded in the supergraph, clients can access multiple resources through a single request—and can specify which fields they need to obtain from those resource servers.

GraphQL gateway challenges

While GraphQL gateways can streamline the client experience, enhance scalability, foster team autonomy and improve security through centralization, they can also introduce new operational hurdles. Challenges include:

Caching complexity

In traditional API frameworks, client-side caching is relatively straightforward: The client can request specific resources through a URL, which serves as a stable, unique identifier for a particular resource. Meanwhile, the server embeds caching instructions, including who can cache and for how long, through HTTP headers. This feature enables the client to reuse cached responses instead of calling the same resources repeatedly, reducing bandwidth and server load, among other benefits.

But with GraphQL, queries typically flow through just one endpoint, which uses a single URL to handle many requests. That means the URL can no longer be used to identify specific resources on its own; instead, the contents of the query itself dictate what will be returned. Because clients can request any combination of fields, traditional, resource-level caching strategies become insufficient.

To overcome this challenge, clients often rely on unique identifiers to cache data at the entity level, which is more granular than resource-level caching approaches. On the server side, developers might perform caching at the resolver or response layers, although this approach requires additional management and configuration.

Another common approach is persisted queries, where each unique query is assigned a fixed identifier, allowing clients to send that identifier rather than the full query. While this approach doesn’t fully replicate REST’s URL-based caching model, it enables standard HTTP caching mechanisms to apply.

Performance bottlenecks

Like other gateways, because every API request flows through the centralized gateway, the router can present a single point of failure, where errors and vulnerabilities impact all downstream GraphQL services. If the gateway is bombarded with requests, performance bottlenecks can also emerge. Organizations can counteract this problem by deploying the gateway across several horizontally scaled server instances, with a load balancer distributing traffic. (As a stateless component, the GraphQL endpoint does not store session data, enabling horizontal scaling.)

Security vulnerabilities

Because a single endpoint is responsible for handling every client query, GraphQL’s expressive query language can create greater exposure to query-based attacks than traditional REST architectures. Attackers can create complex or malicious queries to overwhelm servers with denial-of-service attacks, or exploit backend services to reveal sensitive information. Enterprises can implement limits on query depth and complexity to mitigate this issue, and combine these implementations with rate limiting and other controls for more robust security.

Vendor lock-in

Organizations might struggle to move existing REST APIs to GraphQL because doing so requires teams to define new schemas and reassign service ownership to accommodate a single endpoint. Also, even after adopting a GraphQL gateway, organizations might face vendor lock-in because many third-party GraphQL platforms incorporate proprietary features and conventions that make switching difficult.

Team coordination and consistency

Because teams are responsible for defining and maintaining their own schema, they must be aligned around a consistent, shared documentation model. Otherwise, poor governance can lead different teams to adopt disparate policies and conventions, contributing to naming overlaps, data mismatches, inconsistent rollouts and other problems.

Management complexity

Because GraphQL gateways must interpret dynamic, unpredictable client queries, they carry a heavier computational burden than traditional gateways. While query routing is abstracted away from the client server, which interacts with a single endpoint, GraphQL gateways can nonetheless cause scaling and performance issues. If resolvers are not optimized, a single query can trigger dozens or hundreds of individual resolver executions.

Also, schemas can become tangled or fragmented as services scale, and existing caching, monitoring and troubleshooting mechanisms—designed around REST’s predictable resource model—often need to be redesigned. Debugging can also be challenging, as GraphQL embeds errors within responses rather than surfacing them through standard HTTP status codes.

GraphQL gateway benefits

While GraphQL gateways can present operational challenges, especially during the implementation phase, they can also foster a more efficient, scalable, secure and resilient IT environment when paired with robust governance. Benefits include:

Improving efficiency and client control

The GraphQL gateway enables clients to fetch the resources they need with more precision than is possible with traditional REST frameworks. Instead of making multiple requests to fetch data from different services, client applications can make a single, specific request and rely on the gateway to fulfill the request without returning unnecessary data. The gateway’s efficient fetching can also reduce latency and improve uptime.

Extending team autonomy

Like traditional API gateways, GraphQL gateways are designed to give teams a wide degree of freedom to scale services, test new features and deploy updates without interrupting system-wide performance and security. But with GraphQL, services are more loosely coupled compared to endpoint-based frameworks. Integration responsibilities shift to a shared schema, while teams maintain full ownership over their own schema contributions.

Simplifying schema management

With Apollo GraphQL federation, developers can use a specific set of directives (including “key,” “external,” “requires” and “extends”) to more easily define entity relationships, rather than relying on complex, manual stitching.

Because relationship logic is composed through a shared schema, the gateway can independently coordinate complex orchestration, reducing the need for custom integration logic at the service level. These capabilities free up development resources so that programmers can spend more time on higher-value tasks.

Supporting event-driven and asynchronous architectures

GraphQL natively supports subscriptions, which enable clients to receive real time updates when predefined events occur. This is a capability that gateways can manage and route alongside standard queries. Gateways can also fan out requests to multiple backend services in parallel, receiving responses asynchronously and improving overall resource efficiency and responsiveness.

Authors

Nick Gallagher

Staff Writer, Automation & ITOps

IBM Think

Michael Goodwin

Staff Editor, Automation & ITOps

IBM Think

Related solutions
Manage your APIs with IBM API Connect

Simplify API management across all environments

Explore IBM API Connect®
IBM integration solutions

Connect, automate, and break data silos to unlock innovation and speed with secure, unified integration.

Explore IBM integration solutions
IBM consulting for cloud

Maximize hybrid cloud value in the agentic AI era. Accelerate transformation, modernize applications, and automate IT to drive efficiency, sustainability, and faster innovation.

Explore IBM cloud consulting
Take the next step

Discover how IBM API Connect can help you streamline API management, strengthen governance, and accelerate the delivery of secure, high-quality digital services.

  1. Explore IBM API Connect®
  2. Explore IBM integration solutions