Designing RESTful APIs

The way that you design APIs can have a significant impact on their adoption. This section lists considerations for API design in general, and principles for effective RESTful implementation in particular.

Think consumer

APIs don't exist in isolation. Modern APIs are the way in which the capabilities of services are shared with others. When implemented correctly, APIs that are used inside your organization can enforce consistency and promote efficient reuse. Public APIs that are used outside your organization can expand the reach of your business, by allowing developers to extend the services that you provide. Ease-of-use for consumers is vital for the adoption of the API.

Consumers are developers. They could be developers in your own organization, or a mix of internal and third-party developers. These developers expect APIs that make it quick for them to deliver: quick to learn, easy to use, and aimed at their use cases. Using the API must be faster and more expedient than coding an alternative solution. A successful API encourages developers to use it and to share it with other developers.

An API designer of any API must decide on the following functional requirements:
  • What function needs to be exposed, and how.
  • Models an API that supports the needs of the user and follows RESTful principles.
A properly designed API appeals to the user, is easy to understand and implement.
Thinking of an API as a business product helps to differentiate it from traditional application programming interfaces. A traditional application programming interface represents a piece of software that you have built and deployed. A modern API represents a package of capabilities that is both attractive to a user and independent of any specific piece of back-end software. The API is designed from the perspective of the intended user. Before you develop one, you must understand:
  • Who is the user? You might have one clear target user for the API or a mix of users. If you have a mix of users, you must understand each of them.
  • What do they want? Instead of focusing on what the API can do, that is, its functions and capabilities, think about the ways in which it might be used.
  • How can you make those use cases as easy as possible? Think about:
    Stability
    How can you minimize disruption for the consumer when you change the API?
    Flexibility
    Although you can't exhaustively cover every possibility, how can you build in some flexibility for the consumer? A simple example is allowing either uppercase or lower-case input.
    Consistency
    What standards can you set for your API so that consumers know what to expect?
    Documentation
    What documentation can you provide, and how do you make this as straightforward to use as possible?

Think resources

Traditional services focused on methods, such as "createAccount" or "updateAccount". Designing RESTful services means that you have to think differently: you focus on resources. For example, a resource could be "Account", then the standard HTTP methods are used to operate on that resource. These methods act as verbs for the nouns of the resources.

The verbs POST, GET, PUT, and DELETE are already defined. Try to handle all operations with a combination of these verbs and the resources. The more bespoke verbs that you define, the less generalized your interface becomes.

Design the URIs

For client applications that address resources, the URIs determine how intuitive the REST Web service is and whether the service will be used in ways that the designers can anticipate. REST Web service URIs should be intuitive to the point where they are easy to guess. Think of a URI as a kind of self-documenting interface that requires little, if any, explanation or reference for a developer to understand what it points to and to derive related resources. To this end, the structure of a URI should be straightforward, predictable, and easily understood.

One way to achieve this level of usability is to define directory structure-like URIs. This type of URI is hierarchical, rooted at a single path, and branching from it are sub-paths that expose the service's main areas. According to this definition, a URI is not merely a slash-delimited string, but rather a tree with subordinate and superordinate branches connected at nodes. For example, in a discussion threading service that gathers a range of topics, you might define a structured set of URIs like this:
http://www.myservice.org/discussion/topics/{topic}
The root, /discussion, has a /topics node beneath it. Underneath that there are a series of topic names, such as technology and so on, each of which points to a discussion thread. Within this structure, it's easy to pull up discussion threads just by typing something after /topics/.

In some cases, the path to a resource lends itself especially well to a directory-like structure. Take resources organized by date, for instance, which are a very good match for using a hierarchical syntax. This example is intuitive because it is based on rules: http://www.myservice.org/discussion/2008/12/10/{topic}. The first path fragment is a four-digit year, the second path fragment is a two-digit day, and the third fragment is a two-digit month. This is the level of simplicity we're after. Humans and machines can easily generate structured URIs like this because they are based on rules. Filling in the path parts in the slots of a syntax makes them good because there is a definite pattern from which to compose them: http://www.myservice.org/discussion/{year}/{day}/{month}/{topic}

Some additional guidelines while thinking about URI structure for a RESTful Web service are:
  • Hide the server-side scripting technology file extensions (.jsp, .php, .asp), if any, so you can convert to another scripting language without changing the URIs.
  • Keep everything lowercase.
  • Substitute spaces with either hyphens or underscores
  • Avoid query strings as much as you can.
  • Instead of using the 404 Not Found code if the request URI is for a partial path, always provide a default page or resource as a response.
  • URIs should also be static so that when the resource changes or the implementation of the service changes, the link stays the same. This allows bookmarking. It's also important that the relationship between resources that is encoded in the URIs remains independent of the way the relationships are represented where they are stored.

Apply HTTP methods explicitly

One of the key characteristics of a RESTful Web service is the explicit use of HTTP methods in a way that follows the protocol as defined by RFC 2616. HTTP GET, for instance, is defined as a data-producing method that's intended to be used by a client application to retrieve a resource, to fetch data from a Web server, or to execute a query with the expectation that the Web server will look for and respond with a set of matching resources.

REST asks developers to use HTTP methods explicitly and in a way that's consistent with the protocol definition. This basic REST design principle establishes a one-to-one mapping between create, read, update, and delete (CRUD) operations and HTTP methods. According to this mapping:
  • To create a resource on the server, use POST.
  • To retrieve a resource, use GET.
  • To change the state of a resource or to update it, use PUT.
  • To remove or delete a resource, use DELETE.

An unfortunate design flaw inherent in many Web APIs is in the use of HTTP methods for unintended purposes. The request URI in an HTTP GET request, for example, usually identifies one specific resource. Or the query string in a request URI includes a set of parameters that defines the search criteria used by the server to find a set of matching resources. At least this is how the HTTP/1.1 RFC describes GET. But there are many cases of unattractive Web APIs that use HTTP GET to trigger something transactional on the server: for instance, to add records to a database. In these cases the GET request URI is not used properly, or at least is not used according to RESTful design principles. If the Web API uses GET to invoke remote procedures, it looks like this: GET /adduser?name=Robert HTTP/1.1

It's not a very attractive design because this Web method supports a state-changing operation over HTTP GET. Put another way, this HTTP GET request has side effects. If successfully processed, the result of the request is to add a new user: in this example, Robert, to the underlying data store. The problem here is mainly semantic. Web servers are designed to respond to HTTP GET requests by retrieving resources that match the path (or the query criteria) in the request URI and return these or a representation in a response, not to add a record to a database. From the standpoint of the intended use of the protocol method then, and from the standpoint of HTTP/1.1-compliant Web servers, using GET in this way is inconsistent.

Using a GET action that triggers the deletion, modification, or addition of a record in a database, or changes server-side state in some way, invites Web caching tools (crawlers) and search engines to make server-side changes unintentionally simply by crawling a link. A simple way to overcome this common problem is to move the parameter names and values on the request URI into the request body. The resulting request body, a JSON representation of the entity to create, can be sent in the body of an HTTP POST whose request URI is the intended parent of the entity. For example:
Before:
GET /adduser?name=Robert HTTP/1.1
After:
POST /users HTTP/1.1
   Host: myserver
   Content-Type: application/json
   { "user": { "name": "Robert" } }

This method is exemplary of a RESTful request: proper use of HTTP POST and inclusion of the payload in the body of the request. On the receiving end, the request can be processed by adding the resource contained in the body as a subordinate of the resource identified in the request URI; in this case the new resource should be added as a child of /users. This containment relationship between the new entity and its parent, as specified in the POST request, is analogous to the way a file is subordinate to its parent directory. The client sets up the relationship between the entity and its parent and defines the new entity's URI in the POST request.

A client application can then get a representation of the resource using the new URI, noting that at least logically the resource is located under /users:
HTTP GET request  GET /users/Robert HTTP/1.1
   Host: myserver
   Accept: application/json 
Using GET in this way is explicit because GET is for data retrieval only. GET is an operation that should be free of side effects, a property also known as idempotence. A similar refactoring of a Web method also needs to be applied in cases where an update operation is supported over HTTP GET:
GET /updateuser?name=Robert&newname=Bob HTTP/1.1
This changes the name attribute (or property) of the resource. Query strings aren't a bad thing (they're good for implementing filter specifications, for example) but the query-string-as-method-signature pattern that is used in this simple example can break down when used for more complex operations. Because your goal is to make explicit use of HTTP methods, a more RESTful approach is to send an HTTP PUT request to update the resource, instead of HTTP GET, for the same reasons stated earlier.
PUT /users/Robert HTTP/1.1
   Host: myserver
   Content-Type: application/json

   { "user": { "name": "Bob" } }

Using PUT to replace the original resource provides a much cleaner interface that's consistent with REST's principles and with the definition of HTTP methods. The PUT request in this example is explicit in the sense that it points at the resource to be updated by identifying it in the request URI and in the sense that it transfers a new representation of the resource from client to server in the body of a PUT request instead of transferring the resource attributes as a loose set of parameter names and values on the request URI. This also has the effect of renaming the resource from Robert to Bob, and in doing so changes its URI to /users/Bob. In a REST Web service, subsequent requests for the resource using the old URI would generate a standard 404 Not Found error.

Another consideration is handling large result sets. A standard approach is to use explicit pagination: the GET returns a limited number of objects when it is invoked against a set (irrespective of whether it is filtered), and include a link to the next page or batch that can be requested. The size of a page can be included on the GET, for example as a query string parameter) and, if there is a danger of returning too many, it should be set to default if the caller forgets to set it.

As a general design principle, it helps to follow REST guidelines for using HTTP methods explicitly by using nouns in URIs instead of verbs. In a RESTful Web service, the verbs POST, GET, PUT, and DELETE are already defined by the protocol. And ideally, to keep the interface generalized and to allow clients to be explicit about the operations they invoke, the Web service should not define more verbs or remote procedures, such as /adduser or /updateuser. This general design principle also applies to the body of an HTTP request, which is intended to be used to transfer resource state, not to carry the name of a remote method or remote procedure to be invoked.

Be stateless

REST Web services need to scale to meet increasingly high performance demands. Clusters of servers with load-balancing and failover capabilities, proxies, and gateways are typically arranged in a way that forms a service topology, which allows requests to be forwarded from one server to the other as needed to decrease the overall response time of a Web service call. Using intermediary servers to improve scale requires REST Web service clients to send complete, independent requests; that is, to send requests that include all data needed to be fulfilled so that the components in the intermediary servers can forward, route, and load-balance without any state being held locally in between requests.

A complete, independent request doesn't require the server, while processing the request, to retrieve any kind of application context or state. A REST Web service application (or client) includes within the HTTP headers and body of a request all of the parameters, context, and data needed by the server-side component to generate a response. Statelessness in this sense improves Web service performance and simplifies the design and implementation of server-side components because the absence of state on the server removes the need to synchronize session data with an external application.

Figure 1 illustrates a stateful service from which an application can request the next page in a multi-page result set, assuming that the service keeps track of where the application leaves off while navigating the set. In this stateful design, the service increments and stores a previousPage variable somewhere to be able to respond to requests for next.
Figure 1. . Diagram showing stateful design
Illustration of stateful design, where the service increments and stores a previousPage variable somewhere to be able to respond to requests for next

Stateful services like this get complicated. In a Java Platform, Enterprise Edition (Java EE) environment stateful services require a lot of up-front consideration to efficiently store and enable the synchronization of session data across a cluster of Java EE containers. In this type of environment, there's a problem familiar to servlet/JavaServer Pages (JSP) and Enterprise JavaBeans (EJB) developers who often struggle to find the root causes of java.io.NotSerializableException during session replication. Whether it's thrown by the servlet container during HttpSession replication or thrown by the EJB container during stateful EJB replication, it's a problem that can cost developers days in trying to pinpoint the one object that doesn't implement the Serializable interface in a sometimes complex graph of objects that constitute the server's state. In addition, session synchronization adds overhead, which impacts server performance.

Stateless server-side components, on the other hand, are less complicated to design, write, and distribute across load-balanced servers. A stateless service not only performs better, it shifts most of the responsibility of maintaining state to the client application. In a RESTful Web service, the server is responsible for generating responses and for providing an interface that enables the client to maintain application state on its own. For example, in the request for a multi-page result set, the client should include the actual page number to retrieve instead of simply asking for next.
Figure 2. . Diagram showing stateless design
In this stateless design, for the request for a multipage result set, the client should include the actual page number to retrieve instead of simply asking for next.
A stateless Web service generates a response that links to the next page number in the set and lets the client do what it needs to in order to keep this value around. This aspect of RESTful Web service design can be broken down into two sets of responsibilities as a high-level separation that clarifies just how a stateless service can be maintained:
Server
  • Generates responses that include links to other resources to allow applications to navigate between related resources. This type of response embeds links. Similarly, if the request is for a parent or container resource, a typical RESTful response might also include links to the parent's children or subordinate resources so that these remain connected.
  • Generates responses that indicate whether they are cacheable or not to improve performance by reducing the number of requests for duplicate resources, and by eliminating some requests entirely. The server does this by including a Cache-Control and Last-Modified (a date value) HTTP response header.
Client application
  • Uses the Cache-Control response header to determine whether to cache the resource (make a local copy of it) or not. The client also reads the Last-Modified response header and sends back the date value in an If-Modified-Since header to ask the server if the resource has changed. This is called Conditional GET, and the two headers go hand-in-hand in that the server's response is a standard 304 code (Not Modified) and omits the actual resource requested if it has not changed since that time. A 304 HTTP response code means the client can safely use a cached, local copy of the resource representation as the most up-to-date, in effect bypassing subsequent GET requests until the resource changes.
  • Sends complete requests that can be serviced independently of other requests. This requires the client to make full use of HTTP headers as specified by the Web service interface and to send complete representations of resources in the request body. The client sends requests that make very few assumptions about prior requests, the existence of a session on the server, the server's ability to add context to a request, or about application state that is kept in between requests.
This collaboration between client application and service is essential to being stateless in a RESTful Web service. It improves performance by saving bandwidth and minimizing server-side application state.

Data format

A resource representation typically reflects the current state of a resource, and its attributes, at the time a client application requests it. Resource representations in this sense are mere snapshots in time. This could be as simple as a representation of a record in a database that consists of a mapping between column names and JSON fields, where the element values in the JSON contain the row values. Or, if the system has a data model, according to this definition a resource representation is a snapshot of the attributes of one of the things in your system's data model. These are the things you want your REST Web service to serve up. The last set of constraints that goes into a RESTful Web service design has to do with the format of the data that the application and service exchange in the request/response payload or in the HTTP body. This is where it really pays to keep things simple, human-readable, and connected. The objects in your data model are usually related in some way, and the relationships between data model objects (resources) should be reflected in the way they are represented for transfer to a client application. In the discussion threading service, an example of connected resource representations might include a root discussion topic and its attributes, and embed links to the responses given to that topic.

And last, to give client applications the ability to request a specific content type that's best suited for them, construct your service so that it makes use of the built-in HTTP Accept header, where the value of the header is a MIME type. The JSON MIME type is typically used by RESTful services, and is specified using the HTTP header Content-Type: application/json.

Designing APIs for use with interceptors

Service interceptors are not called for API requests. Interceptors that are configured for services are only triggered if the service is invoked directly from an HTTP or HTTPS request. They are not triggered if the service is invoked from an API.

Consider the following criteria when designing your APIs to work with interceptors:
  • You should only combine services that have the same security constraints (authorization and audit) into a single API. For example, do not combine getBalance and accountTransfer services into the same API if the authorization or audit requirements of these banking services are different.
  • You should only combine services that have the same logging constraints into a single API.
  • You should only combine services into a single API if API-level monitoring is sufficient. For example, you want to monitor the number of API requests to an account but not the number of balance inquiries, postings, or transfers.