Designing RESTful APIs
The way that you design APIs can have a significant impact on their adoption. This section lists considerations for API design in general, and principles for effective RESTful implementation in particular.
Think consumer
APIs don't exist in isolation. Modern APIs are the way in which the capabilities of services are shared with others. When implemented correctly, APIs that are used inside your organization can enforce consistency and promote efficient reuse. Public APIs that are used outside your organization can expand the reach of your business, by allowing developers to extend the services that you provide. Ease-of-use for consumers is vital for the adoption of the API.
Consumers are developers. They could be developers in your own organization, or a mix of internal and third-party developers. These developers expect APIs that make it quick for them to deliver: quick to learn, easy to use, and aimed at their use cases. Using the API must be faster and more expedient than coding an alternative solution. A successful API encourages developers to use it and to share it with other developers.
- What function needs to be exposed, and how.
- Models an API that supports the needs of the user and follows RESTful principles.
- Who is the user? You might have one clear target user for the API or a mix of users. If you have a mix of users, you must understand each of them.
- What do they want? Instead of focusing on what the API can do, that is, its functions and capabilities, think about the ways in which it might be used.
- How can you make those use cases as easy as possible? Think about:
- Stability
- How can you minimize disruption for the consumer when you change the API?
- Flexibility
- Although you can't exhaustively cover every possibility, how can you build in some flexibility for the consumer? A simple example is allowing either uppercase or lower-case input.
- Consistency
- What standards can you set for your API so that consumers know what to expect?
- Documentation
- What documentation can you provide, and how do you make this as straightforward to use as possible?
Think resources
Traditional services focused on methods, such as "createAccount" or "updateAccount". Designing RESTful services means that you have to think differently: you focus on resources. For example, a resource could be "Account", then the standard HTTP methods are used to operate on that resource. These methods act as verbs for the nouns of the resources.
The verbs POST, GET, PUT, and DELETE are already defined. Try to handle all operations with a combination of these verbs and the resources. The more bespoke verbs that you define, the less generalized your interface becomes.
Design the URIs
For client applications that address resources, the URIs determine how intuitive the REST Web service is and whether the service will be used in ways that the designers can anticipate. REST Web service URIs should be intuitive to the point where they are easy to guess. Think of a URI as a kind of self-documenting interface that requires little, if any, explanation or reference for a developer to understand what it points to and to derive related resources. To this end, the structure of a URI should be straightforward, predictable, and easily understood.
http://www.myservice.org/discussion/topics/{topic}
The root,
/discussion
, has a /topics
node beneath it. Underneath that there
are a series of topic names, such as technology
and so on, each of which points to
a discussion thread. Within this structure, it's easy to pull up discussion threads just by typing
something after /topics/
. In some cases, the path to a resource lends itself especially well to a directory-like
structure. Take resources organized by date, for instance, which are a very good match for using a
hierarchical syntax. This example is intuitive because it is based on rules:
http://www.myservice.org/discussion/2008/12/10/{topic}
. The first path fragment is
a four-digit year, the second path fragment is a two-digit day, and the third fragment is a
two-digit month. This is the level of simplicity we're after. Humans and machines can easily
generate structured URIs like this because they are based on rules. Filling in the path parts in the
slots of a syntax makes them good because there is a definite pattern from which to compose them:
http://www.myservice.org/discussion/{year}/{day}/{month}/{topic}
- Hide the server-side scripting technology file extensions (.jsp, .php, .asp), if any, so you can convert to another scripting language without changing the URIs.
- Keep everything lowercase.
- Substitute spaces with either hyphens or underscores
- Avoid query strings as much as you can.
- Instead of using the 404 Not Found code if the request URI is for a partial path, always provide a default page or resource as a response.
- URIs should also be static so that when the resource changes or the implementation of the service changes, the link stays the same. This allows bookmarking. It's also important that the relationship between resources that is encoded in the URIs remains independent of the way the relationships are represented where they are stored.
Apply HTTP methods explicitly
One of the key characteristics of a RESTful Web service is the explicit use of HTTP methods in a way that follows the protocol as defined by RFC 2616. HTTP GET, for instance, is defined as a data-producing method that's intended to be used by a client application to retrieve a resource, to fetch data from a Web server, or to execute a query with the expectation that the Web server will look for and respond with a set of matching resources.
- To create a resource on the server, use POST.
- To retrieve a resource, use GET.
- To change the state of a resource or to update it, use PUT.
- To remove or delete a resource, use DELETE.
An unfortunate design flaw inherent in many Web APIs is in the use of HTTP methods for
unintended purposes. The request URI in an HTTP GET request, for example, usually identifies one
specific resource. Or the query string in a request URI includes a set of parameters that defines
the search criteria used by the server to find a set of matching resources. At least this is how the
HTTP/1.1 RFC describes GET. But there are many cases of unattractive Web APIs that use HTTP GET to
trigger something transactional on the server: for instance, to add records to a database. In these
cases the GET request URI is not used properly, or at least is not used according to RESTful design
principles. If the Web API uses GET to invoke remote procedures, it looks like this: GET
/adduser?name=Robert HTTP/1.1
It's not a very attractive design because this Web method supports a state-changing operation over HTTP GET. Put another way, this HTTP GET request has side effects. If successfully processed, the result of the request is to add a new user: in this example, Robert, to the underlying data store. The problem here is mainly semantic. Web servers are designed to respond to HTTP GET requests by retrieving resources that match the path (or the query criteria) in the request URI and return these or a representation in a response, not to add a record to a database. From the standpoint of the intended use of the protocol method then, and from the standpoint of HTTP/1.1-compliant Web servers, using GET in this way is inconsistent.
Before:
GET /adduser?name=Robert HTTP/1.1
After:
POST /users HTTP/1.1
Host: myserver
Content-Type: application/json
{ "user": { "name": "Robert" } }
This method is exemplary of a RESTful request: proper use of HTTP POST and inclusion of the payload in the body of the request. On the receiving end, the request can be processed by adding the resource contained in the body as a subordinate of the resource identified in the request URI; in this case the new resource should be added as a child of /users. This containment relationship between the new entity and its parent, as specified in the POST request, is analogous to the way a file is subordinate to its parent directory. The client sets up the relationship between the entity and its parent and defines the new entity's URI in the POST request.
HTTP GET request GET /users/Robert HTTP/1.1
Host: myserver
Accept: application/json
GET /updateuser?name=Robert&newname=Bob HTTP/1.1
PUT /users/Robert HTTP/1.1
Host: myserver
Content-Type: application/json
{ "user": { "name": "Bob" } }
Using PUT to replace the original resource provides a much cleaner interface that's consistent with REST's principles and with the definition of HTTP methods. The PUT request in this example is explicit in the sense that it points at the resource to be updated by identifying it in the request URI and in the sense that it transfers a new representation of the resource from client to server in the body of a PUT request instead of transferring the resource attributes as a loose set of parameter names and values on the request URI. This also has the effect of renaming the resource from Robert to Bob, and in doing so changes its URI to /users/Bob. In a REST Web service, subsequent requests for the resource using the old URI would generate a standard 404 Not Found error.
Another consideration is handling large result sets. A standard approach is to use explicit
pagination: the GET returns a limited number of objects when it is invoked against a set
(irrespective of whether it is filtered), and include a link to the next page
or batch
that can be requested. The size of a page can be included on the GET, for example as a query string
parameter) and, if there is a danger of returning too many, it should be set to default if the
caller forgets to set it.
As a general design principle, it helps to follow REST guidelines for using HTTP methods explicitly by using nouns in URIs instead of verbs. In a RESTful Web service, the verbs POST, GET, PUT, and DELETE are already defined by the protocol. And ideally, to keep the interface generalized and to allow clients to be explicit about the operations they invoke, the Web service should not define more verbs or remote procedures, such as /adduser or /updateuser. This general design principle also applies to the body of an HTTP request, which is intended to be used to transfer resource state, not to carry the name of a remote method or remote procedure to be invoked.
Be stateless
REST Web services need to scale to meet increasingly high performance demands. Clusters of servers with load-balancing and failover capabilities, proxies, and gateways are typically arranged in a way that forms a service topology, which allows requests to be forwarded from one server to the other as needed to decrease the overall response time of a Web service call. Using intermediary servers to improve scale requires REST Web service clients to send complete, independent requests; that is, to send requests that include all data needed to be fulfilled so that the components in the intermediary servers can forward, route, and load-balance without any state being held locally in between requests.
A complete, independent request doesn't require the server, while processing the request, to retrieve any kind of application context or state. A REST Web service application (or client) includes within the HTTP headers and body of a request all of the parameters, context, and data needed by the server-side component to generate a response. Statelessness in this sense improves Web service performance and simplifies the design and implementation of server-side components because the absence of state on the server removes the need to synchronize session data with an external application.
Stateful services like this get complicated. In a Java Platform, Enterprise Edition (Java EE) environment stateful services require a lot of up-front consideration to efficiently store and enable the synchronization of session data across a cluster of Java EE containers. In this type of environment, there's a problem familiar to servlet/JavaServer Pages (JSP) and Enterprise JavaBeans (EJB) developers who often struggle to find the root causes of java.io.NotSerializableException during session replication. Whether it's thrown by the servlet container during HttpSession replication or thrown by the EJB container during stateful EJB replication, it's a problem that can cost developers days in trying to pinpoint the one object that doesn't implement the Serializable interface in a sometimes complex graph of objects that constitute the server's state. In addition, session synchronization adds overhead, which impacts server performance.
- Server
-
- Generates responses that include links to other resources to allow applications to navigate between related resources. This type of response embeds links. Similarly, if the request is for a parent or container resource, a typical RESTful response might also include links to the parent's children or subordinate resources so that these remain connected.
- Generates responses that indicate whether they are cacheable or not to improve performance by reducing the number of requests for duplicate resources, and by eliminating some requests entirely. The server does this by including a Cache-Control and Last-Modified (a date value) HTTP response header.
- Client application
-
- Uses the Cache-Control response header to determine whether to cache the resource (make a local copy of it) or not. The client also reads the Last-Modified response header and sends back the date value in an If-Modified-Since header to ask the server if the resource has changed. This is called Conditional GET, and the two headers go hand-in-hand in that the server's response is a standard 304 code (Not Modified) and omits the actual resource requested if it has not changed since that time. A 304 HTTP response code means the client can safely use a cached, local copy of the resource representation as the most up-to-date, in effect bypassing subsequent GET requests until the resource changes.
- Sends complete requests that can be serviced independently of other requests. This requires the client to make full use of HTTP headers as specified by the Web service interface and to send complete representations of resources in the request body. The client sends requests that make very few assumptions about prior requests, the existence of a session on the server, the server's ability to add context to a request, or about application state that is kept in between requests.
Data format
A resource representation typically reflects the current state of a resource, and its attributes, at the time a client application requests it. Resource representations in this sense are mere snapshots in time. This could be as simple as a representation of a record in a database that consists of a mapping between column names and JSON fields, where the element values in the JSON contain the row values. Or, if the system has a data model, according to this definition a resource representation is a snapshot of the attributes of one of the things in your system's data model. These are the things you want your REST Web service to serve up. The last set of constraints that goes into a RESTful Web service design has to do with the format of the data that the application and service exchange in the request/response payload or in the HTTP body. This is where it really pays to keep things simple, human-readable, and connected. The objects in your data model are usually related in some way, and the relationships between data model objects (resources) should be reflected in the way they are represented for transfer to a client application. In the discussion threading service, an example of connected resource representations might include a root discussion topic and its attributes, and embed links to the responses given to that topic.
And last, to give client applications the ability to request a specific content type that's best
suited for them, construct your service so that it makes use of the built-in HTTP Accept header,
where the value of the header is a MIME type. The JSON MIME type is typically used by RESTful
services, and is specified using the HTTP header Content-Type: application/json
.
Designing APIs for use with interceptors
Service interceptors are not called for API requests. Interceptors that are configured for services are only triggered if the service is invoked directly from an HTTP or HTTPS request. They are not triggered if the service is invoked from an API.
- You should only combine services that have the same security constraints (authorization and
audit) into a single API. For example, do not combine
getBalance
andaccountTransfer
services into the same API if the authorization or audit requirements of these banking services are different. - You should only combine services that have the same logging constraints into a single API.
- You should only combine services into a single API if API-level monitoring is sufficient. For example, you want to monitor the number of API requests to an account but not the number of balance inquiries, postings, or transfers.