Hello everybody, This is the first of a series of articles that will describe how to build the authentication scheme for your own REST API and your mobile client App. For the articles, we will suppose the following scenario:

You are developing a RESTful API as the backend for a mobile App frontend. The App will communicate with your REST API to perform CRUD operations on the data. You will allow the user to authenticate against the REST server by a common username/password scheme and you will also allow social login, by means of the usual “login with Facebook” or “login with Twitter” buttons on your App. You want to integrate the social authentication seamlessly into your App User Experience. Once logged in, the App will communicate with the REST API in behalf of the user to do the CRUD operations.

The first in this series of articles will focus on the basics of OAuth, concretely on its second version, OAuth 2.0. The first step for adopting a technology is understanding it.

So, what is OAuth 2.0?

OAuth is an authorization protocol. It was born because there was a need for certain web services to access information contained in another web services, and each service had their own username and password authentication scheme, so the user would end up entering their username and password for service A into service B, supposing that service B needed to access some information from service A. Concretely, this situation happened when Flickr was acquired by Yahoo!, and Google bought Blogger. Some Google services, like Google Calendar, allowed developers to access the information from calendars if the user provided the credentials. The idea of Yahoo! asking for Google passwords was found unacceptable, so both firms developed proprietary protocols for handling authorization of information. As developers were not happy having to integrate different custom authorization schemes into their apps for every web service, it was decided that a standard protocol was needed for tackling with the authorization of users’ information. The first version of OAuth, OAuth 1.0, relied on cryptographic signatures to add more security to the authorization information transmitted by the protocol. With the adoption of TLS/SSL, OAuth 2.0 dropped signatures and cryptography. By using OAuth, we can get authorization from the user to retrieve certain information or perform certain actions on his/her behalf on third party services, such as Facebook, Twitter, Google+, Google Calendar, Yahoo!, etcetera. It is important to emphasize that OAuth is not an authentication protocol, it is an authorization protocol. Authentication means identifying the user by his or her credentials, whereas authorization means granting access to certain information about the user. As I wrote in a previous post, most developers nowadays are using OAuth as an authentication protocol, thus employing the wrong protocol for the wrong task. Nevertheless, we are going to talk about how you can use OAuth 2.0 to achieve a (kind of) delegate authentication scheme. The OAuth 2.0 protocol usually adheres to the REST specification, requiring usually GET and POST requests, and the server’s responses are usually expressed in JSON format, although this depends on the concrete implementation of the different services. Facebook, for example, returns the access token as a plain text string.

Meet the cast

When discussing the OAuth protocol, we will refer to the following actors:

  • Resource server: The server hosting user-owned resources that are protected by OAuth. This is typically an API provider that holds and protects data such as photos, videos, calendars, or contacts.
  • Resource owner: Typically the user of an application, the resource owner has the ability to grant access to their own data hosted on the resource server.
  • Client (or client application): An application making API requests to perform actions on protected resources on behalf of the resource owner and with its authorization.
  • Authorization server: The authorization server gets consent from the resource owner and issues access tokens to clients for accessing protected resources hosted by a resource server. Smaller API providers may use the same application and URL space for both the authorization server and resource server.
  • Access token:  An access token is a key that allows a client, resource owner (via an App or web client) or a resource server, to access the information protected by OAuth. In OAuth 2.0, these access tokens are called “bearer tokens”, and can be used alone, with no signature or cryptography, to access the information. Access tokens are usually passed to the servers in the header, in the form of “Authorization: Bearer <token-string>” (and this is the recommended way of sending the access token), although depending on the concrete OAuth implementation made by the service, the access token can also be passed as POST parameters or even as part of the GET URI.

Know your flows

There are different types of scenario were OAuth can be deployed, with different types of clients and different protocol flows for obtaining authorization. OAuth has defined four “grant types” to handle four authorization flows, depending on the kind of client accessing the information and the needs of the information schema. Some flows are complex and difficult to understand, but I will try to explain them in a developer-friendly manner.

Server-side web application flow

This flow is used in the following scenario: we have a user accessing your web service by means of a browser. Your web service would be the client for the authorization server that would grant us authorization on behalf of the user. serverSideWebApplicationFlow   When the client needs authorization to access some information about the user, the browser (user agent) redirects the resource owner to the OAuth authorization server. There, the user is faced with an authentication dialog (this dialog is not shown if the user is already authenticated), after which he or she is presented an authorization dialog explaining the permissions that the client is requesting, the information that it needs to access or the actions that it needs to do on his or her behalf. For example, if the OAuth Authorization service is Facebook, the request will be:

https://www.facebook.com/dialog/oauth?
client_id={app-id}&
redirect_uri={redirect-uri}

The client ID (and its associated client secret) can be usually obtained by registering a developer account with the service and creating a new application in its developer portal. Facebook, for example, allows the developer to create an App in http://developers.facebook.com. The redirect-uri will be a URI of your REST server, for example: http://www.example.com/oauth_code. Once the user has granted access, the authorization service will redirect the browser to the redirect-uri, including a code that the client REST server can exchange for the desired access token, for example, the Facebook server would redirect the user to:

http://www.example.com/oauth_code?code=1234

The client REST server then will perform a request to the authorization server again, including this code. In the case of Facebook it would be:

https://graph.facebook.com/oauth/access_token?
client_id={app-id}&
redirect_uri=http://www.example.com/oauth_token&
client_secret={app-secret}&
code=1234

If everything goes well, the authorization server will respond with the access token, and optionally a refresh token:

http://www.example.com/oauth_token? access_token=abcde& expires=seconds& refresh_token=fghijk

The access token is a temporary key that can be used to access the information or actions that the user granted for a short amount of time, typically indicated by the “expires” or “expires_in” value. After this time, the token will be invalidated and you will no longer be able to access the resources with it. That’s the reason why you can also ask for an additional refresh token, that will allow “offline access” to the data, i.e: it will allow the client (your REST server in this case) to get the information without asking the user again for permission. This “offline access” mode must be indicated in the initial request to the Authorization Server, so the user will be notified that he or she is granting it. The Server-side Web Application flow is specially designed for long lived access where the OAuth client is your REST API server, and the resource owner access the information by means of a web app displayed in a browser. Its main advantage is that the access token is never leaked to the user or leaves the REST server.

Client-side web application flow

Commonly called the “implicit” flow, this grant type is aimed at situations where the OAuth client is the browser. This flow can only provide short-lived tokens, not refresh tokens, and does not require an intermediate authorization code like the Server-side web application flow. The experience provided by this flow is better if the user is always (or most of the times) logged in the OAuth provider, and some services like Google will not prompt for the authorization dialog if the user is logged in and has previously approved the same permissions, thus limiting the impact in the user experience due to the limited life of the access token. clientSideWebApplicationFlow In this scenario, when the client needs to get authorization from the resource owner, it will redirect the user to the Authorization server, where he is asked to authenticate (if he or she is not already logged in) and then authorize the requested permissions. The request will include a redirect_uri where the client is going to be redirected when/if he or she grants access. For example, for Google services, this request would be:

https://accounts.google.com/o/oauth2/auth?
client_id={app-id}&
redirect_uri=http://www.example.com/oauth_token&
scope={permissions}&
response_type=token

In this case, we are indicating the client-side web application flow type with the “response_type=token”. After successfully being granted access, the client is redirected to the redirect_uri address, including an access_token that can be used directly by the client to request information or perform operations on behalf of the user:

http://www.example.com/oauth_token#access_token=abcde&expires_in=3600

Notice how in this case the access token is separated from the server’s base URL by a ‘#’ sign. There does exist differences in the way in which different services implement the OAuth protocols, so you must be sure to read the API for each service you want to include in your application.

Resource owner password flow

The Resource owner password flow implies using a username and password to obtain an access token (and optionally a refresh token). This flow is mostly used in official applications (i.e: the official Twitter App) where it is ok to ask the user for a password. As one of the main goals behind OAuth was to avoid having the user to enter his or her username and password when accessing the service through third party Apps, this flow is discouraged for other uses. resourceOwnerPasswordFlow As the password is exposed to the application, there must be a strong trust in it. The password is supposed to be exchanged for the access/refresh tokens and then deleted, so it would (marginally) still mean an improvement over the username/password schema where the password is stored in the application (or needs to be continually asked to the user), as the tokens are easier to revoke. As an example, let’s suppose we have an Authorization server from a Task Management Framework that includes an frontend mobile App. The client would be the App, and it will ask the resource owner his or her username and password, exchanging them for an access token and refresh token, so the user won’t need to enter his or her credentials again. The endpoint for the authorization server could be:

https://www.taskmanagementexample.com/oauth/token?
grant_type=password&
scope=reading,writing,offline_access&
client_id={client-app-id}&
client_secret={client-app-secret}&
username=user&password=pass

If the credentials are valid, the authorization server’s response could be similar to this (in JSON):

{
   user_id: 123
   access_token: abcde
   expires_in: 3600
   refresh_token: fghij
}

In this case, the “offline_access” in the scope parameter of the request would indicate the authorization server that it should return a refresh token besides the access token, so the App can have long lived access without requesting the password to the user again.

 Client credentials flow

The last flow supported by OAuth 2.0, this flow is intended to be used in a scenario where there is an App o REST server (acting as a client) that needs to retrieve information or get access to resources from an Authorization Server, but in behalf of itself instead of a resource owner, or when a concrete client has already granted access to certain resources to the App outside of the OAuth control flows. clientCredentialsFlow This flow can look like the easiest one to implement, and the one to go for a mobile App, but it has several important drawbacks when used that way. One of the main caveats of this flow is the fact that you need to store the client ID and client secret of the application on the client App, so in theory, anybody with access to the App could potentially obtain this client ID and secret and use them to authenticate themselves as the legitimate client. Another factor to have into account is the fact that the client is authenticating and authorizing in behalf of the client itself, it has no way to indicate the user accessing the services (unless you create a client ID and client secret for every user of the App, which is not usually feasible), so you cannot perform any operation or get any information reliant on the identity of a user, at least, unless you do some custom hack, which we’ll discuss in further articles. Facebook, for example, supports this kind of flow. You need to do a GET request with the following format:

GET /oauth/access_token?
client_id={app-id}&
client_secret={app-secret}&
grant_type=client_credentials

The response would look similar to this:

access_token=abcdefghijklmnopqrstuvwxyz

Notice how the Facebook response, as indicated previously, is a plain text response in HTML format, contrary to the rest of OAuth implementations. This access token can be used to get some information and perform some operations, but some of the API functions can not be reached using it. Besides, Facebook explicitly discourages using this flow on native or desktop apps.

Where to go from here

In the next article of this series, I will present three different approaches for using these OAuth flows to achieve (kind of) authentication for your REST API, with implementation details, and we will discuss their advantages and disadvantages. If you have any comment, suggestion or correction, please let me know in the comments. Some inspiration for this article comes from the book “Getting started with OAuth 2.0”, by Ryan Boyd. If you want a even deeper analysis of OAuth 2.0, I recommend you to grab a copy.