This article was written as "Series: The Present and Future of the RFS Project for Strengthening the Technical Infrastructure".
In today’s article, we will discuss how the Mercari ID platform team applied the OAuth 2.0 token exchange industry standard for its internal use.
In the Mercari architecture, the business services are supported by various platform services. One such platform service that is critical for the overall security of the system is the identity platform (IDP). It provides authentication and authorization features to other services within the Mercari group, and is based on several industry standards. The platform’s responsibilities include:
- Authorizing external clients access to the Mercari platform
- Authorizing access between services within the Mercari platform
- Authorizing access between various subsystems in the Mercari platform
The ID platform also constantly evolves to support new services and use cases that appear. In this article, we would like to describe one such evolution, where the OAuth 2.0 Token Exchange standard was applied to support several new features of the Mercari ID Platform.
The OAuth 2.0 token exchange standard
As mentioned in a previous article, the Mercari ID platform uses several industry standards, such as OAuth 2.0 (RFC 6749) and OpenID Connect (OpenID Connect Core 1.0). OAuth 2.0, on which OpenID Connect is also based, defines a protocol that allows clients (for example a web or mobile application, a backend server, …) to obtain a credential called an “access token”, and use it to access a protected resource (such as an HTTP service).
Several flows are available for obtaining the access token, but the general idea is that an authorization (called a “grant type” in OAuth 2.0) is obtained from the owner of the resource, which is then exchanged for an access token by calling a pre-defined endpoint (called the “token endpoint” in OAuth 2.0) of an authorization server. Clients may also obtain the access token on their own behalf.
The protocol is very flexible, and it supports various use cases, such as an end-user in a web application accessing a backend server, or a server accessing another server without user interaction. In addition, the protocol is designed in a way that allows new grant types to be defined.
However, the scope of the OAuth 2.0 standard ends when the resources have been accessed using the access token. It does not define how an entity that already holds a valid token can access a resource located elsewhere, over the boundary of a security domain for example. As explained in the previous article, the entity in this case could itself be a resource server that was accessed by a different client. A flow involving a user authorization could look like the following:
This type of access scenario can also happen with clients acting on their own behalf. More generally, any entity that holds a valid token and needs to access an external system needs to consider how to access that system.
The security domain B in the examples above might be completely unrelated to domain A, and have independent access requirements. Therefore, even if a server in the security domain A already has a valid security token (a more generic concept that includes access tokens), it might still not be able to use it to access a resource in the security domain B.
This type of scenario is not specific to architectures using OAuth 2.0. Security Token Services (STS) have traditionally been used in those cases, to issue a new token for security domain B from an existing valid token. For environments like Mercari where OAuth 2.0 is already used, there exists a standard (OAuth 2.0 token exchange, RFC 8693) that defines a way for OAuth 2.0 authorization servers to act as an STS, by extending the OAuth 2.0 standard with a new “grant type”.
In that specification, a client that holds a valid token (regardless of how the token was initially obtained) may call the token endpoint of an OAuth 2.0 authorization server to obtain a new valid token:
In the example above, “Domain A resource server” acts as an OAuth 2.0 client to exchange the original token (called the “subject token” here) ST1 for an access token AT2 using the token exchange grant type. In this case, the authorization server needs to be able to validate ST1, as well as issue tokens for the security domain B.
To be able to support the token exchange scenarios described above, the standard augments the OAuth 2.0 specification with some parameters for the token endpoint. Let’s take a look at some of those new parameters and values:
- grant_type: this must be set to “urn:ietf:params:oauth:grant-type:token-exchange”
- subject_token: this is the source token that needs to be exchanged.
- subject_token_type: the type of the subject_token token. Several types of tokens are supported in the specification, such as OAuth 2.0 access tokens or ID tokens (from the OIDC standard).
In the token exchange scenarios, clients may act on their own, or they may act on behalf of another entity E. The standard makes the distinction between 2 common scenarios:
– Impersonation: in that case the client acts as if it was the other entity E. From the authorization server perspective, it is as if the token exchange request came from E. Similarly, from the point of view of a resource server where the resulting token is later used, the request came from E.
– Delegation: in that case, the client acts on behalf of the other entity E, but the 2 entities are clearly distinguished. This is achieved by sending an additional “actor_token” parameter in the token exchange request, that contains the token for the client that acts on behalf of E. The issued token is then associated with information about both entities.
These allow supporting a large number of token exchange scenarios and access requirements of the target security domain.
Applying the OAuth 2.0 token exchange standard
This token exchange standard is used in multiple features provided by the Mercari ID platform. One such feature is a custom Terraform provider developed by the IDP team, but before we describe this use case, some background information about the IDP team’s processes need to be explained.
The ID platform provides several services related to authentication and authorization, used for both internal communication between microservices, and communication with clients outside of the Mercari security domain. To support this, it is necessary to register in advance some entities, such as:
- access permissions for services inside the Mercari platform,
- access permissions between subsystems in the Mercari platform,
- and several others.
The registration of those permissions was handled manually by the IDP team in the past. This allowed having a careful review process, but it also made the registration flow time-consuming. We looked for a way to keep a strict review process, while making it easier and faster for all teams to register permissions.
We eventually settled on the idea of using Terraform for this purpose. Hashicorp Terraform is a tool that allows managing external resources as code. Resources are often infrastructure entities (such as servers, cloud storage, network components, …), but it is also possible to develop custom plugins (called “providers”) for managing other resources, such as the pre-registered permissions described above. This solution has 2 main benefits:
- a custom Terraform provider allows declaring, as code, the resources that represent the data to be registered (access permissions for example), so it’s possible for each team to manage those entities themselves,
- we could take advantage of our existing code review flow to keep a strict review process.
This seemed like a suitable solution, but one problem remained: how could we authorize the custom Terraform provider to access the ID platform API to manage those registrations? It was necessary to have a way to obtain a valid token to access the ID platform resources from Terraform.
Since the Continuous Integration (CI) platform resides outside of the Mercari microservices platform, the token exchange mechanism explained above seemed like a perfect fit.
As explained in the previous section, the token exchange protocol requires the client to present a token type that can be validated by the authorization server. Since the CI platform is based on Google Cloud services, the custom Terraform provider can leverage the Google Cloud IAM service to obtain a short-lived service account Google ID token, which is used as the subject token in the token exchange process. After obtaining the access token, the custom Terraform provider can access the ID Platform and register the necessary permission resources.
The Google ID token itself could not have been used directly as a token to access the resource server. Indeed, ID tokens are security tokens but are not access tokens, and they only provide information about the authentication of an entity. The issued access token, on the other hand, is designed for accessing resource servers, and as such has mechanisms to control where and how it can be used via the audiences and scopes associated with it. In addition, it could be considered that, in the future, the resource server handling permissions might be called by a client other than the custom Terraform provider. Using a standard access token to authorize access to the permission resources makes it simple to expand to other use cases in the future.
Another benefit of using the token exchange standard is that the custom Terraform provider can leverage the token exchange impersonation mechanism described above, to act on behalf of each service. In the Mercari CI platform, a service account is assigned to each service, which is used when executing Terraform commands for the resources of that service only. In addition to the security benefits of this approach, it also allows the ID platform to verify, for each request to one of its resource endpoints, that the requesting service is the owner of the permission that needs to be modified. In that case, the service account email of a particular service is used as the subject of both the Google ID token and the access token issued during the token exchange flow. From the point of view of the authorization service and of the ID platform, it is as if all actions were performed by that specific service account.
Finally, not every service account is allowed to obtain an access token using a Google ID token. The authorization server ensures that access tokens are issued only to some predefined service accounts associated with the Terraform provider OAuth client, and the access token is restricted to the specific audience and scopes allowed for that OAuth client. The resource server then rejects any request that does not have the required audience and scopes. This ensures that the Terraform provider cannot be used in an unexpected way, and that only authorized clients are allowed to manage the permission resources.
We have seen how the ID Platform team could leverage an industry standard to improve a critical internal process. This was made possible thanks to the flexibility of the OAuth 2.0 framework and its token exchange extension. The standard allowed developing a robust solution that significantly improved both the security and the efficiency of our internal permission registration process. This article only touches the surface of this topic. In a future article in this series, my colleague will discuss another application of this industry standard.
If you found this topic interesting, and would like to work with us on our authentication and authorization platform, please take a look at our current open position !