2022/02/18

Dynamic Service Routing using Istio

Author:: SharmaRajesh

, 2022/02/18

This article is part of the Developer Productivity Engineering Camp blog series which will take you through one of the great features called Dynamic Service Routing. Using this feature we can route the traffic between different versions of each microservice. Let’s find out how this feature works, hope you will find it interesting.

Introduction

At Mercari, we use microservice architecture where each microservice focuses on particular business logic. All these microservices are containerized and deployed in the Kubernetes cluster. These microservices are not standalone services. Every microservice has a dependency where it needs to call other microservices. For example, every service calls an authority service to authenticate the user, this is just a simple example, but there are many more complicated dependencies.

As shown in the above picture, microservice-A sends traffic to microservice-B. Service address of microservice-B is configured in microservice-A’s deployment. Similarly, if there are multiple dependencies, then a microservice will have multiple endpoints configured in its deployment to send traffic to all upstream services.

Dynamic Service Routing

As the name says, “Dynamic Service Routing” is a feature that can route the traffic between microservices dynamically. In monolithic architecture, you have a single giant service with all your business logic to handle all the requests from users, but in the case of a microservices architecture you have multiple services focusing on particular business logic. A single user request can further become multiple requests for dependent services.

As shown in the above diagram, a single user request to microservice-A is sent to microservice-B, microservice-C and microservice-D. Finally, a response is sent to the users.

Now let’s say microservice-B has a new feature request for User1, which needs to be developed and tested. Normally we would develop and deploy the new microservice-B and traffic routing will be the same as before.

Everything looks cool right?. No, we will face the following issues:

Let’s say there is another team working on a feature for User2. If there is a bug in the new microservice-B, User2 request R2 will also fail, which is not ok. User2 R2 should not be affected because of any other irrelevant service changes. There should be a different version of microservice-B which can only be used by R1 requests before rolling changes to master.

How do we solve the above issue?

One way is to have multiple instances of each microservice. Let’s say microservice-A is dependent on microservice-B. Then whenever we make any changes to microservice-B, we first create separate instances of microservice A and B to test new changes in microservice-B are not breaking anything. Like this, we can test the changes without affecting other services.

As shown in the above picture, we replicate a few dependent services to confirm the behavior of microservice-B. Once verified, we replace the original microservice-B and delete replicated microservices.

Developers can replicate their downstream(caller) microservices and confirm changes but think about QA members. They cannot replicate and test a single microservice. QA needs to test the whole workflow whenever there is any change. So in the above scenario, if there are more microservices dependent on microservice-B, we need to replicate them also.

In Mercari, we have more than 300 microservices, among which hundreds of services have parallel development and testing. Therefore replicating dependent microservices for every single feature development is not possible.

To overcome this issue we have developed an in-house feature called Dynamic Service Routing(DSR). Using DSR, we can dynamically divert the traffic and test any feature. We only need to replicate the target service and don’t need to replicate dependent services like shown below.

Only requests from User1 will go to replicated service microservice-B-v2 and the rest traffic flows the same as before.

Technical details

In this section, we will discuss how Dynamic Service Routing works and routes traffic between multiple versions of microservices.

Dynamic Service Routing is a feature based on Istio traffic shifting using headers. We will explain in brief how Istio traffic shifting works. For more details, please refer to the official Istio traffic shifting document.

Dynamic Service Routing mainly uses the following resources to route traffic between multiple versions of deployments:

Kubernetes Services (There is a separate service targeting each version of deployment)
Deployment (Multiple versions of deployment)
Istio VirtualService (This is the one which routes requests to specific Kubernetes services)
Config of VirtualService looks like below

There are two routes in the above VirtualService under http block. The first one is a header-based route that matches the request having {HEADER_NAME} header equal to {HEADER_VALUE} and routes requests to a particular host. Later is a normal route that doesn’t match anything and directly routes requests to another host.

Sample configuration

Let’s say we have a service called reviews in the reviews namespace. Now developers are developing two different features and want to test without affecting the original reviews service. Then the following resources need to be created.

For original service, we already have:
- Kubernetes service called reviews
- Kubernetes deployment called reviews

For feature-1, we will have:
- Kubernetes service called reviews-v1
- Kubernetes deployment called reviews-v1

For feature-2, we will have:
- Kubernetes service called reviews-v2
- Kubernetes deployment called reviews-v2

Finally, Istio VirtualService with the following configuration which routes requests to particular Kubernetes service based on header:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
  namespace: reviews
spec:
  hosts:
  - reviews.reviews.svc.cluster.local
  http:
  - match:
    - headers:
        feature:
          exact: v1
    name: feature-v1
    route:
    - destination:
        host: reviews-v1.reviews.svc.cluster.local
  - match:
    - headers:
        feature:
              exact: v2
    name: feature-v2
    route:
    - destination:
        host: reviews-v2.reviews.svc.cluster.local
  - name: default
    route:
    - destination:
        host: reviews.reviews.svc.cluster.local

Traffic routing at Mercari

In Mercari, we have a common Kubernetes cluster for developers and the QA teams where all kinds of testing is done before deploying any microservice to the production environment. We will show an example of “what kind of traffic” and the “service layout” we have and how we use Dynamic Service Routing to ease the testing for developers and QA teams.

Details about the above picture:

SRE team is monitoring the original services
QA team is doing testing for a new feature introduced in svcA
DEV team is developing a new feature for svcB

Before the Dynamic Service Routing feature, we had to replicate Gateway service pods for each user to test their target services without causing trouble to other users and services. With the help of the Dynamic Service Routing feature, we only replicate target services and save a lot of configurational and resource costs.

As explained above in the Technical details section, we need to create Kubernetes service and deployment for each target feature, also configure VirtualService resource. We have two additional tools that automate the job for us to do these.

Pull request based replication controller.
Service router.

Pull request based replication controller

This tool creates Kubernetes service and deployment for a particular microservice whenever developers raise any pull request to the master branch. In the above traffic flow picture, we can see that we have replications called svcA-pr1 and svcB-pr2 for both svcA and svcB. Every replica has a PR number in the suffix.

Service router

This is another tool(Kubernetes controller) that works along the “Pull request based replication controller” and configures the VirtualService resource. It does a very simple job, whenever a pull request-based service and deployment is created, it adds the following configuration block to the existing VirtualService resource.

  - match:
    - headers:
        service-router-{MICROSERVICE_NAME}-{NAMESPACE}:
            exact: {NEWLY_CREATED_KUBERNETES_SERVICE_NAME}.{NAMESPACE}
    name: {PR_NUMBER}
    route:
    - destination:
        host: {NEWLY_CREATED_KUBERNETES_SERVICE_NAME}.{NAMESPACE}.svc.cluster.local

Sample VirtualService configuration like below:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: svcA
  namespace: {NAMESPACE}
spec:
  hosts:
  - svcA.{NAMESPACE}.svc.cluster.local
  http:
  - match:
    - headers:
        service-router-svcA-{NAMESPACE}:
            exact: svcA-pr1.{NAMESPACE}
    name: pr1
    route:
    - destination:
        host: svcA-pr1.{NAMESPACE}.svc.cluster.local
  - name: default
    route:
    - destination:
        host: svcA.{NAMESPACE}.svc.cluster.local

You can see header-based traffic routes to particular PR-based services and default traffic(without header) goes to default stable service. That’s how the SRE team can do monitoring the original services without any trouble, QA team can do testing for new features and finally DEV team can develop and test new features without affecting anyone else.

Benefits of Dynamic Service Routing

There are many ways DSR helps developers and QA for developing and testing the microservices:

Simple setup: We need only to deploy a single PR-based replicaset. No need to deploy multiple versions for all downstream(caller) services
Reduces the time to test new features and release quickly
The complexity of the provisioning test environment is decreased
No need to change application source code
Easy to reproduce the bugs by dynamically routing the requests

Wrap Up

I introduced DSR(Dynamic Service Routing) through this blog post, which helps developers and QA teams route traffic dynamically between microservices.

This is one of the excellent features based on Istio. We (Network team) are working on many more features. If you find this blog post interesting and want to develop more excellent features to empower developers, you are welcome to join us.

Dynamic Service Routing using Istio

Introduction

Dynamic Service Routing

Technical details

Traffic routing at Mercari

Pull request based replication controller

Service router

Benefits of Dynamic Service Routing

Wrap Up

Related article

When Caching Hides the Truth: A VPC Service Controls & Artifact Registry Tale

From DNS Failures to Resilience: How NodeLocal DNSCache Saved the Day

Upgrading ECK Operator: A Side-by-Side Kubernetes Operator Upgrade Approach