*This article is a translation of the Japanese article published on August 30th, 2022.
Author: @urahiroshi, Engineering manager of Web Platform team
On August 4, 2022, a server called "web-2" was shut down at Mercari. This was the end of an era for those teams involved in developing Mercari Web.
The web-2 server was a web server written in PHP, and it had been serving content under https://www.mercari.com/jp/ since 2015. Several web microservices are now responsible for its functionality, and pages under https://www.mercari.com/jp/ are redirected to pages served by these.
The process of migrating Mercari Web to microservices and finally shutting down web-2 actually took over four years to accomplish. In this article, we are going to cover what the development teams discussed over these four years, describe our architectural design choices, and consider what kind of organization and architecture can best handle change. I hope you’ll find some useful information here!
Launch of the Web Re-Architecture project (from May 2018)
We launched the project to overhaul the architecture of Mercari Web in 2018.
All Mercari Web content was served from the web-2 server at that time. Back when web-2 development first began, there were a lot of team members with plenty of experience in PHP, and we were using an internally supported PHP framework called Dietcube (https://github.com/mercari/dietcube) to write an efficient server-side implementation. However, we began to notice some issues as time went on.
- As the number of functions provided by web-2 increased, so did code and deployment complexity. This meant higher development costs to make changes, and more issues occurring after release.
(We were also using CDNs, proxy servers, and databases, but I’ve omitted them here)
In order to solve these issues, we launched the Web Re-Architecture project in May 2018. Our goals were to overhaul the codebase and architecture to make Mercari Web easier to maintain, and to use the latest web frontend technologies to increase page rendering performance.
We also started a plan to migrate the API servers into microservices during the same year. A decision was reached to replace our monolithic API servers (written in PHP) with Go microservices running on a Kubernetes infrastructure.
The infrastructure and deployment processes for web-2 were originally managed by the SRE Team. But after the decision was made to migrate Mercari Web from a monolith to various microservices, the Web Team, under the Web Re-Architecture project, assumed responsibility for designing and operating the new web infrastructure.
This is how we began considering which technologies to use and which microservices to provide. The microservices created during the Re-Architecture project are listed below, along with brief descriptions of how they came to be.
web-fuji: Provides server-side rendering (SSR)
- We decided to use SSR from an SEO perspective, to support OGP, and to improve page rendering performance. Specifically we chose Next.js (https://nextjs.org/) for our framework.
- We decided to name the future Mercari Web microservices after mountains, so we named this service web-fuji, after Mount Fuji. Unfortunately, we never named any other services after mountains!
web-graphql: Provides web APIs
- The web-2 server also provided endpoints for calling APIs. This functionality is called backend for frontend (BFF). We decided to use GraphQL as the protocol for providing this BFF functionality, and selected Apollo Server (https://www.apollographql.com/docs/apollo-server/) as our framework.
web-gateway: Performs routing for web microservices
- We planned to launch Web Re-Architecture releases gradually for each page, and so created the web-gateway microservice in order to perform routing between Mercari Web microservices. By controlling routing for canary releases, we could gradually switch requests between web-2 and Mercari Web microservices.
- We were using NGINX as a load balancer for web-2 routing, and so we decided to use ingress-nginx (https://kubernetes.github.io/ingress-nginx/) because it would make it easier to migrate NGINX logic. It also provides cookie-based session affinity and canary release functionality.
web-session: Retrieves and updates session information
- The web-2 server stored user session information associated with session IDs from cookies in a database. These session IDs were converted into access tokens prior to calling APIs. As web-gateway gradually switched routing between web-2 and Mercari Web microservices, we needed to synchronize web-2 session information during the migration period. We expected that this would become pretty complicated, so we decided to create a microservice dedicated to retrieving and updating session information.
- For the web-session tech stack, we chose Node.js to reuse as much of the same technology as possible between Mercari Web microservices, and also decided on a plan of using gRPC as our communication protocol between microservices, since it was being recommended within Mercari (we ended up changing this, which I’ll cover later). Also, we chose Cloud Spanner as our database for storing session information.
Our Web Re-Architecture plan was presented during the 2018 Mercari Tech Conference. You can also see information on re-architecture goals and architecture from this presentation.
Release of web microservices (from June 2019)
Our first goal was to serve just the front page from the new architecture as soon as possible. However, we ran into several problems and fell behind our release schedule.
Underestimation of the new technology:
Even though we were only releasing the front page, we needed to finish a lot of tasks including building infrastructures for each microservice and a CI/CD platform. Many technologies were brand new for the Web Team, and we underestimated how long it would take to become proficient.
Reworking due to changes in underlying technology:
When we were selecting technologies to be used for web-session, we had planned on implementing the gRPC server in Node.js. However, gRPC wasn’t fully supported by the Node.js ecosystem at that time, so we decided to use Go to implement our gRPC server instead. We had assumed that we would use gRPC itself at this time, but there were some issues with implementing gRPC clients from Node.js. We finally decided to use REST APIs instead of gRPC.
Schedule delays due to service dependencies:
web-session’s functionality was required to call microservice APIs from web-fuji and web-graphql., Delays in web-session blocked development of web-fuji and web-graphql, which further delayed development overall.
Changes to development processes:
At the same time, Mercari introduced scrum and design docs as part of the development workflow. It took the Web Team some trial and error to effectively learn these techniques.
The first release finally became possible in June 2019, or roughly one year since development began. We first released web-gateway. Although it merely proxied requests to web-2 at this time, we decided to release it first so that we could control routing destinations as we released other web microservices.
Web-gateway added in order to relay requests
In order to release other services, we used the canary release functionality of web-gateway to gradually switch front page traffic from web-2 to web-fuji from 0% (access allowed only to internal users) to 1%, 5%, 10%, and so on, and finally were able to switch 100% of requests to web-fuji in August.
Front page now distributed from web-fuji and web-graphql
I posted an article in the past (Japanese only) that provides more details on how web-fuji and web-graphql were released after the architecture change and describes web-gateway functionality, so feel free to check it out if you’re interested in learning more.
After releasing the front page, we began gradually switching other pages to web-fuji. There were some concerns that web-fuji and web-graphql would themselves become new monoliths, so we began considering how to decouple services.
While all of this was happening, we arrived at a turning point in December 2019.
Launch of the GroundUp Web project (from December 2019)
In December 2019, everyone on the Web Team met to discuss a new plan. This was called the GroundUp Web project, and its goal was to completely redesign and reimplement Mercari Web. This turned into a discussion on why we needed to redesign and reimplement Mercari Web, only four months after the release of Web Re-Architecture.
There were several reasons for the launch of the GroundUp Web project.
Overhaul the user interface:
The plan for the Web Re-Architecture project was to replace the architecture and implementation, without changing the existing user interface. However, due to the growing number of mobile users, we decided to improve the user experience by overhauling the design and providing a layout better suited for mobile browsers. These were the goals of the GroundUp Web project.
Simultaneously release core functionality instead of releasing page-by-page:
The Re-Architecture project was released on a page-by-page basis, and therefore required release work to be performed for each page. We also needed to consider compatibility between old and new pages when migrating, and they required more for migration. Also, we wanted to overhaul the user interface and release core functionality all together for the C2C marketplace.
Implement architecture that allows the Web Frontend Team to have the same responsibilities as the Android and iOS Teams:
Mercari Web application was missing some features compared to the Android and iOS application, and we wanted to add them. If we wanted to implement a certain feature on all apps, the plan for the Android and iOS apps were that first the Backend Team would implement the required APIs, and then client code would be implemented. For the web app however, we would also need to design and implement GraphQL APIs and consider server-side rendering processes. These development procedures and responsibilities were different from those of the Android and iOS Team , and were bottlenecks in implementing features on all apps simultaneously. The aim here was to change the architecture so that we would have the same responsibilities as the Android Team and iOS Team, allowing for new features to be provided faster.
Of course, we also considered taking an approach that would have resolved this issue using the existing Web Re-Architecture project architecture. However, we wanted to design architecture without being restricted by existing architecture.
Some of the decisions made for the GroundUp Web project architecture are described below.
Use web components to overhaul the Design System
- During the Web Re-Architecture project, we used our in-house Design System library for the UI component. The Design System library was originally written based on React, but we decided to rewrite the system using web components that don’t rely on any particular framework so that it could be more widely used throughout Mercari Group. We decided to use the system for the GroundUp Web project as well. More details on our Design System can be found here: https://engineering.mercari.com/blog/entry/20210823-8128e0d987/ (Japanese Only).
Use static site generation (SSG) and dynamic rendering (web-suruga microservice)
- We decided to design the architecture based on static site generation (SSG) and selected Gatsby (https://www.gatsbyjs.com/) as our SSG framework. We did this for two reasons. First, using web components would make SSR difficult. Second, we wanted to reduce the amount of development work that would occur due to the introduction of SSR, and have the same development procedures to iOS and Android development teams.
- The microservice providing GroundUp Web was named "web-suruga." This was named after Suruga Bay, which looks out over Mount Fuji, which was the inspiration for naming web-fuji from the Web Re-Architecture project.
Create the new web-auth microservice
- We decided to create a new microservice called web-auth to implement login and registration, rather than implementing this functionality on web-suruga.
- The intention was to be able to use the login and registration screens from web services other than Mercari Web, and we thought it would be better from an ownership perspective to split this microservice from web-suruga (which provides Mercari Web).
- We were planning to implement an architecture to return HTML contents from Google Cloud Storage (GCS) for web-suruga. However, login and registration functionality had some situations to receive callback requests after authentication using Google, Facebook, and Apple accounts. It was difficult to support these needs by returning static files from GCS, so web-auth required a different architecture, returning response from a Go server.
Eliminate web-graphql and web-session
- We also revised session management and API calling. Instead of going through a GraphQL API, we wanted to call the APIs provided by the Backend Team directly from the Web client, just like with the Android and iOS applications.
- We initially intended to avoid storing access tokens directly in the browser, out of concern for security. To that end, we opted to store session information including access tokens in JWT format to HttpOnly cookies. However, eventually we switched gears to use access tokens with sufficiently short lifetimes and store them in the browser’s Local Storage, and make use of session IDs saved as HttpOnly cookies only when reissuing access tokens.
- Access tokens would be saved on the browser, and the same microservice used by the iOS and Android applications could be used to reissue tokens, so Web Team doesn’t need to manage the web-session microservice.
To develop new microservices we looked back on our development experiences during the Web Re-Architecture project, and decided to provide a BFF for development that can easily call APIs until session management and API calling could be properly implemented. This made it easier to build the frontend. We were also able to take what we learned during the Web Re-Architecture project and apply it toward tasks such as configuring CI/CD, implementing end-to-end testing, and building an infrastructure on Kubernetes.
We continued to add functionality to web-2 and web-fuji in parallel for a while, but ended up halting this and instead worked with all of the Mercari Web development teams on implementing the GroundUp Web project.
We were finally prepared for release after a development period of around one year. We had been using URLs under https://www.mercari.com/jp/ for Mercari Web, but were also using https://www.mercari.com for Mercari US. We wanted to maintain separate Japan and US sites, and so we decided to switch to the https://jp.mercari.com domain after releasing GroundUp Web.
We came up with the following plan for launching gradual releases.
Open https://jp.mercari.com only to Mercari employees, and gather feedback, it was called “Internal Release”.
Opt-in a certain percentage of users to be redirected from https://www.mercari.com/jp/ to https://jp.mercari.com, it was called “Limited Release”
Redirect all requests to pages under https://www.mercari.com/jp/ to https://jp.mercari.com, it was called ”Full Release”
We were able to launch the Limited Release in March 2021, but had to temporarily stop development in order to join a companywide response to a security incident in which an attacker gained unauthorized access to Codecov (https://about.mercari.com/press/news/articles/20210521_incident_report/ (Japanese only)).
Once we were able to resume development, we began designing and implementing redirection for the Limited Release and Full Release. Until this point, web-2 and web-fuji were processing requests for each path under https://www.mercari.com/jp/, so we implemented redirection on both web-2 and web-fuji.
Once this was finished, we launched the Limited Release without issue on August 5, gradually increased the redirect ratio, and then launched the Full Release on September 29.
GroundUp Web released, web-suruga and web-auth content distributed from the jp.mercari.com domain
In this article, I’ve focused mainly on the roles of each web microservice. If you’d like to learn more about the web-suruga microservice, which played a central role in the GroundUp Web project, details on the architecture and development organization can be found in the following article: https://engineering.mercari.com/en/blog/entry/20210810-the-new-mercari-web/
The web-2 Sunset project (from September 2021)
Although we didn’t have any issues launching the Full Release of GroundUp Web, not all of the pages under https://www.mercari.com/jp/ were being redirected to https://jp.mercari.com, as there were still some pages being served by web-2. The services created during the Web Re-Architecture project (web-fuji, web-graphql, and web-session) were also still running to provide redirection. Our next goal was to completely eliminate the service infrastructure for web-2, web-fuji, web-graphql, and web-session, because we would otherwise have to continue to maintain them and implement security measures, even if most of their functionality wasn’t being used. We called this the web-2 Sunset project.
We started working on eliminating the services that were just providing redirection (web-fuji, web-graphql, and web-session). Although redirection itself is required, there was no need to maintain an infrastructure for these services. We had provided redirection with keeping login status of users who had already logged in the old domain for one month after the release of GroundUp Web, and it required complex functionality to refer cookies and issue access tokens but we could eliminate this functionality. So it would be sufficient just to map URLs under https://www.mercari.com/jp/ to URLs under https://jp.mercari.com.
In November 2021, we provided a microservice called web-redirection (used only for redirection), and were able to remove the infrastructure for web-fuji, web-graphql, and web-session.
We implemented web-redirection using Cloud Functions, because we wanted to reduce the amount of work to maintain the service, and because we wanted to provide the implementation with some flexibility.
Redirection provided by the web-redirection service, allowing the web-fuji and web-graphql services to be eliminated
Our next task was to determine how to migrate remaining pages served by web-2. The remaining pages could be classified as follows.
- Terms of services, privacy policies, and other legal documents for each service
- Landing pages for promotions and certain services
- Pages providing users with instructions on using Mercari (these pages are also called as “Mercari Guide”)
- Pages that launch the Android or the iOS app through universal links, etc.
Of these, web-2 provided CMS functionality for Mercari Guide and Mercari Guide is independent from Mercari Web app, so there was already a plan in place to launch a new help-center microservice and migrate the existing pages.
As for the other pages including terms of services and privacy policies, and landing pages for promotions, there was no need to call APIs, and so these pages could be provided using static files such as HTML. We decided to create a new microservice called web-static-page and migrate these pages to it.
That leaves us with the pages that launch the Android App or iOS App through universal links. These functionality can be provided by redirection, so we decided to implement them on the web-redirection service.
Using a common domain between multiple microservices causes some potential issues, such as Local Storage or cookie information being shared unintentionally. We therefore decided to serve content on different domains if the microservice is different.
The teams with ownership of these services worked on these development projects in parallel, and were able to migrate their pages on the following schedule.
- April 2022: Released the help.jp.mercari.com domain and the corresponding help-center microservice, and redirected corresponding pages under https://www.mercari.com/jp/
- May 2022: Released the static.jp.mercari.com domain and the corresponding web-static-page microservice, and redirected corresponding pages under https://www.mercari.com/jp/
- June 2022: Switched over all redirection features on web-2 to the web-redirection service
As of June 2022, all requests under https://www.mercari.com/jp/ were being redirected, and there was no longer any traffic to web-2. The infrastructure for web-2 had been completely eliminated as of August 4. Finally, after more than four years of work, our conversion of Mercari Web into microservices was complete.
Appropriate microservices process requests for each type of content
Our architecture has continued to change to support the needs from application and organization, and to suit whatever technology we can use. The microservices we’ve developed over the last four years will also continue to be changed.
As of the writing of this article, the Web Team is planning to make the following architecture changes.
- Migrate from dynamic rendering to SSR:
We’ve noticed that costs have increased due to the CPU load on the dynamic rendering server and that response delays have had an effect on SEO, so we’d like to switch over to SSR.
- Separate the domain of the pages provided by web-auth from https://jp.mercari.com:
We want to ensure independence from web-suruga and make network routing easier.
As for the first change, we actually had some concerns over performance when initially designing the architecture, and had conducted a load test prior to deciding to use dynamic rendering. However, the application itself hadn’t been completed when we conducted the load test, and so we instead performed rendering using test content. Our results ended up being quite different from those of the final application. We first realized that the load would be a problem immediately prior to release, but it wouldn’t have been possible to migrate to SSR at that point. We decided to investigate the issue later, since this dynamic rendering feature wouldn’t have any impact on users.
On the other hand, this change wouldn’t impact any other microservices, since web-suruga was the only microservice using dynamic rendering. I guess this shows the benefits of a microservices architecture!
When I reflect back on these four years of architecture changes and development work, I see a lot of areas we were able to improve. This isn’t meant to place any blame or criticize any work! Instead, it’s better to look back on the purposes, development processes, and results of changes we made, and reflect what we’ve learned for future changes. I think that’s how you build a stronger organization. I hope this article will be of some help for it.
Finally, I’d like to thank everyone who contributed to the web-2, and also everyone involved in the Web Re-Architecture project, GroundUp Web project, and web-2 Sunset project. I really appreciate all of your work building the basis for Mercari Web and your efforts during each project to make Mercari Web what it is today. Thank you very much!
If you are interested in joining the projects like these at Mercari, please take a look at our career page.