- This article is a translation of the Japanese article written on April 6, 2022.
This article is for day 3 of Merpay Tech Openness Month 2022.
Hello. This is @sinmetal from the Merpay Solutions Team.
In this article, I discuss migrating gchammer from Google App Engine to Cloud Run. This tool, gchammer, is an internal Merpay web UI tool we use for running queries (SELECT Only) on Cloud Spanner ("Spanner" below).
I described what gchammer does in a previous article(only available in Japanese).
Why migrate to Cloud Run?
- To use Go 1.18 immediately after it’s released
- To use various machine types
- To get away from using App Engine
Go 1.18 was a massive update that also introduced generics. The developer of gchammer, @vvakame, was really looking forward to generics and said he’d spend the next 10 years in a permanent frown if he couldn’t use them once Go 1.18 is released, so I knew I needed to treat this as a high-priority item.
It’s better to have plenty of memory for cases where there are many Spanner results, so Cloud Run (which can be run on machines with a wider range of specifications) is more appealing than App Engine. Another mark in the favor of Cloud Run is that we are billed only for the time spent processing requests, instead of in 15-minute increments like with App Engine.
Although within Mercari Group, the previous incarnation of Souzoh, Inc. from several years ago had used App Engine, we are now using Cloud Run for the current Souzoh’s Mercari Shops service, with App Engine no longer being used. That means that the proportion of people who had experience using App Engine was gradually decreasing within the group. App Engine happens to be a specialty of mine, and although it was no problem to continue using App Engine for now, I wanted us to start moving away from it so that other members of the team could better participate in development.
Our post-Cloud Run architecture
Cloud Run doesn’t allow for an Identity-Aware Proxy to be used natively, so we had placed an HTTP LB ahead of it. Although we had originally placed the API (using GraphQL) and other web frontend resources (HTML, etc.) on the same server, we ended up creating a container image with a simple structure for each. We replaced the routing performed by app.yaml with URL maps, and assigned processing to each container.
Requests from Cloud Tasks are sent directly to the tqworker service, without passing through the IAP. We do this because the timeout setting cannot be configured due to serverless NEG constraints, with the default being 30 seconds. This is not good for handling requests from Cloud Tasks, which can sometimes take a long time to process.
Issues with migrating to Cloud Run
Development of gchammer began in 2019, and so it made very little use of APIs unique to App Engine. There weren’t any major issues during the migration.
Broadly speaking, the following changes were made in the app code:
- Changed environment variables to obtain App Engine Module Name, etc. for use with Cloud Run
- Changed any app engine tasks that were cloud tasks to HTTP target tasks
- Changed from using App Engine UserService to using Identity-Aware Proxy HTTP headers for obtaining information
- Changed how secret values are obtained
The Identity-Aware Proxy does not have UserService.IsAdmin, so we use the Cloud Resource Manager API to view the project IAM and obtain similar functionality.
We had stored secret values in Secret Manager, with those values obtained when the app is run from berglas. However, Cloud Run contains a feature that brings Secret Manager values over to environment variables, so we changed the architecture in order to make use of this feature.
Feature to add later
Spanner provides two types of tags (request tags and transaction tags) for use as information associated with requests. Tags can be used for a variety of purposes, including identifying the sources of requests or for categorization. For now, it would be convenient if we could determine whether something is a query sent from gchammer. If we could identify and exclude queries from gchammer when analyzing query stats, there would be less extraneous noise to worry about.
If we could attach identification tags to all of our known access sources (applications and jobs), we could identify anything without a tag as access from some source other than our own code, such as from the Cloud Console. Tags are therefore a crucial element in making use of stats.
This concludes my article on migrating gchammer from Google App Engine to Cloud Run. I hope you found it interesting!