2022/10/13

Leverage Kotlin in your Android CI

Author:: pgreze

, 2022/10/13

Hello, my name is Pierrick, and I’m a member of the Architecture Team at Mercari.

In this blog post, I will outline the multistep strategy we adopted over the years to transition our Android CI from being a mix of loosely coupled integrations to a consistent environment maintainable by any engineer familiar with Kotlin.

Android CI Context

One of the biggest CI projects in Mercari is the Android client application, where we are continuously improving the app and pushing weekly updates on the Google Play Store.

To keep a fast delivery speed, our CI is used in a variety of ways to ensure our codebase is not breaking and no regressions are introduced along the way.

Mercari CI usages

We are leveraging our CI primarily in the following two ways:

Hot path

The hot path is the common flow of CI tasks we’re running every time we push a change to our Android repository.

This is a time-critical path, because it will define how long our engineers will have to wait every time they’re pushing a new change before being able to merge it.

This is also a monetary-sensitive path, because our Android application, being a complex project, requires powerful machines to compile in an acceptable amount of time. Therefore, each additional minute we spend using these expensive machines is visible in the overall cost we’re paying for our infrastructure.

Since we spend most of our waiting time and money on the hot path, our improvements mainly focus on this workflow.

Automation of regular tasks

The second way we leverage CI is by running regular tasks that are needed to maintain our project. We have a lot of regular tasks including:

weekly autoupdate of our dependencies,
daily reminders of the next release schedule, and who will be the owner of the release,
etc.

Compared to the hot path we defined before, such tasks are not as time sensitive, because we’re running them early in the morning or during the weekend, when no one is actively waiting for their results. But they introduce new challenges, like monitoring or maintenance.

Since these tasks are triggered automatically, we risk forgetting them if they start failing silently when everyone is busy with more urgent tasks.

Even if we notice an error, solving the failure might be difficult because people may not be used to working with that part of our tooling. We may encounter more challenges if the error is based on a programming language our Android engineers have limited experience with, like Ruby or JavaScript.

Historical tech stacks

Our Android CI was originally maintained by the same people who were maintaining our iOS CI, and so Fastlane was used for most of the Android CI logic.

Fastlane is a popular solution in the iOS ecosystem. But since Fastlane is based on Ruby, we needed to support the Ruby language support, including linting, dependencies installation and updates. Due to a lack of knowledge and interest, Fastlane became an unmaintained part of our codebase and thus also brought in many unresolved security warnings.

We clearly lacked a strategy on how to maintain such automations, and an action-plan to converge to a coherent ecosystem. After some Android members volunteered to take over the Android CI maintenance, we had to define a strategy to replace unmaintained parts of our infrastructure with more Android-friendly technologies.

Kotlin was chosen as a replacement for all the existing scripting languages. But the migration had to be incremental, because we had limited resources and had to change parts of the CI while it was still being used by all our Android engineers daily. That’s how we ended up with our first iteration of Kotlin-based tooling.

First iteration: provide Kotlin-based automation with custom Docker images

To introduce gradual Kotlin power automation, the first idea was to provide our first Kotlin-based tooling with new custom Docker images.

We were able to provide the Docker images by creating a new, separate project called toolbox, which focused on maintaining Kotlin automation code and the Dockerfile files required to run them.

We used this approach for the following reasons:

More deterministic Docker image upgrades

Up to this point, the CircleCi team was updating the images that we were using, without clear visibility on when changes would happen and why. By building new images based on the CircleCI images, we were able to introduce a more explicit versioning on their releases, enabling us to control potential regressions.

FROM circleci/android:api-30

# Install toolbox
COPY ./kotlin-ci/build/install/kotlin-cli-shadow /mercari/kotlin-cli
ENV PATH $PATH:/mercari/kotlin-cli/bin

Extending an existing Docker image with a fat-jar is pretty straightforward.

We now had to regularly update our images manually, but this can be an advantage because we had started introducing changes in a predictable way.

Include Kotlin automation code in the Docker image

Because of the images we were using to build Android apps, we already had a dependency on the JVM. Thus, we could run the jar files of our custom Kotlin automation code without bringing in any new dependency.

The following are some examples of commands that we started distributing with this approach:

apk-publish: Publish an APK deploy message to a Slack channel + the related GitHub PR.
pr-comment: Post a message to a linked PR, if it exists.
slack-msg: Post a message to Slack.

Second iteration: bring automation directly into the Android codebase

The custom Docker image approach allowed us to experiment with writing Kotlin-based automation code. But since each update was tied to a new release of the related Docker images, we encountered a slow feedback loop.

At the same time, we investigated a new idea that would allow us to write automation code directly in the Android project. We decided to try out the Gradle’s JavaExec task type, which can run a Java or Kotlin program located in the Gradle project directly as a Gradle task.

But since our Android project is already big and has many Android-related dependencies, we wanted to avoid adding additional dependencies that engineers would have to resolve, even by those who just want to work on Android feature development.

To solve this issue, we created a new Composite Build based module that would host all our JavaExec powered integrations and only depend on Kotlin and a few other automation related libraries. This project would only be loaded on demand on local machines, and always on the CI.

// Only load scripts folder if required.
val includedScripts: String? by settings
if ("true" == System.getenv("CI") || "true" == includedScripts) {
    includeBuild("scripts")
}

In the settings.gradle.kts file, the scripts composite build is only included on demand via a specific Gradle property.

This new scripts module became a great place to host all isolated tooling that we wanted to expose locally, but only on demand, like code generation when creating a new module.

./gradlew -p scripts generateFeatureModule

The -p option allows us to run Gradle with a custom root project, instead of the current directory being the Android project.

With this new approach, we could completely replace several Ruby/Python scripts, while being even more integrated with the Gradle environment. With this integration, we were able to simplify the maintenance of said scripts because we could directly reference Gradle outputs (apk, test results) instead of hardcoding their paths in other scripts.

We noticed Android engineers were able to manage a Kotlin project much easier, bringing with them the habit of writing tests for each new addition, rather than standalone Python or Ruby scripts.

And last but not least, we could even refactor our initial automation code in the toolbox project mentioned in the first step. The idea would be to publish reusable code into a private maven repository that could be consumed not only in the main Android project scripts folder, but also in all other projects in the future that want to use similar automation in their CI flow.

Third iteration: embrace Kotlin Script for a pure scripting experience

We were reaching a pretty nice situation in our migration, gradually replacing even more of the existing Ruby or Python scripts in the Android project itself. But at some point, a particularly complicated Python script that had accumulated many features over time became the opportunity to try Kotlin Script in our Android project.

This script contains the logic calling the Firebase Test Lab service for our E2E tests, waiting for Android devices to complete their tests and publish the results to the pull request. Originally, the script was using the gcloud command to trigger the tests. With gcloud being implemented in Python, we decided at that time to write our integration in the same language.

Luckily, we had already moved to Flank, which is running on the JVM, so we were getting a chance to drop the Python dependency. Moving to Flank allowed us to rewrite this logic in Kotlin, and create stronger foundations for our future needs.

Kotlin Script seemed like a good choice for the following reasons:

Easy to run on a local machine and in the CI

All our execution environments require at least a JVM. For Kotlin Script, the only required extra dependency is a Kotlin installation.
Locally with SDKMan or homebrew:

# With homebrew
brew install kotlin

# With sdkman
sdk install kotlin

For the CI side, we needed to create a new Dockerfile providing a JVM + Kotlin installation to run our Kotlin scripts. Notice that with GitHub Actions, you can natively run Kotlin scripts without any additional setup, because the Ubuntu default image is coming with Kotlin by default.

Support for external dependencies out-of-the-box

Kotlin Script is a great choice when you have existing maven dependencies, because adding them is as easy as including an annotation in your script.

@file:Repository("https://repo.maven.apache.org/maven2/")
@file:Repository("https://example.org/maven2/")
@file:DependsOn("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.6.4")
@file:DependsOn("com.mercari.test:testlab:1.0")

import com.mercari.testlab.runE2eTests
import kotlinx.coroutines.runBlocking

runBlocking {
    val results = runE2eTests(...)
    println("Results: $results")
}

Anyone who works with Python or Ruby knows the pain of having to prepare a virtual environment, relying on pip or bundler to install your script dependencies before running any non-trivial script. In the case of Kotlin Script, the workflow doesn’t change regardless of the fact that you’re having dependencies or not. Kotlin will make sure they are downloaded or reused from a local cache before compiling your script into a Java application. And this works without touching any build system like Gradle.

Considering this feature, Kotlin Script is the perfect candidate when you want to write glue code for business logic implemented in another repository, where you can then take the time to have a complete setup to write quality code (linting, testing, etc). Kotlin script could in this case be treated as a more maintainable equivalent to Bash.

Conclusion

This multistep effort of converting our existing esoteric scripts to a more consistent environment allowed us to remove a large amount of unmaintained code that was previously running in our CI. We also gained a more powerful CI that enabled us to create automations that we couldn’t confidently provide before.

Thank you for reading. We’re always on the lookout for new Android engineers, to contribute to our app and infrastructure.
If you are interested in working with us, check our careers page.

To dig even more into this topic:

this blog post is the written summary of a talk I gave in DroidKaigi 2022, you can find slides here.
A sample project showcasing these approaches is available here.