2022/06/28

We have created unit testing guidelines

Author:: Hiroyuki Tanaka

, 2022/06/28

日本語で読む

* This article is a translation of the Japanese article written on April 22, 2022.

This article is for day 15 of Merpay Tech Openness Month 2022.

Introduction

Hello. This is @tanaka0325, a backend engineer on the Credit Design Team. I work primarily on developing Merpay Smart Payment.
In this article, I discuss the unit testing guidelines my team has recently created.

Issues

I’m currently working on the Merpay Smart Payment microservice. This microservice was created by reusing code from Mercari Monthly Deferred Payments and adding features to it for new requirements.
In converting this into a microservice, our policy was to take data from Mercari Monthly Deferred Payments and migrate it as needed once the microservice was released, so we added new features such as fixed-amount payments as we continued to migrate existing data. The following article goes over the history of creating the Merpay Smart Payment microservice.

[Transcription, Japanese] Converting Merpay Smart Payment to a microservice – Takuya Yoshida [Merpay Tech Fest 2021]

Adding various new features increased not only the amount of code, but the amount of tests as well. There were also some tests we no longer needed, and other situations where it was difficult to tell what the intent of the test was originally. This all made it more difficult to understand the code, which began to have a negative impact on the time it took to add new features.

In order to resolve this, we decided to refactor our tests. However, much like with designing code, there is no single correct answer when it comes to writing tests. We decided to first write some guidelines that could satisfy everyone on the team.

Important points

We kept the following points in mind in writing these guidelines.

The first point was to keep things realistic. We wouldn’t be writing new tests, but instead would be applying these guidelines to existing tests. We decided we would not set rules that would require massive revisions to implementation code. It would be meaningless to create a rule that was ideal but unrealistic, because we could never implement it.

The second point was to consider whether the guidelines suited us, rather than simply adopting rules generally thought to be good ideas. Just because a rule is thought to be a good idea, doesn’t mean it’s the right choice for us here and now. We decided to think very carefully about adopting these rules, rather than just unconditionally incorporating them.

The third point involves what we would do if we ever felt that the guidelines weren’t working for us. We decided we would meet as a team and update them as needed. The code we test changes daily, from how it’s designed to how it’s written. A rule that makes sense today may no longer apply tomorrow. Our thinking also changes with time, and it’s certainly possible that a team member could come up with an even better idea later on. We decided it would be best for team members to consult with one another as needed from the initial discussion phase, so that we could react flexibly to change.

Individual guidelines

Before I cover any individual guidelines, I’d like to stress one thing. These guidelines were written with our current team and the services we are currently running in mind. Our team, and the code we test, will need to change as our situation changes. The important thing is to first discuss things within the team, and then write guidelines that suit us.

I’ll be covering several individual guidelines throughout the rest of this article.

Include business logic tests within the packages containing that logic

Imagine some code with a structure where one package ("A") imports another package ("B"). Depending on the test, there may be cases where tests related to the logic in B are included in B and A. Revising a feature in B will affect tests for both B and A. When reading a test for A, you would also have to check tests related to B, making it more difficult to understand tests for A on their own. There are other kinds of tests that involve multiple packages, such as e2e and scenario tests. For unit tests, we decided that tests for a specific package should be included only in that package.

Eliminate redundant/unnecessary test cases

Related to the previous guideline, we decided to remove redundant tests (where the same kind of test is included in multiple packages). We also decided to remove any tests that were no longer necessary, due to revising features, etc. Having more tests is not always a good thing, so we want to remove anything we don’t need.

Do not perform multiple condition testing using conditional branches within a single test function

We have an enormous number of tests, and some of these include conditional branches in them. Including conditional branches and other logic in tests makes it more difficult to understand what the test does. The team decided that it would be easier to understand tests if we would split test functions by condition and reduce the number of conditional branches within individual test functions, so that’s what we did.

Test function naming rules

We set some rules on how functions should be named when splitting up test functions.

Test functions are named based on the following rule:
- Test{*test subject*}_{condition}_{*expected result*}
  - Items indicated with ** are required
- Examples (test names are just for reference)
  - Function with condition
    - Test_CreateUser_AdminUser_OK
  - Function with condition
    - Test_CreateUser_OK
    - Test_CreateUser_ValidationError

Test package names can be split or not split

Prior to writing these guidelines, it was recommended that we separate code (to test) and tests in separate packages. However, this rule wasn’t in place from the beginning, so some packages follow it while others don’t. We met to discuss whether it would actually be better to keep code and tests in separate packages. We decided to do whatever seemed right in each case, since each approach has benefits and drawbacks.

By the way, if you want to read about the benefits of keeping these in separate packages, and about tests using private features, you might find the following article interesting.

Go Friday Tidbits: Tests using private (unexported) features (Japanese)

Write tests as needed, without worrying about whether the function to test is public or private

It’s said that general tests should be implemented using public functions. We probably would have decided to test only public functions, if all we were going to do was implement new tests. However, we took a look at our existing implementation code and test code, and decided that it would also be okay to write tests for private functions. There are several reasons why, including the fact that we have a lot of existing private function tests. Another reason is that, due to the structure of our existing implementation code, tests wouldn’t necessarily be easier to write as the size of individual functions tended to grow. As mentioned earlier, we had decided not to make massive revisions to implementation code.

Use a common fixture generation function when writing test data

We create test data as a preliminary step for each test. This can add a lot of lines of code depending on the test, which can make it difficult to understand dependencies and can reduce readability. We considered reducing the number of lines of code to make dependencies more easily understandable, and decided to create a new tool. We published an article the other day that covers this in more detail:

Using Go to write good test fixtures (Japanese)

When writing a failure cause test, always include the associated ticket URL

You cannot completely eliminate failure, no matter how hard you try. If some failure occurs, we will add a test to reproduce that failure. The conditions under which a failure occurs often differ from that of normal operation, so there are cases where it’s difficult to fully understand the situation just from reading the test code. We decided that any test code added to reproduce a failure must also contain a comment indicating the URL for a ticket explaining that failure in detail.

In conclusion

In this article, I discussed the unit testing guidelines created by my team. We actually do refactor tests as a sort of side project, but there are still many tests that have yet to be rewritten. As we continue to rewrite these tests, we’ll likely have plenty of opportunities to rethink these guidelines as a team and make them even better.

There’s no such thing as a perfect set of guidelines that cover every situation. I definitely recommend getting together as a team to discuss how testing can best suit your own situation. I hope this article will serve as a launching point for team discussion.