* This article is a translation of the Japanese article written on August 24, 2020.
This article is for day 6 of Merpay Tech Openness Month 2020.
Hello, everyone. I’m Yoshiki Shibata (@yoshiki_shibata), a backend engineer at Merpay. In this article, I discuss the parallelization function provided in the testing
package of the Go programming language (Golang).
Golang provides a package called testing
, which is used to create test code. As you develop a piece of software and its scale grows, the amount of test code written also increases. This can increase the time it takes for all testing to complete. This is especially the case when testing access to a database, where communicating with the database accounts for much of the testing time. In this case, running test code in parallel rather than sequentially can reduce testing time. (The correct term is “concurrent” rather than “parallel,” but because I’m covering the t.Parallel()
method here, I’ll be using “parallel” throughout the article.)
I’ll be explaining the Parallel()
method in *testing.T
here.
Executing tests from multiple packages in parallel
By default, execution of test code using the testing
package will be done sequentially. However, note that it is only the tests within a given package that run sequentially.
If tests from multiple packages are specified, the tests will be run in parallel at the package level. For example, imagine there are two packages, package a
and package b
. The test code in package a
will be run sequentially, and the test code in package b
will also be run sequentially. However, the tests for package a
and package b
will be run in parallel. Let’s look more closely into how these will be run in parallel.
If multiple packages are specified (or if all packages are specified with ./...
), the number of packages for which tests will be run in parallel is specified with the -p
flag for the go test
command (actually, a build
flag). The description of the -p
flag provided by go help build
is shown below.
-p n
the number of programs, such as build commands or
test binaries, that can be run in parallel.
The default is the number of CPUs available.
The number of programs, such as build commands or test binaries, that can be run in parallel. The default is the number of CPUs available.
In other words, for tests, the number of processes specified with the -p
flag will be the maximum number of test binaries that can be run as parallel processes. If nothing is specified with the -p
flag, the maximum will be the number of CPUs. Note also that the packages to be tested will be automatically assigned to processes. In other words, each process will run tests for a single package sequentially. What happens if we specify -p=1
? There would be only one process running tests, so all tests would be run sequentially, one package at a time.
Note: If you specify a value greater than 1 using the
-p
flag, specify multiple packages (or specify./...
), run the tests, and then execute theps
command from another terminal while the tests are running, you’ll see that test binaries are being created for each package while tests are being run.
Specifying a large value for the -p
flag will generate a number of test processes equal to that number, and this will improve parallelism. However, keep in mind that this only means that tests from multiple packages will be run in parallel. It does not mean that tests within individual packages will be run in parallel. In order to improve parallelism for tests within a package, we need to use the t.Parallel()
method.
t.Parallel() method
*testing.T
contains a method called Parallel()
. Using the t.Parallel()
method can be tricky, and it’s important to have a good understanding of how to use it properly.
The description of the Parallel()
method is as follows.
func (t *T) Parallel()
Parallel signals that this test is to be run in parallel with (and only
with) other parallel tests. When a test is run multiple times due to use of
-test.count or -test.cpu, multiple instances of a single test never run in
parallel with each other.
Parallel
signals that this test is to be run in parallel with (and only with) other parallel tests. When a test is run multiple times due to use of-test.count
or-test.cpu
, multiple instances of a single test never run in parallel with each other.
Let’s look at a simple example.
Imagine we have some test code using the testing
package. Within this test code is a top-level test function with the func TestXXX(t *testing.T)
signature. Within this top-level test function is a subtest function written using t.Run()
. Let’s start by seeing what happens when the t.Parallel()
method is called only for a top-level function.
Take a look at the following code.
package main
import (
"fmt"
"testing"
)
func trace(name string) func() {
fmt.Printf("%s enteredn", name)
return func() {
fmt.Printf("%s returnedn", name)
}
}
func Test_Func1(t *testing.T) {
defer trace("Test_Func1")()
// ...
}
func Test_Func2(t *testing.T) {
defer trace("Test_Func2")()
t.Parallel()
// ...
}
func Test_Func3(t *testing.T) {
defer trace("Test_Func3")()
// ...
}
func Test_Func4(t *testing.T) {
defer trace("Test_Func4")()
t.Parallel()
// ...
}
func Test_Func5(t *testing.T) {
defer trace("Test_Func5")()
// ...
}
There are five test functions. Test_Func1
, Test_Func3
, and Test_Func5
are normal test functions. Test_Func2
and Test_Func4
call the t.Parallel()
method. If we run this using the go test
command, the following occurs.
Test_Func1
is executed and finishes processing.- Next, the program moves on to running
Test_Func2
. However, it pauses once thet.Parallel()
method is called. - With
Test_Func2
execution paused,Test_Func3
is run and finishes processing. - Next, the program moves on to running
Test_Func4
. However, it pauses once thet.Parallel()
method is called. - With
Test_Func4
execution paused,Test_Func5
is run and finishes processing.
Once the functions that do not call the t.Parallel()
method (Test_Func1
, Test_Func3
, and Test_Func5
) are all run in order, processing of the functions that do call the t.Parallel()
method (Test_Func2
and Test_Func4
) is resumed in parallel, and then finishes.
The results are shown below.
=== RUN Test_Func1
Test_Func1 entered
Test_Func1 returned <- 1 (完了)
--- PASS: Test_Func1 (0.00s)
=== RUN Test_Func2
Test_Func2 entered
=== PAUSE Test_Func2 <- 2 (一時停止)
=== RUN Test_Func3
Test_Func3 entered
Test_Func3 returned <- 3 (完了)
--- PASS: Test_Func3 (0.00s)
=== RUN Test_Func4
Test_Func4 entered
=== PAUSE Test_Func4 <- 4 (一時停止)
=== RUN Test_Func5
Test_Func5 entered
Test_Func5 returned <- 5 (完了)
--- PASS: Test_Func5 (0.00s)
=== CONT Test_Func2 <- 処理が再開
Test_Func2 returned <- 完了
=== CONT Test_Func4 <- 処理が再開
Test_Func4 returned <- 完了
--- PASS: Test_Func2 (0.00s)
--- PASS: Test_Func4 (0.00s)
PASS
*[WIP below]
Pay special attention to how, in the results above, calling the t.Parallel()
method makes the function pause and then resume. When a pause occurs, it is indicated with === PAUSE
. When processing resumes, it is indicated with === CONT
.
The condition for resuming processing for a test paused after calling the t.Parallel()
method is described below as Operation 1.
Operation 1: Once all the top-level test functions (within a package) that do not call the t.Parallel()
method have completed, processing of top-level test functions calling the t.Parallel()
method is resumed and runs in parallel.
Operation 1 means that, if a top-level test function does not call the
t.Parallel()
method, the program will not move on to running to the next top-level test function until the execution of its subtest functions have completed—even if a subtest function usingt.Run()
calls thet.Parallel()
method.
For example, let’s rewrite Test_Func1 as follows (code).
func Test_Func1(t *testing.T) {
defer trace("Test_Func1")()
t.Run("Func1_Sub1", func(t *testing.T) {
defer trace("Func1_Sub1")()
t.Parallel()
// ...
})
t.Run("Func1_Sub2", func(t *testing.T) {
defer trace("Func1_Sub2")()
t.Parallel()
// ...
})
// ...
}
We’ve added two subtest functions that both call the t.Parallel()
method.
The results of running this are shown below.
=== RUN Test_Func1
Test_Func1 entered
=== RUN Test_Func1/Func1_Sub1
Func1_Sub1 entered <- Func1_Sub1 starts
=== PAUSE Test_Func1/Func1_Sub1 <- Func1_Sub1 pauses
=== RUN Test_Func1/Func1_Sub2
Func1_Sub2 entered <- Func1_Sub2 starts
=== PAUSE Test_Func1/Func1_Sub2 <- Func1_Sub2 pauses
Test_Func1 returned <- Test_Func1 call returns(*)
=== CONT Test_Func1/Func1_Sub1 <- Func1_Sub1 resumes
Func1_Sub1 returned <- Func1_Sub1 completes
=== CONT Test_Func1/Func1_Sub2 <- Func1_Sub2 resumes
Func1_Sub2 returned <- Func1_Sub2 completes
--- PASS: Test_Func1 (0.00s) <- Test_Func1 results displayed
--- PASS: Test_Func1/Func1_Sub1 (0.00s)
--- PASS: Test_Func1/Func1_Sub2 (0.00s)
=== RUN Test_Func2 <- Test_Func2 is not run until this point
Test_Func2 entered
=== PAUSE Test_Func2
=== RUN Test_Func3
Test_Func3 entered
Test_Func3 returned
--- PASS: Test_Func3 (0.00s)
=== RUN Test_Func4
Test_Func4 entered
=== PAUSE Test_Func4
=== RUN Test_Func5
Test_Func5 entered
Test_Func5 returned
--- PASS: Test_Func5 (0.00s)
=== CONT Test_Func2
Test_Func2 returned
=== CONT Test_Func4
Test_Func4 returned
--- PASS: Test_Func4 (0.00s)
--- PASS: Test_Func2 (0.00s)
PASS
As shown in the results above, Test_Func1
does not call the t.Parallel()
method, so the program does not process the subsequent Test_Func2
until all tests within are completed. In other words, if the top-level test function does not call the t.Parallel()
method at all, the tests in the package will be run sequentially one-by-one by the top-level test function. Of course, if subtest functions using t.Run()
within the top-level test function call the t.Parallel()
method, the included subtest functions will be run in parallel.
There’s something else to note in the results.
Operation 2: If a subtest function using t.Run()
calls the t.Parallel()
method, the subtest function will pause once the t.Parallel()
method is called, and remain paused until its parent top-level test function completes and returns. (This behavior is the same whether the parent top-level test function called the t.Parallel()
method or not.)
In other words, we can express Operation 2 as follows.
Operation 2 (expressed differently): If a subtest function using t.Run()
calls the t.Parallel()
method and is paused by the t.Parallel()
method, the subtest function will resume after the parent top-level test function completes and returns.
We can combine Operation 1 and Operation 2 to state that, “in order to improve parallelism as much as possible, the t.Parallel()
method must be called by both the top-level test function and its subtest functions.” Doing so would mean that all subtest functions inside the package calling the t.Parallel()
method would operate in parallel at once.
Parallel level
I just said that all test functions would operate in parallel at once, but in reality, the number of test functions that can operate simultaneously is limited. The number of test functions that will operate in parallel is specified using the -parallel
flag.
-parallel n
Allow parallel execution of test functions that call t.Parallel.
The value of this flag is the maximum number of tests to run
simultaneously; by default, it is set to the value of GOMAXPROCS.
Note that -parallel only applies within a single test binary.
The 'go test' command may run tests for different packages
in parallel as well, according to the setting of the -p flag
(see 'go help build').
Allow parallel execution of test functions that call
t.Parallel
. The value of this flag is the maximum number of tests to run simultaneously. By default, it is set to the value ofGOMAXPROCS
. Note that-parallel
only applies within a single test binary. Thego test
command may run tests for different packages in parallel as well, according to the setting of the-p
flag.
If this is not explicitly specified, the number will be the value of the GOMAXPROCS
environment variable. If the value of GOMAXPROCS
is not explicitly set, it will be equal to the number of (apparent) CPUs.
If many of the tests, including subtests, will be accessing a database, the parallel level should be explicitly set to a number larger than the number of CPUs. Otherwise, the tests will often be waiting for communication. In contrast, specifying a large value will not improve performance if tests will be performing calculations requiring heavy CPU processing.
defer statement and t.Cleanup() method
Using the defer
statement or t.Cleanup()
method requires some caution when running post-processing once a test is complete. Basic considerations for top-level test functions are listed below.
- If a top-level test function does not contain subtest functions using the
t.Run()
method, either the defer statement or thet.Cleanup()
method may be used to write the post-process. - If a top-level test function contains subtest functions using the
t.Run()
method but all of these subtest functions do not call thet.Parallel()
method, either the defer statement or thet.Cleanup()
method may be used to write the post-process. - If a top-level test function contains subtest functions using the
t.Run()
method and at least one of these subtest functions calls thet.Parallel()
method, use thet.Cleanup()
method to write the post-process.
The defer
statement is called when the function containing it returns. Review the previous example of executing code. The Test_Func1
function returns prior to subtest functions Func1_Sub1
and Func1_Sub2
completing (Operation 2). Therefore, any functions contained in the Test_Func1
function that use the defer
statement to specify a delay will be called prior to resuming processing after the Func1_Sub1
and Func1_Sub2
subtest functions are paused (note the location where Test_Func1 returned
is displayed in the execution results above).
For example, imagine a post-process where table records created by a subtest function are deleted. Even if this post-process is delayed by the top-level test function using a defer
statement, once the subtest function calls the t.Parallel()
method, the post-process function specified by the defer
statement would be called prior to executing the subtest function. In this case, instead of using the defer
statement, you should use the t.Cleanup()
method to write the post-process.
The description of the t.Cleanup()
method is as follows.
func (c *T) Cleanup(f func())
Cleanup registers a function to be called when the test and all its subtests
complete. Cleanup functions will be called in last added, first called
order.
Cleanup
registers a function to be called when the test and all its subtests complete.Cleanup
functions will be called in last added, first called order.
The description indicates that the function registered to the t.Cleanup()
method will be called once all subtests complete.
So, what will happen to a post-process within a subtest function written using the t.Run()
method? It will look a lot like the three considerations mentioned above, just with the subjects of each statement changed. Let’s take a look.
- If a subtest function does not contain any nested sub-subtest functions using the
t.Run()
method, either thedefer
statement or thet.Cleanup()
method may be used to write the post-process. - If a subtest function contains nested sub-subtest functions using the
t.Run()
method but all of these sub-subtest functions do not call thet.Parallel()
method, either thedefer
statement or thet.Cleanup()
method may be used to write the post-process. - If a subtest function contains nested sub-subtest functions using the
t.Run()
method and at least one of these sub-subtest functions calls thet.Parallel()
method, use thet.Cleanup()
method to write the post-process.
If you’d rather not memorize these six considerations for top-level test functions and subtest functions, you could instead just decide to use t.Cleanup()
to write post-processes for any test code using the t.Parallel()
method, depending on the project.
Summary
You might think that parallel execution would be performed properly as long as the t.Parallel()
method is called. However, you need to keep several points in mind, as discussed in this article.
I’ll summarize the key points below.
The -p
flag is used to specify that tests from multiple packages should be run in parallel as separate processes.-p=1
would cause packages to be run one at a time.- Calling the
t.Parallel()
method will cause top-level test functions or subtest functions in a package to run in parallel. - A test function calling the
t.Parallel()
method (including the top level) will not resume processing once paused by thet.Parallel()
method being called, until its parent test function call returns. - By default, the parallel level of the
t.Parallel()
method is the value ofGOMAXPROCS
. To explicitly change this, either specify the value using the-parallel
flag, or set theGOMAXPROCS
environment variable. - Determine whether to use the
t.Cleanup
method or thedefer
statement for post-processes within test functions, based on whether or not included subtest functions call thet.Parallel()
method. - Even if the
t.Parallel()
method is used, tests from multiple packages will not be run within a single test process at the same time.
This article summarized some points I’ve noticed in my work maximizing parallelization for tests in Merpay (microservice) packages. I was able to significantly reduce test times for these packages (to 10% or lower). I didn’t cover any specific points with regard to parallelization programming in this article, but would like to if I get the chance to do so.