Today, I want to share with you about a project I have been working on for a while that has been officially open sourced!
What is Testdeck?
Testdeck is a framework for integration, end-to-end (E2E), and security testing of gRPC microservices written in Golang and deployed in Google Kubernetes Engine (GKE). Its features include:
- Integration/E2E testing for gRPC and HTTP endpoints
- Fuzz testing
- Injection of malicious payloads (similar to Burp Suite’s Intruder)
- Utility methods for gRPC/HTTP requests
- Connecting a debugging proxy such as Charles or Burp Suite to intercept, analyze, modify, or replay requests while testing
As mentioned in this blog post, microservices at Mercari are deployed into a Kubernetes cluster. Prior to Testdeck, services were tested using only mocks. Since the service’s interactions with real components in the system wasn’t being tested, it could not be called “true” integration or E2E testing.
The concept of Testdeck is to deploy a temporary test service into the cluster that will run a set of tests to call the service’s endpoints directly and verify its responses for correct behavior. In other words, the test service will interact with the service just as a real consumer service would. Since this test service is deployed as a Kubernetes job, the pod will automatically stop running after test execution has completed.
Testdeck test cases are divided into four stages (all stages are optional so you can just use the stages that are relevant to your test). The reason for this design will be discussed later on in this article.
A simple test case would look something like the code sample below. Since Testdeck is built off of Golang’s native testing library, you can implement and run them the exact same way you do with unit tests!
Our overall system architecture is as follows:
- The microservice is deployed into a Kubernetes pod using Spinnaker
- Tests for the microservice are deployed to a different pod, also using Spinnaker. The tests run as a Kubernetes job
- Test results are saved to a database
- The team receives a Slack notification with Pass or Fail, and a link to the full test run report
- The team can view the full test run report on the dashboard (which reads test result data from the database)
Below are five advantages of Testdeck over other similar testing frameworks:
1) Easy Debugging
As mentioned earlier, Testdeck test cases are separated into four stages so that when a test case fails, you can have a rough idea of approximately where the failure occurred. We came up with this design because QA’s E2E tests tend to be long, with multiple setup steps for creating test data such as test accounts, etc. so it is helpful to be able to quickly isolate at which step tests failed:
- If a test case fails at Arrange or After, we know that something the test depends on (e.g. test account and test data) is broken
- If a test case fails at Act, most likely there is something wrong with the service itself (e.g. the endpoint returned an error, the service could not be reached, etc.)
- If a test case fails at Assert, the response returned is different from what was originally expected. This could mean that either the response format or spec has changed so the test case needs to be updated, or there is a bug
2) Metrics Collection
It is a best practice in test automation to collect metrics about tests such as the number of failed tests, failure reason, test run duration, etc. Metrics can be used in statistical analysis later on to identify and fix issues such as “flakey” tests (test cases that fail randomly and are unreliable).
Testdeck is designed to save test results to your specified database for use in reporting or statistical analysis. To configure metrics collection, simply follow these steps:
- Set the environment variable DB_URL to the root URL of your DB’s API methods
- Clone Testdeck and modify db.go to fit your database’s schema (the statistics that you want to collect will likely differ from ours so we just provided ours as an example)
If you are collecting metrics in a database, you may also want to visualize the information onto a dashboard for stakeholders (e.g. QA, PMs, etc.) to see as this is considered as a best practice of test automation. Below is a screenshot of a simple dashboard that we made to show test results, including at which lifecycle stage the tests failed at, and links to the output logs for use in debugging:
3) Same technology stack used by engineers
Testdeck was written in Golang, based off of Golang’s native testing library, and deployed into the development cluster using Spinnaker in order to use the exact same technology stack as the backend engineers. This is to encourage “Shift-Left Testing”- instead of waiting until the Testing stage of the Software Development Lifecycle (SDLC) for QA to begin testing, testing is “shifted left” into the Development stage. Engineers take responsibility for the quality of their code by creating their own automated tests as they are implementing features. By choosing the same technology stack, we hope that engineers can seamlessly adapt to Shift-Left Testing without having to waste time learning or integrating technologies that they are not familiar with.
4) Easy to extend and integrate with other Golang libraries
Since Testdeck is written purely in Golang, it can easily integrate other Golang libraries such as google/gofuzz (which we use to generate random input for fuzz testing) and stretchr/testify (a common library used for assertions in unit testing). It is also easy to add your own features as needed, making it a highly-customizable tool that can be extended and modified to suit your team’s needs.
5) A single tool for both QA and Security testing
During my career transition from Automation Engineer to Security Engineer, I noticed that surprisingly there are quite a few similarities between the two fields. Two examples include: 1) the recent trend for both QA and Security to shift left and heavily rely on automation (“Shift-Left Testing” and “Shift-Left Security”), and 2) the fact that both quality and security must be built into the software instead of considered as an afterthought. I also noticed that there is a natural overlap between QA testing and security testing. The two types of testing are done from the perspectives of different types of users- the normal or slightly curious user, and the malicious user. Some types of testing such as input validation, error cases, business logic (particularly around features involving authentication and authorization), etc. are overlap so we came up with the idea of using one tool to conduct both types of testing. By doing so, we can minimize time spent on integrating, learning, and mastering multiple tools, as well as execute automated QA and security tests together in order to ensure that a baseline of quality and security is always met with any code change.
Setting up Testdeck
For information on how to set up Testdeck for your organization, please see the setup documentation.
Testdeck test cases run in the Kubernetes cloud so we use a tool called Telepresence during our local development of tests. This allows us to proxy into the cluster to access microservices directly even when running and debugging our tests locally.
Mercari’s microservices backend has significantly grown over the past few years and with over a hundred microservices existing now, it is no longer scalable for QA and security teams to test each code change manually. Our development teams are moving towards DevOps culture as backend engineers take full responsibility for all stages of the software development lifecycle for their microservices. We created Testdeck focusing on ease of use for engineers, extendability for adding new features or integrating existing Golang libraries in the future, and having a one-stop solution for the testing and reporting of both QA and security test cases.
We hope that we have given you some ideas for automating QA and security testing at your company. If you would like to learn more about Testdeck, please see the newly-released public repository here.