The 15th day’s post of Merpay Advent Calendar 2019 is brought to you by Robert, a Backend Engineer @ Merpay.
Hi, I already talked a little bit about how we have different challenges facing us in my previous post. tech.mercari.com
Apart from security, we also need to worry about stability. So, in this post I’ll talk a little bit about that.
What do you mean by stability?
When I say stability, I mean we need to make sure our service continues operating under heavy load even when it gets accidentally load tested on production 🙂
As Merpay uses a microservices-based architecture, it is of course important that all services are independently stable, but this time I’ll focus on a specific use case.
What happens when that time of the month comes?
I’m talking about the due date our users set when they set-up Merpay Smart Payments.
When that time comes, the user has several options for repayment. One of those options is an automatic system to transfer funds from a previously connected bank account.
As the number of banks we support increases, we get more and more users who connect their bank accounts.
This means that we get more and more requests for real-time bank transfers for the purpose of repayment.
So… what’s the problem?
Of course, you can be like, just call the bank’s API and be done with it, right?
Well, most banks and the underlying networks in Japan impose a certain limit of concurrent transactions that they can process.
Hmm, shouldn’t it be their responsibility to handle the load?
Well, yes and no. We also need to be mindful users and properly rate-limit the requests we make.
OK, let’s try to play nice
As Merpay is still a startup company at its heart, the first iteration of repayments for Merpay Smart Payments worked by making small batches from all of the bank transfers to process and gradually sending them over to the bank service for processing.
This batching of bank transfers for the specific purpose of repayment happened in a different microservice. But, because we’ve also seen the amount of bank transfers increase over time, we decided we need to make a system that can process bank transfers in a smarter way 😄
Smart Rate Limiting
It should just be called plain-old rate limiting, but that doesn’t sound as cool 😎
For the second iteration of processing repayments, we decided to process them using a queue.
Due to our use of GCP, the go-to solution is
cloud.google.com
While Pub/Sub scales well, it’s still a very basic solution and doesn’t really work all that well by itself. As such, we built a custom processing pipeline on top of it.
The custom processing pipeline allows us to vary the rate limit depending on the bank and has a dynamic back-off algorithm for a graceful handling of retries. In the future we will be able to leverage this pipeline to also reschedule bank transfers depending on whether a bank is in maintenance or not, prioritize different types of bank transfers, and even more.
The microservice responsible for creating the batches now no longer has to worry about how rate limiting should be done depending on the bank, but should just be a mindful user of the bank service.
With this, we have moved the bank domain specific parts out of the microservice that handles Merpay Smart Payments to where it should be: in the bank service.
Future
I said that this is just the second iteration and we already have plans to improve upon this to further make the user experience even better.
If you’re interested in working on these kind of problems, then feel free to look at our openings 😉
Tomorrow’s blog post – the 16th in the Advent Calendar will be written by @akifumi. Please look forward to it! 🙂