Migrating SumUp's internal tools
Written by Vitor Falcao
Migrating internal tools can be painful. So let me share our experience migrating from one CI and CD tool to another.
Introduction
All companies have their internal tools. There are a few for communication, another for versioning control, one more for handling agendas, and the list keeps going.
In this post, I'll focus on our migration from one CI and CD tool to another. More specifically, from BuildKite to GitHub Actions. But I believe the knowledge and strategy will be the same for most occasions.
Lights, camera, GitHub actions!
At SumUp, we've used a fantastic tool called BuildKite for years. But, unfortunately, we reached a point where it was not fulfilling our needs anymore, at least not as easy as we wished.
Migration process
It's not easy to change something that hundreds of developers use. It needs collaboration from both sides. So let me tell you how we are doing it.
Reduce friction
Migrating an entire CI and CD pipeline tool for more than hundreds of projects is not easy. It's not something you can do in one or two weeks, you'll depend on the collaboration of every developer, and you'll need to run learning sessions to help reduce the learning curve of the new tool.
The lucky part is that our developers have loved GitHub Actions since day one. We've reached alignment. They understand why we're migrating and can also benefit from that. It's the perfect situation. Having an alignment between you and your stakeholders before starting any migration process is fundamental.
To reduce friction, we decided to maintain the same behaviour from the old pipelines. We didn't want to change the developer's experience. All we'll change is the tool; the only difference for them at the beginning should be the link they open.
Adapt to differences
I didn't want to get too technical, but it will provide us with good examples and some good thoughts.
BuildKite and GitHub Actions are pretty different, so replicating things requires much effort. For example, we hosted BuildKite runners in our Kubernetes clusters inside our network. So they could easily communicate with the application in development and staging. GitHub Actions runners are somewhere between the clouds, far away from us.
Our developers also maintain and evolve their pipelines, so they must change their mindset. They have to start thinking differently to solve the same problems. A new tool or language is always an iceberg. When you think you know a lot, you see how 90% of it is underwater, and you have only scratched the surface. GitHub Actions is an easy-to-use tool, but it still follows this principle.
BuildKite has plugins that work well, but it's far from the Actions Marketplace variety, which also brings some security concerns to discuss later. You shouldn't run unknown code in your private repository pipelines, just as many people do with those heavy NPM dependencies.
As you can see, we create new technical and non-technical challenges. These challenges develop risks that we must mitigate. There are several ways of doing it, for example, by running learning sessions or creating documentation. You could also gather a technical group and have RFCs and discussions on how to overcome these technical challenges.
The important lesson here is to involve as many people as possible without decreasing productivity.
Break things
Even though we're trying to keep things as before to reduce friction, it's impossible to replicate everything. So some things will change, and you should get over it. Embrace these changes.
Having space to create changes is fantastic. It opens doors to better solutions by gathering previous experience data. In addition, we'll have both tools running simultaneously at the beginning, so if one thing doesn't work in Actions, you'll go to BuildKite, and it'll work because we've changed only how it works, not the final result.
Automate it
Are you going to manually migrate a thousand YAML files used in BuildKite to Actions? You shouldn't. Automating the migration process is a good option, even better when the company's technologies stack doesn't change much from project to project as at SumUp.
We've created the most basic template for Elixir applications we could have, and then we gave it to the developers. So now, they don't need to wait for us to try it.
Keep the automation as simple as possible. Don't try to make anything marvellous and complex. Developers will bring you feedback and requests, allowing you to evolve your template and make it as flexible as needed; this will be the right time to increase complexity.
Help the community
The nicest part? SumUp is now contributing to the Actions Marketplace. We'll do it if we can create an Actions that can be open source because it won't disclose any sensitive information. We'll do it if we can remove sensitive data and make it available. Feel free to contribute to them at any time!
The pricing model change
Specific issues appear depending on the tool and its nature. For example, when migrating CI and CD tools, pricing is always something to consider carefully.
When we decided to migrate, we also considered the pricing. BuildKite bases its pricing models on your active users, while GitHub Actions uses how many minutes you spend. So, again, we had to change our building mindset. For BuildKite, we were running entire pipelines in every branch, every PR, and not caring about how many minutes we spent. When we started developing GitHub Actions pipelines, we had to think more about caching and saving time.
Conclusion
The SRE and DevOps team had a lot of issues with BuildKite, and then we put a lot of expectations on GitHub Actions. But it would help if you didn't think these migrations processes are a short-term solution. When we migrate tools that affect several people, we must be careful. Otherwise, they won't want to adapt to your new solution.
Migrating tools slowly means more iterations and discussing with your stakeholders instead of just typing code and making things happen. You want them to get into the challenge to help. So we must control our anxiety about using all the shiny features the new tool has and keep it as simple as possible.
You'll start the fundamental migration part when everything is ready from a technical perspective. The final step is when users start grasping how the new tool works and evolving it. It is the stage that brings the most value.