TL;DR β this is a discussion about the pros and cons of version control layout alternatives. There is a ton of reading material about the pros and cons of monorepo vs. multirepo. In this article, I argue that the decision should be derived mainly from production architecture. π€
Version control layout in a microservices architecture is a basic strategic decision that will later dictate the options available for the next strategic decisions. Version control layouts are usually depicted as the battle between monorepos π£ and multirepos π₯, and each has a long list of pros and cons which Iβll discuss shortly. However, Iβd like to argue that the strongest consideration should be correctly capturing production architecture. This will give us stronger validation and assurances before going live. π π€
A monorepo is a single version control repository containing several applications, libraries and components. Typically, these components will compose a system. A simple example:
my-projects
βββ my-monorepo-repository <this is a repository>
β βββ my-backend-application
β βββ my-frontend-application
β βββ my-shared-code
The multirepo approach advocates a dedicated version control repository for each individual component. The same example, now with multirepo:
my-projects
βββ my-backend-application-repository <this is a repository>
βββ my-frontend-application-repository <this is a repository>
βββ my-shared-code-repository <this is a repository>
Hybrid approaches advocate creating combinations of the above. Tools such as meta and Git-X-Modules allow us to create dedicated repositories for each component and a parent repository to reference all sub repositories and group them together. This provides a wrapper to allow synchronizing operations across more than one repo, it can help reduce the time and complexity of such operations. πͺ
Another hybrid approach example would be using git subtree split to split a monorepo to several dedicated component repositories and manage them apart while allowing for explicit merging and syncing. π€ π€
In the series intro, weβve defined consistent version guarantee between components A and B as:
Each version of component A is guaranteed to interact exclusively with a certain version of component B.
If you havenβt already, quickly view the intro section to better understand guaranteed version interaction and to view some examples.
So, whatβs the decision-making process? I believe that matching the version control layout to production architecture will have the strongest ROI. It will empower our development teams with highly trusted automated testing processes and increase production reliability while maintaining maximum simplicity. π π
In a consistent version environment , monorepos provide everything we need in our automated testing while keeping everything as simple and smooth as possible. π
For inconsistent version environments , multirepos are required to automatically validate backward compatibility and provide production-oriented testing to increase reliability. π»
So what about other considerations? Letβs quickly cover them.
Searchability π
Monorepos take a clear win here. Cross component search and navigation are much easier. Quickly jumping to definitions and documentation is very simple and straightforward. In multirepo environments, this is also possible, even without cloning all the repos. GitHub, for example, provides such organization-level search and jump-to-definitions. However, monorepos are definitely built for this. πββ
Coding Standards π¨βπ»
If consistent coding standards are important , monorepo can get a slight advantage here. Obviously, automated tools are required to enforce policies and standards, and this will work the same for monorepos and multirepos. However, teams who share the same repo are more likely to retain shared culture at scale. π₯
Version History and Information π
Multirepos take a strong win here. Each component has its own dedicated release and version control history. There might be tools to empower monorepos for such needs, but Iβm not familiar with any. If you are, please let me know! π β€οΈ
Parallelism β‘οΈππ»
Multirepos take a slight win here due to a bit smaller chance for conflicts, concurrent merges and a high-rate need for pulling.
Access Control π«
Multirepos have granular control over permissions and access control per component. Monorepos might support it too, depending on the version control platform and its features.
Cross Component Refactoring π§
This point is really a double-edged sword. Monorepos definitely provide faster abilities to refactor across multiple components. However, in some cases, deploying such changes can get dangerous.
Cloning Time βοΈ
Cloning can take many hours in extreme cases of monorepos. On the other hand, itβs a one-time process and then you never have to deal with cloning again. Iβd say itβs a matter of taste.
Dependencies & Code Sharing π
Strong win for monorepos. Dependencies and code sharing become as simple and straightforward as possible. Since this is a very important consideration, a wide variety of tooling and techniques open all possibilities to multirepos. However, since there is no truly simple solution for multirepos, monorepos win here mainly on the ability to remain clear and simple.
Generally speaking, all considerations provide tooling to support both if needed. The strongest point of monorepos is simplicity. However, in inconsistent version environments, it might be worth sacrificing simplicity to gain production reliability, which gives the edge to multirepos. πͺ
Have I missed anything? Please share your thoughts and let me know. β€οΈ
Special thanks to my superstar colleague Vlad Mayakov for helping in making sense and putting it all together.