Conf42 JavaScript 2023 - Online

Glorious Monolith: Scaling an application without microfrontends

Video size:

Abstract

Microfrontend is not the only way to scale a project. In this talk, we will discuss how to build a monolithic application that is easy to understand and allows for quick implementation of new functionality, without suffering from a growing codebase.

Summary

  • Maximskov is a lead microfrontend engineer at Yan X. He says monoliths are often compared to microfrontends, giving the impression that this is the best way to build large scale applications without turning development process into a nightmare. Next, we will delve into the topic of building an effective and scalable monolith.
  • Yandex Direct's code base is over 20 years old, and a massive amount of code has been written. The main contribution to the system complexity is high coupling between different parts of the system. As a result, the project becomes difficult to move forward and the development process slows down.
  • Micro front end is the idea of dividing an application into several smaller applications. This separation provides us with several advantages. Making change to smaller applications is easier, resulting in a less significant decrease in the team's productivity over time.
  • Microfrontend architecture offers great advantages, scales better than manoliths. But adapting microfrontends brings important benefits, but also comes at a cost. To connect all micro front end applications into a unified application, a complex infrastructure is required.
  • Software architecture is mostly about rules and constraints that lead to the creation of flexible and scalable system. To create a good monolith, modules need to meet certain requirements. These requirements are all about how modules structured and how they are isolated from each other.
  • When it comes to data isolation, things can get a bit trickier. Each module should have its own data storage. If one module needs some data from another module, it should only obtain it through a public API. This overall isolation allows us to avoid any unexpected visual breaks in an application.
  • A good monolith is built on three key principles: highly isolated modules, runtime for convenient module development, and rules for predictable decomposition of the system into modules. Developing a monolith like this requires some additional effort, but it's still a lot less effort compared to adapting a microfrontend architecture.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. My name is Maximskov and I work as a lead microfrontend engineer at Yan X. Let's start by thinking about what comes to mind when you hear the word monolith. Most likely you think of something huge, indivisible, and heavy. It's totally fine. But then it comes to software development. It's not exactly something you want to work with. You probably remember a project with a large code base that was very difficult to understand and also challenging to make. Change to the complexity of such projects tends to increase rapidly as the code base grows. In other words, these projects don't scale well. On the other hand, monoliths are often compared to microfrontends, which are frequently discussed at conferences, giving the impression that this is the best way to build large scale applications without turning development process into a nightmare. In this talk, we'll explore if it's true or not. We'll also delve into the main issue of monoliths and how microfrontends can help us to address it. The term glorious in the title signifies a good monolith, suggesting that not all monoliths are equally bad or bad at all. Next, we will delve into the topic of building an effective and scalable monolith by the end. The aim to understand then it becomes necessary to seriously consider adapting micro front end architecture and determine the extent to which we can rely on the monolith architecture without experiencing any development issues. So why did I decide to make this presentation? Well, it's pretty simple. I've been in web development for over 13 years, and during that time I've become proficient in building new products and maintaining large scale existing code bases. For the past three years, I've been leading the infrastructure team at Yandexdirect, which is a major online advertising platform. One of the challenges they faced was the scalability of the project. And here's the interesting part, they successfully solved it without adapting microfrontend architecture. We still have a monolith, and I want to share my experience and what we ultimately came to. So let's dive in and explore the world of monoliths. But before the start, let me share some numbers about Yandex Direct so you can better understand the scale at which our monolith successfully operates. Yandex Direct's code base is over 20 years old, and during this time, a massive amount of code has been written. When I joined the project a couple of years ago, the application had several million slides of code, and that was just the front end part. At that time, every developer including myself, kindly felt the pain working with a large monolith that didn't want to get any bigger. In other words, we found that the project architecture doesn't scale well to the size of our code base. So what the challenges does the team face in such situation? Usually in such project, several things happen. First, the project reaches a point where it becomes too complex for one developer to fully grasp. It becomes challenging to understand all the code and system functionality. This complexity makes developers hesitant to make changes to the code, as even a minor modification in one part of the system can potentially break important functionality in another one. Writing new code becomes difficult too, as it's not always clear where to edit. Multiple solutions to the same problems may exist as often, it's easier to write new code instead of understanding the existing code. Moreover, to solve any task, developers need to immerse themselves in a vast amount of code base, which takes a significant amount of time. As a result, the project becomes difficult to move forward and the development process slows down, leading to a decrease in productivity and in product quality too. The chart illustrates how team productivity decreases as the code base grows. The main reason for the decrease in productivity is that the complexity of development in the project is growing much faster than the size of the code base. And this happens because the main contribution to the system complexity is not complex algorithms or complex user interfaces, but the high coupling between different parts of the system. For example, in the worst case, two modules or functions depend on each other, resulting in two connections or dependencies. The three modules, the number of connections, can increase to six, and the four modules it can go up to twelve. This pattern continues until we end up with a big ball of mud where the boundaries of any functionality are completely blurred. And to solve any task, we have to delve into the entire code base. So we need to address this issue and aim to reduce the coupling. However, since amanolith is a physically single code base, there are usually no restrictions on importing code anywhere, leading to a rapid increase in the number of interdependencies. To regain control other system complexity, we must learn how to manage the coupling in our code base. And here comes micro front end. You may have already heard a lot about them, but if not, it's the idea of dividing an application into several smaller applications, like embedding a widget, displaying currency exchange rates on a news website. And there are many implementation options from a technical point of view. But I won't go into detail about that. What's important is the concept itself, where they divide one big application into several smaller applications that are independent of each other. This separation provides us with several advantages. Firstly, since each micro front end is a separate application, this is our own code base. Often in a separate repository, we can no longer simply import a function from random file than we need it. These applications usually communicate with each other according to specific rules, such as through a public API, events or something. As a result, the coupling between these applications will be significantly less than between different parts of a monolith. Also, dividing the code into multiple applications, on the one hand, leads to a lower productivity at the beginning of a new project as it incurs additional constraints and significant expenses in preparing more advanced infrastructure. But on the other hand, as the project evolves, we will experience fewer scaling problems than we would in a monolith. Making change to smaller applications is easier, resulting in a less significant decrease in the team's productivity over time compared to a monolith. Another advantage of dividing a monolith into different applications is the ability to distribute responsibility for different parts of the project among different teams. These teams will have the freedom to develop their own processes, choose technologies, define style guides, and determine release circles. As a result, teams can work at their own pace, with some applications being released more frequently than others. And what's even more, if a bug happens in one application, it won't require rolling back releases from other teams, which can be quite painful. Process well, it turns out that microfrontend architecture offers great advantages, scales better than manoliths, and there is even a certain bias around this topic that makes us want to try it in our projects. But does this mean that microfrontends are a good fit for every application? I don't think so, because adapting microfrontends brings not only important benefits, but also comes at a cost that is important to be aware of. Firstly, everything becomes more complicated. To connect all micro front end applications into a unified application, a complex infrastructure is required. Bundling becomes more challenging and you need to have a deep understanding of your bundler capabilities and it will be necessary to decide how applications communicate with each other and prevent any conflicts between them on a single page and if you need server side rendering. The complexity of the infrastructure increases significantly as it requires a distributed system of microservices, which has its own set of challenges. It's also necessary to think more about backward compatibility in releases. It's important to always remember and not break contracts between micro front end applications. It would also be more difficult to update shared dependencies. Trust me, if you ever tried to update the version of react on a bunch of micro front ends. You know, it's not the easiest thing to do, and besides, it's not always necessary to split application among multiple teams. Maybe you have a relatively small product that is being worked on by just one team, or maybe you want to have consistent processes and technologies across all teams so that you can easily transfer engineers between teams and focus on critical tasks at the time. And what should we do if the monolith doesn't scale well and the microfrontends are complex and often redundant? Well, there is a solution. We can try to take the best of both worlds. By combining the advantages of monoliths and micro front ends, we can create an architecture that can handle the growth of the code base without incurring significant infrastructure costs, especially at the beginning of the new project. In this case, productivity at the beginning might be slightly lower than in the traditional monolith due to the need for architecture and development tools setup. But as the project evolves, productivity won't decrease as significantly as with traditional monolith. It will be more similar to that of microfrontends. To achieve this goal, we can keep all the code in one repository, which allows us to easily reuse our code, have one release, and not worry much about backward compatibility. And it also helps us avoid complicating the development, deployment, build or maintenance of our application. At the same time, they need to preserve the advantages of micro front ends, which enable us to control coupling and system complexity. In other words, we need to find a way to divide the code into smaller and isolated parts. And if to think about it, what prevents us from doing this in a regular monolith? The only thing that actually stops us is the fact that in the code base of regular monolith, they have complete freedom of action. And glorious monolith differs from a regular one in that it has a well thought out architecture and complaints that this architecture is controlled by automated tools such as linters. So it's important to understand what architecture exactly is. Software architecture is mostly about rules and constraints that lead to the creation of flexible and scalable system. So let's talk about the rules and constraints that are essential for building a good monolith. It's worth starting with the introduction of the module concept. The main idea is that they need to divide our code base into separate and loosely coupled parts, which will call modules. The clear boundaries and weak dependencies between these modules are exactly what allow us to create scalable system. In such a system, each module will be responsible for a specific product or technical functionality, depending on the application. Examples of modules can include a cash balance module, data filtering module, maybe something bigger like a sidebar of an application, or even the entire page. Typically, modules responsible for a large amount of functionality are assembled from smaller modules. Externally. This looks very similar to a microfrontend architecture, but we are still within the same code base and do not incur the infrastructure tax for microfrontend. But unfortunately it won't be sufficient to just distribute the entire code base into different directories and consider it as modules. To create a good monolith, modules need to meet certain requirements. These requirements are all about how modules structured, how they are isolated from each other, and how they communicate to each other. Let's start with how things are structured inside modules. Inside a module, there can be everything that exists in a regular monolith application, for example, UI components, styles, the business logic of an application, and even technical things like libraries and frameworks. In general, all the code that exists in an application should be contained within one of the modules. Additionally, it's important that each module is implemented in a similar way, otherwise the team will have difficulty switching between the development of different modules. It's a good idea to limit both set of technologies and the high level directory structure. This will allow logically separating the code of each module into several segments, each with its own area of responsibility. For example, you can have four segments like on the slide, which I took from the feature slice design methodology. However, depending on the project needs, you can come up with your own set of segments, like you can add a separate segment for server side code, which can contain API endpoints, database connections and something. The key is that the entire team clearly understands where to put new code and where to find existing code. Unfortunately, it might be difficult to perfectly synchronize everyone's understanding of what each segment is responsible for. Each engineer might have a slightly different interpretation. For instance, let's imagine, can a state manager inside the model segment roll the user a pop up notification about the successful completion of a data saving operation? Well, someone might say yes, why not? While someone else might say no. That's the responsibility of the UI segment. So we can resolve this ambiguity by introducing import restrictions between different segments, like in the layers of the clean architecture. In particular, what we can do is to put all the segments in order and prevent lower segments from depending on higher segments. For example, the UI segment can use both model and API segments as well as utility code. On the other hand, the model segments can only use the API segment and utility code, but it cannot use anything from the UI. Alright, these simple rules highlight the code's responsibility and help us to put it in the right place in the module. This makes decision making much easier and also comes with additional benefits. First, all the related code will be in one place within the module. That makes it easier to understand what the module is supposed to do and how it works. Secondly, it prevents the mixing of business logic and UI, which leads to more flexible, composable and easy to understand code. All of this make it way simpler to work with a large code base and switch between different modules with ease during development. To sum up, the strict structure of modules lets us effectively solve any tasks in any part of the system, even in modules we are not familiar with. The next step is to isolate a module from the rest of the system and other modules. This is a key part of the rules that helps us to achieve the two main goals. First, it makes the module loosely coupled from the other modules. They achieve this by allowing other modules to only use the functionality that the module developer has specifically prepared for this purpose. And since other modules won't have access to all the inner work ends of the module, the number of dependencies between modules won't grow as quickly as in regular monolith. At the same time, the dependencies themselves will be more obvious and controlled. Which brings us to the second goal, the ability to make changes to the module safely. The developer can confidently make changes to one part of the system without worrying about unexpected bugs in other parts of the system. And to gain that confidence, modules need to be isolated at every level from code to styles and data. So when it comes to code isolation, it's basically about two things, making sure there aren't any global side effects in your module and controlling what functionality is exposed to other modules. What do I mean by global side effects? Basically, it's anything that can implicitly change the behavior of other modules. For example, if our module patches some global objects, loads polyfills, it can cause other modules to rely on this behavior. And if loading order of modules suddenly change or we decide to remove some legacy modules, these dependent modules will stop working correctly. That's why global side effects are highly undesirable and should be avoided. And then I say controlling what functionality is exposed. What I really mean is a set of rules which give us a way to have one entry point for a module and treat it like a contract for how the module interacts with other modules in the system. And we call it the public API of the module. So the first thing we need to do to implement such public API is to create an entry point in each module, like an index file at the root of your module. In this file, we'll define everything that is available for use in other modules. Then we can set up a linter that will prevent other modules from importing anything except that index file. For this we can use Eslint and ready made plugins such as Eslint plugin boundaries. And how can we describe the public API of a module inside that index file? There are few options here, mostly depending on the framework you use in your application and your personal preferences. You can simply use ES modules and re export the part of the module's functionality. Alternatively, you can use dependency injection principle, especially if you are using angular or SJS frameworks. Or you can use an event driven architecture and connect all the modules by communicating with events sent to some sort of event bus. Each of these options has its advantages and disadvantages. For example, DI makes dependencies between modules less strong, but it does complicate the infrastructure a bit. On the other hand, the event driven architecture decouples modules even more, but you need to be careful with the model loading sequence to not miss any important events. Let's say we want to use ES modules and simply reexport some of the modules functionality, as shown on the slide. In this example, we are using the react Redux stack, so we mainly export react components, redux selectors and actions using ES modules in this case allow us to save on infrastructure, since no additional development is required for GI or event based architectures, and it works well with code analyzers out the box. For example, we can easily build a dependency, graph the system and use it for selective test execution. For instance. Yes, modules can also be loaded both statically and dynamically to implement code splitting technique, and this is available out of the box too. And sometimes we can still use event emitters exported from a module entry point to make the dependency between two modules as weak as possible, but only in that cases when it's not crucial to handle all the events and it's safe to lose some of them. One more thing to keep in mind is to control the size of public API. It's better to keep it as small as possible because larger public API increases the chances of additional dependencies between modules, which in turn makes the system more complex. In general, there are few factors that contribute to the size of a public API. This includes, of course, the number of exports from the entry point and the number of arguments for each export. For instance, exporting a component with a high number of arguments makes it more difficult to use it and also more difficult to understand its functionality. It's also worth paying attention to the complexity of the data structures that a module receives or returns. The more unnecessary data is passed between modules, the harder it is to make change to the internal implementation of a module. And again, it creates more opportunities for additional dependencies in the system, so it's better to try to keep a public API as small as possible. Alright, in most cases this rule should be enough to isolate code between modules, but the main thing is to enforce these restrictions with automated checks and linters, because it's almost impossible to remember all the rules and perfectly synchronize them across the team. So since we are talking about front end applications, we also have style sheets, right? It's also important to make sure our styles are also isolated, because style sheets has a global scope by default and can affect everything on the page. For instance, two modules might have the same class name, or one of the modules might add some sort of reset css and mess up the layout of all other modules in the system. So in order to avoid any unexpected styling issues, we need to make sure we keep our styles isolated from each other. There are a bunch of ways to make it happen, and we've only shown a few on the slide. And each of these options has its own advantages and disadvantages, and the one you choose will mostly depend on the project's requirements and your personal preferences. But the important thing is all of them work pretty well for building a good monolith, as long as we'll stick to some additional agreements. For example, if we go with CSS modules, they work really well for isolating styles. But as long as we only use classes and pivotal classes to select elements, using other selectors could easily cause styles to leak onto elements of other modules. Also, it's better to not import CSS files between different modules. Instead, if we need to override some styles in another module, we can pass, for example, class name as a component property and add it to the necessary elements. But it's even better to avoid style overrides at all, as it introduces dependency on the module loading order and selector specificity. Also, it's important to be careful with CSS custom properties, since custom properties have a global scope as well and can easily conflict between different modules. So it's better to avoid creating new custom properties inside module styles to prevent any potential visual bugs. And finally, we can control all of these rules with the help of the style in plinter. This overall isolation allows us to avoid any unexpected visual breaks in an application and safely make change to module styles. So when it comes to style sheets, the rules and constraints are quite simple. But when it comes to data isolation, things can get a bit trickier. In any system, code and data are highly coupled because data basically the main reason why we build almost every application, right? So if the module doesn't control access to its data, we will run into a bunch of problems. First, it will be really tough to make change to the module because we could easily break other modules in the system, then changing internal data structures of the module. Secondly, it's not really obvious dependency, then a module depends on data from another module. It's more like a global side effect, and change to the data can implicitly change the behavior of other modules. On top of that, then we are developing a new module. It will be challenging to use data from existing modules because we would have to dig through the entire code base of the system to figure out what data is available to use. And to give you some examples, these two cases on the slide are both incorrect. In the first case, we have one global storage with all the data, allowing every module to have full access to the data of the system. In the second case, the data storage is inside each module, which is correct. But the public API of the modules isn't strict enough, so other modules have full access to the data storage. And that's a problem. And to avoid these kind of problems, we just need to follow a few simple rules. First, each module should have its own data storage, and we don't want one single global storage for every module. And if one module needs some data from another module, it should only obtain it through a public API. The CQRS pattern is perfect for creating this kind of API. It lets us provide some separate operations for reading and mutating the data without exposing the entire storage. Also, when it comes to building user interfaces, it's important to respond quickly to data changes. So the public API should let us to subscribe to this change, not just receive the data once. Another thing to consider is protecting our data from accidental or intentional mutations. We can do this by simply exporting a readonly version of the data in the public API. In typescript based projects we can simply use the only type for this, or we can freeze the objects using object freeze, but it will add some runtime overhead. It's worth mentioning that all these rules and restrictions can be applied with almost any state management library, no matter the framework. The important thing is controlling which data is accessible for other modules and limiting how the data can be modified. Alright, so now we've got these well isolated modules, which means we can safely modify the code within them and control the coupling using the public API. The next thing we need to do is to introduce a runtime for these modules. But what is that? When they are creating a new module, it's crucial to understand how it will fit into the application. For example, we need to know what features of a bundler are available, what environment variables we can use, how to import images, styles and so on. It's also important to know which version of browsers and node js the code will run on, and also which library to use, for example, to provide data access in the public API. So we need some common rules and libraries for all the modules to make module development easier. And for this reason we introduce this thing called runtime for modules. For example, this is runtime from Yandex Direct. It consists of versions of the most important libraries like typescript, react and redux. It's important to share these kind of libraries across every module because they have a huge impact on what we can do in the public API. In fact, communication between modules heavily relies on these libraries. There are also some another libraries that don't affect the public API, but it's handy to have them as a part of runtime, for example, HTTP client router, library, components library, and so on. All of these libraries help us with common development tasks when we are working with modules. Although it's possible to remove almost everything from this list to create highly minimal runtime, there will be a downsides of doing so then designing a runtime for modules. It's important to find the balance between the size of a runtime and the convenience of developing modules. Making runtime bigger will make it more difficult to maintain it. In fact, making change to runtime is always risky because it affects all the modules at once. On the other hand, making the runtime smaller will make model development more challenging. There might be a lack of functionality requiring us to repeatedly develop custom solutions for each module. Let's take react as an example. If we decide to include react in the runtime, we can export components from module's public API, and that can be easily integrated between modules. But if we decide not to include react in the runtime, on the one hand each module can use a different framework, but on the other hand connecting multiple modules will be more difficult. So now we have runtime for modules, which makes module development much easier. But we still need to figure out how to organize our code base into multiple modules. Like when should we create a new module? How do we locate existing code within modules. Those are the kind of questions we need to address. Well, there is no one size fits all solution here, but luckily the problem has been extensively researched and there are plenty of methodologies like domain driven design, clean architecture, feature slice design and others. All these methodologies suggest some common rules and for breaking code into modules. First and foremost, each module should have just one responsibility and most of the time it's connected to the product domain, meaning the module is responsible for a certain product functionality. For example, it could be a module for handling payments, user authentication or module for editing articles. Secondly, each module should have high cohesion inside and low coupling with other modules. High cohesion is a good sign that we have properly defined the module's responsibility and that the module contains all the related functionality. High cohesion also makes it easier to locate the necessary code and dive into the business logic of a specific part of the application. It also means that the module will have fewer reasons to interact with other modules, which greatly reduce the coupling and the overall complexity of the system. And besides, each methodology suggests dividing all modules into several meaningful groups and implementing strict rules on how these groups can depend on each other. These limitations help achieve a predictable system breakdown and facilitate faster discovery of existing code. Moreover, it simplifies the decision making process on there to place new code by reducing the number of options available. And if you don't want to create your own rules and constraints for modules layout, you can simply choose one of the premade options like a trendy feature slice design methodology this methodology basically outlines an architecture that's pretty similar to a modular monolith when it comes to dividing the code into modules. The main idea of the methodology is to split them into six layers. Each layer has a different level of understanding about the product domain and a different level of impact on the system. You can find more information in the official documentation on the website, which is really useful. But what is important now is that there are two main rules to follow. First, the layers are strictly ordered and imports between layers are unidirectional from the app layer to shared layer. For example, a module in the widgets layer can use modules from the features, entities and shared layers, but cannot use pages. And second rule is the modules on the same layer cannot import each other. For example, two modules in the features layer are aware of each other and cannot import anything from each other. These two rules require careful consideration than dividing code into modules, but they lead to a predictable decomposition of the system, which in turn simplifies navigation and understanding of a large code base. This methodology has been used in Yandex direct for over two years and has proven to be effective. The only drawback I can mention is that there is a quite high learning curve. It will be necessary to ensure that the team has a synchronized understanding of the different layers and ideally provide documentation with examples for each specific application. So this was the key principles for building a scalable monoliths. It's time to draw some conclusions. First, a good monolith is built on three key principles. That is, highly isolated modules, runtime for convenient module development, and rules for predictable decomposition of the system into modules. Secondly, a good monolith is scalable, maintainable and adaptable to future change. It does not become a bottleneck in product development. That is because isolated modules and lau coupling between them allow us to work with a small piece of the code base at a time which can be quickly read and understood. It also makes possible to safely make changes to modules and avoid unexpected bugs and side effects. Developing a monolith like this requires some additional effort, but it's still a lot less effort compared to adapting a microfrontend architecture. And last but not least, it's highly likely that you don't need microfrontend tent architecture because it brings significant expenses for both implementation and ongoing support. Objectively, there is only one reason to adapt it, and that is then you need to completely isolate multiple teams from each other, allows them to have the allen technologies, processes, releases and so on. All the other advantages of micro front ends can be effectively implemented in a monolithic application. In fact, a monolith is perfectly suitable for the vast majority of application, especially in the early stage of development. And to dive deeper into the topic you can explore the following subjects. The modular monolith there are a lot of articles and presentations available on this topic in the Internet. The clean architecture has an excellent book with the same name. Feature Slice Design has excellent documentation on their website and to start building a glorious monolith in your project, you can use the following tools. Typescript it is essential for building any large scale application. Eslint and Stylint are used to enforce architecture rules and constraints, and dependency cruiser helps in controlling imports within the system. These resources and tools will assist you in developing and maintaining a well structured and scalable monolith and that's all from me. Thank you for joining the presentation. I hope you really enjoyed it. Feel free to leave comments I'll be happy to answer any questions and have a nice day.
...

Maksim Zemskov

Lead Software Engineer @ Yandex

Maksim Zemskov's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways