The term “scalability” is often used as a catch-all phrase to suggest that something is poorly designed or flawed. It’s commonly used in arguments as a way to end discussions, indicating that a system’s architecture is limiting its potential for growth. However, when used positively, scalability refers to a desired property, such as a platform’s need for good scalability.
In essence, scalability means that when resources are added to a system, performance increases proportionally. This can involve serving more units of work or handling larger units of work, such as when datasets grow. In distributed systems, adding resources can also be done to improve service reliability, such as introducing redundancy to prevent failures. A scalable always-on service can add redundancy without sacrificing performance.
Achieving scalability is not easy, as it requires systems to be designed with scalability in mind. Systems must be architected to ensure that adding resources results in improved performance or that introducing redundancy does not adversely affect performance. Many algorithms that perform well under low load and small datasets can become prohibitively expensive when dealing with higher request rates or larger datasets.
Additionally, as systems grow through scale-out, they often become more heterogeneous. This means that different nodes in the system will have varying processing speeds and storage capabilities. Algorithms that rely on uniformity may break down or underutilize newer resources.
Despite the challenges, achieving good scalability is possible if systems are architected and engineered with scalability in mind. Architects and engineers must carefully consider how systems will grow, where redundancy is required, and how heterogeneity will be handled. They must also be aware of the tools and potential pitfalls associated with achieving scalability.