Home > What Is a Distributed System?

What Is a Distributed System?

Before you can answer this question, I like to step back and take a good look at the history of computing. Mainframes and legacy client-server applications were monolithic—all the processing took place on a single set of hardware. As hardware grew cheaper, and especially after widespread hardware virtualization came to fruition, these trends gave way to the widespread development of distributed systems. I began to see this trend during my time as an architect at a large cable and telecom provider. Physical servers and traditional block-based storage had higher costs and didn’t deliver the flexibility our development teams needed to meet their continuous delivery goals. A distributed computing system is a collection of individual software components located on different computers sharing messages with each other to achieve common goals. This process is transparent to the end-user, who simply interacts with an app or a website. While on the back end, there’s a collection of different services and systems all working to achieve a common set of tasks. A distributed system offers several benefits, including reliability, aiming to eliminate single points of failure. Distributed systems also provide fault tolerance through various forms of clustering and computer networks. Beyond this, they offer scalability for web applications by delivering greater concurrency through scalability patterns. However, while increasing availability, distributed systems introduce inherent complexity around engineering, monitoring, and troubleshooting.

How Does a Distributed System Work?

Distributed systems don’t have a single design or implementation pattern—instead, they break down into patterns aligned with various types of business problems. An example of this is the sharded pattern for stateful services like databases. This allows you to horizontally scale your data tier to avoid performance bottlenecks and single points of failure. This pattern is illustrated in Figure 1 below and is commonly referred to as a distributed database.

Figure 1: This image illustrates the sharded pattern of a distributed system.

The other element of how distributed systems work includes how all the elements communicate with each other using REST (Representational State Transfer) APIs, a set of definitions and protocols which allow two systems to communicate with one another.  These APIs often serve as an abstraction layer to the direct language needed to access the target system. This allows for easier and faster development. It’s important to understand the latency between these systems, even within a single compute environment, because it will have a direct effect on the overall performance of the distributed computing system.

Key Concepts and Architectural Design Elements of Distributed Systems

Early distributed systems were built from the ground up in a custom manner. This led to overly complex systems—however, a few paradigms changed to make distributed architecture easier to deploy. The first is the broad adoption of virtualization, but not just virtual machines. The concept of virtual networking allows for key distributed systems concepts like software load balancing, which allow for stateless workloads to easily scale horizontally. The other changes are the ubiquity of cloud computing from vendors like Amazon and Microsoft, which allows nearly any component to be software-defined. It also enables container orchestration platforms like Kubernetes, which also allow for “infrastructure as code” and provides service resiliency in the event of hardware failure. You may notice a general trend of abstractions here—whether it’s software-defined networking or a SQL tier behind a REST API. Using these abstractions allows for faster development cycles and added flexibility. You can easily add complex networking rules in your code without ever having to log in to a router, or you can design a web tier to scale to more instances based on the average CPU utilization or memory consumption. These design patterns are helped by containers, which themselves are an abstraction of a host operating system at the application layer, whose lightweight design allows for easy deployment. Adding a new VM to a load balanced set can take minutes, while adding an additional container to a Kubernetes replica scale set pod can happen in a matter of seconds. Beyond these design patterns, a key concept of distributed systems is desired state configuration and service healing. For example, in Kubernetes, the head nodes will keep all the defined objects in the cluster—if for example, a database pod is suddenly unavailable or missing, because of service healing, the cluster will redeploy those pods. While this example is specific to Kubernetes, these concepts apply to many distributed systems to cut single points of failure.

Types and Examples of Distributed Systems

Distributed computing systems support most of the modern web and blockchain systems; however, let’s look at some specific examples of systems, which have largely replaced traditional client-server systems. One common element of a distributed system is a microservice architecture, where each computing function is broken out into its own service. An example of this is shown in Figure 2 below.

Figure 2: This diagram shows the microservice architecture of an e-commerce application.

This example is a simple e-commerce application designed to process orders from customers via both mobile apps or web browsers and then handles the order fulfillment process through a series of related APIs, including a database for each function. The benefit to this pattern is any of these components can be easily swapped out for a different technology without affecting the overall system.

Advantages and Disadvantages of Distributed Systems

The major disadvantage of a distributed computing environment is the increased complexity compared to single system architecture. Other challenges of distributed systems include monitoring ephemeral items like containers, and bringing together monitoring information across technology stacks. While this problem has been solved by several application monitoring vendors, it can still be challenging to make sense of all the monitoring information. The benefits of such systems architecture are flexibility, resiliency, and scalability. Modern orchestration platforms help eliminate single points of failure and allow for large-scale computing paradigms designed to support nearly any workload. Harnessing computing power from the nearly infinite scale of the cloud, along with real-time scaling, allows for distributed applications to scale, which simply wasn’t an option on n-tier or single computer systems.

Why Distributed Systems Are Here to Stay

While distributed systems can be challenging to implement and manage, the benefits far outweigh the costs. There are several use cases where you simply can’t have a centralized system with the processing power or flexibility of a distributed system.  My belief is while there will always be less flexible systems, modern software development has converged on development patterns dependent on distributed systems. This emergence has changed the way IT organizations operate, and it will continue to sustain this model of systems. Additionally, the ubiquity of cloud computing means the most basic systems are already being built on a distributed system (the cloud framework itself). As organizations mature in their cloud efforts, they will continue to evolve using distributed patterns. Interested in finding a solution built to help you achieve your customer experience goals while ensuring service availability? See how SolarWinds® Hybrid Cloud Observability can help increase visibility, intelligence, and productivity across your on-premises, hybrid, and cloud environments.
Joey D'Antoni
Joey D'Antoni is a principal consultant at Denny Cherry and Associates Consulting. He is recognized as a VMware vExpert and a Microsoft Data Platform MVP…
Read more

Tweets

SolarWinds's Twitter avatar
SolarWinds
@solarwinds

Take your database performance to the next level with our eBook, “Understanding the Fundamentals of Database Perfor… t.co/JpspL49Mlo

SolarWinds's Twitter avatar
SolarWinds
@solarwinds

"Observability provides information on the correlation of the various components and can quickly find the root caus… t.co/7H2U9yQJ5h

SolarWinds's Twitter avatar
SolarWinds
@solarwinds

Miss out on learning how to use problem management to get ahead of incidents before they happen? Watch our May 25… t.co/3QZnYCMp2M