In this post, we’ve gathered many of the common (and some not so common!) terms, patterns and products developers and operators reference when describing various aspects of a modern microservices architecture. In this post, we’ll look at architecture and components, operational patterns and techniques, and finally, products and projects.
Microservices are an architectural style that decomposes a real or notional monolithic application into individual services, which are loosely coupled to deliver the same functionality as the corresponding monolith, but with some additional advantages. A microservice can be developed independently from other services, can have its own CI/CD pipeline and can be scaled individually in production as needed. As a result, the application can evolve more quickly and closer to changing requirements. On the downside, breaking a monolithic application up into individual components can be a daunting task and incurs the overhead of communication over the network, which can be slow and unreliable. Often, microservice-based applications resemble more a “distributed monolith” than a “true” microservice architecture, which leads to substantial difficulties as the application evolves and scales.
Microservices are sometimes confounded with containerized applications running on a scheduler such as Kubernetes or Mesos but are an architectural style that is independent of the technology stack used. Microservice-based applications are sometimes combined with a Service Mesh such as Istio or Linkerd to encapsulate the challenges that arise from having to establish, manage and secure service-to-service communication.
A service mesh is not a “mesh of services.” It is a mesh of API proxies that (micro-)services can plug into to completely abstract the network away. Service meshes are designed to solve the many challenges developers face when talking to service endpoints. However, they do not address operational issues affecting the entire architecture.
In a typical service mesh, service deployments are modified to include a dedicated “Sidecar” proxy. Instead of calling services directly over the network, services then call their local sidecar proxies, which in turn encapsulate the complexities of the service-to-service exchange. The interconnected set of proxies in a service mesh implements what is referred to as its Data Plane.
In contrast, the set of APIs and tools used to control proxy behavior across the service mesh is referred to as the Control Plane. The control plane is where users specify policies and configure the data plane as a whole. Both, a data plane and a control plane are needed to implement a service mesh.
A “sidecar” is a software component that provides third-party functionality next to the actual workload, like a sidecar attached to a motorcycle. The most popular use of this deployment pattern today is the injection of a service proxy such as Envoy into a workload deployment to manage the workload’s service communication. In the context of a Service Mesh, a set of such injected proxies implements its Data Plane.
A control plane in the context of Service Meshes is a set of APIs and tools used to control proxy behavior across the mesh. The control plane is where users specify authentication policies, gather metrics and configure the Data Plane as a whole.
In a typical Service Mesh, service deployments are modified to include a dedicated “Sidecar” proxy. Instead of calling services directly over the network, services call their local sidecar proxies, which in turn encapsulate the complexities of the service-to-service exchange. The interconnected set of proxies in a service mesh represent its “data plane.”
An API Gateway is a reverse proxy that maps Microservices to APIs and exposes them to the outsides world. As the name implies, it acts as a “gatekeeper” between the clients and microservices. The basic features of a typical API Gateway include the ability to authenticate requests, enforce security policies, load balance requests and throttle them if necessary. For a more detailed discussion about API Gateways see “What is an API Gateway?” Examples of popular API Gateways include Kong, Amazon API Gateway and Express Gateway.
Like Microservices, organic architecture is an architectural style rather than a particular architecture. Organic architectures come into being when applications and services are composed in a continual and nimble manner to form a Service Landscape. Service landscapes treat applications as common digital capabilities and are built up by their organic federated growth. As a result, organic architecture is able to adapt to the numerous and rapidly changing needs in an agile enterprise. While growing digital capabilities organically in a federated way allows for rapid adaptations and results in a fast time to market, it also gives rise to complex emergent Behaviors that need to be controlled for organic architecture to be successful.
A service landscape is the topology that results when applications and services are connected to form an assembly of digital capabilities. Service landscapes evolve through organic federated growth of applications. From an behavioral perspective, service landscapes act as Organic Architectures.
In the context of Organic Architecture, emergent behavior arises as federated microservices interact in more significant numbers. Emergent behaviors are typically dynamic and cascading on a large scale, and manifest themselves as communication pathologies such as burstiness, imbalanced interaction characteristics or unpredictable latencies. These communication patterns and other behaviors (both good and bad) reveal themselves when one “zooms out” to observe the architecture as a whole as opposed to looking at the individual services. Because these large-scale behaviors are emergent, they are inherently unpredictable and can present themselves rapidly. As a result, it is essential that operations can detect them quickly and remediate their adverse effects immediately. This inherent unpredictability is also the reason why they are impossible to “squash” in a staging environment. Detection and remediation need to happen in production, in real-time, when scale and time act as catalysts.
Contrary to what the name implies, “serverless” does not mean running application code without servers. Instead, serverless means that cloud resources are being utilized in a truly on-demand or “pay only for what you use” fashion. This allows organizations to avoid the costs associated with idle or underutilized resources that would have otherwise been provisioned to support the application. Serverless applications run code in event-triggered and short-lived “Function-as-a-Service” (FaaS) containers that are run entirely on demand. Because serverless applications implement business logic through a set of distributed functions, they share many similarities to Microservices, with a few notable exceptions. Serverless applications have less operational overhead by design than microservices because the “operations” are being transferred to the cloud provider. Also, because function containers typically don’t provide API management features or runtime controls over execution time, disk or memory usage, serverless applications tend to make extensive use of API Gateways. AWS Lamba is arguably the most popular “function-as-service” at the moment.
In the area of traffic management, “backpressure” is what operators exert against callers when a set of services is under duress or if the aggregate call pattern exhibits too many spikes or is too bursty. The backpressure operational pattern ultimately helps equalize traffic characteristics to protect services from overload, which can lead to a multitude of adverse effects.
In the area of chaos engineering, “brownout” is an operational pattern that gradually reduces capacity in a service to test the resilience of the overall architecture. This often requires invasive modification of business logic and is therefore costly to implement at the service level. A much cheaper and general approach to creating brownouts is to directly modulate the capacity of the communication routes involved with a cloud traffic controller such as Glasnostic. This allows brownouts to be applied at any time and between arbitrary sets of services.
A bulkhead is an operational pattern that seeks to segment an architecture such that a failure in one segment doesn’t ripple across the entire architecture. For instance, to prevent gridlock in one availability zone from spilling over into another, a policy could be instituted that only allows modest traffic volumes to cross zones. The “bulkhead” name comes from naval ship design and refers to the watertight walls that segment a ship’s body so that a hull breach in one section won’t sink the entire ship.
Similar to Quarantining, canary deployment is an operational pattern that is used to reduce the risk of deploying an unproven workload. In the case of canary deployment, this workload is typically a new version of an existing service. In this pattern, the new version is introduced as a “canary in the coal mine” to only a tiny fraction of traffic. If it turns out to work correctly, traffic is then increased to full and the previous version decommissioned.
When services become unavailable, clients often resort to multiple unsuccessful retries before ultimately giving up, resulting in outsize SLA violations. Traditionally, engineers tried to avoid this from happening by specifying various levels of timeouts. A more intelligent approach, however, is to “circuit-break” the service, i.e., flag the service as unavailable for all clients and only re-engage it (“close the circuit”) when the service becomes available again. Note that circuit breaking can be a developer as well as an operational concern. As an operational pattern, circuit breakers are crucial tools in ensuring the overall Quality of Service levels.
In chaos engineering, fault injection is the practice of intentionally injecting communication errors or other faults such as unexpected delays, payloads or protocol responses into a service interaction. Depending on the type of fault that is to be injected, this is typically achieved by modifying the service logic. A much cheaper approach, however, is to use a cloud traffic controller such as Glasnostic that is able to inject faults directly and at any time into arbitrary communication stream. Fault injection testing can reveal issues related to service availability, the ability to communicate useful information or gracefully recovering from a failure.
When machines (or automation in general) is left to operate autonomously, small issues can compound in non-linear ways and thus quickly escalate to catastrophic events. The governor operational pattern is applied at critical junctions to allow for human oversight. Human pattern recognition is a powerful antidote to machine-generated chaos when provided with the right amount of visibility and immediate control.
When faced with sudden performance degradations in critical services—for whatever reasons—it is paramount that operators are able to “shed” load against these services in a granular manner. This may involve temporary policies to drop long-running request, push back on tier-3 service requests, block requests from malicious clients, limit data transfer volumes or the like. Load shedding is a popular operational pattern in availability and denial-of-service (DoS) management.
In the world of Microservices, ensuring quality of service (QoS) means limiting non-essential interactions to provide the best possible networking performance and user experience. The common challenges to QoS when running microservices include having to monitor and manage diverse technology stacks, understand complex and unexpected service interactions, account for constant change and have visibility into hundreds if not thousands of individual components. Quality of service is an operational pattern that combines other patterns such as Backpressure or Circuit Breakers to ensure critical services are available and degrade gracefully under duress.
In a large-scale, agile service environment such as an Organic Architecture, it is impossible to stage and test every deployment before it enters the production environment. This is especially true for production environments that make use of continuous deployment pipelines. In such cases, the ability to quarantine (or “ringfence”) unproven deployments is essential to mitigate its associated risks by imposing temporary rate limiting policies on the deployment.
Because Microservices and Service Landscapes are developed by numerous groups and teams independently of each other and in parallel, it is vital for operators to be able to partition the landscape into “segments” with defined ingress and egress points that ensure operational concerns such as security and compliance are always satisfied. Segmentation is an operational pattern that helps ensure that security policies are honored, even in volatile environments.
The Amazon API Gateway is a fully managed service for developers to create, publish, maintain, monitor and secure APIs. If you want to expose AWS workloads as an API, it is worth considering due to its seamless integration with the AWS Management Console.
Consul is a newer addition to the ecosystem of service mesh control planes that works with multi-datacenter topologies and specializes in service discovery. Consul works with a number of Data Planes and can be used with or without other Control Planes such as Istio. It is sponsored by HashiCorp.
Envoy is a Layer-7 proxy often used as the Data Plane in Service Meshes. Envoy is written in C++ and designed to run as a Sidecar alongside every workload. Envoy is developed at Lyft and quickly displaced other proxies due to its convenient configuration API, which allows Control Planes to adjust its behavior quickly and in real-time.
Glasnostic is a cloud traffic controller that lets digital enterprises detect and remediate the complex emergent behaviors that their connected Service Landscapes exhibit so they can innovate faster and with confidence.
Istio is sponsored jointly by Google, IBM and Lyft, and is arguably the most popular Service Mesh today. Istio is a Control Plane that is typically paired with Envoy as a Data Plane and runs on Kubernetes. For more details on Istio, check out our “The Kubernetes Service Mesh: A Brief Introduction to Istio” blog post.
Kong is a cloud-native API Gateway written mostly in Lua that is extensible through both open source and proprietary plugins. It integrates easily with the Kong Service Control Platform, the API management solution from the same vendor.
Linkerd is an open source project sponsored by Buoyant and “the original” Service Mesh. Initially written in Scala like Twitter’s Finagle, from which it evolved, it has since merged with the lightweight Conduit project and relaunched as Linkerd 2.0.