Istio rigging
How Istio is rigged. (Image:

The Kubernetes Service Mesh: A Brief Introduction to Istio

In this blog we explore what the Istio service mesh is, its architecture, when and where to use it, plus some criticisms of the platform.

The Kubernetes Service Mesh: A Brief Introduction to Istio

Istio is an open source service mesh designed to make it easier to connect, manage and secure traffic between, and obtain telemetry about microservices running in containers. Istio is a collaboration between IBM, Google and Lyft. It was originally announced in May 2017, with a 1.0 version released in July of 2018.

As we wrote in “Post not found: what-is-a-service-mesh-istio-linkerd-envoy-consul”,

A service mesh is not a “mesh of services.” It is a mesh of Layer 7 proxies that microservices can use to completely abstract the network away. Service meshes are designed to solve the many challenges developers face when talking to remote endpoints.

Service meshes focus narrowly on the technical issues related to establishing and maintaining reliable service-to-service connections. They do not solve for larger-scale operational issues such as inserting bulkheads, quarantining new services or exerting backpressure against a group of rogues.

As of this writing, Istio focuses mostly on Kubernetes.

Istio architecture

Like all service meshes, an Istio service mesh consists of a data plane and a control plane.

Istio data plane

The Istio data plane is typically composed of Envoy proxies that are deployed as sidecars within each container on the Kubernetes pod. These proxies take on the task of establishing connections to other services and managing the communication between them.

Istio control plane

How this communication is managed needs to be configured, of course. Istio’s component that is responsible for configuring the data plane is called Pilot. Apart from defining basic proxy behaviors, it also allows you to specify routing rules between proxies as well as failure recovery features.

The Mixer component of Istio collects traffic metrics and can respond to various queries from the data plane such as authorization, access control or quota checks. Depending on which adapters are enabled, it can also interface with logging and monitoring systems.

Citadel is the component that allows developers to build zero-trust environments based on service identity rather than network controls. It is responsible for assigning certificates to each service and can also accept external certificate authority keys when needed.

Istio control and data plane using Envoy
Istio’s separate, centralized control plane is typically paired with Envoy as a data plane.

Why use Istio or any service mesh for that matter?

If you are a developer or architect looking to create a network of deployed services with built-in traffic control features, service-to-service authentication and monitoring, all without having to make changes to your service code and you don’t mind running them on Kubernetes, then Istio is a good solution, even though it is not easy to install and requires a fair amount of knowledge of both, its own and Kubernetes’ internals to troubleshoot.

Istio’s key benefits include

  • Traffic control features including routing rules, retries, failovers, and fault injection
  • Policy enforcement including access controls, rate limits and quotas
  • Built-in metrics, logs, and traces for all traffic within a cluster
  • Secure service-to-service communication
  • Layer 7 load balancing

Control plane features by component are given in more detail below.

Pilot: Connectivity and Communication

  • Traffic management: Istio separates traffic management from infrastructure scaling (which is handled by Kubernetes). This separation allows for features that can live outside the application code, like dynamic request routing for A/B testing, gradual rollouts, canary releases, retries, circuit breakers and fault injection.
  • Fault injection: in contrast to killing pods, delaying or corrupting packets at the TCP layer to perform testing, Istio allows for protocol-specific fault injection into the network.
  • Layer 7 Load balancing: Istio currently supports three load balancing modes: round robin, random, and weighted least request. In contrast to Kubernetes’ own load balancing, Istio’s is based on application layer (Layer 7) and not just on transport layer (Layer 4) information.

Mixer: Monitoring and Observability

  • Backend abstraction: As mentioned previously, the Mixer component provides policy controls and telemetry collection, which abstracts away the implementation details of individual infrastructure backends.
  • Intermediation: The Mixer component also allows for fine-grained control over the interactions between Istio and databases.
  • Latency: Mixer can be used as a highly scaled and highly available second-level cache for sidecars.
  • Reliability: In some situations, Mixer could potentially help mask infrastructure backend database failures because of its caching capabilities.

Citadel: Encryption and Authentication

  • Service authentication: Services can only be accessed from strongly authenticated and authorized clients.
  • Authentication policy: Enforced only on the server side. Authentication requirements for services must be configured on the client side.
  • Role-based access control: Istio supports namespace-level, service-level and method-level access control for services.
  • TLS authentication: Istio supports both service-to-service and end-user-to-service authentication.
  • Key management: Automation of key and certificate generation, distribution, rotation and revocation.

Criticisms of Istio and other service meshes

Despite their apparent popularity and promise of attractive features, service meshes are not as widely used as one would expect. This is undoubtedly due in part to their relative novelty and the fact that the general space is still evolving. However, service meshes are also not without criticisms. Typical concerns regarding Istio and service meshes include the net-new complexity they introduce, their comparatively poor performance and certain gaps with respect to multi-cluster topologies.

First, service meshes require an up-front investment in a platform that can be difficult to justify when applications are still evolving. Also, their impact on application performance when compared to direct calls across the network can be substantial and difficult to diagnose let alone remediate. Because Istio currently runs only on Kubernetes, users will have to account for the well-known performance penalty that comes when running in a containerized environment. Finally, since most service meshes target individual microservice applications, not entire landscapes of connected applications, multi-cluster, multi-region support tends not to be a focus.

In short, service meshes are no panacea for architects and operators looking to run a growing portfolio of digital services in an agile manner. They are tactical affairs that represent a “below the ground” upgrade of technical issues that predominantly interest developers. They are often not a game changer for the business.

Istio and Glasnostic in an organic architecture
Glasnostic is a cloud traffic controller that plays well with Istio.

Combining Istio with Glasnostic

As we point out in “Should I Use a Service Mesh?,” Istio is a powerful technology to establish and maintain reliable service-to-service connections, in particular for self-contained microservice architectures that are built on Kubernetes. As a result, it can and likely should be used with any such applications, irrespective of whether or not an enterprise-wide control plane such as Glasnostic exists. Because Istio addresses the local concerns around intra-application connectivity and Glasnostic the large-scale, complex behaviors of systems of applications, they are complementary and may be deployed independently of each other.