The Kubernetes is Not Enough

Resiliency is an important feature that any Application should possess. A Resilient application handles failure gracefully and do not become unavailable when a failure occurs.

Let us take an example of a Microservices based application where

Microservice A is calling Microservice B

and

Microservice B is calling Microservice C

Let us say that Microservice C becomes slow or goes down, this can create a cascading effect on upstream calling microservices and can affect the whole application’s availability and response time.

In case Microservice C is slow, this will slow down Microservice B as well as A. In case Micro Service C is down or throws an HTTP 5xx error, Microservice B and A would still keep calling Microservice C in all subsequent calls and may end up consuming all its resources to the call faulted Microservice.

To be able to handle such situations gracefully, we would want a timeout to be introduced at Service B so that when a slowness occurs, Microservice B waits only for the defined time duration e.g. 5 seconds.

If Microservice C starts throwing 5xx errors we would not want all subsequent calls to still go to Microservice C. We would want Microservice B not to consume its server resources to keep calling Microservice C but preserve its resources like HTTP/TCP connections so it can remain healthy and call other Microservices also.

When we have hundreds of Microservices, these problems can be very difficult to detect and handle.

All these capabilities cannot be handled in Kubernetes without putting lot of logic inside the Microservice itself. Though Liveness Probe in Kubernetes will help restart a buggy container/pod but will not do much beyond that.

Kubernetes do not offer any capabilities to handle such situations and that’s why we have — “Service Mesh — Istio”.

A Service Mesh like istio simplifies implementation of circuit breakers, timeouts, retries and Canary rollouts with percentage-based traffic splits.

Virtual services and Destination Rules are the key building block of Istio’s traffic routing capabilities. Virtual services decouple where clients send their requests to and the destination workloads/services that actually implement them. Virtual services provide a rich way of specifying different traffic routing rules for sending traffic to those real workloads/services.

So a given request to the virtual service is routed to a real destination service within the mesh based on a uri prifex or http header value or based on some other routing rules.

Destination rules help configure what happens to the traffic for the end destination service. Destination rules are applied after virtual service routing rules are evaluated, so they apply to the traffic’s “real” destination. We can configure preferred load balancing model, TLS security mode or circuit breaker settings using Destination rules.

Below Virtual Service makes sure that all traffic for reviews service goes to reviews:v1.

Below VirtualService routes 90% of the traffic to reviews:v1 and 10% to reviews:v2

Below VirtualService adds a 2 seconds request timeout for calls to the reviews service.

Below DestinationRule limits the number of concurrent connections to 10 for the reviews service workloads of the v1 subset to 5. This way calling service will not consume all its connection resources to call the end service.

So to conclude we have seen how a service mesh like istio is needed along with Kubernetes to build a resilient micro services based application to handle faults gracefully and increase Availability.

In the above article, we have discussed only some of the benefits that istio provides. Apart from these, istio can also help setup fine grained access control between micro services, provide distributed tracing and observability etc.

To try istio labs on Oracle Kubernetes Engine, please follow the below link with step by step guide.

https://istio.io/latest/docs/tasks/traffic-management/