Why We Chose Istio at Jar

Feb 26, 2025

About Jar

At Jar, we aim to make saving effortless. Our app enables users to save money in 24K gold on a daily, weekly, or monthly basis. This not only fosters a daily savings habit but also allows users to invest in gold, benefiting from the appreciation in value over time.

Managing a fast-growing financial application at scale requires a robust and reliable infrastructure. Ensuring security, observability, and seamless communication between microservices was a challenge we faced early on. That’s where Istio came in.


Challenges of Not Using a Service Mesh in Production

Running microservices in production without a service mesh introduces multiple challenges, including:

1. Security & Encryption

  • Ensuring end-to-end encryption between services is difficult without a standardized approach.
  • Implementing mutual TLS (mTLS) across services manually is complex and error-prone.

2. Observability & Debugging

  • Troubleshooting issues like latency spikes or failures requires deep visibility into service-to-service communication.
  • Without built-in observability tools, gathering tracing, logging, and metrics is cumbersome.

3. Traffic Management

  • Routing, load balancing, and retries need to be explicitly handled at the application level.
  • No easy way to enforce traffic policies like circuit breaking, fault injection, or rate limiting.

4. Inter-AZ Traffic Control

  • Without a mesh, restricting cross-AZ traffic is challenging, leading to increased network costs and latency.

5. Developer Overhead

  • Developers end up embedding resilience and security logic into the application code, increasing complexity.

How Istio Overcame These Challenges

We initially adopted Istio in sidecar mode, which addressed most of our challenges. Over time, we transitioned to Istio Ambient Mode for better efficiency and performance.

1. Security & mTLS

  • Istio enforces mTLS (mutual TLS) by default, securing communication between microservices without requiring app-level changes.
  • Role-Based Access Control (RBAC) and fine-grained policies ensure only authorized services can communicate.

2. Observability & Monitoring

  • Istio integrates with Prometheus, Grafana, and Jaeger, providing deep insights into traffic flow, latency, and failures.
  • Service-to-service tracing enables faster debugging and performance optimization.

3. Advanced Traffic Management

  • Destination rules and virtual services allow us to control inter-AZ traffic and optimize network performance.
  • Canary deployments and traffic mirroring help us safely roll out and test new features.

4. Moving from Sidecar to Ambient Mode

Previously, we used Istio in sidecar mode, which worked well for:

  • Restricting inter-AZ traffic with destination rules
  • Implementing retries, circuit breakers, and fault injection
  • Enforcing security policies at the service level

However, sidecar mode came with some drawbacks, such as increased CPU and memory overhead due to per-pod sidecars.

5. The Shift to Ambient Mode with ztunnel and Istio CNI

We moved to Istio Ambient Mode, which provided:

  • Lower resource consumption (no more sidecar per pod)
  • Better security with ztunnel, handling mTLS at the node level
  • Seamless integration with Istio CNI, reducing developer overhead

With Ambient Mode, we achieved everything Istio offered without requiring developers to modify their application code.


Kiali: Observability Made Easy

Managing a service mesh can get complex, especially when dealing with hundreds of microservices. This is where Kiali plays a crucial role — it acts as the dashboard for Istio, providing deep observability, monitoring, and troubleshooting capabilities.

1. Service Mesh Visualization

  • Kiali provides a real-time service graph, helping us visualize how microservices interact with each other.
  • This makes it easier to understand traffic flow, dependencies, and bottlenecks.

2. Traffic & Performance Monitoring

  • With real-time metrics, we can monitor request rates, error percentages, and latency at a glance.
  • Helps us quickly identify and resolve performance issues before they impact users.

3. Istio Configuration Management

  • Kiali allows us to validate Istio configurations (VirtualServices, DestinationRules, Gateways, etc.) and ensures there are no misconfigurations.
  • Built-in linting and error detection help prevent deployment issues.

4. Debugging & Tracing

  • Integrated with Jaeger, Kiali enables distributed tracing, allowing us to analyze request paths and optimize services.
  • Helps in root cause analysis by pinpointing where failures or latencies occur.

5. Security Insights

  • Provides visibility into mTLS status, showing which services are securely communicating.
  • Helps enforce zero-trust security policies across the mesh.

How Kiali Helped Us

With Kiali, we no longer need to dig through logs manually or rely solely on Prometheus/Grafana dashboards for Istio monitoring. It provides a single pane of glass for understanding service-to-service communication, making troubleshooting and performance tuning effortless.

Bottom line: Kiali made our Istio adoption smoother, significantly reducing operational complexity and improving real-time visibility into our microservices! 🚀


Conclusion & What’s Next

Adopting Istio helped us improve security, observability, and traffic management while reducing operational overhead. Moving to Ambient Mode further optimized resource usage and streamlined service-to-service communication.

Looking ahead, we’re excited to explore new Istio features, particularly around:

  • L4/L7 policy enhancements in Ambient Mode
  • Zero-trust security models
  • More efficient service discovery mechanisms

Istio has been a game-changer for us, and we’re looking forward to pushing its limits even further! 🚀