Service Mesh 3 with ambient mode in action

I recently deployed a production-ready Red Hat Developer Hub (RHDH) instance at granatengeorg.de (yes, I really bought that url :-D ) with Service Mesh 3 configured in ambient mode. While the setup provides excellent security and observability benefits, the most challenging aspect was getting health checks to work correctly. In this post, I’ll focus on the solutions I implemented, particularly the critical routingViaHost configuration that made the deployment successful.

The Challenge: Health Checks in Ambient Mode

The biggest obstacle I encountered was getting Kubernetes health checks (liveness and readiness probes) to function properly in the ambient mode environment. Pods were failing to start because the kubelet couldn’t reach the application containers to perform health checks.

Why This Happens: In ambient mode, the Service Mesh intercepts network traffic at the node level using ztunnel (zero-trust tunnel) components. While this provides excellent security, it can interfere with direct health check requests from the kubelet to application pods. The kubelet needs to reach containers directly, but the ambient mesh networking layer can block or interfere with these requests.

As documented in the Red Hat OpenShift Service Mesh 3.1 release notes, health checks in ambient mode require routing via the host network.

The Solution: Routing Via Host Network

The solution is to configure the gateway to route health checks via the host network, allowing the kubelet to directly reach application containers without going through the ambient mesh networking layer. This maintains all the security benefits of ambient mode while enabling proper health check functionality.

Implementation

I found the configuration details in the OpenShift documentation for configuring gateway routing via host and implemented it as a shell script.

The script is available in my repository at rhdh-georg/11-enable-routingViaHost.sh. Here’s what it does:

  1. Configures the gateway routing: Sets up routing via the host network for health check traffic
  2. Enables direct kubelet access: Allows liveness and readiness probes to reach application containers directly
  3. Preserves ambient mode security: Maintains zero-trust networking for application traffic while bypassing the mesh for health checks

This configuration is critical because without working health checks, Kubernetes cannot properly manage the pod lifecycle. Pods will fail to start, restart continuously, or be marked as unhealthy even when the application is running correctly.

How It Works

The routingViaHost configuration tells the OVN-Kubernetes network provider to route health check traffic through the host network interface rather than through the pod network namespace. This creates a direct path from the kubelet (running on the node) to the application container, bypassing the ambient mesh’s ztunnel components that would otherwise intercept the traffic.

The key insight is that health checks are infrastructure-level operations that need to work independently of the service mesh. By routing them via the host network, we ensure that Kubernetes can properly monitor and manage pod health while still maintaining all the security and observability benefits of ambient mode for actual application traffic.

Other Solutions Implemented

While the routingViaHost solution was the most critical, several other configurations were necessary for a complete deployment:

Keycloak Authentication Integration

Problem: Integrating Keycloak with RHDH for authentication required proper OIDC configuration.

Solution: Configured Keycloak realms and clients with appropriate OIDC/OAuth2 flows, and mapped Keycloak roles to Backstage permissions. The configuration is available in keycloak-secrets.yaml in the repository.

Ambient Mode Pod Labeling

Problem: RHDH components needed to work correctly with ambient mode networking.

Solution: Properly labeled all RHDH pods and namespaces to enable ambient mode, and configured waypoint proxies for L7 policies where needed. This ensures that application traffic benefits from zero-trust security while health checks use the host network routing.

GitOps Configuration with ArgoCD

Problem: Managing configuration changes and ensuring consistency across deployments.

Solution: Set up ArgoCD for GitOps-based deployment management. All configurations are version-controlled in Git, with automated sync ensuring the cluster matches the desired state. This provides an audit trail and easy rollback capability.

Domain and Ingress Configuration

Problem: Setting up secure access via the granatengeorg.de domain.

Solution: Configured appropriate ingress resources with TLS termination, ensuring secure access to the developer hub while maintaining compatibility with ambient mode networking.

Repository Structure

All configuration files and solutions are available in the GitHub repository. The repository includes:

  • rhdh-georg/11-enable-routingViaHost.sh - The critical health check routing solution
  • rhdh.yaml - Main Red Hat Developer Hub deployment configuration
  • rhdh-config.yaml - Application-specific settings
  • dynamic-plugins-rhdh.yaml - Dynamic plugin configurations
  • rbac-policies.yaml - Role-based access control policies
  • keycloak-secrets.yaml - Keycloak authentication configuration
  • argocd-secrets.yaml - ArgoCD GitOps configuration

Key Takeaways

  1. Health checks are critical: The routingViaHost solution was the most important fix, enabling the entire deployment to function correctly.

  2. Ambient mode requires specific configurations: While ambient mode simplifies operations by avoiding sidecar injection, it requires careful attention to networking configurations, especially for infrastructure-level operations like health checks.

  3. Documentation is your friend: The Red Hat and OpenShift documentation provided the key insights needed to solve the health check problem.

  4. Script it: Automating the routingViaHost configuration in a shell script makes it easy to apply and maintain.

Conclusion

Deploying Red Hat Developer Hub with Service Mesh 3 in ambient mode provides excellent security and observability benefits, but requires careful attention to networking configurations. The routingViaHost solution for health checks was the critical piece that made everything work together.

If you’re facing similar challenges with health checks in ambient mode, I encourage you to check out the 11-enable-routingViaHost.sh script in the repository. The complete configuration is available for reference and adaptation to your own environments.

The deployment is now live at granatengeorg.de, providing a production-ready developer hub experience with the enhanced security and observability that Service Mesh 3 ambient mode brings to the table.