Istio service mesh add-on ingress gateway troubleshooting

This article discusses how to troubleshoot ingress gateway issues on the Istio service mesh add-on for Azure Kubernetes Service (AKS). The Istio ingress gateway is an Envoy-based reverse proxy that you can use to route incoming traffic to workloads in the mesh.

For the Istio-based service mesh add-on, we offer the following ingress gateway options:

  • An internal ingress gateway that uses a private IP address.

  • An external ingress gateway that uses a publicly accessible IP address.

Note

Microsoft doesn't support customizing the IP address for either the internal or external ingress gateways. Any IP customization changes to the Istio service mesh add-on will be reverted.

The add-on deploys Istio ingress gateway pods and deployments per revision. If you're doing a canary upgrade and have two control plane revisions installed in your cluster, then you might have to troubleshoot multiple ingress gateway pods across both revisions.

Troubleshooting checklist

Step 1: Make sure no firewall or NSG rules block the ingress gateway

Verify that you don't have firewall or Network Security Group (NSG) rules that block traffic to the ingress gateway. You have to explicitly add a Destination Network Address Translation (DNAT) rule to allow inbound traffic through Azure Firewall to the ingress gateway.

Step 2: Configure gateways, virtual services, and destination rules correctly

When you configure gateways, virtual services, and destination rules for traffic routing through the ingress gateway, follow these steps:

  1. Make sure that the ingress gateway selector in the gateway resource is set to one of the following text values if you're using an external or internal gateway, respectively:

    • istio: aks-istio-ingressgateway-external
    • istio: aks-istio-ingressgateway-internal
  2. Make sure that the ports are set correctly in gateways and virtual services. For the gateway, the port should be set to 80 for http or 443 for https. For the virtual service, the port should be set to the port that the corresponding service for the application is listening on.

  3. Verify that the service is exposed within the hosts specification for both the gateway and the virtual service. If you experience issues that are related to the Host header in the requests, try adding to the allowlist all hosts that contain an asterisk wildcard ("*"), such as in this example gateway configuration. However, we recommend that you don't amend the allowlist as a production practice. Also, the hosts specification should be configured explicitly.

Step 3: Fix the health of the ingress gateway pod

If the ingress gateway pod crashes or doesn't appear in the ready state, verify that the Istio daemon (istiod) control plane pod is in the ready state. The ingress gateway depends on having the istiod release be ready.

If the istiod pod doesn't appear in the ready state, make sure that the Istio custom resource definitions (CRDs) and the base Helm chart is installed correctly. To do this, run the following command:

helm ls --all --all-namespaces

You might see a broader error in which the add-on installation isn't configured specifically to the ingress gateway.

If the istiod pod is healthy, but the ingress gateway pods aren't responding, inspect the following ingress gateway resources in the aks-istio-ingress namespace to collect more information:

  • Helm release
  • Deployment
  • Service

Additionally, you can find more information about gateway and sidecar debugging in General Istio service mesh add-on troubleshooting.

Step 4: Configure resource utilization

High resource utilization occurs when the default min/max replica settings for Istiod and the gateways aren't sufficient. In this case, change horizontal pod autoscaling configurations.

Step 5: Troubleshoot the secure ingress gateway

When an external ingress gateway is configured to expose a secure HTTPS service using simple or mutual TLS, follow these troubleshooting steps:

  1. Verify that the values of the INGRESS_HOST_EXTERNAL and SECURE_INGRESS_PORT_EXTERNAL environment variables are valid based on the output of the following command:

    kubectl -n aks-istio-ingress get service aks-istio-ingressgateway-external
    
  2. Check for error messages in the gateway controller's logs:

    kubectl logs -n aks-istio-ingress <gateway-service-pod>
    
  3. Verify that the secrets are created in the aks-istio-ingress namespace:

    kubectl -n aks-istio-ingress get secrets
    

For the example in Secure ingress gateway for Istio service mesh add-on for Azure Kubernetes Service, the productpage-credential secret should be listed.

After you enable the Azure Key Vault secrets provider add-on, you have to grant access for the user-assigned managed identity of the add-on to the Azure Key Vault. Incorrectly setting up access to Azure Key Vault will prevent the productpage-credential secret from being created.

After you create the SecretProviderClass resource, to ensure secrets sync from Azure Key Vault to the cluster, ensure the sample pod secrets-store-sync-productpage that references this resource is successfully deployed.

References

Third-party information disclaimer

The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.

Third-party contact disclaimer

Microsoft provides third-party contact information to help you find additional information about this topic. This contact information may change without notice. Microsoft does not guarantee the accuracy of third-party contact information.

Contact us for help

If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.