Istio service mesh add-on plug-in CA certificate troubleshooting
This article discusses common troubleshooting issues with the Istio add-on plug-in certificate authority (CA) certificates feature, and it offers solutions to fix these issues. The article also reviews the general process of setting up plug-in CA certificates for the service mesh add-on.
Note
This article assumes that Istio revision asm-1-21
is deployed on the cluster.
Prerequisites
The Kubernetes kubectl tool, or a similar tool, to connect to the cluster. To install kubectl by using Azure CLI, run the az aks install-cli command.
The following Linux-style standard shell tools:
grep
sort
tail
awk
xargs
The jq tool for querying JSON data.
General setup process
Before you enable the Istio add-on to use the plug-in CA certificates feature, you have to enable the Azure Key Vault provider for Secrets Store add-on on the cluster. Make sure that the Azure Key Vault and the cluster are on the same Azure tenant.
After the Azure Key Vault secrets provider add-on is enabled, you have to set up access to the Azure Key Vault for the user-assigned managed identity that the add-on creates.
After you grant permission for the user-assigned managed identity to access the Azure Key Vault, you can use the plug-in CA certificates feature together with the Istio add-on. For more information, see the Enable the Istio add-on to use a plug-in CA certificate section.
For the cluster to auto-detect changes in the Azure Key Vault secrets, you have to enable auto-rotation for the Azure Key Vault secrets provider add-on.
Although changes to the intermediate certificate are applied automatically, changes to the root certificate are only picked up by the control plane after the
istiod
deployment is restarted by a cronjob that the add-on deploys, as explained in the Deployed resources section. This cronjob runs at a 10-minute interval.
Enable the Istio add-on to use a plug-in CA certificate
The Istio add-on plug-in CA certificates feature allows you to configure plug-in root and intermediate certificates for the mesh. To provide plug-in certificate information when you enable the add-on, specify the following parameters for the az aks mesh enable command in Azure CLI.
Parameter | Description |
---|---|
--key-vault-id <resource-id> |
The Azure Key Vault resource ID. This resource is expected to be in the same tenant as the managed cluster. This resource ID must be in the Azure Resource Manager template (ARM template) resource ID format. |
--root-cert-object-name <root-cert-obj-name> |
The root certificate object name in the Azure Key Vault. |
--ca-cert-object-name <inter-cert-obj-name> |
The intermediate certificate object name in the Azure Key Vault. |
--ca-key-object-name <inter-key-obj-name> |
The intermediate certificate private key object name in the Azure Key Vault. |
--cert-chain-object-name <cert-chain-obj-name> |
The certificate chain object name in the Azure Key Vault. |
If you want to use the plug-in CA certificates feature, you must specify all five parameters. All Azure Key Vault objects are expected to be of the type Secret.
For more information, see Plug in CA certificates for Istio-based service mesh add-on on Azure Kubernetes Service.
Deployed resources
As part of the add-on deployment for the plug-in certificates feature, the following resources are deployed onto the cluster:
The
cacerts
Kubernetes secret is created in theaks-istio-system
namespace at the time of the add-on deployment. This secret contains synchronized Azure Key Vault secrets:kubectl describe secret cacerts --namespace aks-istio-system
Name: cacerts Namespace: aks-istio-system Labels: secrets-store.csi.k8s.io/managed=true Annotations: <none> Type: opaque Data ==== ca-cert.pem: 1968 bytes ca-key.pem: 3272 bytes cert-chain.pem: 3786 bytes root-cert.pem: 3636 bytes
The
istio-spc-asm-1-21
SecretProviderClass object is created in theaks-istio-system
namespace at the time of the add-on deployment. This resource contains Azure-specific parameters for the Secrets Store Container Storage Interface (CSI) driver:kubectl get secretproviderclass --namespace aks-istio-system
NAME AGE istio-spc-asm-1-21 14h
The
istio-ca-root-cert
configmap is created in theaks-istio-system
namespace and all user-managed namespaces. This configmap contains the root certificate that the certificate authority uses, and it's used by workloads in the namespaces to validate workload-to-workload communication, as follows:kubectl describe configmap istio-ca-root-cert --namespace aks-istio-system
Name: istio-ca-root-cert Namespace: aks-istio-system Labels: istio.io/config=true Annotations: <none> Data ==== root-cert.pem: ---- -----BEGIN CERTIFICATE----- <certificate data> -----END CERTIFICATE-----
The
istio-cert-validator-cronjob-asm-1-21
cronjob object is created in theaks-istio-system
namespace. This cronjob is scheduled to run every 10 minutes to check for updates on the root certificate. If the root certificate that's in thecacerts
Kubernetes secret doesn't match theistio-ca-root-cert
configmap in theaks-istio-system
namespace, it restarts theistiod-asm-1-21
deployment:kubectl get cronjob --namespace aks-istio-system
NAME SCHEDULE SUSPEND ACTIVE istio-cert-validator-cronjob-asm-1-21 */10 * * * * False 0
You can run the following command to check the cronjob logs for the last run:
kubectl logs --namespace aks-istio-system $(kubectl get pods --namespace aks-istio-system | grep 'istio-cert-validator-cronjob-' | sort -k8 | tail -n 1 | awk '{print $1}')
This command generates one of the following output messages, depending on whether a root certificate update was detected:
Root certificate update not detected.
Root certificate update detected. Restarting deployment... deployment.apps/istiod-asm-1-21 restarted Deployment istiod-asm-1-21 restarted.
Determine certificate type in deployment logs
You can view the istiod
deployment logs to determine whether you have a self-signed CA certificate or a plug-in CA certificate. To view the logs, run the following command:
kubectl logs deploy/istiod-asm-1-21 --container discovery --namespace aks-istio-system | grep -v validationController
Immediately before each certificate log entry is another log entry that describes that kind of certificate. For a self-signed CA certificate, the entry states "No plugged-in cert at etc/cacerts/ca-key.pem; self-signed cert is used." For a plug-in certificate, the entry states "Use plugged-in cert at etc/cacerts/ca-key.pem." Sample log entries that pertain to the certificates are shown in the following tables.
Log entries for a self-signed CA certificate
Timestamp Log level Message 2023-11-20T23:27:36.649019Z info Using istiod file format for signing ca files 2023-11-20T23:27:36.649032Z info No plugged-in cert at etc/cacerts/ca-key.pem; self-signed cert is used 2023-11-20T23:27:36.649536Z info x509 cert - <certificate-details> 2023-11-20T23:27:36.649552Z info Istiod certificates are reloaded 2023-11-20T23:27:36.649613Z info spiffe Added 1 certs to trust domain cluster.local in peer cert verifier Log entries for a plug-in CA certificate
Timestamp Log level Message 2023-11-21T00:20:25.808396Z info Using istiod file format for signing ca files 2023-11-21T00:20:25.808412Z info Use plugged-in cert at etc/cacerts/ca-key.pem 2023-11-21T00:20:25.808731Z info x509 cert - <certificate-details> 2023-11-21T00:20:25.808764Z info x509 cert - <certificate-details> 2023-11-21T00:20:25.808799Z info x509 cert - <certificate-details> 2023-11-21T00:20:25.808803Z info Istiod certificates are reloaded 2023-11-21T00:20:25.808873Z info spiffe Added 1 certs to trust domain cluster.local in peer cert verifier
The certificate details in a log entry are shown as comma-separated values for the issuer, subject, serial number (SN—a long hexadecimal string), and the beginning and ending timestamp values that define when the certificate is valid.
For a self-signed CA certificate, there's one detail entry. Sample values for this certificate are shown in the following table.
Issuer | Subject | SN | NotBefore | NotAfter |
---|---|---|---|---|
"O=cluster.local" | "" | <32-digit-hex-value> | "2023-11-20T23:25:36Z" | "2033-11-17T23:27:36Z" |
For a plug-in CA certificate, there are three detail entries. The other two entries are for a root certificate update and a change to the intermediate certificate. Sample values for these entries are shown in the following table.
Issuer | Subject | SN | NotBefore | NotAfter |
---|---|---|---|---|
CN=Intermediate CA - A1,O=Istio,L=cluster-A1" | "" | <32-digit-hex-value> | "2023-11-21T00:18:25Z" | "2033-11-18T00:20:25Z" |
CN=Root A,O=Istio" | "CN=Intermediate CA - A1,O=Istio,L=cluster-A1" | <40-digit-hex-value> | "2023-11-04T01:40:22Z" | "2033-11-01T01:40:22Z" |
CN=Root A,O=Istio" | "CN=Root A,O=Istio" | <40-digit-hex-value> | "2023-11-04T01:38:27Z" | "2033-11-01T01:38:27Z" |
Troubleshoot common issues
Issue 1: Access to Azure Key Vault is set up incorrectly
After you enable the Azure Key Vault secrets provider add-on, you have to grant access for the user-assigned managed identity of the add-on to the Azure Key Vault. Setting up access to Azure Key Vault incorrectly causes the add-on installation to stall.
kubectl get pods --namespace aks-istio-system
In the list of pods, you can see that the istiod-asm-1-21
pods are stuck in an Init:0/2
state.
NAME | READY | STATUS | RESTARTS | AGE |
---|---|---|---|---|
istiod-asm-1-21-6fcfd88478-2x95b | 0/1 | Terminating | 0 | 5m55s |
istiod-asm-1-21-6fcfd88478-6x5hh | 0/1 | Terminating | 0 | 5m40s |
istiod-asm-1-21-6fcfd88478-c48f9 | 0/1 | Init:0/2 | 0 | 54s |
istiod-asm-1-21-6fcfd88478-wl8mw | 0/1 | Init:0/2 | 0 | 39s |
To verify the Azure Key Vault access issue, run the kubectl get pods
command to locate pods that have the secrets-store-provider-azure
label in the kube-system
namespace:
kubectl get pods --selector app=secrets-store-provider-azure --namespace kube-system --output name | xargs -I {} kubectl logs --namespace kube-system {}
The following sample output shows that a "403 Forbidden" error occurred because you don't have "get" permissions for secrets on the Key Vault:
"failed to process mount request" err="failed to get objectType:secret, objectName:<secret-object-name>, objectVersion:: keyvault.BaseClient#GetSecret: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code=\"Forbidden\" Message=\"The user, group or application 'appid=<appid>;oid=<oid>;iss=<iss>' does not have secrets get permission on key vault 'MyAzureKeyVault;location=eastus'. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125287\" InnerError={\"code\":\"AccessDenied\"}"
To fix this problem, set up access to the user-assigned managed identity for the Azure Key Vault secrets provider add-on by obtaining Get and List permissions on Azure Key Vault secrets and reinstalling the Istio add-on. First, get the object ID of the user-assigned managed identity for the Azure Key Vault secrets provider add-on by running the az aks show command:
OBJECT_ID=$(az aks show --resource-group $RESOURCE_GROUP --name $CLUSTER --query 'addonProfiles.azureKeyvaultSecretsProvider.identity.objectId')
To set the access policy, run the following az keyvault set-policy command by specifying the object ID that you obtained:
az keyvault set-policy --name $AKV_NAME --object-id $OBJECT_ID --secret-permissions get list
Note
Did you create your Key Vault by using Azure RBAC Authorization for your permission model instead of Vault Access Policy? In this case, see Provide access to Key Vault keys, certificates, and secrets with an Azure role-based access control to create permissions for the managed identity. Add an Azure role assignment for Key Vault Reader for the user-assigned managed identity of the add-on.
Issue 2: Auto-detection of Key Vault secret changes isn't set up
For a cluster to auto-detect changes in the Azure Key Vault secrets, you have to enable auto-rotation for the Azure Key Vault provider add-on. Auto-rotation can detect changes in intermediate and root certificates automatically. For a cluster that enables the Azure Key Vault provider add-on, run the following az aks show
command to check whether auto-rotation is enabled:
az aks show --resource-group $RESOURCE_GROUP --name $CLUSTER | jq -r '.addonProfiles.azureKeyvaultSecretsProvider.config.enableSecretRotation'
If the cluster enabled the Azure Key Vault provider add-on, run the following az aks show
command to determine the rotation poll interval:
az aks show --resource-group $RESOURCE_GROUP --name $CLUSTER | jq -r '.addonProfiles.azureKeyvaultSecretsProvider.config.rotationPollInterval'
Azure Key Vault secrets are synchronized with the cluster when the poll interval time elapses after the previous synchronization. The default interval value is two minutes.
Issue 3: Certificate values are missing or are configured incorrectly
If secret objects are missing from Azure Key Vault, or if these objects are configured incorrectly, the istiod-asm-1-21
pods might get stuck in an Init:0/2
status, delaying the installation of the add-on. To find the underlying cause of this problem, run the following kubectl describe
command against the istiod
deployment and view the output:
kubectl describe deploy/istiod-asm-1-21 --namespace aks-istio-system
The command displays events that might resemble the following output table. In this example, a missing secret is the cause of the problem.
Type | Reason | Age | From | Message |
---|---|---|---|---|
Normal | Scheduled | 3m9s | default-scheduler | Successfully assigned aks-istio-system/istiod-asm-1-21-6fcfd88478-hqdjj to aks-userpool-24672518-vmss000000 |
Warning | FailedMount | 66s | kubelet | Unable to attach or mount volumes: unmounted volumes=[cacerts], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition |
Warning | FailedMount | 61s (x9 over 3m9s) | kubelet | MountVolume.SetUp failed for volume "cacerts" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod aks-istio-system/istiod-asm-1-21-6fcfd88478-hqdjj, err: rpc error: code = Unknown desc = failed to mount objects, error: failed to get objectType:secret, objectName:test-cert-chain, objectVersion:: keyvault.BaseClient#GetSecret: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SecretNotFound" Message="A secret with (name/id) test-cert-chain was not found in this key vault. If you recently deleted this secret you may be able to recover it using the correct recovery command. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125182" |
Resources
Third-party information disclaimer
The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.
Third-party contact disclaimer
Microsoft provides third-party contact information to help you find additional information about this topic. This contact information may change without notice. Microsoft does not guarantee the accuracy of third-party contact information.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.