This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Download Microsoft Edge
More info about Internet Explorer and Microsoft Edge
Network problems can occur in new installations of Kubernetes or when you increase the Kubernetes load. Other problems that relate back to networking problems might also occur. Always check the
AKS troubleshooting
guide to see whether your problem is described there. This article describes additional details and considerations from a network troubleshooting perspective and specific problems that might arise.
Client can't reach the API server
These errors involve connection problems that occur when you can't reach an Azure Kubernetes Service (AKS) cluster's API server through the Kubernetes cluster command-line tool (kubectl) or any other tool, like the REST API via a programming language.
Error
You might see errors that look like these:
Unable to connect to the server: dial tcp <API-server-IP>:443: i/o timeout
Unable to connect to the server: dial tcp <API-server-IP>:443: connectex: A connection attempt
failed because the connected party did not properly respond after a period, or established
connection failed because connected host has failed to respond.
Cause 1
It's possible that IP ranges authorized by the API server are enabled on the cluster's API server, but the client's IP address isn't included in those IP ranges. To determine whether IP ranges are enabled, use the following az aks show
command in Azure CLI. If the IP ranges are enabled, the command will produce a list of IP ranges.
az aks show --resource-group <cluster-resource-group> \
--name <cluster-name> \
--query apiServerAccessProfile.authorizedIpRanges
Solution 1
Ensure that your client's IP address is within the ranges authorized by the cluster's API server:
Find your local IP address. For information on how to find it on Windows and Linux, see How to find my IP.
Update the range that's authorized by the API server by using the az aks update
command in Azure CLI. Authorize your client's IP address. For instructions, see Update a cluster's API server authorized IP ranges.
Cause 2
If your AKS cluster is a private cluster, the API server endpoint doesn't have a public IP address. You need to use a VM that has network access to the AKS cluster's virtual network.
Solution 2
For information on how to resolve this problem, see options for connecting to a private cluster.
Pod fails to allocate the IP address
Error
The Pod is stuck in the ContainerCreating
state, and its events report a Failed to allocate address
error:
Normal SandboxChanged 5m (x74 over 8m) kubelet, k8s-agentpool-00011101-0 Pod sandbox
changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 21s (x204 over 8m) kubelet, k8s-agentpool-00011101-0 Failed
create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod
"deployment-azuredisk6-874857994-487td_default" network: Failed to allocate address: Failed to
delegate: Failed to allocate address: No available addresses
Check the allocated IP addresses in the plugin IPAM store. You might find that all IP addresses are allocated, but the number is much less than the number of running Pods:
# Kubenet, for example. The actual path of the IPAM store file depends on network plugin implementation.
cd /var/lib/cni/networks/kubenet
ls -al|wc -l
docker ps | grep POD | wc -l
Cause 1
This error can be caused by a bug in the network plugin. The plugin can fail to deallocate the IP address when a Pod is terminated.
Solution 1
Contact Microsoft for a workaround or fix.
Cause 2
Pod creation is much faster than garbage collection of terminated Pods.
Solution 2
Configure fast garbage collection for the kubelet. For instructions, see the Kubernetes garbage collection documentation.
Service not accessible within Pods
The first step to resolving this problem is to check whether endpoints have been created automatically for the service:
kubectl get endpoints <service-name>
If you get an empty result, your service's label selector might be wrong. Confirm that the label is correct:
# Query Service LabelSelector.
kubectl get svc <service-name> -o jsonpath='{.spec.selector}'
# Get Pods matching the LabelSelector and check whether they're running.
kubectl get pods -l key1=value1,key2=value2
If the preceding steps return expected values:
Check whether the Pod containerPort
is the same as the service containerPort
.
Check whether podIP:containerPort
is working:
# Testing via cURL.
curl -v telnet ://<Pod-IP>:<containerPort>
# Testing via Telnet.
telnet <Pod-IP>:<containerPort>
These are some other potential causes of service problems:
The container isn't listening to the specified containerPort
. (Check the Pod description.)
A CNI plugin error or network route error is occurring.
kube-proxy isn't running or iptables rules aren't configured correctly.
Network Policies is dropping traffic. For information on applying and testing Network Policies, see Azure Kubernetes Network Policies overview.
If you're using Calico as your network plugin, you can capture network policy traffic as well. For information on configuring that, see the Calico site.
Nodes can't reach the API server
Many add-ons and containers need to access the Kubernetes API (for example, kube-dns and operator containers). If errors occur during this process, the following steps can help you determine the source of the problem.
First, confirm whether the Kubernetes API is accessible within Pods:
kubectl run curl --image=mcr.microsoft.com/azure-cli -i -t --restart=Never --overrides='[{"op":"add","path":"/spec/containers/0/resources","value":{"limits":{"cpu":"200m","memory":"128Mi"}}}]' --override-type json --command -- sh
Then execute the following from within the container that you now are shelled into.
# If you don't see a command prompt, try selecting Enter.
KUBE_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT/api/v1/namespaces/default/pods
Healthy output will look similar to the following.
"kind": "PodList",
"apiVersion": "v1",
"metadata": {
"selfLink": "/api/v1/namespaces/default/pods",
"resourceVersion": "2285"
"items": [
If an error occurs, check whether the kubernetes-internal
service and its endpoints are healthy:
kubectl get service kubernetes-internal
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-internal ClusterIP 10.96.0.1 <none> 443/TCP 25m
kubectl get endpoints kubernetes-internal
NAME ENDPOINTS AGE
kubernetes-internal 172.17.0.62:6443 25m
If both tests return responses like the preceding ones, and the IP and port returned match the ones for your container, it's likely that kube-apiserver isn't running or is blocked from the network.
There are four main reasons why the access might be blocked:
Your network policies. They might be preventing access to the API management plane. For information on testing Network Policies, see Network Policies overview.
Your API's allowed IP addresses. For information about resolving this problem, see Update a cluster's API server authorized IP ranges.
Your private firewall. If you route the AKS traffic through a private firewall, make sure there are outbound rules as described in Required outbound network rules and FQDNs for AKS clusters.
Your private DNS. If you're hosting a private cluster and you're unable to reach the API server, your DNS forwarders might not be configured properly. To ensure proper communication, complete the steps in Hub and spoke with custom DNS.
You can also check kube-apiserver logs by using Container insights. For information on querying kube-apiserver logs, and many other queries, see How to query logs from Container insights.
Finally, you can check the kube-apiserver status and its logs on the cluster itself:
# Check kube-apiserver status.
kubectl -n kube-system get pod -l component=kube-apiserver
# Get kube-apiserver logs.
PODNAME=$(kubectl -n kube-system get pod -l component=kube-apiserver -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system logs $PODNAME --tail 100
If a 403 - Forbidden
error returns, kube-apiserver is probably configured with role-based access control (RBAC) and your container's ServiceAccount
probably isn't authorized to access resources. In this case, you should create appropriate RoleBinding
and ClusterRoleBinding
objects. For information about roles and role bindings, see Access and identity. For examples of how to configure RBAC on your cluster, see Using RBAC Authorization.
Contributors
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal author:
Michael Walters | Senior Consultant
Other contributors:
Mick Alberts | Technical Writer
Ayobami Ayodeji | Senior Program Manager
Bahram Rushenas | Architect
Next steps
Network concepts for applications in AKS
Troubleshoot Applications
Debug Services
Kubernetes Cluster Networking
Choose the best networking plugin for AKS
AKS architecture design
Lift and shift to containers with AKS
Baseline architecture for an AKS cluster
AKS baseline for multiregion clusters
AKS day-2 operations guide
Triage practices
Patching and upgrade guidance
Monitoring AKS with Azure Monitor