Kubernetes By Example
文章目录
Pods
A pod is a collection of containers sharing a network and mount namespace and is the basic unit of deployment in Kubernetes. All containers in a pod are scheduled on the same node.
To launch a pod using the container image
mhausenblas/simpleservice:0.5.0
and exposing a HTTP API on port 9876
, execute:
$ kubectl run sise --image=mhausenblas/simpleservice:0.5.0 --port=9876
We can now see that the pod is running:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
sise-3210265840-k705b 1/1 Running 0 1m
$ kubectl describe pod sise-3210265840-k705b | grep IP:
IP: 172.17.0.3
From within the cluster (e.g. via minishift ssh
) this pod is accessible via the pod IP 172.17.0.3
,
which we’ve learned from the kubectl describe
command above:
[cluster] $ curl 172.17.0.3:9876/info
{"host": "172.17.0.3:9876", "version": "0.5.0", "from": "172.17.0.1"}
Note that kubectl run
creates a deployment, so in order to
get rid of the pod you have to execute kubectl delete deployment sise
.
You can also create a pod from a configuration file.
In this case the pod is
running the already known simpleservice
image from above along with
a generic CentOS
container:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/pods/pod.yaml
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
twocontainers 2/2 Running 0 7s
Now we can exec into the CentOS
container and access the simpleservice
on localhost:
$ kubectl exec twocontainers -c shell -i -t -- bash
[root@twocontainers /]# curl -s localhost:9876/info
{"host": "localhost:9876", "version": "0.5.0", "from": "127.0.0.1"}
Specify the resources
field in the pod to influence how much CPU and/or RAM a
container in a pod
can use (here: 64MB
of RAM and 0.5
CPUs):
$ kubectl create -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/pods/constraint-pod.yaml
$ kubectl describe pod constraintpod
...
Containers:
sise:
...
Limits:
cpu: 500m
memory: 64Mi
Requests:
cpu: 500m
memory: 64Mi
...
Learn more about resource constraints in Kubernetes via the docs here and here.
To remove all the pods created, just run:
$ kubectl delete pod twocontainers
$ kubectl delete pod constraintpod
To sum up, launching one or more containers (together) in Kubernetes is simple, however doing it directly as shown above comes with a serious limitation: you have to manually take care of keeping them running in case of a failure. A better way to supervise pods is to use deployments, giving you much more control over the life cycle, including rolling out a new version.
Labels
Labels are the mechanism you use to organize Kubernetes objects. A label is a key-value pair with certain restrictions concerning length and allowed values but without any pre-defined meaning. So you’re free to choose labels as you see fit, for example, to express environments such as ’this pod is running in production’ or ownership, like ‘department X owns that pod’.
Let’s create a pod that initially has one label (env=development
):
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/labels/pod.yaml
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
labelex 1/1 Running 0 10m env=development
In above get pods
command note the --show-labels
option that output the
labels of an object in an additional column.
You can add a label to the pod as:
$ kubectl label pods labelex owner=michael
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
labelex 1/1 Running 0 16m env=development,owner=michael
To use a label for filtering, for example to list only pods that have an
owner
that equals michael
, use the --selector
option:
$ kubectl get pods --selector owner=michael
NAME READY STATUS RESTARTS AGE
labelex 1/1 Running 0 27m
The --selector
option can be abbreviated to -l
, so to select pods that are
labelled with env=development
, do:
$ kubectl get pods -l env=development
NAME READY STATUS RESTARTS AGE
labelex 1/1 Running 0 27m
Oftentimes, Kubernetes objects also support set-based selectors.
Let’s launch another pod
that has two labels (env=production
and owner=michael
):
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/labels/anotherpod.yaml
Now, let’s list all pods that are either labelled with env=development
or with
env=production
:
$ kubectl get pods -l 'env in (production, development)'
NAME READY STATUS RESTARTS AGE
labelex 1/1 Running 0 43m
labelexother 1/1 Running 0 3m
Other verbs also support label selection, for example, you could remove both of these pods with:
$ kubectl delete pods -l 'env in (production, development)'
Beware that this will destroy any pods with those labels.
You can also delete them directly, via their names, with:
$ kubectl delete pods labelex
$ kubectl delete pods labelexother
Note that labels are not restricted to pods. In fact you can apply them to all sorts of objects, such as nodes or services.
Deployments
A deployment is a supervisor for pods, giving you fine-grained control over how and when a new pod version is rolled out as well as rolled back to a previous state.
Let’s create a deployment
called sise-deploy
that supervises two replicas of a pod as well as a replica set:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/deployments/d09.yaml
You can have a look at the deployment, as well as the the replica set and the pods the deployment looks after like so:
$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
sise-deploy 2 2 2 2 10s
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
sise-deploy-3513442901 2 2 2 19s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
sise-deploy-3513442901-cndsx 1/1 Running 0 25s
sise-deploy-3513442901-sn74v 1/1 Running 0 25s
Note the naming of the pods and replica set, derived from the deployment name.
At this point in time the sise
containers running in the pods are configured
to return the version 0.9
. Let’s verify that from within the cluster (using kubectl describe
first to get the IP of one of the pods):
[cluster] $ curl 172.17.0.3:9876/info
{"host": "172.17.0.3:9876", "version": "0.9", "from": "172.17.0.1"}
Let’s now see what happens if we change that version to 1.0
in an updated
deployment:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/deployments/d10.yaml
deployment "sise-deploy" configured
Note that you could have used kubectl edit deploy/sise-deploy
alternatively to
achieve the same by manually editing the deployment.
What we now see is the rollout of two new pods with the updated version 1.0
as well
as the two old pods with version 0.9
being terminated:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
sise-deploy-2958877261-nfv28 1/1 Running 0 25s
sise-deploy-2958877261-w024b 1/1 Running 0 25s
sise-deploy-3513442901-cndsx 1/1 Terminating 0 16m
sise-deploy-3513442901-sn74v 1/1 Terminating 0 16m
Also, a new replica set has been created by the deployment:
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
sise-deploy-2958877261 2 2 2 4s
sise-deploy-3513442901 0 0 0 24m
Note that during the deployment you can check the progress using kubectl rollout status deploy/sise-deploy
.
To verify that if the new 1.0
version is really available, we execute from
within the cluster (again using kubectl describe
get the IP of one of the pods):
[cluster] $ curl 172.17.0.5:9876/info
{"host": "172.17.0.5:9876", "version": "1.0", "from": "172.17.0.1"}
A history of all deployments is available via:
$ kubectl rollout history deploy/sise-deploy
deployments "sise-deploy"
REVISION CHANGE-CAUSE
1 <none>
2 <none>
If there are problems in the deployment Kubernetes will automatically roll back to the previous version, however you can also explicitly roll back to a specific revision, as in our case to revision 1 (the original pod version):
$ kubectl rollout undo deploy/sise-deploy --to-revision=1
deployment "sise-deploy" rolled back
$ kubectl rollout history deploy/sise-deploy
deployments "sise-deploy"
REVISION CHANGE-CAUSE
2 <none>
3 <none>
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
sise-deploy-3513442901-ng8fz 1/1 Running 0 1m
sise-deploy-3513442901-s8q4s 1/1 Running 0 1m
At this point in time we’re back at where we started, with two new pods serving
again version 0.9
.
Finally, to clean up, we remove the deployment and with it the replica sets and pods it supervises:
$ kubectl delete deploy sise-deploy
deployment "sise-deploy" deleted
See also the docs for more options on deployments and when they are triggered.
Services
A service is an abstraction for pods, providing a stable, so called virtual IP (VIP) address. While pods may come and go and with it their IP addresses, a service allows clients to reliably connect to the containers running in the pod using the VIP. The virtual
in VIP means it is not an actual IP address connected to a network interface, but its purpose is purely to forward traffic to one or more pods. Keeping the mapping between the VIP and the
pods up-to-date is the job of kube-proxy, a process that runs on every node, which queries the API server to learn about
new services in the cluster.
Let’s create a pod supervised by an RC and a service along with it:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/services/rc.yaml
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/services/svc.yaml
Now we have the supervised pod running:
$ kubectl get pods -l app=sise
NAME READY STATUS RESTARTS AGE
rcsise-6nq3k 1/1 Running 0 57s
$ kubectl describe pod rcsise-6nq3k
Name: rcsise-6nq3k
Namespace: default
Security Policy: restricted
Node: localhost/192.168.99.100
Start Time: Tue, 25 Apr 2017 14:47:45 +0100
Labels: app=sise
Status: Running
IP: 172.17.0.3
Controllers: ReplicationController/rcsise
Containers:
...
You can, from within the cluster, access the pod directly via its
assigned IP 172.17.0.3
:
[cluster] $ curl 172.17.0.3:9876/info
{"host": "172.17.0.3:9876", "version": "0.5.0", "from": "172.17.0.1"}
This is however, as mentioned above, not advisable since the IPs assigned
to pods may change. Hence, enter the simpleservice
we’ve created:
$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
simpleservice 172.30.228.255 <none> 80/TCP 5m
$ kubectl describe svc simpleservice
Name: simpleservice
Namespace: default
Labels: <none>
Selector: app=sise
Type: ClusterIP
IP: 172.30.228.255
Port: <unset> 80/TCP
Endpoints: 172.17.0.3:9876
Session Affinity: None
No events.
The service keeps track of the pods it forwards traffic to through the label,
in our case app=sise
.
From within the cluster we can now access simpleservice
like so:
[cluster] $ curl 172.30.228.255:80/info
{"host": "172.30.228.255", "version": "0.5.0", "from": "10.0.2.15"}
What makes the VIP 172.30.228.255
forward the traffic to the pod?
The answer is: IPtables,
which is essentially a long list of rules that tells the Linux kernel what to do
with a certain IP package.
Looking at the rules that concern our service (executed on a cluster node) yields:
[cluster] $ sudo iptables-save | grep simpleservice
-A KUBE-SEP-4SQFZS32ZVMTQEZV -s 172.17.0.3/32 -m comment --comment "default/simpleservice:" -j KUBE-MARK-MASQ
-A KUBE-SEP-4SQFZS32ZVMTQEZV -p tcp -m comment --comment "default/simpleservice:" -m tcp -j DNAT --to-destination 172.17.0.3:9876
-A KUBE-SERVICES -d 172.30.228.255/32 -p tcp -m comment --comment "default/simpleservice: cluster IP" -m tcp --dport 80 -j KUBE-SVC-EZC6WLOVQADP4IAW
-A KUBE-SVC-EZC6WLOVQADP4IAW -m comment --comment "default/simpleservice:" -j KUBE-SEP-4SQFZS32ZVMTQEZV
Above you can see the four rules that kube-proxy
has thankfully added to the
routing table, essentially stating that TCP traffic to 172.30.228.255:80
should be forwarded to 172.17.0.3:9876
, which is our pod.
Let’s now add a second pod by scaling up the RC supervising it:
$ kubectl scale --replicas=2 rc/rcsise
replicationcontroller "rcsise" scaled
$ kubectl get pods -l app=sise
NAME READY STATUS RESTARTS AGE
rcsise-6nq3k 1/1 Running 0 15m
rcsise-nv8zm 1/1 Running 0 5s
When we now check the relevant parts of the routing table again we notice the addition of a bunch of IPtables rules:
[cluster] $ sudo iptables-save | grep simpleservice
-A KUBE-SEP-4SQFZS32ZVMTQEZV -s 172.17.0.3/32 -m comment --comment "default/simpleservice:" -j KUBE-MARK-MASQ
-A KUBE-SEP-4SQFZS32ZVMTQEZV -p tcp -m comment --comment "default/simpleservice:" -m tcp -j DNAT --to-destination 172.17.0.3:9876
-A KUBE-SEP-PXYYII6AHMUWKLYX -s 172.17.0.4/32 -m comment --comment "default/simpleservice:" -j KUBE-MARK-MASQ
-A KUBE-SEP-PXYYII6AHMUWKLYX -p tcp -m comment --comment "default/simpleservice:" -m tcp -j DNAT --to-destination 172.17.0.4:9876
-A KUBE-SERVICES -d 172.30.228.255/32 -p tcp -m comment --comment "default/simpleservice: cluster IP" -m tcp --dport 80 -j KUBE-SVC-EZC6WLOVQADP4IAW
-A KUBE-SVC-EZC6WLOVQADP4IAW -m comment --comment "default/simpleservice:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-4SQFZS32ZVMTQEZV
-A KUBE-SVC-EZC6WLOVQADP4IAW -m comment --comment "default/simpleservice:" -j KUBE-SEP-PXYYII6AHMUWKLYX
In above routing table listing we see rules for the newly created pod serving at
172.17.0.4:9876
as well as an additional rule:
-A KUBE-SVC-EZC6WLOVQADP4IAW -m comment --comment "default/simpleservice:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-4SQFZS32ZVMTQEZV
This causes the traffic to the service being equally split between our two pods
by invoking the statistics
module of IPtables.
You can remove all the resources created by doing:
$ kubectl delete svc simpleservice
$ kubectl delete rc rcsise
Service Discovery
Service discovery is the process of figuring out how to connect to a service. While there is a service discovery option based on environment variables available, the DNS-based service discovery is preferable. Note that DNS is a cluster add-on so make sure your Kubernetes distribution provides for one or install it yourself.
Let’s create a service named
thesvc
and an RC supervising
some pods along with it:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/sd/rc.yaml
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/sd/svc.yaml
Now we want to connect to the thesvc
service from within the cluster, say, from another service.
To simulate this, we create a jump pod
in the same namespace (default
, since we didn’t specify anything else):
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/sd/jumpod.yaml
The DNS add-on will make sure that our service thesvc
is available via the FQDN
thesvc.default.svc.cluster.local
from other pods in the cluster. Let’s try it out:
$ kubectl exec -it jumpod -c shell -- ping thesvc.default.svc.cluster.local
PING thesvc.default.svc.cluster.local (172.30.251.137) 56(84) bytes of data.
...
The answer to the ping
tells us that the service is available via the cluster
IP 172.30.251.137
. We can directly connect to and consume the service (in the same namespace) like so:
$ kubectl exec -it jumpod -c shell -- curl http://thesvc/info
{"host": "thesvc", "version": "0.5.0", "from": "172.17.0.5"}
Note that the IP address 172.17.0.5
above is the cluster-internal IP address
of the jump pod.
To access a service that is deployed in a different namespace than the one you’re
accessing it from, use a FQDN in the form $SVC.$NAMESPACE.svc.cluster.local
.
Let’s see how that works by creating:
- a namespace
other
- a service
thesvc
in namespaceother
- an RC supervising the pods, also in namespace
other
If you’re not familiar with namespaces, check out the namespace examples first.
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/sd/other-ns.yaml
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/sd/other-rc.yaml
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/sd/other-svc.yaml
We’re now in the position to consume the service thesvc
in namespace other
from the
default
namespace (again via the jump pod):
$ kubectl exec -it jumpod -c shell -- curl http://thesvc.other/info
{"host": "thesvc.other", "version": "0.5.0", "from": "172.17.0.5"}
Summing up, DNS-based service discovery provides a flexible and generic way to connect to services across the cluster.
You can destroy all the resources created with:
$ kubectl delete pods jumpod
$ kubectl delete svc thesvc
$ kubectl delete rc rcsise
$ kubectl delete ns other
Keep in mind that removing a namespace will destroy every resource inside.
Port Forward
In the context of developing apps on Kubernetes it is often useful to quickly access a service from your local environment without exposing it using, for example, a load balancer or an ingress resource. In this case you can use port forwarding.
Let’s create an app consisting of a deployment and a service called simpleservice
, serving on port 80
:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/pf/app.yaml
Let’s say want to access the simpleservice
service from the local environment, say, your laptop, on port 8080
. So we forward the traffic as follows:
$ kubectl port-forward service/simpleservice 8080:80
Forwarding from 127.0.0.1:8080 -> 9876
Forwarding from [::1]:8080 -> 9876
We can see from above that the traffic gets eventually routed through the service to the pod serving on port 9876
.
Now we can invoke the service locally like so (using a separate terminal session):
$ curl localhost:8080/info
{"host": "localhost:8080", "version": "0.5.0", "from": "127.0.0.1"}
Remember that port forwarding is not meant for production traffic but for development and experimentation.
Health Checks
In order to verify if a container in a pod is healthy and ready to serve traffic,
Kubernetes provides for a range of health checking mechanisms. Health checks,
or probes as they are called in Kubernetes, are carried out
by the kubelet to determine when to
restart a container (for livenessProbe
) and used by services and deployments to determine if a pod should receive traffic (for readinessProbe
).
We will focus on HTTP health checks in the following. Note that it is the responsibility of the application developer to expose a URL that the kubelet can use to determine if the container is healthy (and potentially ready).
Let’s create a pod
that exposes an endpoint /health
, responding with a HTTP 200
status code:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/healthz/pod.yaml
In the pod specification we’ve defined the following:
livenessProbe:
initialDelaySeconds: 2
periodSeconds: 5
httpGet:
path: /health
port: 9876
Above means that Kubernetes will start checking the /health
endpoint, after initially waiting 2 seconds, every 5 seconds.
If we now look at the pod we can see that it is considered healthy:
$ kubectl describe pod hc
Name: hc
Namespace: default
Security Policy: anyuid
Node: 192.168.99.100/192.168.99.100
Start Time: Tue, 25 Apr 2017 16:21:11 +0100
Labels: <none>
Status: Running
...
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
3s 3s 1 {default-scheduler } Normal Scheduled Successfully assigned hc to 192.168.99.100
3s 3s 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Pulled Container image "mhausenblas/simpleservice:0.5.0"
already present on machine
3s 3s 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Created Created container with docker id 8a628578d6ad; Security:[seccomp=unconfined]
2s 2s 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Started Started container with docker id 8a628578d6ad
Now we launch a bad pod, that is, a pod that has a container that randomly (in the time range 1 to 4 sec) does not return a 200 code:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/healthz/badpod.yaml
Looking at the events of the bad pod, we can see that the health check failed:
$ kubectl describe pod badpod
...
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned badpod to 192.168.99.100
1m 1m 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Created Created container with docker id 7dd660f04945; Security:[seccomp=unconfined]
1m 1m 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Started Started container with docker id 7dd660f04945
1m 23s 2 {kubelet 192.168.99.100} spec.containers{sise} Normal Pulled Container image "mhausenblas/simpleservice:0.5.0" already present on machine
23s 23s 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Killing Killing container with docker id 7dd660f04945: pod "badpod_default(53e5c06a-29cb-11e7-b44f-be3e8f4350ff)" container "sise" is unhealthy, it will be killed and re-created.
23s 23s 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Created Created container with docker id ec63dc3edfaa; Security:[seccomp=unconfined]
23s 23s 1 {kubelet 192.168.99.100} spec.containers{sise} Normal Started Started container with docker id ec63dc3edfaa
1m 18s 4 {kubelet 192.168.99.100} spec.containers{sise} Warning Unhealthy Liveness probe failed: Get http://172.17.0.4:9876/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
This can also be verified as follows:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
badpod 1/1 Running 4 2m
hc 1/1 Running 0 6m
From above you can see that the badpod
had already been re-launched 4 times,
since the health check failed.
In addition to a livenessProbe
you can also specify a readinessProbe
, which
can be configured in the same way but has a different use case and semantics:
it’s used to check the start-up phase of a container in the pod. Imagine a container
that loads some data from external storage such as S3 or a database that needs
to initialize some tables. In this case you want to signal when the container is
ready to serve traffic.
Let’s create a pod
with a readinessProbe
that kicks in after 10 seconds:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/healthz/ready.yaml
Looking at the events of the pod, we can see that, eventually, the pod is ready to serve traffic:
$ kubectl describe pod ready
...
Conditions: [0/1888]
Type Status
Initialized True
Ready True
PodScheduled True
...
You can remove all the created pods with:
$ kubectl delete pod/hc pod/ready pod/badpod
Learn more about configuring probes, including TCP and command probes, via the docs.
Environment Variables
You can set environment variables for containers running in a pod and in addition, Kubernetes exposes certain runtime infos via environment variables automatically.
Let’s launch a pod
that we pass an environment variable SIMPLE_SERVICE_VERSION
with the value 1.0
:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/envs/pod.yaml
$ kubectl describe pod envs | grep IP:
IP: 172.17.0.3
Now, let’s verify from within the cluster if the application running in the pod
has picked up the environment variable SIMPLE_SERVICE_VERSION
:
[cluster] $ curl 172.17.0.3:9876/info
{"host": "172.17.0.3:9876", "version": "1.0", "from": "172.17.0.1"}
And indeed it has picked up the user-provided environment variable since the default response would be "version": "0.5.0"
.
You can check what environment variables Kubernetes itself provides automatically (from within the cluster, using a dedicated endpoint that the app exposes):
[cluster] $ curl 172.17.0.3:9876/env
{"version": "1.0", "env": "{'HOSTNAME': 'envs', 'DOCKER_REGISTRY_SERVICE_PORT': '5000', 'KUBERNETES_PORT_443_TCP_ADDR': '172.30.0.1', 'ROUTER_PORT_80_TCP_PROTO': 'tcp', 'KUBERNETES_PORT_53_UDP_PROTO': 'udp', 'ROUTER_SERVICE_HOST': '172.30.246.127', 'ROUTER_PORT_1936_TCP_PROTO': 'tcp', 'KUBERNETES_SERVICE_PORT_DNS': '53', 'DOCKER_REGISTRY_PORT_5000_TCP_PORT': '5000', 'PATH': '/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', 'ROUTER_SERVICE_PORT_443_TCP': '443', 'KUBERNETES_PORT_53_TCP': 'tcp://172.30.0.1:53', 'KUBERNETES_SERVICE_PORT': '443', 'ROUTER_PORT_80_TCP_ADDR': '172.30.246.127', 'LANG': 'C.UTF-8', 'KUBERNETES_PORT_53_TCP_ADDR': '172.30.0.1', 'PYTHON_VERSION': '2.7.13', 'KUBERNETES_SERVICE_HOST': '172.30.0.1', 'PYTHON_PIP_VERSION': '9.0.1', 'DOCKER_REGISTRY_PORT_5000_TCP_PROTO': 'tcp', 'REFRESHED_AT': '2017-04-24T13:50', 'ROUTER_PORT_1936_TCP': 'tcp://172.30.246.127:1936', 'KUBERNETES_PORT_53_TCP_PROTO': 'tcp', 'KUBERNETES_PORT_53_TCP_PORT': '53', 'HOME': '/root', 'DOCKER_REGISTRY_SERVICE_HOST': '172.30.1.1', 'GPG_KEY': 'C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF', 'ROUTER_SERVICE_PORT_80_TCP': '80', 'ROUTER_PORT_443_TCP_ADDR': '172.30.246.127', 'ROUTER_PORT_1936_TCP_ADDR': '172.30.246.127', 'ROUTER_SERVICE_PORT': '80', 'ROUTER_PORT_443_TCP_PORT': '443', 'KUBERNETES_SERVICE_PORT_DNS_TCP': '53', 'KUBERNETES_PORT_53_UDP_ADDR': '172.30.0.1', 'KUBERNETES_PORT_53_UDP': 'udp://172.30.0.1:53', 'KUBERNETES_PORT': 'tcp://172.30.0.1:443', 'ROUTER_PORT_1936_TCP_PORT': '1936', 'ROUTER_PORT_80_TCP': 'tcp://172.30.246.127:80', 'KUBERNETES_SERVICE_PORT_HTTPS': '443', 'KUBERNETES_PORT_53_UDP_PORT': '53', 'ROUTER_PORT_80_TCP_PORT': '80', 'ROUTER_PORT': 'tcp://172.30.246.127:80', 'ROUTER_PORT_443_TCP': 'tcp://172.30.246.127:443', 'SIMPLE_SERVICE_VERSION': '1.0', 'ROUTER_PORT_443_TCP_PROTO': 'tcp', 'KUBERNETES_PORT_443_TCP': 'tcp://172.30.0.1:443', 'DOCKER_REGISTRY_PORT_5000_TCP': 'tcp://172.30.1.1:5000', 'DOCKER_REGISTRY_PORT': 'tcp://172.30.1.1:5000', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'ROUTER_SERVICE_PORT_1936_TCP': '1936', 'DOCKER_REGISTRY_PORT_5000_TCP_ADDR': '172.30.1.1', 'DOCKER_REGISTRY_SERVICE_PORT_5000_TCP': '5000', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp'}"}
Alternatively, you can also use kubectl exec
to connect to the container and list the
environment variables directly, there:
$ kubectl exec envs -- printenv
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=envs
SIMPLE_SERVICE_VERSION=1.0
KUBERNETES_PORT_53_UDP_ADDR=172.30.0.1
KUBERNETES_PORT_53_TCP_PORT=53
ROUTER_PORT_443_TCP_PROTO=tcp
DOCKER_REGISTRY_PORT_5000_TCP_ADDR=172.30.1.1
KUBERNETES_SERVICE_PORT_DNS_TCP=53
ROUTER_PORT=tcp://172.30.246.127:80
...
You can destroy the created pod with:
$ kubectl delete pod/envs
In addition to the above provided environment variables, you can expose more using the downward API.
Namespaces
Namespaces provide for a scope of Kubernetes resource, carving up your cluster in smaller units. You can think of it as a workspace you’re sharing with other users. Many resources such as pods and services are namespaced, while some, for example, nodes are not namespaced (but cluster-wide). As a developer you’d usually use an assigned namespace, however admins may wish to manage them, for example to set up access control or resource quotas.
Let’s list all namespaces (note that the output will depend on the environment you’re using, I’m using Minishift here):
$ kubectl get ns
NAME STATUS AGE
default Active 13d
kube-system Active 13d
namingthings Active 12d
openshift Active 13d
openshift-infra Active 13d
You can learn more about a namespace using the describe
verb, for example:
$ kubectl describe ns default
Name: default
Labels: <none>
Status: Active
No resource quota.
No resource limits.
Let’s now create a new namespace
called test
now:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/ns/ns.yaml
namespace "test" created
$ kubectl get ns
NAME STATUS AGE
default Active 13d
kube-system Active 13d
namingthings Active 12d
openshift Active 13d
openshift-infra Active 13d
test Active 3s
Alternatively, we can could have created the namespace using the kubectl create namespace test
command.
To launch a pod in
the newly created namespace test
, do:
$ kubectl apply --namespace=test -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/ns/pod.yaml
Note that using above method the namespace becomes a runtime property, that is,
you can deploy the same pod or service, etc. into multiple
namespaces (for example: dev
and prod
). Hard-coding the
namespace directly in the metadata
section like shown in the following is possible but causes less flexibility when deploying your apps:
apiVersion: v1
kind: Pod
metadata:
name: podintest
namespace: test
To list namespaced objects such as our pod podintest
, run the following command:
$ kubectl get pods --namespace=test
NAME READY STATUS RESTARTS AGE
podintest 1/1 Running 0 16s
You can remove the namespace (and everything inside) with:
$ kubectl delete ns test
If you’re an admin, you might want to check out the docs for more info how to handle namespaces.
Volumes
A Kubernetes volume is essentially a directory accessible to all containers running in a pod. In contrast to the container-local filesystem, the data in volumes is preserved across container restarts. The medium backing a volume and its contents are determined by the volume type:
- node-local types such as
emptyDir
orhostPath
- file-sharing types such as
nfs
- cloud provider-specific types like
awsElasticBlockStore
,azureDisk
, orgcePersistentDisk
- distributed file system types, for example
glusterfs
orcephfs
- special-purpose types like
secret
,gitRepo
A special type of volume is PersistentVolume
, which we will cover elsewhere.
Let’s create a pod
with two containers that use an emptyDir
volume to exchange data:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/volumes/pod.yaml
$ kubectl describe pod sharevol
Name: sharevol
Namespace: default
...
Volumes:
xchange:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
We first exec into one of the containers in the pod, c1
, check the volume mount
and generate some data:
$ kubectl exec -it sharevol -c c1 -- bash
[root@sharevol /]# mount | grep xchange
/dev/sda1 on /tmp/xchange type ext4 (rw,relatime,data=ordered)
[root@sharevol /]# echo 'some data' > /tmp/xchange/data
When we now exec into c2
, the second container running in the pod, we can see
the volume mounted at /tmp/data
and are able to read the data created in the
previous step:
$ kubectl exec -it sharevol -c c2 -- bash
[root@sharevol /]# mount | grep /tmp/data
/dev/sda1 on /tmp/data type ext4 (rw,relatime,data=ordered)
[root@sharevol /]# cat /tmp/data/data
some data
Note that in each container you need to decide where to mount the volume and
that for emptyDir
you currently can not specify resource consumption limits.
You can remove the pod with:
$ kubectl delete pod/sharevol
As already described, this will destroy the shared volume and all its contents.
Persistent Volumes
A persistent volume (PV) is a cluster-wide resource that you can use to store data in a way that it persists beyond the lifetime of a pod. The PV is not backed by locally-attached storage on a worker node but by networked storage system such as EBS or NFS or a distributed filesystem like Ceph.
In order to use a PV you need to claim it first, using a persistent volume claim (PVC). The PVC requests a PV with your desired specification (size, speed, etc.) from Kubernetes and binds it then to a pod where you can mount it as a volume. So let’s create such a PVC, asking Kubernetes for 1 GB of storage using the default storage class:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/pv/pvc.yaml
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
myclaim Bound pvc-27fed6b6-3047-11e9-84bb-12b5519f9b58 1Gi RWO gp2-encrypted 18m
To understand how the persistency plays out, let’s create a deployment that uses above PVC to mount it as a volume into /tmp/persistent
:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/pv/deploy.yaml
Now we want to test if data in the volume actually persists. For this we find the pod managed by above deployment, exec into its main container and create a file called data
in the /tmp/persistent
directory (where we decided to mount the PV):
$ kubectl get po
NAME READY STATUS RESTARTS AGE
pv-deploy-69959dccb5-jhxx 1/1 Running 0 16m
$ kubectl exec -it pv-deploy-69959dccb5-jhxxw -- bash
bash-4.2$ touch /tmp/persistent/data
bash-4.2$ ls /tmp/persistent/
data lost+found
It’s time to destroy the pod and let the deployment launch a new pod. The expectation is that the PV is available again in the new pod and the data in /tmp/persistent
is still present. Let’s check that:
$ kubectl delete po pv-deploy-69959dccb5-jhxxw
pod pv-deploy-69959dccb5-jhxxw deleted
$ kubectl get po
NAME READY STATUS RESTARTS AGE
pv-deploy-69959dccb5-kwrrv 1/1 Running 0 16m
$ kubectl exec -it pv-deploy-69959dccb5-kwrrv -- bash
bash-4.2$ ls /tmp/persistent/
data lost+found
And indeed, the data
file and its content is still where it is expected to be.
Note that the default behavior is that even when the deployment is deleted, the PVC (and the PV) continues to exist. This storage protection feature helps avoiding data loss. Once you’re sure you don’t need the data anymore, you can go ahead and delete the PVC and with it eventually destroy the PV:
$ kubectl delete pvc myclaim
persistentvolumeclaim "myclaim" deleted
The types of PV available in your Kubernetes cluster depend on the environment (on-prem or public cloud). Check out the Stateful Kubernetes reference site if you want to learn more about this topic.
Secrets
You don’t want sensitive information such as a database password or an API key kept around in clear text. Secrets provide you with a mechanism to use such information in a safe and reliable way with the following properties:
- Secrets are namespaced objects, that is, exist in the context of a namespace
- You can access them via a volume or an environment variable from a container running in a pod
- The secret data on nodes is stored in tmpfs volumes
- A per-secret size limit of 1MB exists
- The API server stores secrets as plaintext in etcd
Let’s create a secret apikey
that holds a (made-up) API key:
$ echo -n "A19fh68B001j" > ./apikey.txt
$ kubectl create secret generic apikey --from-file=./apikey.txt
secret "apikey" created
$ kubectl describe secrets/apikey
Name: apikey
Namespace: default
Labels: <none>
Annotations: <none>
Type: Opaque
Data
====
apikey.txt: 12 bytes
Now let’s use the secret in a pod via a volume:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/secrets/pod.yaml
If we now exec into the container we see the secret mounted at /tmp/apikey
:
$ kubectl exec -it consumesec -c shell -- bash
[root@consumesec /]# mount | grep apikey
tmpfs on /tmp/apikey type tmpfs (ro,relatime)
[root@consumesec /]# cat /tmp/apikey/apikey.txt
A19fh68B001j
Note that for service accounts Kubernetes automatically creates secrets containing credentials for accessing the API and modifies your pods to use this type of secret.
You can remove both the pod and the secret with:
$ kubectl delete pod/consumesec secret/apikey
Logging
Logging is one option to understand what is going on inside your applications and the cluster at large. Basic logging in Kubernetes makes the output a container produces available, which is a good use case for debugging. More advanced setups consider logs across nodes and store them in a central place, either within the cluster or via a dedicated (cloud-based) service.
Let’s create a pod
called logme
that runs a container writing to stdout
and stderr
:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/logging/pod.yaml
To view the five most recent log lines of the gen
container in the logme
pod,
execute:
$ kubectl logs --tail=5 logme -c gen
Thu Apr 27 11:34:40 UTC 2017
Thu Apr 27 11:34:41 UTC 2017
Thu Apr 27 11:34:41 UTC 2017
Thu Apr 27 11:34:42 UTC 2017
Thu Apr 27 11:34:42 UTC 2017
To stream the log of the gen
container in the logme
pod (like tail -f
), do:
$ kubectl logs -f --since=10s logme -c gen
Thu Apr 27 11:43:11 UTC 2017
Thu Apr 27 11:43:11 UTC 2017
Thu Apr 27 11:43:12 UTC 2017
Thu Apr 27 11:43:12 UTC 2017
Thu Apr 27 11:43:13 UTC 2017
...
Note that if you wouldn’t have specified --since=10s
in the above command, you
would have gotten all log lines from the start of the container.
You can also view logs of pods that have already completed their lifecycle.
For this we create a pod
called oneshot
that counts down from 9 to 1 and then exits. Using the -p
option
you can print the logs for previous instances of the container in a pod:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/logging/oneshotpod.yaml
$ kubectl logs -p oneshot -c gen
9
8
7
6
5
4
3
2
1
You can remove the created pods with:
$ kubectl delete pod/logme pod/oneshot
Jobs
A job in Kubernetes is a supervisor for pods carrying out batch processes, that is, a process that runs for a certain time to completion, for example a calculation or a backup operation.
Let’s create a job
called countdown
that supervises a pod counting from 9 down to 1:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/jobs/job.yaml
You can see the job and the pod it looks after like so:
$ kubectl get jobs
NAME DESIRED SUCCESSFUL AGE
countdown 1 1 5s
$ kubectl get pods --show-all
NAME READY STATUS RESTARTS AGE
countdown-lc80g 0/1 Completed 0 16s
To learn more about the status of the job, do:
$ kubectl describe jobs/countdown
Name: countdown
Namespace: default
Image(s): centos:7
Selector: controller-uid=ff585b92-2b43-11e7-b44f-be3e8f4350ff
Parallelism: 1
Completions: 1
Start Time: Thu, 27 Apr 2017 13:21:10 +0100
Labels: controller-uid=ff585b92-2b43-11e7-b44f-be3e8f4350ff
job-name=countdown
Pods Statuses: 0 Running / 1 Succeeded / 0 Failed
No volumes.
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 2m 1 {job-controller } Normal SuccessfulCreate Created pod: countdown-lc80g
And to see the output of the job via the pod it supervised, execute:
kubectl logs countdown-lc80g
9
8
7
6
5
4
3
2
1
To clean up, use the delete
verb on the job object which will remove all the
supervised pods:
$ kubectl delete job countdown
job "countdown" deleted
Note that there are also more advanced ways to use jobs, for example, by utilizing a work queue or scheduling the execution at a certain time via cron jobs.
StatefulSet
If you have a stateless app you want to use a deployment. However, for a stateful app you might want to use a StatefulSet. Unlike a deployment, the StatefulSet
provides certain guarantees about the identity of the pods it is managing (that is, predictable names) and about the startup order. Two more things that are different compared to a deployment: for network communication you need to create a headless services and for persistency the StatefulSet
manages a persistent volume per pod.
In order to see how this all plays together, we will be using an educational Kubernetes-native NoSQL datastore.
Let’s start with creating the stateful app, that is, the StatefulSet
along with the persistent volumes and the headless service:
$ kubectl apply -f https://raw.githubusercontent.com/mhausenblas/mehdb/master/app.yaml
After a minute or so, you can have a look at all the resources that have been created:
$ kubectl get sts,po,pvc,svc
NAME DESIRED CURRENT AGE
statefulset.apps/mehdb 2 2 1m
NAME READY STATUS RESTARTS AGE
pod/mehdb-0 1/1 Running 0 1m
pod/mehdb-1 1/1 Running 0 56s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-mehdb-0 Bound pvc-bc2d9b3b-310d-11e9-aeff-123713f594ec 1Gi RWO ebs 1m
persistentvolumeclaim/data-mehdb-1 Bound pvc-d4b7620f-310d-11e9-aeff-123713f594ec 1Gi RWO ebs 56s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mehdb ClusterIP None <none> 9876/TCP 1m
Now we can check if the stateful app is working properly. To do this, we use the /status
endpoint of the headless service mehdb:9876
and since we haven’t put any data yet into the datastore, we’d expect that 0
keys are reported:
$ kubectl run -it --rm jumpod --restart=Never --image=quay.io/mhausenblas/jump:0.2 -- curl mehdb:9876/status?level=full
If you don't see a command prompt, try pressing enter.
0
pod "jumpod" deleted
And indeed we see 0
keys being available, reported above.
Note that sometimes a StatefulSet
is not the best fit for your stateful app. You might be better off defining a custom resource along with writing a custom controller to have finer-grained control over your workload.
Init Containers
It’s sometimes necessary to prepare a container running in a pod. For example, you might want to wait for a service being available, want to configure things at runtime, or init some data in a database. In all of these cases, init containers are useful. Note that Kubernetes will execute all init containers (and they must all exit successfully) before the main container(s) are executed.
So let’s create an deployment consisting of an init container that writes a message into a file at /ic/this
and the main (long-running) container reading out this file, then:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/ic/deploy.yaml
Now we can check the output of the main container:
$ kubectl get deploy,po
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.extensions/ic-deploy 1 1 1 1 11m
NAME READY STATUS RESTARTS AGE
pod/ic-deploy-bf75cbf87-8zmrb 1/1 Running 0 59s
$ kubectl logs ic-deploy-bf75cbf87-8zmrb -f
INIT_DONE
INIT_DONE
INIT_DONE
INIT_DONE
INIT_DONE
^C
If you want to learn more about init containers and related topics, check out the blog post Kubernetes: A Pod’s Life.
Nodes
In Kubernetes, nodes are the (virtual) machines where your workloads in shape of pods run. As a developer you typically don’t deal with nodes directly, however as an admin you might want to familiarize yourself with node operations.
To list available nodes in your cluster (note that the output will depend on the environment you’re using, I’m using Minishift):
$ kubectl get nodes
NAME STATUS AGE
192.168.99.100 Ready 14d
One interesting task, from a developer point of view, is to make Kubernetes schedule a pod on a certain node. For this, we first need to label the node we want to target:
$ kubectl label nodes 192.168.99.100 shouldrun=here
node "192.168.99.100" labeled
Now we can create a pod
that gets scheduled on the node with the label shouldrun=here
:
$ kubectl apply -f https://raw.githubusercontent.com/openshift-evangelists/kbe/master/specs/nodes/pod.yaml
$ kubectl get pods --output=wide
NAME READY STATUS RESTARTS AGE IP NODE
onspecificnode 1/1 Running 0 8s 172.17.0.3 192.168.99.100
To learn more about a specific node, 192.168.99.100
in our case, do:
$ kubectl describe node 192.168.99.100
Name: 192.168.99.100
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=192.168.99.100
shouldrun=here
Taints: <none>
CreationTimestamp: Wed, 12 Apr 2017 17:17:13 +0100
Phase:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Thu, 27 Apr 2017 14:55:49 +0100 Thu, 27 Apr 2017 09:18:13 +0100 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Thu, 27 Apr 2017 14:55:49 +0100 Wed, 12 Apr 2017 17:17:13 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Thu, 27 Apr 2017 14:55:49 +0100 Wed, 12 Apr 2017 17:17:13 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Thu, 27 Apr 2017 14:55:49 +0100 Thu, 27 Apr 2017 09:18:24 +0100 KubeletReady kubelet is posting ready status
Addresses: 192.168.99.100,192.168.99.100,192.168.99.100
Capacity:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 2
memory: 2050168Ki
pods: 20
Allocatable:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 2
memory: 2050168Ki
pods: 20
System Info:
Machine ID: 896b6d970cd14d158be1fd1c31ff1a8a
System UUID: F7771C31-30B0-44EC-8364-B3517DBC8767
Boot ID: 1d589b36-3413-4e82-af80-b2756342eed4
Kernel Version: 4.4.27-boot2docker
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.12.3
Kubelet Version: v1.5.2+43a9be4
Kube-Proxy Version: v1.5.2+43a9be4
ExternalID: 192.168.99.100
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default docker-registry-1-hfpzp 100m (5%) 0 (0%) 256Mi (12%) 0 (0%)
default onspecificnode 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default router-1-cdglk 100m (5%) 0 (0%) 256Mi (12%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
200m (10%) 0 (0%) 512Mi (25%) 0 (0%)
No events.
Note that there are more sophisticated methods than shown above, such as using affinity, to assign pods to nodes and depending on your use case, you might want to check those out as well.
API Server access
Sometimes it’s useful or necessary to directly access the Kubernetes API server, for exploratory or testing purposes.
In order to do this, one option is to proxy the API to your local environment, using:
$ kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080
Now you can query the API (in a separate terminal session) like so:
$ curl http://localhost:8080/api/v1
{
"kind": "APIResourceList",
"groupVersion": "v1",
"resources": [
{
...
{
"name": "services/status",
"singularName": "",
"namespaced": true,
"kind": "Service",
"verbs": [
"get",
"patch",
"update"
]
}
]
}
Alternatively, without proxying, you can use kubectl
directly as follows to achieve the same:
$ kubectl get --raw=/api/v1
Further, if you want to explore the supported API versions and/or resources, you can use the following commands:
$ kubectl api-versions
admissionregistration.k8s.io/v1beta1
...
v1
$ kubectl api-resources
NAME SHORTNAMES APIGROUP NAMESPACED KIND
bindings true Binding
...
storageclasses sc storage.k8s.io false StorageClass