Kubernetes: Difference between revisions
(→Useful) |
|||
(122 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
alias k="kubectl" | alias k="kubectl" | ||
alias ks="kubectl --namespace kube-system" | alias kns="kubectl --namespace $KNS" | ||
alias ke="kubectl get events --sort-by='{.lastTimestamp}'" | alias ks="kubectl --namespace kube-system" # Kubernetes Events | ||
alias ke="kubectl get events --sort-by='{.lastTimestamp}'" # Kubernetes System stuff | |||
alias kse="kubectl --namespace kube-system get events --sort-by='{.lastTimestamp}'" # Kubernetes Systems Events | |||
with the advernt of this , my knX aliases are not as useful. | |||
kubectl config set-context --current --namespace=NAMESPACE | |||
For moving around namespaces fast | |||
alias kX="kubectl --name-space nameX" | |||
dump all : | dump all : | ||
kubectl get all --export=true -o yaml | kubectl get all --export=true -o yaml | ||
( namespace kube-system not dumped ) | |||
also needed: | |||
kns get secrets | |||
kns get pvc | |||
kns get pv | |||
kns get cm | |||
kns get sa | |||
kns get role | |||
kns get RoleBinding | |||
list form: | list form: | ||
Line 22: | Line 44: | ||
kubectl api-resources | kubectl api-resources | ||
kubectl api-versions | |||
event sorted by time | event sorted by time | ||
kubectl get events --sort-by=.metadata.creationTimestamp | kubectl get events --sort-by=.metadata.creationTimestamp | ||
show events with timestamp , rather than relative time: | |||
kubectl get events -o custom-columns=FirstSeen:.firstTimestamp,LastSeen:.lastTimestamp,Count:.count,From:.source.component,Type:.type,Reason:.reason,Message:.message -n $KNS | |||
kubectl get events -o custom-columns=FirstSeen:.firstTimestamp,LastSeen:.lastTimestamp,Count:.count,From:.source.component,Type:.type,Reason:.reason,Message:.message | |||
what storage classes does my cluster support? | what storage classes does my cluster support? | ||
k get storageclass | k get storageclass | ||
how are pod spread out over nodes: | |||
k describe node | grep -E '(^Non-t|m |^Name)' | more | |||
( doesn't scale well, no indication of ) | |||
what deployments should be on what instance groups: | |||
kubectl get deploy --all-namespaces -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{..nodeSelector}{"\n"}{end}' |sort | |||
or the other way round ( ig first ) | |||
kubectl get deploy --all-namespaces -o=jsonpath='{range .items[*]}{..nodeSelector}{"\t"}{.metadata.name}{"\n"}{end}' |sort | |||
if you used kops to deploy the cluster then nodes are labes with their instance groups, you can be more specific like this: | |||
k describe node -l kops.k8s.io/instancegroup=<instance group name> | grep -E '(^Non-t|m |^Name)' | more | |||
how many nods in each instance group? ( tested under kops ) | |||
for i in `kops get ig 2>/dev/null| grep -v NAME | awk '{print $1}'` | |||
do | |||
echo $i | |||
kubectl get nodes -l kops.k8s.io/instancegroup=$i | |||
done | |||
how many pods per node: | |||
k get pod -o wide | grep -v NAME | awk '{print $8}' | sort | uniq -c | sort -rn | |||
k get pod --all-namespaces -o wide | grep -v NAME | awk '{print $8}' | sort | uniq -c | sort -rn | |||
'''audit''': who tried to do what? | '''audit''': who tried to do what? | ||
Line 36: | Line 97: | ||
ks logs $podname | ks logs $podname | ||
kns logs -f --timestamps podname | |||
-f "follow" | |||
--timestamp , show timestamp | |||
the pod restarted , I want the logs from the previous start of this pod: | |||
kns logs --previous podname | |||
who tried to scale unsuccessfully? | who tried to scale unsuccessfully? | ||
Line 77: | Line 147: | ||
</pre> | </pre> | ||
CURL_CA_BUNDLE - | CURL_CA_BUNDLE - kubernerets is it's own CA, and presents to each pod a ca bundle that makes ssl "in" the cluster valid. | ||
This was the role that did it. FIXME: pare it down | This was the role that did it. FIXME: pare it down | ||
Line 116: | Line 186: | ||
</pre> | </pre> | ||
==== On | ==== On patching ==== | ||
There are a couple of way to change an object. | There are a couple of way to change an object. | ||
Line 162: | Line 232: | ||
curl \ | curl \ | ||
-X PATCH \ | -X PATCH \ | ||
-d | -d ${PAYLOAD} \ | ||
-H 'Content-Type: application/json-patch+json' \ | -H 'Content-Type: application/json-patch+json' \ | ||
-H "Authorization: Bearer ${TOKEN}" \ | -H "Authorization: Bearer ${TOKEN}" \ | ||
Line 184: | Line 254: | ||
-d '[{"op":"replace","path":"/spec/replicas","value":3}]' <- works | -d '[{"op":"replace","path":"/spec/replicas","value":3}]' <- works | ||
== Template | === resource definitions === | ||
You did a get -o yaml to see the object, but you want to know what _all_ the attributes are. | |||
For Custom resource definintions you know about: | |||
k get CustomResourceDefinition | |||
but what about vanilla , non-custom, definitions? API reference? | |||
FIXME , dunno. | |||
Maybe the thing you want to know amore about is a custom resource and you just didn't know it as custom: | |||
k get CustomResourceDefinition -A | |||
for example: | |||
k get CustomResourceDefinition backendconfigs.cloud.google.com -o yaml | |||
show recommended labels on deployments | |||
k get deploy -o=custom-columns='Deployment_name:.metadata.name,label_NAME:.metadata.labels.app\.kubernetes\.io/name,INSTANCE:.metadata.labels.app\.kubernetes\.io/instance,VERSION:.metadata.labels.app\.kubernetes\.io/version,COMPONENT:.metadata.labels.app\.kubernetes\.io/component,MANAGEBY:.metadata.labels.app\.kubernetes\.io/managed-by' | |||
( sorry no line breaks allowed ) | |||
== Template Examples == | |||
list images by pod: | |||
kubectl get pods -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}' | |||
kubectl get pods --all-namespaces -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}' | |||
kns version: | |||
kns get pods -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}' | |||
pod and ips: | |||
kubectl get pods -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .status.podIPs[*]}{ .ip } {", "}{end}{end}{"\n"}' | |||
kubectl get pods -n mynamespace -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .status.podIPs[*]}{ .ip } {", "}{end}{end}{"\n"}' | |||
list images by deploy | |||
kubectl get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.template.spec.containers[*]}{.image}{", "}{end}{end}{"\n"}' | |||
kns get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.template.spec.containers[*]}{.image}{", "}{end}{end}{"\n"}' | |||
list all deployment cpu and mem request: | |||
kubectl get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.template.spec.containers[*]}{.resources.requests.cpu}{", "}{.resources.requests.memory}{end}{end}{"\n"}' | |||
list nodeslectors for all deploys: | |||
kubectl get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{.spec.template.spec.nodeSelector}{end}{"\n"}' | |||
all nodes by their condition statuses: | |||
kubectl get nodes -o=jsonpath='{range .items[*]}{@.metadata.name}:{"\n"}{range @.status.conditions[*]}{"\t"}{@.type}={@.status};{"\n"}{end}{end}{"\n"}' | |||
Note the double loop. | |||
Examine the resource request and limit for all deploys: | |||
kubectl get deploy -o=jsonpath='{range .items[*]}{@.metadata.name}{range @.spec.template.spec.containers[*]}{@.resources.request.cpu}{@.resources.request.memory}{@.resources.limits.cpu}{@.resources.limits.memory}{"\n"}{end}{end}{"\n"}' | |||
ContainersIDs by host: ( accounts for multiple containers per pod ) | |||
<pre> | |||
for i in `kns describe node NODENAME | grep "m (" | grep -v cpu | awk '{print $2}'` | |||
do | |||
echo -n $i " " | |||
ns=$(kns get pod -A | grep $i | awk '{print $1}') | |||
echo -n $ns " " | |||
# k get pod $i -n $ns -o jsonpath='{.metadata.name}{" "}{.status.containerStatuses[*].containerID}{"\n"}' ; done | |||
k get pod $i -n $ns -o jsonpath='{"\n"}{"\t"}{range .status.containerStatuses[*]}{.name}{" "}{.containerID}{"\n"}{end}' ; done | |||
done | |||
</pre> | |||
request for every replicaset ,across all name spaces | |||
kubectl get rs -A -o=jsonpath='{range .items[*]}{"\n"}{.metadata.namespace}{","}{.metadata.name}{","}{.status.readyReplicas}{","}{range .spec.template.spec.containers[*]}{.resources.requests.cpu}{","}{.resources.requests.memory}{end}{end}{"\n"}' | sort -n -r -k 3 -t , | head | |||
of the form: | |||
namespace,name,replicasready,cpu,mem | |||
== metrics == | |||
wget "$(kubectl config view -o jsonpath='{range .clusters[*]}{@.cluster.server}{"\n"}{end}')" | |||
also: | |||
k top nodes | |||
k top pods | |||
== exec notes == | |||
k exec -it $i -- bash -c "set | grep ENVVARX" | |||
== Monitoroing == | |||
=== Prometheus queries of note: === | |||
Cluster wide CPU usage percent: | |||
sum (rate (container_cpu_usage_seconds_total{id="/"}[5m])) / sum (machine_cpu_cores{}) * 100 | |||
Cluster wide memory usage: | |||
sum (container_memory_working_set_bytes{id="/"}) / sum (machine_memory_bytes{}) * 100 | |||
Cluster wide cpu and memory request v available as a percent ( keep this below 100% ) | |||
CPU: | |||
sum(kube_pod_container_resource_requests_cpu_cores) / sum(kube_node_status_capacity_cpu_cores) * 100 | |||
Memory: | |||
sum(kube_pod_container_resource_requests_memory_bytes) / sum(kube_node_status_capacity_memory_bytes) * 100 | |||
== Troubleshooting == | |||
=== Pod === | |||
pod starts and dies too quick, inthe deployment, stateuleset, or deamonset overrid the command and args with this: | |||
look at logs: | |||
put this in a deployment so that you can exec into a failing pod and see what it's upto: | |||
command: ["/bin/sh"] | |||
args: ["-c", "while true; do date; sleep 10;done"] | |||
the pod will come up and stay up long enugh for you to get in and look around. | |||
this only works with a pod that has enough of an operating system for you to do that . Like bash ps ls cd and such tools. Some very slim containers do not have those tools. | |||
=== One off pod === | |||
kubectl run tomcat --image=ubuntu:latest | |||
kubectl run -i -t busybox --image=alpine:3.8 --restart=Never | |||
=== kube Api server === | |||
You want to see what is up with the api server. If there is a problem with the cluster the api server is going to give you a better view. | |||
by default -v ( verbose ) is not set. | |||
in this example we are using KOPS to make the kubenertes cluster, so to change the api server config we need to edit the cluster_spec. | |||
kops edit cluster | |||
add a section | |||
apiServer: | |||
LogLevel: X | |||
This adds --v=X to the api server command line. | |||
I had something pounding the server with a bad token , ut the log only said: "Bad token"... who ? | |||
Had to turn loglevel up to 20 to get to the bottom of it... it was a bad fluentd token ( wish fluentd had some sort of back-off. ) | |||
Note that a kops update showed "no changed required". | |||
I could force it, but instead, I used the "small change" trick [[Kops#Tricks_for_making_small_updates_to_a_host]] | |||
=== Running a one off job from a cronjob === | |||
kns create job --from=cronjob/cronjob-name job-name-dthornton | |||
=== Echo Server === | |||
aka printenv | |||
aka python server | |||
For troubleshooting load balancer, service, ingress issues, an "echo" server is useful: | |||
server.py | |||
<pre> | |||
#!/usr/bin/env python3 | |||
# ./server.py [<port>] | |||
from http.server import BaseHTTPRequestHandler, HTTPServer | |||
import logging | |||
class S(BaseHTTPRequestHandler): | |||
def _set_response(self): | |||
self.send_response(200) | |||
self.send_header('Content-type', 'text/html') | |||
self.end_headers() | |||
def do_GET(self): | |||
logging.info("GET request,\nPath: %s\nHeaders:\n%s\n", str(self.path), str(self.headers)) | |||
self._set_response() | |||
for e in self.headers: | |||
# self.response.write(e + "<br />") | |||
self.wfile.write("{}: {}<br>".format(e,self.headers[e]).encode('utf-8')) | |||
# self.wfile.write("type is {}".format(type(self.headers)).encode('utf-8')) | |||
# self.wfile.write("GET request for {}".format(self.path).encode('utf-8')) | |||
def do_POST(self): | |||
content_length = int(self.headers['Content-Length']) # <--- Gets the size of data | |||
post_data = self.rfile.read(content_length) # <--- Gets the data itself | |||
logging.info("POST request,\nPath: %s\nHeaders:\n%s\n\nBody:\n%s\n", | |||
str(self.path), str(self.headers), post_data.decode('utf-8')) | |||
self._set_response() | |||
self.wfile.write("POST request for {}".format(self.path).encode('utf-8')) | |||
== | def run(server_class=HTTPServer, handler_class=S, port=8080): | ||
logging.basicConfig(level=logging.INFO) | |||
server_address = ('', port) | |||
httpd = server_class(server_address, handler_class) | |||
logging.info('Starting httpd...') | |||
try: | |||
httpd.serve_forever() | |||
except KeyboardInterrupt: | |||
pass | |||
httpd.server_close() | |||
logging.info('Stopping httpd...') | |||
if __name__ == '__main__': | |||
from sys import argv | |||
if len(argv) == 2: | |||
run(port=int(argv[1])) | |||
else: | |||
run() | |||
== Practices and | </pre> | ||
make a configmap from this: | |||
kubectl create configmap server-script --from-file=server.py | |||
then make a deployment: | |||
deploy.yaml | |||
<pre> | |||
apiVersion: apps/v1 | |||
kind: Deployment | |||
metadata: | |||
annotations: | |||
labels: | |||
app: server | |||
env: staging | |||
name: server | |||
namespace: testing | |||
spec: | |||
progressDeadlineSeconds: 600 | |||
replicas: 1 | |||
revisionHistoryLimit: 1 | |||
selector: | |||
matchLabels: | |||
app: server | |||
env: staging | |||
strategy: | |||
rollingUpdate: | |||
maxSurge: 34% | |||
maxUnavailable: 0 | |||
type: RollingUpdate | |||
template: | |||
metadata: | |||
creationTimestamp: null | |||
labels: | |||
app: server | |||
env: staging | |||
spec: | |||
automountServiceAccountToken: false | |||
containers: | |||
- args: | |||
- /server.py | |||
- "9011" | |||
command: | |||
- python | |||
env: | |||
- name: MYENVVAR | |||
value: fun | |||
image: python:alpine3.15 | |||
imagePullPolicy: IfNotPresent | |||
livenessProbe: | |||
failureThreshold: 5 | |||
httpGet: | |||
path: / | |||
port: 9011 | |||
scheme: HTTP | |||
initialDelaySeconds: 120 | |||
periodSeconds: 60 | |||
successThreshold: 1 | |||
timeoutSeconds: 60 | |||
name: fusionauth | |||
ports: | |||
- containerPort: 9011 | |||
protocol: TCP | |||
resources: | |||
limits: | |||
cpu: 300m | |||
memory: 700Mi | |||
requests: | |||
cpu: 100m | |||
memory: 500Mi | |||
terminationMessagePath: /dev/termination-log | |||
terminationMessagePolicy: File | |||
volumeMounts: | |||
- mountPath: /server.py | |||
name: server-script | |||
subPath: server.py | |||
dnsConfig: | |||
options: | |||
- name: ndots | |||
value: "1" | |||
dnsPolicy: ClusterFirst | |||
restartPolicy: Always | |||
schedulerName: default-scheduler | |||
securityContext: {} | |||
terminationGracePeriodSeconds: 60 | |||
volumes: | |||
- configMap: | |||
defaultMode: 420 | |||
name: server-script | |||
name: server-script | |||
</pre> | |||
ingress.yaml | |||
<pre> | |||
apiVersion: networking.k8s.io/v1 | |||
kind: Ingress | |||
metadata: | |||
annotations: | |||
ingress.gcp.kubernetes.io/pre-shared-cert: mycert | |||
ingress.kubernetes.io/ssl-cert: mycert | |||
kubernetes.io/ingress.allow-http: "false" | |||
kubernetes.io/ingress.class: gce-internal | |||
labels: | |||
app: server | |||
env: staging | |||
name: server | |||
namespace: testing | |||
spec: | |||
defaultBackend: | |||
service: | |||
name: server-service | |||
port: | |||
name: myserver | |||
</pre> | |||
then a service and ingress. | |||
in this case I'm using gke internal lb: | |||
<pre> | |||
apiVersion: v1 | |||
kind: Service | |||
metadata: | |||
annotations: | |||
cloud.google.com/neg: '{"ingress": true}' | |||
labels: | |||
app: server | |||
env: staging | |||
name: server-service | |||
namespace: testing | |||
spec: | |||
externalTrafficPolicy: Cluster | |||
ipFamilies: | |||
- IPv4 | |||
ipFamilyPolicy: SingleStack | |||
ports: | |||
- name: myservice | |||
port: 8080 | |||
protocol: TCP | |||
targetPort: 8080 | |||
selector: | |||
app: server | |||
env: staging | |||
sessionAffinity: None | |||
type: NodePort | |||
</pre> | |||
== which pods are cycling? == | |||
podnewandold.sh | |||
<pre> | |||
#!/bin/bash | |||
export ARG="-l app.kubernetes.io/instance=myapp" | |||
echo getting current | |||
CURRENT=$(kubectl get pod ${ARG} -o name | sort ) | |||
while true | |||
do | |||
echo getting new | |||
NEW=$(kubectl get pod ${ARG} -o name | sort ) | |||
echo diff question mark | |||
diff -u <(echo "$NEW") <(echo "$CURRENT") | |||
echo sleep | |||
sleep 1 | |||
CURRENT=${NEW} | |||
done | |||
</pre> | |||
== Practices and Guidelines == | |||
https://medium.com/devopslinks/security-problems-of-kops-default-deployments-2819c157bc90 | |||
* Do not use replication controllers, instead use replica sets | * Do not use replication controllers, instead use replica sets | ||
* When changing the shape of the cluster , number and type of instance groups , you will use kops edit ig <ig name> , but don't for get to update the cluster-autoscaler config ( ks edit deploy cluster-autoscaler ) | |||
* Stay up to date, read release notes, just like you do for all the other stuff you manage right ? https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md | |||
== ConfigMaps == | |||
All things configmaps: | |||
https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/ | |||
== Cgroup / slice errors == | == Cgroup / slice errors == | ||
Line 222: | Line 713: | ||
Todo / read: | Todo / read: | ||
* https://www.nccgroup.com/us/about-us/newsroom-and-events/blog/2019/august/tools-and-methods-for-auditing-kubernetes-rbac-policies/ | |||
* https://github.com/aquasecurity/kube-hunter/blob/master/README.md | * https://github.com/aquasecurity/kube-hunter/blob/master/README.md | ||
* https://www.arctiq.ca/events/2018/10/5/building-a-secure-container-strategy-with-aqua-security-microsoft-azure-and-hashicorp-vault/ | * https://www.arctiq.ca/events/2018/10/5/building-a-secure-container-strategy-with-aqua-security-microsoft-azure-and-hashicorp-vault/ | ||
* https://koudingspawn.de/secure-kubernetes-with-vault/ | |||
* https://www.appvia.io/blog/how-can-i-secure-my-kubernetes-cluster-on-gke | |||
Container security: | |||
can I break out of this container? https://github.com/brompwnie/botb | |||
is this container reasonably safe? https://github.com/aquasecurity/trivy | |||
How does my cluster stand up to the security bench-marks? https://github.com/aquasecurity/kube-bench | |||
== References and Reading == | == References and Reading == | ||
Line 229: | Line 732: | ||
; Replica set versus Replication controller | ; Replica set versus Replication controller | ||
: https://www.mirantis.com/blog/kubernetes-replication-controller-replica-set-and-deployments-understanding-replication-options/ | : https://www.mirantis.com/blog/kubernetes-replication-controller-replica-set-and-deployments-understanding-replication-options/ | ||
; Publishing services - service types | |||
: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types | |||
; Kuberenetes the hard way | |||
: https://github.com/kelseyhightower/kubernetes-the-hard-way | |||
; Hadolint - A smarter Dockerfile linter that helps you build best practice Docker images. | |||
: https://github.com/hadolint/hadolint | |||
== HPA broken == | |||
Blue is test | |||
Blue env: | |||
Client Version: v1.12.2 | |||
Server Version: v1.10.6 | |||
Prod env: | |||
Client Version: v1.12.2 | |||
Server Version: v1.9.8 | |||
In prod HPAs work. When I ask for them I see: | |||
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE | |||
adjust Deployment/adjust 0%/70% 1 5 1 1d | |||
web-admin Deployment/web-admin 0%/70% 1 3 1 2h | |||
In blue env they don't work, I see: | |||
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE | |||
adjust Deployment/adjust <unknown>/70% 1 5 1 1d | |||
web-admin Deployment/web-admin <unknown>/70% 1 3 1 2h | |||
in Kubernetes events we see: | |||
HorizontalPodAutoscaler Warning FailedGetResourceMetric horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from resource metrics API | |||
Note that the metrics server is running in kube-system, but there are no repo files for that in /third-party" in prod. | |||
In blue we store all metrics-server related files in /thirdpary/metrics-server ( taken from git@github.com:kubernetes-incubator/metrics-server.git ) | |||
In prod the deployment has: | |||
<pre> | |||
- command: | |||
- /metrics-server | |||
- --source=kubernetes.summary_api:'' | |||
</pre> | |||
In blue this seemed to do the trick | |||
<pre> | |||
- /metrics-server | |||
- --kubelet-preferred-address-types=InternalIP | |||
- --kubelet-insecure-tls | |||
</pre> | |||
== Cluster Scaling == | |||
ks get configmap cluster-autoscaler-status -o yaml | |||
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md | |||
=== Steps to move hardware around === | |||
In this case we are removing the last node from an instance group and then removing the instance group. | |||
Reference: https://kubernetes.io/docs/concepts/architecture/nodes/ | |||
1. '''Cordon the node''' | |||
k cordon ip-xx-xx-xx-xx.region.compute.internal | |||
No new pods will be deployed here. | |||
2. '''drain''' ( move pods here to somewhere else ) | |||
k drain ip-xx-xx-xx-xx.region.compute.internal | |||
You may need to add "--ignore-daemonsets" if you have daemonsets running ( data dog , localredis ) | |||
You may need to "--delete-local-data" if you have a metrics server on this node. BE CAREFUL. You will loose metrics, but probably you have an "out of cluster" place where metrics are stored ( datadog, elastic search, etc ) | |||
3. remove the nodegroup from the autoscaler: | |||
ks edit deploy cluster-autoscaler | |||
4. tell kops to delete the instance group. | |||
kops delete ig myig | |||
at this point the vms will be shut down. | |||
k get nodes | |||
== Downing nodes == | |||
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets | |||
kubectl delete node <node name> | |||
== Kubeadm way == | |||
1. light up some instances: | |||
if you are using amzn linux you can cloud-init like this: | |||
<pre> | |||
yum_repos: | |||
# The name of the repository | |||
kubernetes: | |||
# Any repository configuration options | |||
# See: man yum.conf | |||
# | |||
# This one is required! | |||
baseurl: https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 | |||
enabled: true | |||
gpgcheck: true | |||
gpgkey: | |||
- https://packages.cloud.google.com/yum/doc/yum-key.gpg | |||
- https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg | |||
name: kubernetes | |||
packages: | |||
- curl | |||
- git | |||
- iproute-tc | |||
- jq | |||
- kubeadm | |||
- kubectl | |||
- kubelet | |||
- lsof | |||
- mlocate | |||
- ntp | |||
- screen | |||
- strace | |||
- sysstat | |||
- tcpdump | |||
- telnet | |||
- traceroute | |||
- tree | |||
- unzip | |||
- wget | |||
runcmd: | |||
- [ /usr/bin/updatedb ] | |||
- [ 'amazon-linux-extras', 'install', 'docker', '-y' ] | |||
- [ 'setenforce', '0'] | |||
- [ 'systemctl', 'enable', 'docker'] | |||
- [ 'systemctl', 'start', 'docker' ] | |||
- [ 'systemctl', 'enable', 'kubelet'] | |||
- [ 'systemctl', 'start', 'kubelet' ] | |||
write_files: | |||
- content: | | |||
net.bridge.bridge-nf-call-ip6tables = 1 | |||
net.bridge.bridge-nf-call-iptables = 1 | |||
path: /etc/sysctl.d/k8s.conf | |||
permissions: '0755' | |||
owner: root:root | |||
- content: | | |||
SERVER=$(/usr/bin/aws ec2 describe-tags --region us-east-1 --filters "Name=resource-id,Values=$(wget -q -O - http://169.254.169.254/latest/meta-data/instance-id)" "Name=key,Values=Name" --query 'Tags[*].Value' --output text) | |||
PRIVATE_IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4) | |||
# if hostname was set this would work, but hostname is not set | |||
# sed -i "s/^\(HOSTNAME\s*=\s*\).*$/\1$SERVER/" /etc/sysconfig/network | |||
echo "HOSTNAME=$SERVER" >> /etc/sysconfig/network | |||
echo "$PRIVATE_IP $SERVER" >> /etc/hosts | |||
echo "$SERVER" > /etc/hostname | |||
hostname $SERVER | |||
path: /root/sethostname.sh | |||
permissions: '0755' | |||
owner: root:root | |||
</pre> | |||
on the first instance, do a | |||
kubeadm init | |||
and save the output. | |||
run that output on the other instances. | |||
boom! a kubernetes cluster... | |||
what is it missing ? | |||
* your app | |||
* logging | |||
* monitoring | |||
* dashboard | |||
== Custer status == | |||
kubectl get componentstatuses | |||
== Interogate the cluster == | |||
<pre> | |||
apiVersion: apps/v1 | |||
kind: DaemonSet | |||
metadata: | |||
name: dthornton-diag | |||
namespace: kube-system | |||
labels: | |||
app: conntrack-adjuster | |||
spec: | |||
selector: | |||
matchLabels: | |||
app: conntrack-adjuster | |||
template: | |||
metadata: | |||
labels: | |||
app: conntrack-adjuster | |||
spec: | |||
hostNetwork: true | |||
hostPID: true | |||
hostIPC: true | |||
containers: | |||
- name: sysctl | |||
image: alpine:3.6 | |||
imagePullPolicy: IfNotPresent | |||
command: ["sh", "-c"] | |||
args: ["while true; do echo NOW ; cat /proc/net/nf_conntrack ; sleep 60; done;"] | |||
securityContext: | |||
privileged: true | |||
volumeMounts: | |||
- name: sys | |||
mountPath: /sys | |||
volumes: | |||
- name: sys | |||
hostPath: | |||
path: /sys | |||
tolerations: | |||
- effect: "NoExecute" | |||
operator: "Exists" | |||
- effect: "NoSchedule" | |||
operator: "Exists" | |||
</pre> | |||
== The helm way == | |||
[[How the hell does helm work?]] <- this is me learning, disregard. | |||
=== Setup === | |||
prereq: k cluster is up already. | |||
step 1. install local helm 3 binary. | |||
https://github.com/helm/helm/releases/tag/v3.0.0-alpha.1 | |||
MacOS | |||
cd ~/work | |||
mkdir helm | |||
cd helm | |||
wget https://get.helm.sh/helm-v3.0.0-alpha.1-darwin-amd64.tar.gz | |||
tar darwin-amd64.tar.gz | |||
cp darwin-amd64/helm ~/bin | |||
Loonix | |||
cd ~/work | |||
mkdir helm | |||
cd helm | |||
wget https://get.helm.sh/helm-v3.0.0-alpha.1-linux-amd64.tar.gz | |||
tar zxvf linux-amd64.tar.gz | |||
cp linux-amd64/helm ~/bin/ | |||
helm can use different kubectl contexts, but we just use on concext so simple is fine. | |||
to get the kubernetes side of helm setup you do this: | |||
<pre> | |||
blue@kubernetescluster:~$ helm init --debug | |||
Creating /home/blue/.helm | |||
Creating /home/blue/.helm/repository | |||
Creating /home/blue/.helm/repository/cache | |||
Creating /home/blue/.helm/plugins | |||
Creating /home/blue/.helm/starters | |||
Creating /home/blue/.helm/cache/archive | |||
Creating /home/blue/.helm/repository/repositories.yaml | |||
Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com | |||
$HELM_HOME has been configured at /home/blue/.helm. | |||
Happy Helming! | |||
blue@kubernetescluster:~$ | |||
</pre> | |||
note that "--dry-run" worked in helm 2 but doesn't owrk in helm 3. | |||
=== On HPA versus deployments === | |||
if you set replicas in a deployment and deploy an HPA, replicas in the deployment will fight with whatever the HPA wants. | |||
fix this by _not_ setting it_ in the deployment. | |||
=== commands of note === | |||
list available versions of charts: | |||
helm search repo <reponame>/<chartname> --versions | |||
exmaine the manifest for a release: | |||
helm get manifest <release> -n <namesapce> | |||
=== Helm best practice notes === | |||
* use ''community chart'' when you can to save yourself time. But also ''Keep up with changes'' so that you are not left with a chart version so old that an upgrade will be painful. | |||
* In charts you make yourself ''Don't mention namespace'' in any manifest. This way your chart could be deployed easily to any namespace. You could make it a value in the values.yaml file, and refernce it in the manifests, but why do all that extra typing? Use -n <namespace> at install time. | |||
* Install the chart in the same namespace as where the app will be deployed. | |||
* use <code>--wait</code> in your pipelines so that if there is something doesn't get deployed then your pipeline will go red. or ensure you have a task that checks the status afterwards. This depends on 1. how long you want to wait "in pipeline" to know that the deploy worked. Large deploys can take a long time, think 1000s of pods. 2. having the helm return quick while a "wait for healthy" task lives longer might be preferable for reporting and metrics. | |||
=== Brain Surgery === | |||
do not do this. | |||
<pre> | |||
export app="myapp" | |||
helm history $app | |||
# pick last "good" secret | |||
k get secret $secret -o=jsonpath='{.data.release}' | base64 -d | base64 -d | gzip -c -d > ~/tmp/${app}.manifests | |||
# edit the manifests, then: | |||
vi ~/tmp/${app}.manifests | |||
# bashism | |||
VALUE=$(cat ~/tmp/${app}.manifests | gzip -c | base64 | base64 ) | |||
kubectl patch secret $secret -p "{\"data\":{\"release\":\"${VALUE}\"}}" | |||
</pre> | |||
== taints == | |||
Still learning about this . | |||
kubectl get nodes -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.taints}{"\n"}{end}' | |||
== Kubernetes dashboard == | |||
get the token for the service account to loggin to the web ui: | |||
ks get secret `ks get sa kubernetes-dashboard -o=jsonpath='{.secrets[0].name}'` -o=jsonpath='{.data.token}' | base64 -d ; echo | |||
== Kubernetes Quiz Links == | |||
#Pods | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-pods | |||
#ReplicaSets | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-replicasets | |||
#Deployments | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-deployments | |||
#Namespaces | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-namespaces | |||
#Commands and Arguments | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-commands-and-arguments | |||
#ConfigMaps | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-configmaps | |||
#Secrets | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-secrets | |||
#Security Contexts | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-security-contexts | |||
#Service Accounts | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-service-account | |||
#Taints and Tolerations | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-taints-tolerations | |||
#Node Affinity | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-node-affinity | |||
#Multi-Container Pods | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-multicontainer-pods | |||
#Readiness and Liveness Probes | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-readiness-probes | |||
#Container Logging | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-logging | |||
#Monitoring | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-monitoring | |||
#Labels & Selectors | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-labels-and-selectors | |||
#Rolling Updates And Rollbacks | |||
https://kodekloud.com/p/practice-test-kubernetes-ckad-rolling-updates-and-rollbacks | |||
#Services | |||
https://kodekloud.com/p/kubernetes-for-beginners-services-493847859 | |||
== Datadog == | |||
what version of data dog am I running ? | |||
do this: | |||
kubectl get pods -l app=datadog-agent -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}' | |||
and get "latest"! except what does that mean? | |||
instead ask the agent itself: | |||
for i in `kubectl get pods -l app=datadog-agent | awk '{print $1}' | grep -v NAME `; do echo $i; k exec -it $i -- /opt/datadog-agent/bin/agent/agent version; done | |||
datadog-agent-XXXX | |||
Agent X.X.X - Commit: XXX - Serialization version: X.X.X | |||
datadog-agent-YYYY | |||
Agent X.X.Y - Commit: XXX - Serialization version: X.X.X | |||
ah ah! inconsistant versions! can be fixed with a ds delete -> k apply, or even just a pod kill. | |||
== TCPDump a container == | |||
or pod. | |||
Reference: | |||
https://community.pivotal.io/s/article/How-to-get-tcpdump-for-containers-inside-Kubernetes-pods | |||
;Get the container ID and host. | |||
k get pod XXX -o=jsonpath='{.status.containerStatuses[0].containerID}{"\n"}{.status.hostIP}{"\n"}' | |||
docker://YYYYYYYYYYYYYYYYYYYYYY | |||
Z.Z.Z.Z | |||
; get the interface index: | |||
docker exec XXX cat /sys/class/net/eth0/iflink | |||
<NUMBER> | |||
; find the interface on the host: | |||
ip link |grep ^<NUMBER>: | |||
; On those dump that interface: | |||
tcpdump -i veth235ab8ff | |||
== Disk usage of container == | |||
<pre> | |||
kns get pod -l app=myapp,env=datacenter-production -o=jsonpath='{range .items[*]}{.status.containerStatuses[0].containerID}{"\t"}{.status.hostIP}{"\n"}{end}' | \ | |||
awk 'BEGIN{FS="//"}{print $2}' | \ | |||
while read a b | |||
do | |||
echo ssh ${b} sudo du -sh /var/lib/docker/containers/${a} | |||
done | |||
</pre> | |||
Note that by default in k8s the docker json log driver is used so if the log is big for a container, that log will be in the container's directory. I'm not sure how to fix that . there is a config for it , but it looks like kubernetes is ont honouring it. | |||
== Ingress == | |||
List all ingresses and their api version: | |||
<pre> | |||
k get ingress -A -o jsonpath='{range .items[*]}{.metadata.namespace}{","}{.metadata.name}{","}{.apiVersion}{"\n"}{end}' | |||
</pre> | |||
== Diagram with python and D2 == | |||
[[/k8s2d2.py - Diagram with python and D2]] | |||
== Also See == | |||
* [[kops]] - automated kubenetes cluster build. | |||
* [[What I learned today Nov 2nd 2018]] | |||
;Ingress Networking - 1 | |||
:https://kodekloud.com/p/practice-test-kubernetes-ckad-ingress-1 | |||
;Ingress Networking - 2 | |||
:https://kodekloud.com/p/practice-test-kubernetes-ckad-ingress-2-deploy-controller | |||
* [[/AWS EKS Aspects]] | |||
* [[/GCP GKE Aspects]] | |||
[[Category:Kubernetes]] |
Latest revision as of 13:39, 20 September 2024
Useful
alias:
alias k="kubectl" alias kns="kubectl --namespace $KNS" alias ks="kubectl --namespace kube-system" # Kubernetes Events alias ke="kubectl get events --sort-by='{.lastTimestamp}'" # Kubernetes System stuff alias kse="kubectl --namespace kube-system get events --sort-by='{.lastTimestamp}'" # Kubernetes Systems Events
with the advernt of this , my knX aliases are not as useful.
kubectl config set-context --current --namespace=NAMESPACE
For moving around namespaces fast
alias kX="kubectl --name-space nameX"
dump all :
kubectl get all --export=true -o yaml
( namespace kube-system not dumped )
also needed:
kns get secrets kns get pvc kns get pv kns get cm kns get sa kns get role kns get RoleBinding
list form:
k get pods k get rs # replica set k get rc # replication controller
what are all the things ?
kubectl api-resources
kubectl api-versions
event sorted by time
kubectl get events --sort-by=.metadata.creationTimestamp
show events with timestamp , rather than relative time:
kubectl get events -o custom-columns=FirstSeen:.firstTimestamp,LastSeen:.lastTimestamp,Count:.count,From:.source.component,Type:.type,Reason:.reason,Message:.message -n $KNS
kubectl get events -o custom-columns=FirstSeen:.firstTimestamp,LastSeen:.lastTimestamp,Count:.count,From:.source.component,Type:.type,Reason:.reason,Message:.message
what storage classes does my cluster support?
k get storageclass
how are pod spread out over nodes:
k describe node | grep -E '(^Non-t|m |^Name)' | more
( doesn't scale well, no indication of )
what deployments should be on what instance groups:
kubectl get deploy --all-namespaces -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{..nodeSelector}{"\n"}{end}' |sort
or the other way round ( ig first )
kubectl get deploy --all-namespaces -o=jsonpath='{range .items[*]}{..nodeSelector}{"\t"}{.metadata.name}{"\n"}{end}' |sort
if you used kops to deploy the cluster then nodes are labes with their instance groups, you can be more specific like this:
k describe node -l kops.k8s.io/instancegroup=<instance group name> | grep -E '(^Non-t|m |^Name)' | more
how many nods in each instance group? ( tested under kops )
for i in `kops get ig 2>/dev/null| grep -v NAME | awk '{print $1}'` do echo $i kubectl get nodes -l kops.k8s.io/instancegroup=$i done
how many pods per node:
k get pod -o wide | grep -v NAME | awk '{print $8}' | sort | uniq -c | sort -rn
k get pod --all-namespaces -o wide | grep -v NAME | awk '{print $8}' | sort | uniq -c | sort -rn
audit: who tried to do what?
ks get pod | grep kube-apiserver-ip
ks logs $podname
kns logs -f --timestamps podname
-f "follow" --timestamp , show timestamp
the pod restarted , I want the logs from the previous start of this pod:
kns logs --previous podname
who tried to scale unsuccessfully?
ks logs $podname | grep scale | grep cloud | awk '$8!=200{print $0}'
Where is the service account token that I gave this pod?
It's in here: /var/run/secrets/kubernetes.io/serviceaccount/token
Scripting Scaling
Manually edit the replicas of a deployment from within the same namespace, but in a different pod:
- give the actor pod a service account ( possibly via it's deployment ).
- create a Role as below.
- create the RoleBinding to connect the ServiceAccount to the Role.
Now you have: Pod -> Deployment -> ServiceAccount -> RoleBinding -> Role
Now the Pod has permission to do what it needs. Very similar to AWS's "IAM Role" where you give an instance a role that has the permissions that it needs to operate.
Note that in this case "ClusterRole" and ClusterRoleBinding are not required. It's all namespaced to the namespace that your deployment is in. In this case: "default".
export API_URL="https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}/${KUBE_ENDPOINT}" export TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token` export CURL_CA_BUNDLE=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt curl \ -H 'Accept: application/json' \ -H "Authorization: Bearer $TOKEN" \ $API_URL \ > scale.json # edit scale.json, set replicas to 4 curl -X PUT \ -d@scale.json \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ $API_URL
CURL_CA_BUNDLE - kubernerets is it's own CA, and presents to each pod a ca bundle that makes ssl "in" the cluster valid.
This was the role that did it. FIXME: pare it down
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: kube-cloudwatch-autoscaler labels: app: kube-cloudwatch-autoscaler rules: - apiGroups: - "" resources: - nodes verbs: - list - apiGroups: - apps resources: - deployments - deployments.apps - deployments.apps/scale - "*/scale" verbs: - get - update - patch - put - apiGroups: - "" resources: - configmaps verbs: - get - create
On patching
There are a couple of way to change an object.
export TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token` export CURL_CA_BUNDLE=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
1. dump whole "thing" , make change post object back ( as above ) GET -> PUT
curl \ -v \ -H 'Accept: application/json' \ -H "Authorization: Bearer $TOKEN" \ $API_URL \ > scale.json # edit scale.json, set replicas to 4 curl -X PUT \ -d@scale.json \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ $API_URL
2. terse PATCH
curl -sS \ -X 'PATCH' \ -H "Authorization: Bearer ${TOKEN}" \ -H 'Content-Type: application/merge-patch+json' \ $API_URL \ -d '{"spec": {"replicas": 1}}'
3. old / full PATCH ?
reference: https://stackoverflow.com/questions/41792851/manage-replicas-count-for-deployment-using-kubernetes-api ( 1 year 8 months old at tie of _this_ writing )
Careful, compare:
BORKEN!
PAYLOAD='[{"op":"replace","path":"/spec/replicas","value":"3"}]' curl \ -X PATCH \ -d ${PAYLOAD} \ -H 'Content-Type: application/json-patch+json' \ -H "Authorization: Bearer ${TOKEN}" \ $API_URL
WERKS!
curl \ -X PATCH \ -d '[{"op":"replace","path":"/spec/replicas","value":3}]' \ -H 'Content-Type: application/json-patch+json' \ -H "Authorization: Bearer ${TOKEN}" \ $API_URL
Closely:
-d '[{"op":"replace","path":"/spec/replicas","value":"3"}]' <- broken -d '[{"op":"replace","path":"/spec/replicas","value":3}]' <- works
resource definitions
You did a get -o yaml to see the object, but you want to know what _all_ the attributes are.
For Custom resource definintions you know about:
k get CustomResourceDefinition
but what about vanilla , non-custom, definitions? API reference?
FIXME , dunno.
Maybe the thing you want to know amore about is a custom resource and you just didn't know it as custom:
k get CustomResourceDefinition -A
for example:
k get CustomResourceDefinition backendconfigs.cloud.google.com -o yaml
show recommended labels on deployments
k get deploy -o=custom-columns='Deployment_name:.metadata.name,label_NAME:.metadata.labels.app\.kubernetes\.io/name,INSTANCE:.metadata.labels.app\.kubernetes\.io/instance,VERSION:.metadata.labels.app\.kubernetes\.io/version,COMPONENT:.metadata.labels.app\.kubernetes\.io/component,MANAGEBY:.metadata.labels.app\.kubernetes\.io/managed-by'
( sorry no line breaks allowed )
Template Examples
list images by pod:
kubectl get pods -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}'
kubectl get pods --all-namespaces -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}'
kns version:
kns get pods -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}'
pod and ips:
kubectl get pods -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .status.podIPs[*]}{ .ip } {", "}{end}{end}{"\n"}'
kubectl get pods -n mynamespace -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .status.podIPs[*]}{ .ip } {", "}{end}{end}{"\n"}'
list images by deploy
kubectl get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.template.spec.containers[*]}{.image}{", "}{end}{end}{"\n"}'
kns get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.template.spec.containers[*]}{.image}{", "}{end}{end}{"\n"}'
list all deployment cpu and mem request:
kubectl get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.template.spec.containers[*]}{.resources.requests.cpu}{", "}{.resources.requests.memory}{end}{end}{"\n"}'
list nodeslectors for all deploys:
kubectl get deploy -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{.spec.template.spec.nodeSelector}{end}{"\n"}'
all nodes by their condition statuses:
kubectl get nodes -o=jsonpath='{range .items[*]}{@.metadata.name}:{"\n"}{range @.status.conditions[*]}{"\t"}{@.type}={@.status};{"\n"}{end}{end}{"\n"}'
Note the double loop.
Examine the resource request and limit for all deploys:
kubectl get deploy -o=jsonpath='{range .items[*]}{@.metadata.name}{range @.spec.template.spec.containers[*]}{@.resources.request.cpu}{@.resources.request.memory}{@.resources.limits.cpu}{@.resources.limits.memory}{"\n"}{end}{end}{"\n"}'
ContainersIDs by host: ( accounts for multiple containers per pod )
for i in `kns describe node NODENAME | grep "m (" | grep -v cpu | awk '{print $2}'` do echo -n $i " " ns=$(kns get pod -A | grep $i | awk '{print $1}') echo -n $ns " " # k get pod $i -n $ns -o jsonpath='{.metadata.name}{" "}{.status.containerStatuses[*].containerID}{"\n"}' ; done k get pod $i -n $ns -o jsonpath='{"\n"}{"\t"}{range .status.containerStatuses[*]}{.name}{" "}{.containerID}{"\n"}{end}' ; done done
request for every replicaset ,across all name spaces
kubectl get rs -A -o=jsonpath='{range .items[*]}{"\n"}{.metadata.namespace}{","}{.metadata.name}{","}{.status.readyReplicas}{","}{range .spec.template.spec.containers[*]}{.resources.requests.cpu}{","}{.resources.requests.memory}{end}{end}{"\n"}' | sort -n -r -k 3 -t , | head
of the form:
namespace,name,replicasready,cpu,mem
metrics
wget "$(kubectl config view -o jsonpath='{range .clusters[*]}{@.cluster.server}{"\n"}{end}')"
also:
k top nodes k top pods
exec notes
k exec -it $i -- bash -c "set | grep ENVVARX"
Monitoroing
Prometheus queries of note:
Cluster wide CPU usage percent:
sum (rate (container_cpu_usage_seconds_total{id="/"}[5m])) / sum (machine_cpu_cores{}) * 100
Cluster wide memory usage:
sum (container_memory_working_set_bytes{id="/"}) / sum (machine_memory_bytes{}) * 100
Cluster wide cpu and memory request v available as a percent ( keep this below 100% )
CPU:
sum(kube_pod_container_resource_requests_cpu_cores) / sum(kube_node_status_capacity_cpu_cores) * 100
Memory:
sum(kube_pod_container_resource_requests_memory_bytes) / sum(kube_node_status_capacity_memory_bytes) * 100
Troubleshooting
Pod
pod starts and dies too quick, inthe deployment, stateuleset, or deamonset overrid the command and args with this:
look at logs:
put this in a deployment so that you can exec into a failing pod and see what it's upto:
command: ["/bin/sh"] args: ["-c", "while true; do date; sleep 10;done"]
the pod will come up and stay up long enugh for you to get in and look around.
this only works with a pod that has enough of an operating system for you to do that . Like bash ps ls cd and such tools. Some very slim containers do not have those tools.
One off pod
kubectl run tomcat --image=ubuntu:latest
kubectl run -i -t busybox --image=alpine:3.8 --restart=Never
kube Api server
You want to see what is up with the api server. If there is a problem with the cluster the api server is going to give you a better view.
by default -v ( verbose ) is not set.
in this example we are using KOPS to make the kubenertes cluster, so to change the api server config we need to edit the cluster_spec.
kops edit cluster
add a section
apiServer: LogLevel: X
This adds --v=X to the api server command line.
I had something pounding the server with a bad token , ut the log only said: "Bad token"... who ?
Had to turn loglevel up to 20 to get to the bottom of it... it was a bad fluentd token ( wish fluentd had some sort of back-off. )
Note that a kops update showed "no changed required".
I could force it, but instead, I used the "small change" trick Kops#Tricks_for_making_small_updates_to_a_host
Running a one off job from a cronjob
kns create job --from=cronjob/cronjob-name job-name-dthornton
Echo Server
aka printenv
aka python server
For troubleshooting load balancer, service, ingress issues, an "echo" server is useful:
server.py
#!/usr/bin/env python3 # ./server.py [<port>] from http.server import BaseHTTPRequestHandler, HTTPServer import logging class S(BaseHTTPRequestHandler): def _set_response(self): self.send_response(200) self.send_header('Content-type', 'text/html') self.end_headers() def do_GET(self): logging.info("GET request,\nPath: %s\nHeaders:\n%s\n", str(self.path), str(self.headers)) self._set_response() for e in self.headers: # self.response.write(e + "<br />") self.wfile.write("{}: {}<br>".format(e,self.headers[e]).encode('utf-8')) # self.wfile.write("type is {}".format(type(self.headers)).encode('utf-8')) # self.wfile.write("GET request for {}".format(self.path).encode('utf-8')) def do_POST(self): content_length = int(self.headers['Content-Length']) # <--- Gets the size of data post_data = self.rfile.read(content_length) # <--- Gets the data itself logging.info("POST request,\nPath: %s\nHeaders:\n%s\n\nBody:\n%s\n", str(self.path), str(self.headers), post_data.decode('utf-8')) self._set_response() self.wfile.write("POST request for {}".format(self.path).encode('utf-8')) def run(server_class=HTTPServer, handler_class=S, port=8080): logging.basicConfig(level=logging.INFO) server_address = ('', port) httpd = server_class(server_address, handler_class) logging.info('Starting httpd...') try: httpd.serve_forever() except KeyboardInterrupt: pass httpd.server_close() logging.info('Stopping httpd...') if __name__ == '__main__': from sys import argv if len(argv) == 2: run(port=int(argv[1])) else: run()
make a configmap from this:
kubectl create configmap server-script --from-file=server.py
then make a deployment:
deploy.yaml
apiVersion: apps/v1 kind: Deployment metadata: annotations: labels: app: server env: staging name: server namespace: testing spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 1 selector: matchLabels: app: server env: staging strategy: rollingUpdate: maxSurge: 34% maxUnavailable: 0 type: RollingUpdate template: metadata: creationTimestamp: null labels: app: server env: staging spec: automountServiceAccountToken: false containers: - args: - /server.py - "9011" command: - python env: - name: MYENVVAR value: fun image: python:alpine3.15 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: / port: 9011 scheme: HTTP initialDelaySeconds: 120 periodSeconds: 60 successThreshold: 1 timeoutSeconds: 60 name: fusionauth ports: - containerPort: 9011 protocol: TCP resources: limits: cpu: 300m memory: 700Mi requests: cpu: 100m memory: 500Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /server.py name: server-script subPath: server.py dnsConfig: options: - name: ndots value: "1" dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 60 volumes: - configMap: defaultMode: 420 name: server-script name: server-script
ingress.yaml
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: ingress.gcp.kubernetes.io/pre-shared-cert: mycert ingress.kubernetes.io/ssl-cert: mycert kubernetes.io/ingress.allow-http: "false" kubernetes.io/ingress.class: gce-internal labels: app: server env: staging name: server namespace: testing spec: defaultBackend: service: name: server-service port: name: myserver
then a service and ingress.
in this case I'm using gke internal lb:
apiVersion: v1 kind: Service metadata: annotations: cloud.google.com/neg: '{"ingress": true}' labels: app: server env: staging name: server-service namespace: testing spec: externalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: myservice port: 8080 protocol: TCP targetPort: 8080 selector: app: server env: staging sessionAffinity: None type: NodePort
which pods are cycling?
podnewandold.sh
#!/bin/bash export ARG="-l app.kubernetes.io/instance=myapp" echo getting current CURRENT=$(kubectl get pod ${ARG} -o name | sort ) while true do echo getting new NEW=$(kubectl get pod ${ARG} -o name | sort ) echo diff question mark diff -u <(echo "$NEW") <(echo "$CURRENT") echo sleep sleep 1 CURRENT=${NEW} done
Practices and Guidelines
https://medium.com/devopslinks/security-problems-of-kops-default-deployments-2819c157bc90
- Do not use replication controllers, instead use replica sets
- When changing the shape of the cluster , number and type of instance groups , you will use kops edit ig <ig name> , but don't for get to update the cluster-autoscaler config ( ks edit deploy cluster-autoscaler )
- Stay up to date, read release notes, just like you do for all the other stuff you manage right ? https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md
ConfigMaps
All things configmaps:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/
Cgroup / slice errors
https://github.com/kubernetes/kubernetes/issues/56850
log message:
Sep 18 21:32:37 ip-10-10-37-50 kubelet[1681]: E0918 21:32:37.901058 1681 summary.go:92] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"
MAAS ubuntu
https://stripe.com/blog/operating-kubernetes
https://medium.com/@adriaandejonge/moving-from-docker-to-rkt-310dc9aec938
https://coreos.com/rkt/docs/latest/rkt-vs-other-projects.html#rkt-vs-docker
Security
Todo / read:
- https://www.nccgroup.com/us/about-us/newsroom-and-events/blog/2019/august/tools-and-methods-for-auditing-kubernetes-rbac-policies/
- https://github.com/aquasecurity/kube-hunter/blob/master/README.md
- https://www.arctiq.ca/events/2018/10/5/building-a-secure-container-strategy-with-aqua-security-microsoft-azure-and-hashicorp-vault/
- https://koudingspawn.de/secure-kubernetes-with-vault/
Container security:
can I break out of this container? https://github.com/brompwnie/botb
is this container reasonably safe? https://github.com/aquasecurity/trivy
How does my cluster stand up to the security bench-marks? https://github.com/aquasecurity/kube-bench
References and Reading
- Replica set versus Replication controller
- https://www.mirantis.com/blog/kubernetes-replication-controller-replica-set-and-deployments-understanding-replication-options/
- Publishing services - service types
- https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
- Kuberenetes the hard way
- https://github.com/kelseyhightower/kubernetes-the-hard-way
- Hadolint - A smarter Dockerfile linter that helps you build best practice Docker images.
- https://github.com/hadolint/hadolint
HPA broken
Blue is test
Blue env:
Client Version: v1.12.2 Server Version: v1.10.6
Prod env:
Client Version: v1.12.2 Server Version: v1.9.8
In prod HPAs work. When I ask for them I see:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE adjust Deployment/adjust 0%/70% 1 5 1 1d web-admin Deployment/web-admin 0%/70% 1 3 1 2h
In blue env they don't work, I see:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE adjust Deployment/adjust <unknown>/70% 1 5 1 1d web-admin Deployment/web-admin <unknown>/70% 1 3 1 2h
in Kubernetes events we see:
HorizontalPodAutoscaler Warning FailedGetResourceMetric horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from resource metrics API
Note that the metrics server is running in kube-system, but there are no repo files for that in /third-party" in prod.
In blue we store all metrics-server related files in /thirdpary/metrics-server ( taken from git@github.com:kubernetes-incubator/metrics-server.git )
In prod the deployment has:
- command: - /metrics-server - --source=kubernetes.summary_api:''
In blue this seemed to do the trick
- /metrics-server - --kubelet-preferred-address-types=InternalIP - --kubelet-insecure-tls
Cluster Scaling
ks get configmap cluster-autoscaler-status -o yaml
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md
Steps to move hardware around
In this case we are removing the last node from an instance group and then removing the instance group.
Reference: https://kubernetes.io/docs/concepts/architecture/nodes/
1. Cordon the node
k cordon ip-xx-xx-xx-xx.region.compute.internal
No new pods will be deployed here.
2. drain ( move pods here to somewhere else )
k drain ip-xx-xx-xx-xx.region.compute.internal
You may need to add "--ignore-daemonsets" if you have daemonsets running ( data dog , localredis )
You may need to "--delete-local-data" if you have a metrics server on this node. BE CAREFUL. You will loose metrics, but probably you have an "out of cluster" place where metrics are stored ( datadog, elastic search, etc )
3. remove the nodegroup from the autoscaler:
ks edit deploy cluster-autoscaler
4. tell kops to delete the instance group.
kops delete ig myig
at this point the vms will be shut down.
k get nodes
Downing nodes
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets kubectl delete node <node name>
Kubeadm way
1. light up some instances:
if you are using amzn linux you can cloud-init like this:
yum_repos: # The name of the repository kubernetes: # Any repository configuration options # See: man yum.conf # # This one is required! baseurl: https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled: true gpgcheck: true gpgkey: - https://packages.cloud.google.com/yum/doc/yum-key.gpg - https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg name: kubernetes packages: - curl - git - iproute-tc - jq - kubeadm - kubectl - kubelet - lsof - mlocate - ntp - screen - strace - sysstat - tcpdump - telnet - traceroute - tree - unzip - wget runcmd: - [ /usr/bin/updatedb ] - [ 'amazon-linux-extras', 'install', 'docker', '-y' ] - [ 'setenforce', '0'] - [ 'systemctl', 'enable', 'docker'] - [ 'systemctl', 'start', 'docker' ] - [ 'systemctl', 'enable', 'kubelet'] - [ 'systemctl', 'start', 'kubelet' ] write_files: - content: | net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 path: /etc/sysctl.d/k8s.conf permissions: '0755' owner: root:root - content: | SERVER=$(/usr/bin/aws ec2 describe-tags --region us-east-1 --filters "Name=resource-id,Values=$(wget -q -O - http://169.254.169.254/latest/meta-data/instance-id)" "Name=key,Values=Name" --query 'Tags[*].Value' --output text) PRIVATE_IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4) # if hostname was set this would work, but hostname is not set # sed -i "s/^\(HOSTNAME\s*=\s*\).*$/\1$SERVER/" /etc/sysconfig/network echo "HOSTNAME=$SERVER" >> /etc/sysconfig/network echo "$PRIVATE_IP $SERVER" >> /etc/hosts echo "$SERVER" > /etc/hostname hostname $SERVER path: /root/sethostname.sh permissions: '0755' owner: root:root
on the first instance, do a
kubeadm init
and save the output.
run that output on the other instances.
boom! a kubernetes cluster...
what is it missing ?
- your app
- logging
- monitoring
- dashboard
Custer status
kubectl get componentstatuses
Interogate the cluster
apiVersion: apps/v1 kind: DaemonSet metadata: name: dthornton-diag namespace: kube-system labels: app: conntrack-adjuster spec: selector: matchLabels: app: conntrack-adjuster template: metadata: labels: app: conntrack-adjuster spec: hostNetwork: true hostPID: true hostIPC: true containers: - name: sysctl image: alpine:3.6 imagePullPolicy: IfNotPresent command: ["sh", "-c"] args: ["while true; do echo NOW ; cat /proc/net/nf_conntrack ; sleep 60; done;"] securityContext: privileged: true volumeMounts: - name: sys mountPath: /sys volumes: - name: sys hostPath: path: /sys tolerations: - effect: "NoExecute" operator: "Exists" - effect: "NoSchedule" operator: "Exists"
The helm way
How the hell does helm work? <- this is me learning, disregard.
Setup
prereq: k cluster is up already.
step 1. install local helm 3 binary.
https://github.com/helm/helm/releases/tag/v3.0.0-alpha.1
MacOS
cd ~/work mkdir helm cd helm wget https://get.helm.sh/helm-v3.0.0-alpha.1-darwin-amd64.tar.gz tar darwin-amd64.tar.gz cp darwin-amd64/helm ~/bin
Loonix
cd ~/work mkdir helm cd helm wget https://get.helm.sh/helm-v3.0.0-alpha.1-linux-amd64.tar.gz tar zxvf linux-amd64.tar.gz cp linux-amd64/helm ~/bin/
helm can use different kubectl contexts, but we just use on concext so simple is fine.
to get the kubernetes side of helm setup you do this:
blue@kubernetescluster:~$ helm init --debug Creating /home/blue/.helm Creating /home/blue/.helm/repository Creating /home/blue/.helm/repository/cache Creating /home/blue/.helm/plugins Creating /home/blue/.helm/starters Creating /home/blue/.helm/cache/archive Creating /home/blue/.helm/repository/repositories.yaml Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com $HELM_HOME has been configured at /home/blue/.helm. Happy Helming! blue@kubernetescluster:~$
note that "--dry-run" worked in helm 2 but doesn't owrk in helm 3.
On HPA versus deployments
if you set replicas in a deployment and deploy an HPA, replicas in the deployment will fight with whatever the HPA wants.
fix this by _not_ setting it_ in the deployment.
commands of note
list available versions of charts:
helm search repo <reponame>/<chartname> --versions
exmaine the manifest for a release:
helm get manifest <release> -n <namesapce>
Helm best practice notes
- use community chart when you can to save yourself time. But also Keep up with changes so that you are not left with a chart version so old that an upgrade will be painful.
- In charts you make yourself Don't mention namespace in any manifest. This way your chart could be deployed easily to any namespace. You could make it a value in the values.yaml file, and refernce it in the manifests, but why do all that extra typing? Use -n <namespace> at install time.
- Install the chart in the same namespace as where the app will be deployed.
- use
--wait
in your pipelines so that if there is something doesn't get deployed then your pipeline will go red. or ensure you have a task that checks the status afterwards. This depends on 1. how long you want to wait "in pipeline" to know that the deploy worked. Large deploys can take a long time, think 1000s of pods. 2. having the helm return quick while a "wait for healthy" task lives longer might be preferable for reporting and metrics.
Brain Surgery
do not do this.
export app="myapp" helm history $app # pick last "good" secret k get secret $secret -o=jsonpath='{.data.release}' | base64 -d | base64 -d | gzip -c -d > ~/tmp/${app}.manifests # edit the manifests, then: vi ~/tmp/${app}.manifests # bashism VALUE=$(cat ~/tmp/${app}.manifests | gzip -c | base64 | base64 ) kubectl patch secret $secret -p "{\"data\":{\"release\":\"${VALUE}\"}}"
taints
Still learning about this .
kubectl get nodes -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.taints}{"\n"}{end}'
Kubernetes dashboard
get the token for the service account to loggin to the web ui:
ks get secret `ks get sa kubernetes-dashboard -o=jsonpath='{.secrets[0].name}'` -o=jsonpath='{.data.token}' | base64 -d ; echo
Kubernetes Quiz Links
- Pods
https://kodekloud.com/p/practice-test-kubernetes-ckad-pods
- ReplicaSets
https://kodekloud.com/p/practice-test-kubernetes-ckad-replicasets
- Deployments
https://kodekloud.com/p/practice-test-kubernetes-ckad-deployments
- Namespaces
https://kodekloud.com/p/practice-test-kubernetes-ckad-namespaces
- Commands and Arguments
https://kodekloud.com/p/practice-test-kubernetes-ckad-commands-and-arguments
- ConfigMaps
https://kodekloud.com/p/practice-test-kubernetes-ckad-configmaps
- Secrets
https://kodekloud.com/p/practice-test-kubernetes-ckad-secrets
- Security Contexts
https://kodekloud.com/p/practice-test-kubernetes-ckad-security-contexts
- Service Accounts
https://kodekloud.com/p/practice-test-kubernetes-ckad-service-account
- Taints and Tolerations
https://kodekloud.com/p/practice-test-kubernetes-ckad-taints-tolerations
- Node Affinity
https://kodekloud.com/p/practice-test-kubernetes-ckad-node-affinity
- Multi-Container Pods
https://kodekloud.com/p/practice-test-kubernetes-ckad-multicontainer-pods
- Readiness and Liveness Probes
https://kodekloud.com/p/practice-test-kubernetes-ckad-readiness-probes
- Container Logging
https://kodekloud.com/p/practice-test-kubernetes-ckad-logging
- Monitoring
https://kodekloud.com/p/practice-test-kubernetes-ckad-monitoring
- Labels & Selectors
https://kodekloud.com/p/practice-test-kubernetes-ckad-labels-and-selectors
- Rolling Updates And Rollbacks
https://kodekloud.com/p/practice-test-kubernetes-ckad-rolling-updates-and-rollbacks
- Services
https://kodekloud.com/p/kubernetes-for-beginners-services-493847859
Datadog
what version of data dog am I running ?
do this:
kubectl get pods -l app=datadog-agent -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}{"\n"}'
and get "latest"! except what does that mean?
instead ask the agent itself:
for i in `kubectl get pods -l app=datadog-agent | awk '{print $1}' | grep -v NAME `; do echo $i; k exec -it $i -- /opt/datadog-agent/bin/agent/agent version; done datadog-agent-XXXX Agent X.X.X - Commit: XXX - Serialization version: X.X.X datadog-agent-YYYY Agent X.X.Y - Commit: XXX - Serialization version: X.X.X
ah ah! inconsistant versions! can be fixed with a ds delete -> k apply, or even just a pod kill.
TCPDump a container
or pod.
Reference:
https://community.pivotal.io/s/article/How-to-get-tcpdump-for-containers-inside-Kubernetes-pods
- Get the container ID and host.
k get pod XXX -o=jsonpath='{.status.containerStatuses[0].containerID}{"\n"}{.status.hostIP}{"\n"}' docker://YYYYYYYYYYYYYYYYYYYYYY Z.Z.Z.Z
- get the interface index
docker exec XXX cat /sys/class/net/eth0/iflink <NUMBER>
- find the interface on the host
ip link |grep ^<NUMBER>:
- On those dump that interface
tcpdump -i veth235ab8ff
Disk usage of container
kns get pod -l app=myapp,env=datacenter-production -o=jsonpath='{range .items[*]}{.status.containerStatuses[0].containerID}{"\t"}{.status.hostIP}{"\n"}{end}' | \ awk 'BEGIN{FS="//"}{print $2}' | \ while read a b do echo ssh ${b} sudo du -sh /var/lib/docker/containers/${a} done
Note that by default in k8s the docker json log driver is used so if the log is big for a container, that log will be in the container's directory. I'm not sure how to fix that . there is a config for it , but it looks like kubernetes is ont honouring it.
Ingress
List all ingresses and their api version:
k get ingress -A -o jsonpath='{range .items[*]}{.metadata.namespace}{","}{.metadata.name}{","}{.apiVersion}{"\n"}{end}'
Diagram with python and D2
/k8s2d2.py - Diagram with python and D2
Also See
- kops - automated kubenetes cluster build.
- What I learned today Nov 2nd 2018
- Ingress Networking - 1
- https://kodekloud.com/p/practice-test-kubernetes-ckad-ingress-1
- Ingress Networking - 2
- https://kodekloud.com/p/practice-test-kubernetes-ckad-ingress-2-deploy-controller