Prometheus Notes

From Federal Burro of Information
Revision as of 04:01, 23 November 2021 by David (talk | contribs) (→‎help)
Jump to navigationJump to search

PromQL

node exporter:

node_memory_MemAvailable_bytes{job=~"myjob.*"} / on ( instance ) node_memory_MemTotal_bytes{job=~"myjob.*"}
node_memory_MemFree_bytes{job=~"myjob.*"} / on ( instance ) node_memory_MemTotal_bytes{job=~"myjob.*"}
sum(kube_pod_container_resource_requests_cpu_cores) / sum(kube_node_status_capacity_cpu_cores) * 100


topk(
10,
count({job="prometheus"}) by (__name__)
)

renaming metrics

scrape_configs:
­- job_name: sql
  targets: [172.21.132.39:41212]
  metric_relabel_configs:
­  - source_labels: ['prometheus_metric_name']
    target_label: '__name__'
    regex: '(.*[^_])_*'
    replacement: '${1}'
­  - regex: prometheus_metric_name
    action: labeldrop

turns this:

query_result_dm_os_performance_counters{
  counter_instance="ex01",
  counter_name="log file(s) size (kb)",
  prometheus_metric_name="sqlserver_databases",
}

into :

sqlserver_databases{
  counter_instance="ex01",
  counter_name="log file(s) size (kb)",
}

dirty install node exporter

curl -L -o /tmp/node_exporter-1.0.1.linux-amd64.tar.gz https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
tar zxvf /tmp/node_exporter-1.0.1.linux-amd64.tar.gz -C /tmp/
cp /tmp/node_exporter-1.0.1.linux-amd64/node_exporter /usr/bin/prometheus-node-exporter

curl -L -o /tmp/node_exporter-1.0.1.linux-armv6.tar.gz https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-armv6.tar.gz
tar zxvf /tmp/node_exporter-1.0.1.linux-armv6.tar.gz
cp /tmp/node_exporter-1.0.1.linux-armv6/node_exporter /usr/bin/prometheus-node-exporter

chmod 755 /usr/bin/prometheus-node-exporter
chown root:root /usr/bin/prometheus-node-exporter

cat << EOF > /etc/default/prometheus-node-exporter
ARGS="--collector.diskstats.ignored-devices=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$  \
      --collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run)($|/) \
      --collector.netclass.ignored-devices=^lo$  \
      --collector.systemd
EOF
  
chown root:root /etc/default/prometheus-node-exporter
chmod 644 /etc/default/prometheus-node-exporter

cat << EOF > /lib/systemd/system/prometheus-node-exporter.service
[Unit]
Description=Prometheus exporter for machine metrics
Documentation=https://github.com/prometheus/node_exporter
[Service]
Restart=always
User=nobody  
EnvironmentFile=/etc/default/prometheus-node-exporter
ExecStart=/usr/bin/prometheus-node-exporter $ARGS
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
EOF

chown root:root /lib/systemd/system/prometheus-node-exporter.service
chmod 644 /lib/systemd/system/prometheus-node-exporter.service

systemctl daemon-reload
systemctl enable prometheus-node-exporter.service
systemctl start prometheus-node-exporter.service

Exposing metrics

with python use:

import prometheus_client


example output:

# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 21217
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 307200

cpu usage from cpu seconds

usage by job:

100 - (avg by (job) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

reference: https://www.robustperception.io/understanding-machine-cpu-usage

cont. cpus on system by job:

avg(count(node_cpu_seconds_total)without (cpu))by(job)

when you have tagged you node pools by colour:

100 - (avg by (colour) (irate(node_cpu_seconds_total{job="kubernetes-node-exporter",mode="idle"}[5m])) * 100)

Plotting more than one metrics

	label_replace(
		max(process_open_handles{kubernetes_namespace="mynamespace"}), 
		"aggregation", "max", "", ""
	)
	or
	label_replace(
		quantile(0.95, process_open_handles{kubernetes_namespace="mynamespace"}), 
		"aggregation", "p95", "", ""
	)
	or
	label_replace(
		quantile(0.5, process_open_handles{kubernetes_namespace="mynamespace"}),  
		"aggregation", "p50", "", ""
	)
	or
	label_replace(
		avg(process_open_handles{kubernetes_namespace="mynamespace"}),  
		"aggregation", "avg", "", ""
	)

help

/prometheus help

resources

https://timber.io/blog/promql-for-humans/

https://www.weave.works/blog/promql-queries-for-the-rest-of-us/

https://promcon.io/2018-munich/slides/taking-advantage-of-relabeling.pdf

https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085

https://www.robustperception.io/extracting-full-labels-from-consul-tags

https://blog.freshtracks.io/prometheus-relabel-rules-and-the-action-parameter-39c71959354a

/Prometheus Internal Metrics