Prometheus Notes: Difference between revisions

From Federal Burro of Information
Jump to navigationJump to search
Line 19: Line 19:
== renaming metrics ==
== renaming metrics ==


,pre>
<pre>
scrape_configs:
scrape_configs:
­- job_name: sql
­- job_name: sql

Revision as of 22:24, 31 July 2020

PromQL

node exporter:

node_memory_MemAvailable_bytes{job=~"myjob.*"} / on ( instance ) node_memory_MemTotal_bytes{job=~"myjob.*"}
node_memory_MemFree_bytes{job=~"myjob.*"} / on ( instance ) node_memory_MemTotal_bytes{job=~"myjob.*"}
sum(kube_pod_container_resource_requests_cpu_cores) / sum(kube_node_status_capacity_cpu_cores) * 100


topk(
10,
count({job="prometheus"}) by (__name__)
)

renaming metrics

scrape_configs:
­- job_name: sql
  targets: [172.21.132.39:41212]
  metric_relabel_configs:
­  - source_labels: ['prometheus_metric_name']
    target_label: '__name__'
    regex: '(.*[^_])_*'
    replacement: '${1}'
­  - regex: prometheus_metric_name
    action: labeldrop

turns this:

query_result_dm_os_performance_counters{
  counter_instance="ex01",
  counter_name="log file(s) size (kb)",
  prometheus_metric_name="sqlserver_databases",
}

into :

sqlserver_databases{
  counter_instance="ex01",
  counter_name="log file(s) size (kb)",
}

dirty install node exporter

curl -L -o /tmp/node_exporter-1.0.1.linux-amd64.tar.gz https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
tar zxvf /tmp/node_exporter-1.0.1.linux-amd64.tar.gz -C /tmp/
cp /tmp/node_exporter-1.0.1.linux-amd64/node_exporter /usr/bin/prometheus-node-exporter

curl -L -o /tmp/node_exporter-1.0.1.linux-armv6.tar.gz https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-armv6.tar.gz
tar zxvf /tmp/node_exporter-1.0.1.linux-armv6.tar.gz
cp /tmp/node_exporter-1.0.1.linux-armv6/node_exporter /usr/bin/prometheus-node-exporter

chmod 755 /usr/bin/prometheus-node-exporter
chown root:root /usr/bin/prometheus-node-exporter

cat << EOF > /etc/default/prometheus-node-exporter
ARGS="--collector.diskstats.ignored-devices=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$  \
      --collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run)($|/) \
      --collector.netclass.ignored-devices=^lo$  \
      --collector.systemd
EOF
  
chown root:root /etc/default/prometheus-node-exporter
chmod 644 /etc/default/prometheus-node-exporter

cat << EOF > /lib/systemd/system/prometheus-node-exporter.service
[Unit]
Description=Prometheus exporter for machine metrics
Documentation=https://github.com/prometheus/node_exporter
[Service]
Restart=always
User=nobody  
EnvironmentFile=/etc/default/prometheus-node-exporter
ExecStart=/usr/bin/prometheus-node-exporter $ARGS
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
EOF

chown root:root /lib/systemd/system/prometheus-node-exporter.service
chmod 644 /lib/systemd/system/prometheus-node-exporter.service

systemctl daemon-reload
systemctl enable prometheus-node-exporter.service
systemctl start prometheus-node-exporter.service

cpu usage from cpu seconds

usage by job:

100 - (avg by (job) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

reference: https://www.robustperception.io/understanding-machine-cpu-usage

cont. cpus on system by job:

avg(count(node_cpu_seconds_total)without (cpu))by(job)

resources

https://timber.io/blog/promql-for-humans/

https://www.weave.works/blog/promql-queries-for-the-rest-of-us/

https://promcon.io/2018-munich/slides/taking-advantage-of-relabeling.pdf

https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085

https://www.robustperception.io/extracting-full-labels-from-consul-tags

https://blog.freshtracks.io/prometheus-relabel-rules-and-the-action-parameter-39c71959354a