Prometheus
"Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true." - https://github.com/prometheus/prometheus
Tips
Restart prometheus pods in kubernetes
kubectl get pods -l component=prometheus -o name |
while read -r pod ; do
echo $pod
kubectl port-forward "$pod" 9090 &
sleep 10 # to let the port-forward establish before using it
curl -X POST localhost:9090/-/reload
kill %%
sleep 5 # to let the previous process exit before starting another port forward
done
Validate a prometheus config
promtool check config --syntax-only prometheus-config.yaml
Links
- https://prometheus.io/docs/introduction/overview
- https://prometheus.io/docs/prometheus/latest/querying/basics: Good intro to promql fundamentals.
- https://the-zen-of-prometheus.netlify.app
- https://www.robustperception.io/cardinality-is-key
- https://github.com/cortexproject/cortex: "Horizontally scalable, highly available, multi-tenant, long term storage for Prometheus."
- https://github.com/thanos-io/thanos: "Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments."
- https://github.com/prometheus/prometheus/blob/release-2.42/tsdb/docs/format
- https://www.robustperception.io/using-tsdb-analyze-to-investigate-churn-and-cardinality
- https://fiberplane.com/blog/why-are-prometheus-queries-hard
- https://blog.cloudflare.com/how-cloudflare-runs-prometheus-at-scale
- https://www.robustperception.io/cardinality-is-key