In my Kubernetes, I sometimes try to run a pod on a specific worker node. Maybe one of them has a folder that I need or a specific hardware characteristic. Historically, I’ve used Pod spec.nodeName: srv5 However, when that node becomes unavailable, say because it’s run out of disk space and has DiskPressure on it, then Kubernetes will continually try to spin up thousands of pods on it.
A screenshot of CPU usage growing, then Prometheus falls over and can’t scrap anymore.
