1 配置Alert对接钉钉告警
钉钉告警可以是⽤Prometheus-webhook-dingtalk⼯具,github地址 https://github.com/timonwong/prometheus-webhook-dingtalk 。
Prometheus-->AlertManager-->⾃定义Webhook程序-->钉钉机器⼈-->通知给对应的群组
2 webhook_dingding
2.1 下载并运⾏webhook_dingding
mkdir /app/module/webhook_dingding
mv webhook_dingding /app/module/webhook_dingding
chmod a+x /app/module/webhook_dingding/webhook_dingding
2.2 编辑启动脚本⽂件
vim /usr/lib/systemd/system/webhook_dingding.service
[Unit]
Description=webhook_dingding
Documentation=https://prometheus.io/
After=network.target
[Service]
ExecStart=/app/module/webhook_dingding/webhook_dingding --port 5002
ExecReload=/bin/kill -HUP
TimeoutStopSec=20s
Restart=always
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start webhook_dingding.service
2.3 测试数据发送
测试webhook_dingding能否正确发送消息到钉钉(注意端⼝是5002)
curl -X POST http://localhost:5002/alert?token=钉钉token \
-H "Content-Type: application/json" \
-d '{
"alerts": [
{
"status": "firing",
"labels": {
"severity": "critical",
"alertname": "InstanceDown",
"instance": "example1"
},
"annotations": {
"summary": "Instance example1 down",
"description": "The instance example1 is down."
},
"startsAt": "2024-12-20T15:04:05Z",
"endsAt": "0001-01-01T00:00:00Z"
},
{
"status": "resolved",
"labels": {
"severity": "critical",
"alertname": "InstanceDown",
"instance": "example1"
},
"annotations": {
"summary": "Instance example1 is back up",
"description": "The instance example1 has recovered."
},
"startsAt": "2024-12-20T15:04:05Z",
"endsAt": "2024-12-20T16:04:05Z"
}
]
}'
2.4 检查钉钉结果
2.5 配置AlertManager对接webhook_dingding
2.6 重新加载alertmanager
curl -X POST http://192.168.137.131:9093/-/reload
2.7 触发告警
验证是否能通过钉钉收到消息
systemctl stop node_exporter.service
systemctl start node_exporter.service
3 prometheus-webhook-dingtalk
3.1 下载prometheus-webhook-dingtalk
#加速下载
wget https://mirror.ghproxy.com/https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz
3.2 解压安装
tar -xf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /app/module/
ln -s /app/module/prometheus-webhook-dingtalk-2.1.0.linux-amd64/ /app/module/prometheus-webhook-dingtalk
3.3 添加告警规则模板
vim /app/module/prometheus-webhook-dingtalk/template.tmpl
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}
{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警程序**: AlertManager <br>
**告警类型**: {{ .Labels.alertname }} <br>
**告警级别**: {{ .Labels.severity }} <br>
**告警主机**: {{ .Labels.instance }} <br>
**告警主题**: {{ .Annotations.summary }} <br>
**告警信息**: {{ index .Annotations "description" }} <br>
**告警时间**: <font color='#FF0000'>{{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }} </font><br>
{{ end }}{{ end }}
{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警程序**: AlertManager <br>
**告警类型**: {{ .Labels.alertname }} <br>
**告警级别**: {{ .Labels.severity }} <br>
**告警主机**: {{ .Labels.instance }} <br>
**告警主题**: {{ .Annotations.summary }} <br>
**告警信息**: {{ index .Annotations "description" }} <br>
**告警时间**: <font color='#FF0000'>{{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }} </font><br>
**恢复时间**: <font color='#FF0000'>{{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }} </font><br>
{{ end }}{{ end }}
{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}
{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**<h2><font color='#FF7F00'>侦测到{{ .Alerts.Firing | len }}个故障</font></h2>**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
**<h2><font color='#00FF00'>恢复{{ .Alerts.Resolved | len }}个故障</font></h2>**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
3.4 添加启动配置文件
cd /app/module/prometheus-webhook-dingtalk
cp config.example.yml config.yml
3.5 编辑启动脚本
vim /usr/lib/systemd/system/webhook.service
[Unit]
Description=Prometheus-Server
After=network.target
[Service]
ExecStart=/app/module/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk \
--config.file=/app/module/prometheus-webhook-dingtalk/config.yml
ExecReload=/bin/kill -HUP
TimeoutStopSec=20s
Restart=always
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start webhook.service
3.6 配置AlertManager对接webhook-dingtalk
- name: 'webhook-dingtalk'
webhook_configs:
- url: 'http://192.168.137.131:8060/dingtalk/webhook1/send'
3.7 重新加载alertmanager
curl -X POST http://192.168.137.131:9093/-/reload
3.8 触发告警
验证是否能通过钉钉收到消息
systemctl stop nginx_exporter
systemctl start node_exporter.service