一、安装Prometheus + snmp_exporter + Alertmanager + Grafana
说明:
Prometheus + snmp_exporter + Alertmanager + Grafana 安装部署
各程序说明
Prometheus:用于做主监控端,收集 snmp_exporter 数据信息
snmp_exporter:用于监控交换机设备,收集交换机数据信息
Alertmanager: 用于做报警端,当交换机挂掉或者CPU信息或者内存占用过高进行报警(邮件报警),也可以对接口流量阈值进行报警
Grafana: 把 Prometheus 收集的数据进行比较友好的界面展示
使用版本
CentOS 7.8-2003-Minimal(4核 8G 100G)
prometheus 版本 2.21.0 监听 9090 端口
alertmanager 版本 0.21.0 监听 9093, 9094 端口
snmp_exporter 版本 0.19.0 监听 9116 端口,SNMP 是 UDP 协议 161 和 162 端口,所以不建议 docker 部署
Grafana 版本 7.2.0(docker部署) 监听 3000 端口
官网地址:
https://prometheus.io/docs/prometheus/latest/
https://github.com/prometheus
1.系统安装后初始步骤(优化服务器):
yum install wget yum-utils net-tools vim unzip bash-completion -y
yum update -y
ulimit -n
sed -i "s/* soft nofile 65535/ /g" /etc/security/limits.conf
sed -i "s/* hard nofile 65535/ /g" /etc/security/limits.conf
echo "* soft nofile 65535" >>/etc/security/limits.conf
echo "* hard nofile 65535" >>/etc/security/limits.conf
ulimit -n 65535
echo "修改后文件数量"
ulimit -n
echo "优化内核参数"
echo "net.ipv4.ip_local_port_range = 1024 65535" >>/etc/sysctl.conf
echo "net.ipv4.tcp_syncookies = 1" >>/etc/sysctl.conf
echo "net.ipv4.tcp_tw_reuse = 1" >>/etc/sysctl.conf
echo "net.ipv4.tcp_tw_recycle = 1" >>/etc/sysctl.conf
echo "net.ipv4.tcp_fin_timeout = 30" >>/etc/sysctl.conf
echo "net.core.somaxconn = 20480" >>/etc/sysctl.conf
echo "net.core.netdev_max_backlog = 20480" >>/etc/sysctl.conf
echo "net.ipv4.tcp_max_syn_backlog = 20480" >>/etc/sysctl.conf
echo "net.ipv4.tcp_max_tw_buckets = 800000" >>/etc/sysctl.conf
sysctl -p
sed -i 's/=enforcing/=disabled/g' /etc/selinux/config
setenforce 0
systemctl disabled firewalld
systemctl stop firewalld
2.下载所需要的安装包
1.下载软件
下载prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.21.0/prometheus-2.21.0.linux-amd64.tar.gz
下载snmp_exporter
wget https://github.com/prometheus/snmp_exporter/releases/download/v0.19.0/snmp_exporter-0.19.0.linux-amd64.tar.gz
下载alertmanager
wget https://github.com/prometheus/alertmanager/releases/download/v0.21.0/alertmanager-0.21.0.linux-amd64.tar.gz
2.解压缩至/opt/
解压缩prometheus
tar -xvf prometheus-2.21.0.linux-amd64.tar.gz
mv prometheus-2.21.0.linux-amd64 /opt/prometheus
解压缩snmp_exporter
tar -xf snmp_exporter-0.19.0.linux-amd64.tar.gz
mv snmp_exporter-0.19.0.linux-amd64 /opt/snmp_exporter
解压缩alertmanage
tar -xf alertmanager-0.21.0.linux-amd64.tar.gz
mv alertmanager-0.21.0.linux-amd64 /opt/alertmanager
3增加服务自启动
#.prometheus服务
cat > /etc/systemd/system/prometheus.service <<EOF
[Unit]
Description=Prometheus
After=network.target
[Service]
ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --storage.tsdb.path=/opt/prometheus/data
User=prometheus
[Install]
WantedBy=multi-user.target
EOF
#.snmp_exporter服务
cat > /etc/systemd/system/snmp_exporter.service <<EOF
[Unit]
Description=node_exporter
After=network.target
[Service]
ExecStart=/opt/snmp_exporter/snmp_exporter --config.file=/opt/snmp_exporter/snmp.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
#.alertmanager服务
cat > /etc/systemd/system/alertmanager.service <<EOF
[Unit]
Description=node_exporter
After=network.target
[Service]
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
4.建立prometheus用户 并设置为所有者和所有者组
useradd prometheus
chown -R prometheus:prometheus /opt/{snmp_exporter,prometheus,alertmanager}
5.刷新
systemctl daemon-reload
6.开启开机自启动
systemctl enable prometheus && systemctl enable snmp_exporter && systemctl enable alertmanager
7.开启服务
systemctl start prometheus && systemctl start snmp_exporter && systemctl start alertmanager
3.安装Grafana
1. grafana 下载 grafana-7.4.0-1.x86_64.rpm
https://mirrors.bfsu.edu.cn/grafana/yum/rpm/
上传 grafana-7.4.0-1.x86_64.rpm 到服务器的/usr/local 目录中
2.进入目录
cd /usr/local
3.安装grafana
yum install -y grafana-7.4.0-1.x86_64.rpm
4.启动服务
systemctl start grafana-server
5.添加服务至 开机启动
systemctl enable grafana-server
6.查看grafana端口
netstat -natp | grep :3000
7.浏览器访问:http://10.10.201.86:3000 ,默认账号和密码为 admin/admin
第一次登陆 需要更改密码 记住就好
其他一些命令
查看安装版本
rpm -qa | grep grafana
卸载
yum remove grafana
### 下面的选装 我没有装###
#安装grafana-image-renderer
grafana-cli plugins install grafana-image-renderer
#安装截图需要的依赖库,不装捕获不到图片
yum -y install libatk-bridge* libXss* libgtk*
#修改配置文件,防止中文乱码
vim /etc/grafana/grafana.ini
#修改以下内容
rendering_language = zh
(二)Prometheus 监控思科交换机---snmp_exporter配置文件修改
说明:
一、说明
采用的是if_mib模块。所以需要修改if_mib模块的配置文件,添加交换机snmp验证的关键字(默认是public),默认没有监控CPU和内存的OID内容,需要手动添加。
Prometheus 监控思科交换机文档完整地址: https://blog.51cto.com/liujingyu/category9.html
1.如果不知道交换机的关键字,可以上交换机查一下,选择 RW 类型的关键字
show running-config | include snmp
交换机中操作(如果交换机没有开启snmp 需要在交换机上 开启snmp)
2.可以在服务器上通过snmpwalk命令进行测试验证
yum -y install net-snmp-utils
# 查看交换机接口详细信息,如果可以正常返回接口信息则说明关键字正确
[root@localhost ~]# snmpwalk -v 2c -c public 192.168.100.151 1.3.6.1.2.1.2
IF-MIB::ifNumber.0 = INTEGER: 146
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifIndex.5 = INTEGER: 5
····
具体OID信息可以参考如下网址:
http://oid-info.com/
https://www.cisco.com/c/en/us/support/web/tools-catalog.html
(二)修改snmp.yml配置文件,在 if_mib 模块中添加验证关键字,添加监控 CPU 和 内存 的 OID 信息;
1.修改配置文件
vim /opt/snmp_exporter/snmp.yml
····
if_mib:
auth:
community: ABCDEFG
walk:
- 1.3.6.1.2.1.2
- 1.3.6.1.2.1.31.1.1
- 1.3.6.1.4.1.9.2.1 # 交换机cpu的相关信息
- 1.3.6.1.4.1.9.9.48 # 交换机内存的相关信息
get:
- 1.3.6.1.2.1.1.3.0
metrics:
- name: busyPer
oid: 1.3.6.1.4.1.9.2.1.56.0
type: gauge
help: CPU utilization
- name: avgBusy1
oid: 1.3.6.1.4.1.9.2.1.57.0
type: gauge
help: CPU utilization in the past 1 minute
- name: avgBusy2
oid: 1.3.6.1.4.1.9.2.1.58.0
type: gauge
help: CPU utilization in the past 5 minute
- name: MemoryPoolFree
oid: 1.3.6.1.4.1.9.9.48.1.1.1.6.1
type: gauge
help: ciscoMemoryPoolFree
- name: MemoryPoolUsed
oid: 1.3.6.1.4.1.9.9.48.1.1.1.5.1
type: gauge
help: ciscoMemoryPoolUsed
···
2.从启动服务
[root@localhost ~]# systemctl restart snmp_exporter
3.查询snmp监听的9116端口
[root@localhost ~]# netstat -tnlp
如下
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 20451/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 20561/master
tcp6 0 0 :::22 :::* LISTEN 20451/sshd
tcp6 0 0 ::1:25 :::* LISTEN 20561/master
tcp6 0 0 :::9116 :::* LISTEN 27273/snmp_exporter