0
点赞
收藏
分享

微信扫一扫

Prometheus安装部署

书坊尚 2024-01-09 阅读 22

1、部署Prometheus

###新建安装目录
mkdir  /usr/local/prometheus

###下载Prometheus安装包
wget https://github.com/prometheus/prometheus/releases/download/v2.48.1/prometheus-2.48.1.linux-amd64.tar.gz
##下载可能会很慢,可直接用下属安装包

链接:安装包 提取码:q5t0

手动上传安装包:

Prometheus安装部署_Prometheus

###解压安装包到安装目录
tar -xf prometheus-2.48.1.linux-amd64.tar.gz  -C  /usr/local/prometheus

###目录重命名
mv  prometheus-2.48.1.linux-amd64   prometheus

Prometheus安装部署_Prometheus_02

###Prometheus文件配置
##备份Prometheus配置文件
cp prometheus.yml prometheus.yml-bak

Prometheus安装部署_Prometheus_03

###重新配置Prometheus配置文件
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

Prometheus安装部署_Prometheus_04

###检查检验Prometheus配置文件
cd  /usr/local/prometheus/prometheus
./promtool check config ./prometheus.yml

Prometheus安装部署_Prometheus_05

###启动Prometheus服务,有两种方式,可以手动启动,也可以systemctl启动
##手动启动
cd  /usr/local/prometheus/prometheus

###手动启动
nohup ./prometheus --config.file=./prometheus.yml \
--web.listen-address=0.0.0.0:9090 \
--web.enable-lifecycle \
--storage.tsdb.retention=90d \
--storage.tsdb.path=./data &

Prometheus安装部署_Prometheus_06

查看启动日志,启动成功

Prometheus安装部署_Prometheus_07

查看进程,端口启动成功

Prometheus安装部署_Prometheus_08

###快速kill服务命令
ps -ef|grep prometheus|grep -v grep |awk '{print $2}'|xargs kill -9

###或者curl命令停止
curl -XPOST http://localhost:9090/-/quit

Prometheus安装部署_Prometheus_09

###重新加载Prometheus服务
curl -XPOST http://localhost:9090/-/reload

###方法二:启停服务
###以systemctl的方式,启停Prometheus服务
##配置Prometheussystemd中配置

cat > /usr/lib/systemd/system/prometheus.service <<EOF
[Unit]
Description=The Prometheus Server
After=network.target
[Service]
ExecStart=/usr/local/prometheus/prometheus/prometheus \
  --config.file=/usr/local/prometheus/prometheus/prometheus.yml \
  --web.listen-address=0.0.0.0:9090 \
  --web.enable-lifecycle \
  --storage.tsdb.retention=90d \
  --storage.tsdb.path="/usr/local/prometheus/prometheus/data/"
Restart=on-failure
RestartSec=15s
[Install]
WantedBy=multi-user.target
EOF

###重启加载配置
systemctl  daemon-reload

##配置服务开机自启动
systemctl  enable prometheus

##启停Prometheus服务
systemctl  start  prometheus
systemctl  stop  prometheus

服务启动后,web访问http://IP:9090

Prometheus安装部署_Prometheus_10

2、部署node_exporter

###下载安装包
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
###或者手动下载后,上传

###解压到指定安装目录
tar xf node_exporter-1.7.0.linux-amd64.tar.gz -C /usr/local/prometheus/
##重命名
mv node_exporter-1.7.0.linux-amd64 node_exporter

Prometheus安装部署_Prometheus_11

###服务启停,同样的,有两种方式,可手动启停,也可配置systemd方式启停
##手动启动服务
cd  /usr/local/prometheus/node_exporter
nohup ./node_exporter --web.listen-address=0.0.0.0:4220 &
###--web.listen-address=0.0.0.0:4220   ###node_exporter暴露的端口

Prometheus安装部署_Prometheus_12

###服务快速kill命令
ps -ef|grep node_exporter|grep -v grep |awk '{print $2}'|xargs kill -9

Prometheus安装部署_Prometheus_13

###以systemd的方式管理node_exporter的启停
##配置服务systemd配置文件

cat > /usr/lib/systemd/system/node_exporter.service   <<EOF
[Unit]
Description=The node_exporter Server
After=network.target
[Service]
ExecStart=/usr/local/prometheus/node_exporter/node_exporter \
  --web.listen-address=0.0.0.0:4220 \
  --collector.systemd \
  --collector.systemd.unit-whitelist=(sshd|docker).service
Restart=on-failure
RestartSec=15s
SyslogIdentifier=node_exporter
[Install]
WantedBy=multi-user.target 
EOF

###重新加载配置
systemctl daemon-reload

##配置服务开机自启动
systemctl enable node_exporter
##配置服务启停
systemctl start node_exporter
systemctl stop node_exporter

可以看到4220端口已启动

3、部署alertmanager

###下载安装包
cd  /usr/local/prometheus
wget https://github.com/prometheus/alertmanager/releases/download/v0.26.0/alertmanager-0.26.0.linux-amd64.tar.gz
##或者手动下载后,上传服务器

###解压安装包
tar -xf alertmanager-0.26.0.linux-amd64.tar.gz -C /usr/local/prometheus

##重命名
mv alertmanager-0.26.0.linux-amd64 alertmanager

###配置共分为六部分,分别是global、templates、route、receivers、inhibit_rules、静默配置
##备份配置文件
 cp alertmanager.yml  alertmanager.yml-bak

###详细配置,根据实际情况修改配置
#global配置
global:
  resolve_timeout: 5m  #在报警恢复的时候不是立马发送的,在接下来的这个时间内,如果没有此报警信息触发,才发送报警恢复消息
  smtp_smarthost: 'smtp.exmail.qq.com:465' #发件人对应邮件提供商的smtp地址,此处为腾讯企业邮箱stmp配置
  smtp_from: 'xxx@company.com'          #发件人邮箱地址
  smtp_auth_username: 'xxx@company.com' #发件人的登陆用户名,默认和发件人地址一致
  smtp_auth_password: 'xxxxxxx'       #发件人的登陆密码,也可以是授权码。
  smtp_require_tls: false		      #是否需要tls协议,默认是true
#templates配置
templates:
- '/usr/local/soft/alertmanager/email.tmpl'	 #自定义通知的模板的目录或者文件
#route配置
route:						#每个输入警报进入根路由
  group_by: ['alertname','cluster','service']	#将传入的报警中有这些标签的分为一个组,比如, cluster=A 和 alertname=LatencyHigh 会分成一个组
  group_wait: 30s	#指分组创建多久后才可以发送压缩的警报,也就是初次发警报的延时,这样会确保第一次通知的时候, 有更多的报警被压缩在一起
  group_interval: 5m	#当第一个通知发送,等待多久发送压缩的警报
  repeat_interval: 1h	#如果报警发送成功, 等待多久重新发送一次
  receiver: 'email'	 #默认警报接收者
#receivers配置
receivers:
- name: 'email'		#警报名称
  email_configs:
  - to: 'xxx@xxx.com'		#接收警报的email
    send_resolved: true		#是否发送警报解除邮件
    html: '{{ template "email.htm" . }}'	#模板
    headers: { Subject: "{{ .CommonLabels.severity }} {{ .CommonAnnotations.summary }}" }	#标题
#报警抑制规则
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']	#通过上面的配置,可以在alertname相同的情况下,critaical的报警会抑制warning级别的报警信息。
#静默配置
#静默配置是通过web界面配置的,通常用于服务升级或者长时间的服务故障,确保在接下来的时间内不会在收到同样报警信息

###手动启停alertmanager服务
cd  /usr/local/prometheus/alertmanager
nohup ./alertmanager --config.file="alertmanager.yml" --web.listen-address=":9093" &

##停掉服务
ps -ef |grep alertmanager |grep -v grep  |awk '{print $2}' | xargs kill -9

##重新加载服务
curl -XPOST http://localhost:9093/-/reload

###以systemd方式启停alertmanager服务
##配置systemd配置文件

cat > /usr/lib/systemd/system/alertmanager.service   <<EOF
[Unit]
Description=The Prometheus Server
After=network.target
[Service]
ExecStart=/usr/local/prometheus/alertmanager/alertmanager \
  --config.file=/usr/local/prometheus/alertmanager/alertmanager.yml \
  --web.listen-address=0.0.0.0:9093
Restart=on-failure
RestartSec=15s
[Install]
WantedBy=multi-user.target
EOF

###重新加载配置
systemctl daemon-reload
##配置开机自启动
systemctl enable alertmanager
##配置服务启停
systemctl start alertmanager
systemctl stop alertmanager

web访问,http://IP:9093

Prometheus安装部署_Prometheus_14

4、部署grafana服务

###下载安装包
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-10.2.3.linux-amd64.tar.gz
##或者手动下载后,上传
##解压到指定目录
tar xf grafana-enterprise-10.2.3.linux-amd64.tar.gz -C /usr/local/prometheus/
##重命名
mv grafana-v10.2.3 grafana

##手动启动grafana
cd  /usr/local/prometheus/grafana/bin/
nohup /usr/local/prometheus/grafana/bin/grafana-server &
##手动停服务
ps -ef |grep grafana |grep -v grep  |awk '{print $2}'|xargs kill -9

Prometheus安装部署_Prometheus_15

通过浏览器访问:

http://IP:3000

默认登录用户名密码: admin/admin

Prometheus安装部署_Prometheus_16

初次登录后,会提示修改登录密码

Prometheus安装部署_Prometheus_17









举报

相关推荐

0 条评论