HA Cluster :
集群类型:LB (Ivs/nginx ( http/upstream,stream/upstream) )、HA、HP
SPoF: Single Point of Failure
系统可用性的公式: A=MTBF/ (MTBF+MTTR)
(0.1).95%
几个9(指标):99%,99.5%,…99.999%,99.9999%;
99%: %1,99.9%,0.1%
系统故障:
硬件故障:设计缺陷、wear out、自然灾害、…
软件故障:设计缺陷、
提升系统高用性的解决方案之降低MTTR:
手段:冗余(redundant )
active/passive(主备),active/active(双主)
active --> HEARTBEAT --> passive
active <--> HEARTBEAT <--> active
高可用的是“服务”:
HA nginx service :
vip/nginx process[/shared storage]
资源:组成一个高可用服务的“组件“
keepalived:
vrrp协议的软件实现,原生设计的目的为了高可用ipvs服务
vrrp协议完成地址流动:
为vip地址所在的节点生成ipvs规则(在配置文件中预先定义):
为ipvs集群的各RS做健康状态检测:
基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务;
组件:
核心组件
wrrp stack
ipvs wrapper
checkers
控制组件:配置文件分析器
IO复用器
内存管理组件
HA Cluster的配置前提:
(1)各节点时间必须同步;
ntp,chrony
(2)确保iptables及selinux不会成为阻碍:
(3)各节点之间可通过主机名互相通信(对KA并非必须)
建议使用/etc/hosts文件实现:
(4)确保各节点的用于集群服务的接口支持MULTICAST通信
D类:224-239
keepalived安装配置:
CentOs 6.4+随base仓库提供
程序环境:
主配置文件:/etc/keepalived/keepalived.conf
主程序文件:/usr/sbin/keepalived
Unit File : keepalived.service
Unit File的环境配置文件:/etc/sysconfig/keepalived
配置文件组件部分:
TOP HIERACHY
GLOBAL CONFIGURATION
Global definitions
Static routes/addresses
VRRPD CONFIGURATION
VRRP synchronizatlon group(s):vrrp同步组;
VRRP instance(s):每个vrrp instance即一个vrrp路由器;
LVS CONFIGURATION
Virtual server group(s)
Virtual server(s):ipvs集群的vs和rs ;
高可用的ipvs集群示例
! Configuration File for keepalived
global_defs {
notification_email {
root@xiang
}
notification_email_from keepalived@xiang
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id xiang
vrrp_mcast_group4 224.1.101.33
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 33
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass kav5hsNF
}
virtual_ipaddress {
192.168.0.111/16 dev ens33 label ens33:0
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}
通知脚本的使用方式
示例通知脚本:
#!/bin/bash
contact='root@localhost'
notify(){
local mailsubject="$(hostname) to be $1, vip floating"
local mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1"
echo "$mailbody" | mail -s "$mailsubject" $contact
}
case $1 in
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*)
echo "Usage: $(basename $0) (master | backup | fault}"
exit 1
;;
esac
脚本的调用方法:
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
虚拟服务器:
配置参数:
virtual_server IP port |
virtual_server fwmark int
{
....
real_server {
....
}
....
}
常用参数:
delay_loop<INT>:服务轮询的时间间隔;
lb_algo rr | wrr | lc | wlc | lblc | sh | dh:定义调度方法;
Ib _kind NAT | DR | TUN:集群的类型:
persistence_timeout <INT>:持久连接时长;
protocol TCP:服务协议,仅支持TCP;
sorry_server <IPADDR><PORT>:备用服务器地址;
real_semer <IPADDR><PORT>
{
weight <INT>
notify_up <STRING> | <QUOTED-STRING>
notify_down <STRING> | <QUOTED-STRING>
HTTP_GET | SSL_GET | TCP_CHECK | SMTP_CHECK | MISC_CHECK{...}:定义当前主机的健康状态检测方法
}
HTTP_GET | SSL_GET:应用层检测
HTTP_GET | SSL_GET {
url {
path <URL_PATH>:定义要监控的URL;
status_code <INT>:判断上述检测机制为健康状态的响应码;
digest <STRING>:判断上述检测机制为健康状态的响应的内容的校验码;
}
nb_get_retry <INT>:重试次数;
delay_before_retry<INT>:重试之前的延迟时长;
connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求
connect_port<PORT>:向当前RS的哪个PORT发起健康状态检测请求
bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址
bind_port <PORT>:发出健康状态检测请求时使用的源端口;
connect timeout<INTEGER>:连接请求的超时时长;
}
TCP_CHECK {
connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求
connect_port<PORT>:向当前RS的哪个PORT发起健康状态检测请求
bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址:
bind port <PORT>:发出健康状态检测请求时使用的源端目;
connect_timeout <INTEGER>:连接请求的超时时长:
}
TCP_CHECK使用示例:
TCP_CHECK {
nb_get_retry 3
delay_before_retry 2
connect_timeout 3
}
keepalived调用外部的辅助脚本进行资源监控,并根据监控的结果状态能实现优先动态调整
分两步:(1)先定义一个脚本;(2)调用此脚本;
vrrp_script <SCRIPT_NAME> {
script ""
interval INT
weight -INT
}
track_script {
SCRIPT_NAME_1
SCRIPT_NAME_2
...
}
示例:高可用nginx服务
! configuration File for keepalived
global_defs {
notification_email {
root@xiang
}
notification email_from keepalived@xiang
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id xiang
vrrp_mcast_group4 224.0.100.19
}
vrrp_scrlpt chk_down {
script "[[ -f /etc/keepalived/down ]] && exit 1 | exit 0"
interval 1
weight -5
}
vrrp_script chk_nginx {
script "killall -0 nginx && exit 0 || exit 1"
interval 1
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 561f97b2
}
virtual_ipaddress {
10.1.0.93/16 dev ens33
}
track_script {
chk_down
chk_nginx
}
notify_master "/etc/keepallved/notify.sh master"
notify_backup "/etc/keepallved/notify.sh backup"
notify_fault "/etc/keepallved/notify.sh fault"
}