使用阿里云havip实现keepalived高可用

0 前言

keepalived是一种借助lvs实现的的高可用方案,在物理机环境下是很好实施的。

但是在云环境,由于vpc网络的安全限制,是无法直接部署的。

近期各大云厂商都在内测havip功能,下文以阿里云为例

1 申请havip权限

当前阿里云的havip属于内测状态,需要单独申请

内测审批一般在当天就能通过

审批通过后,入口会出现在专有网络下

2 环境准备

部署2台ECS,放到同一个vpc的相同虚拟交换机下,ip分别是:

  • 172.20.1.34
  • 172.20.1.35

分别安装keepalived和lighttpd

sudo apt-get install -y keepalived lighttpd

看下lighttpd是否正常启动

curl "127.0.0.1"

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Welcome page 1</title>
....

为了方便实验,我修改了vim /var/www/html/index.lighttpd.html中的title,一个是page 1 ,另一个page 2

3 申请havip

注意要选择和上述ecs相同交换机下的,还要指定一个ip,不要和之前的机器冲突,我这里选了172.20.1.10:

创建成功后,点击进入havip,分别绑定2台ecs

4 配置keepalived

/etc/keepalived/keepalived.conf

第一台

sudo chmod a+x /etc/keepalived/notify_action.sh
! Configuration File for keepalived
global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_script chk_http {
    script "</dev/tcp/127.0.0.1/80"
    interval 1
    weight -30 # 差值,必须大于两者优先级只差
}

vrrp_instance VI_1 {
state MASTER                    #设置ECS1实例为主实例
    interface eth0              #设置网卡名,本示例配置为eth0  
    virtual_router_id 51
    nopreempt              
    priority 70                 #设置优先级,数字越大,优先级越高
    advert_int 1        
    authentication {
        auth_type PASS
        auth_pass mypass
    }
    unicast_src_ip 172.20.1.34  #当前ECS的ip
    unicast_peer {
        172.20.1.35             #对端ECS的ip
    }
    virtual_ipaddress {
        172.20.1.10             #设置HaVip的IP地址 
    }
    notify_master "/etc/keepalived/notify_action.sh MASTER"
    notify_backup "/etc/keepalived/notify_action.sh BACKUP"
    notify_fault "/etc/keepalived/notify_action.sh FAULT"
    notify_stop "/etc/keepalived/notify_action.sh STOP"
    garp_master_delay 1
    garp_master_refresh 5

    track_interface {
        eth0                #设置ECS实例网卡名,本示例配置为eth0
    }

    track_script {
        chk_http 
    }
}

第二台

sudo chmod a+x /etc/keepalived/notify_action.sh
! Configuration File for keepalived
global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_script chk_http {
    script "</dev/tcp/127.0.0.1/80"
    interval 1
    weight -30  # 差值,必须大于两者优先级只差
}

vrrp_instance VI_1 {
state BACKUP                    #设置ECS2实例为备用实例
    interface eth0              #设置网卡名,本示例配置为eth0  
    virtual_router_id 51
    nopreempt              
    priority 80                 #设置优先级,数字越大,优先级越高
    advert_int 1        
    authentication {
        auth_type PASS
        auth_pass mypass
    }
    unicast_src_ip 172.20.1.35  #当前ECS的ip
    unicast_peer {
        172.20.1.34             #对端ECS的ip
    }
    virtual_ipaddress {
        172.20.1.10             #设置HaVip的IP地址 
    }
    notify_master "/etc/keepalived/notify_action.sh MASTER"
    notify_backup "/etc/keepalived/notify_action.sh BACKUP"
    notify_fault "/etc/keepalived/notify_action.sh FAULT"
    notify_stop "/etc/keepalived/notify_action.sh STOP"
    garp_master_delay 1
    garp_master_refresh 5

    track_interface {
        eth0                    #设置ECS实例网卡名,本示例配置为eth0
    }

    track_script {
        chk_http 
    }
}

切换通知脚本/etc/keepalived/notify_action.sh这里比较简单,就打了日志

#!/bin/bash
log_file=/var/log/keepalived.log
log_write()
{
echo "[`date '+%Y-%m-%d %T'`] $1" >$log_file
}
[ ! -d /var/keepalived/ ] && mkdir -p /var/keepalived/
case "$1" in
"MASTER" )
   echo -n "$1" /var/keepalived/state
   log_write " notify_master"
   echo -n "0" /var/keepalived/vip_check_failed_count
   ;;
   "BACKUP" )
   echo -n "$1" /var/keepalived/state
   log_write " notify_backup"
   ;;
   "FAULT" )
   echo -n "$1" /var/keepalived/state
   log_write " notify_fault"
   ;;
   "STOP" )
   echo -n "$1" /var/keepalived/state
   log_write " notify_stop"
   ;;
*)
   log_write "notify_action.sh: STATE ERROR!!!"
   ;;
esac

5 重启检查

分别重启keepalived服务,验证状态,由于我们设置的机器1优先级更高,所以1处于master,2是slave

机器1

sudo service keepalived status
● keepalived.service - Keepalive Daemon (LVS and VRRP)
     Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2021-09-18 03:53:15 CST; 9min ago
   Main PID: 537 (keepalived)
      Tasks: 2 (limit: 4482)
     Memory: 5.7M
     CGroup: /system.slice/keepalived.service
             ├─537 /usr/sbin/keepalived --dont-fork
             └─567 /usr/sbin/keepalived --dont-fork

Sep 18 03:53:18 h1 Keepalived_vrrp[567]: (VI_1) received lower priority (40) advert from 172.20.1.35 - discarding
Sep 18 03:53:19 h1 Keepalived_vrrp[567]: (VI_1) received lower priority (40) advert from 172.20.1.35 - discarding
Sep 18 03:53:19 h1 Keepalived_vrrp[567]: (VI_1) Entering MASTER STATE
Sep 18 03:53:49 h1 Keepalived_vrrp[567]: Netlink reports eth0 down
Sep 18 03:53:49 h1 Keepalived_vrrp[567]: (VI_1) Entering FAULT STATE
Sep 18 03:53:49 h1 Keepalived_vrrp[567]: (VI_1) sent 0 priority
Sep 18 03:53:49 h1 Keepalived_vrrp[567]: Netlink reports eth0 up
Sep 18 03:53:49 h1 Keepalived_vrrp[567]: (VI_1) Entering BACKUP STATE
Sep 18 03:53:52 h1 Keepalived_vrrp[567]: (VI_1) received lower priority (40) advert from 172.20.1.35 - discarding
Sep 18 03:53:53 h1 Keepalived_vrrp[567]: (VI_1) Entering MASTER STATE

 

机器2

● keepalived.service - Keepalive Daemon (LVS and VRRP)
     Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2021-09-18 03:02:17 CST; 2s ago
   Main PID: 3688 (keepalived)
      Tasks: 2 (limit: 4482)
     Memory: 1.8M
     CGroup: /system.slice/keepalived.service
             ├─3688 /usr/sbin/keepalived --dont-fork
             └─3699 /usr/sbin/keepalived --dont-fork

Sep 18 03:02:17 h2 Keepalived[3688]: (Line 14) vrrp_gna_interval '0' is invalid
Sep 18 03:02:17 h2 Keepalived[3688]: Starting VRRP child process, pid=3699
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: Registering Kernel netlink reflector
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: Registering Kernel netlink command channel
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: Opening file '/etc/keepalived/keepalived.conf'.
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: (VI_1) Ignoring track_interface eth0 since own interface
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: Registering gratuitous ARP shared channel
Sep 18 03:02:17 h2 Keepalived_vrrp[3699]: (VI_1) Entering BACKUP STATE (init)

我们使用havip访问,是可以成功的,当前在机器1上:

url "172.20.1.10"
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Welcome page 1</title>
...

6 验证高可用

我们关掉机器1上的keepalived

访问havip,依然可以成功

测试脚本如下:

#/bin/bash
while true
do
  curl "172.20.1.10"
  sleep 1
done

此时观察日志,也可以发现,出现了切换

当前在page2上

curl "172.20.1.10" | grep Wel
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3373  100  3373    0     0  3293k      0 --:--:-- --:--:-- --:--:-- 3293k
<title>Welcome page 2</title>

当机器1的80端口挂掉后,对应机器1的keepalived也会自动退出,因此当你恢复80后,需要重启keepalived,这样才能抢回master

你还可以验证eth接口重启,也是一样的效果。

ifconfig eth0 down && ifconfig eth0 up

如果不想恢复后重新抢占,可以配置成一样的权重、双BACKUP、nopreempt参数

Leave a Reply

Your email address will not be published. Required fields are marked *