本篇文章主要介绍了redis sentinel哨兵集群的实现步骤,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,下面一起来看一下,希望对大家有帮助。
推荐学习:redis视频教程
一、redis sentinel哨兵集群概述(1)redis哨兵概述*sentinel 哨兵:这是一个分布式系统,该进程是用于监控redis集群中master主服务器的工作状态,在master主服务器发生故障时,可以实现master和slave服务器的秒级切换,保证系统有一个master主服务器,提供了redis集群的高可用,在reids2.6.版本时被加入,到2.8版本之后得到了稳定
redis哨兵和redis主从的区别:
redis哨兵:主服务器出现故障后,会有一个从服务器代替主服务器
redis主从:主服务器出现故障后,从服务器不会做任何事
(2)redis哨兵的工作机制哨兵只需要部署在master主服务器上即可
工作进程:
监控(monitoring):哨兵通过流言协议(gossip protocols)会不断检查集群中每一台服务器是否运作正常
提醒(notification):当哨兵监控的某个redis服务器出现问题时,哨兵可以通过api(应用程序接口)向管理员或者其他应用程序发送通知
自动故障转移(automatic failover):在集群中如果有一个master主服务器出现故障时,哨兵会通过投票协议(agreement protocols)开始一次自动故障迁移操作,他会选择一台数据较完整的slave从服务器升级为主服务器,当客户端试图连接失效的master主服务器时,集群也会向客户端返回新的master主服务器的地址,使得集群可以使用现在的master替换掉失效的master。
master和slave切换后,master的redis主配置文件、slave的redis主配置文件和哨兵的配置文件的内容都会发生相应的改变,即原来的master的redis主配置文件会多一行slave服务器的配置,之后哨兵的监控目标就会改变到现在的master主服务器上
(3)哨兵的三个定时监控任务每隔10秒,每个sentinel节点会向主节点和从节点发送info命令获取redis数据节点的信息
作用:
通过向主节点执行info命令,获取从节点的信息,这也是为什么sentinel节点不需要显式配置监控从节点。当有新的从节点加入时都可以立刻感知出来,当节点不可达或者故障转移后,可以通过info命令实时更新节点拓扑信息。
每隔1秒,每个sentinel节点会向主节点、从节点、发送一条ping命令做一次心跳检测,来确认这些节点当前是否可达如果主节点挂掉,那么sentinel,就会从剩余的从节点选择一个数据比较完整来做主节点
二、部署redis哨兵系统(1)实验环境系统ip主机名redis版本端口扮演角色
centos7.4 192.168.100.202 master redis-5.0.4 redis:6379 sentinel:26379 master
centos7.4 192.168.100.203 slave1 redis-5.0.4 redis:6379 slave
centos7.4 192.168.100.204 slave2 redis-5.0.4 redis:6379 slave
(2)实验步骤 -在每台服务器上都安装redis安装步骤相同,主机名、ip不同,下面只写master配置
[root@centos7 ~]# hostnamectl set-hostname master[root@centos7 ~]# su[root@master ~]# systemctl stop firewalld[root@master ~]# setenforce 0setenforce: selinux is disabled[root@master ~]# mount /dev/cdrom /mnt/mount: /dev/sr0 写保护,将以只读方式挂载mount: /dev/sr0 已经挂载或 /mnt 忙 /dev/sr0 已经挂载到 /mnt 上[root@master ~]# ll总用量 1928-rw-------. 1 root root 1264 1月 12 18:27 anaconda-ks.cfg-rw-r--r-- 1 root root 1966337 6月 9 01:16 redis-5.0.4.tar.gz[root@master ~]# tar xf redis-5.0.4.tar.gz[root@master ~]# cd redis-5.0.4[root@master redis-5.0.4]# make[root@master redis-5.0.4]# mkdir -p /usr/local/redis[root@master redis-5.0.4]# cp /root/redis-5.0.4/src/redis-server /usr/local/redis/[root@master redis-5.0.4]# cp /root/redis-5.0.4/src/redis-cli /usr/local/redis/[root@master redis-5.0.4]# cp /root/redis-5.0.4/redis.conf /usr/local/redis/ [root@master redis-5.0.4]# vim /usr/local/redis/redis.conf #修改。。。。。。 68 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 bind 192.168.100.202 #修改为本机地址,如果为127.0.0.1就只能本机访问 70 。。。。。。 87 # are explicitly listed using the "bind" directive. 88 protected-mode no #关闭redis的保护模式,如果为yes的话其他客户端就无法连接到此服务器 89 。。。。。。 135 # note that redis will write a pid file in /var/run/redis.pid when daemonized. 136 daemonize yes #开启redis的后台守护程序,即在redis开启之后是放在后台运行的 137 。。。。。。 262 # note that you must specify a directory here, not a file name. 263 dir /usr/local/redis/rdb 264 。。。。。。 506 # 507 requirepass 123123 #去掉注释,修改redis的密码为123123 508 #保存退出[root@slave2 redis-5.0.4]# mkdir /usr/local/redis/rdb[root@master redis-5.0.4]# vim /etc/init.d/redis#!/bin/sh# chkconfig: 2345 80 90# description: start and stop redis#path=/usr/local/bin:/sbin:/usr/bin:/binredisport=6379exec=/usr/local/redis/redis-serverredis_cli=/usr/local/redis/redis-clipidfile=/var/run/redis_6379.pidconf="/usr/local/redis/redis.conf"auth="123123"listen_ip=$(netstat -utpln |grep redis-server |awk '{print $4}'|awk -f':' '{print $1}')case "$1" in start) if [ -f $pidfile ] then echo "$pidfile exists, process is already running or crashed" else echo "starting redis server..." $exec $conf fi if [ "$?"="0" ] then echo "redis is running..." fi ;; stop) if [ ! -f $pidfile ] then echo "$pidfile does not exist, process is not running" else pid=$(cat $pidfile) echo "stopping ..." $redis_cli -h $listen_ip -p $redisport -a $auth shutdown while [ -x ${pidfile} ] do echo "waiting for redis to shutdown ..." sleep 1 done echo "redis stopped" fi ;; restart|force-reload) ${0} stop ${0} start ;; *) echo "usage: /etc/init.d/redis {start|stop|restart|force-reload}" >&2 exit 1esac[root@master redis-5.0.4]# chkconfig --add redis[root@master redis-5.0.4]# chmod 755 /etc/init.d/redis[root@master redis-5.0.4]# ln -s /usr/local/redis/* /usr/local/bin/[root@master redis-5.0.4]# /etc/init.d/redis start starting redis server...5233:c 09 jun 2021 01:25:53.069 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo5233:c 09 jun 2021 01:25:53.069 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=5233, just started5233:c 09 jun 2021 01:25:53.069 # configuration loadedredis is running...[root@master redis-5.0.4]# netstat -anpt | grep 6379tcp 0 0 192.168.100.202:6379 0.0.0.0:* listen 5234/redis-server 1
-做redis主从
******(1)master配置[root@master redis-5.0.4]# vim /usr/local/redis/redis.conf #修改。。。。。。 292 # 293 masterauth 123123 #配置主服务器密码,哨兵有一个问题,就是当主服务器坏掉,切换到从服务器时,原来的主服务器可以正常运行之后,再次加入集群是加不进去的,因为哨兵没有配置主服务器的密码,所以无法连接,所以在使用哨兵集群时,要把每台的主服务器密码都配置上,每台redis的密码最好都一样 294 。。。。。。 456 # 457 min-replicas-to-write 1 #设置slave服务器的数量,当slave服务器少于这个数量时,master主服务器会停止接收客户端的一切写请求 458 min-replicas-max-lag 10 #设置主服务器和从服务器之间同步数据的超时时间,当超过此时间时,master主服务器会停止客户端的一切写操作,单位为秒 459 #。。。。。。[root@master redis-5.0.4]# /etc/init.d/redis restart #重启redisstopping ...warning: using a password with '-a' or '-u' option on the command line interface may not be safe.redis stoppedstarting redis server...5291:c 09 jun 2021 02:04:39.132 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo5291:c 09 jun 2021 02:04:39.132 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=5291, just started5291:c 09 jun 2021 02:04:39.132 # configuration loadedredis is running...******(2)slave1配置[root@slave1 redis-5.0.4]# vim /usr/local/redis/redis.conf 。。。。。。 285 # 286 replicaof 192.168.100.202 6379 #在从服务器上指定主服务器的ip和端口 287 。。。。。。 292 # 293 masterauth 123123 #指定主服务器上redis的密码 294。。。。。。#保存退出[root@slave redis-5.0.4]# /etc/init.d/redis restart #重启服务stopping ...warning: using a password with '-a' or '-u' option on the command line interface may not be safe.redis stoppedstarting redis server...5304:c 09 jun 2021 02:11:32.241 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo5304:c 09 jun 2021 02:11:32.241 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=5304, just started5304:c 09 jun 2021 02:11:32.241 # configuration loadedredis is running...******(3)slave2配置[root@slave2 redis-5.0.4]# vim /usr/local/redis/redis.conf 。。。。。。 286 replicaof 192.168.100.204 6379 287 288 # if the master is password protected (using the "requirepass" configuration 289 # directive below) it is possible to tell the replica to authenticate before 290 # starting the replication synchronization process, otherwise the master will 291 # refuse the replica request. 292 # 293 masterauth 123123 294 。。。。。。#保存退出[root@slave2 redis-5.0.4]# /etc/init.d/redis restart stopping ...warning: using a password with '-a' or '-u' option on the command line interface may not be safe.redis stoppedstarting redis server...5253:c 09 jun 2021 17:50:25.680 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo5253:c 09 jun 2021 17:50:25.680 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=5253, just started5253:c 09 jun 2021 17:50:25.680 # configuration loadedredis is running...******(3)验证主从是否成功[root@master ~]# redis-cli -h 192.168.100.202 -a 123123warning: using a password with '-a' or '-u' option on the command line interface may not be safe.192.168.100.202:6379> set aaa bbbok192.168.100.202:6379> set bbb ccc ok192.168.100.202:6379> keys *1) "aaa"2) "bbb"[root@slave1 ~]# redis-cli -h 192.168.100.203 -a 123123warning: using a password with '-a' or '-u' option on the command line interface may not be safe.192.168.100.203:6379> keys *1) "bbb"2) "aaa"192.168.100.203:6379> set ttt fff(error) readonly you can't write against a read only replica. #从服务器无法写入数据[root@slave2 redis-5.0.4]# redis-cli -h 192.168.100.204 -a 123123warning: using a password with '-a' or '-u' option on the command line interface may not be safe.192.168.100.204:6379> keys *1) "aaa"2) "bbb"192.168.100.204:6379> set ggg aaa(error) readonly you can't write against a read only replica.#主从配置完成
-配置哨兵
******(1)在master上配置sentinel哨兵[root@master ~]# cp redis-5.0.4/src/redis-sentinel /usr/local/redis/ #复制哨兵启动脚本[root@master ~]# cp redis-5.0.4/sentinel.conf /usr/local/redis/ #复制哨兵配置文件[root@master ~]# mkdir -p /var/redis/data #创建日志文件存放位置 [root@master ~]# vim /usr/local/redis/sentinel.conf #修改哨兵配置文件。。。。。。 21 port 26379 #指定端口默认为26379 22 23 # by default redis sentinel does not run as a daemon. use 'yes' if you need it. 24 # note that redis will write a pid file in /var/run/redis-sentinel.pid when 25 # daemonized. 26 daemonize yes #yes为放在后台运行,使用no放在前台运行可以看到主从切换时候的信息 27 。。。。。。 64 # unmounting filesystems. 65 dir /var/redis/data #指定日志存放位置,就是刚才创建的路径 66 。。。。。。 83 # the valid charset is a-z 0-9 and the three characters ".-_". 84 sentinel monitor mymaster 192.168.100.202 6379 1 #指定用户为mymaster,ip为202,端口为6379,1表示当有一台master出现故障时,就进行切换 85 86 # sentinel a。。。。。。112 # default is 30 seconds.113 sentinel down-after-milliseconds mymaster 3000 #指定master的失效时间,单位为毫秒3000为3秒,表示master超过3秒没响应就判定为故障114 。。。。。。145 # default is 3 minutes.146 sentinel failover-timeout mymaster 180000 #切换操作完成的超时时间,单位为毫秒180000为180秒,在主从切换超过这个时间就判定为切换失败147 148 # scripts exe。。。。。。102 #103 sentinel auth-pass mymaster 123123 #连接master和slave的密码104 sentinel config-epoch mymaster 1 #切换后最多有多少节点可以于新的master进行同步数据105 #保存退出[root@master ~]# /usr/local/redis/redis-sentinel /usr/local/redis/sentinel.conf #启动哨兵1118:x 09 jun 2021 18:09:29.027 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo1118:x 09 jun 2021 18:09:29.027 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1118, just started1118:x 09 jun 2021 18:09:29.027 # configuration loaded[root@master ~]# netstat -anpt | grep 26379tcp 0 0 0.0.0.0:26379 0.0.0.0:* listen 1119/redis-sentinel tcp6 0 0 :::26379 :::* listen 1119/redis-sentinel [root@master ~]# kill -9 1119 #先关闭哨兵[root@master ~]# netstat -anpt | grep 26379[root@master ~]# sed -i '26s/yes/no/g' /usr/local/redis/sentinel.conf #修改为前台启动[root@master ~]# /usr/local/redis/redis-sentinel /usr/local/redis/sentinel.conf #再次开启哨兵,稍等一段时间会有提示1129:x 09 jun 2021 18:11:02.585 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo1129:x 09 jun 2021 18:11:02.585 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1129, just started1129:x 09 jun 2021 18:11:02.585 # configuration loaded1129:x 09 jun 2021 18:11:02.586 * increased maximum number of open files to 10032 (it was originally set to 1024). _._ _.-``__ ''-._ _.-`` `. `_. ''-._ redis 5.0.4 (00000000/0) 64 bit .-`` .-". "\/ _.,_ ''-._ ( ' , .-` | `, ) running in sentinel mode |`-._`-...-` __...-.``-._|'` _.-'| port: 26379 | `-._ `._ / _.-' | pid: 1129 `-._ `-._ `-./ _.-' _.-' |`-._`-._ `-.__.-' _.-'_.-'| | `-._`-._ _.-'_.-' | http://redis.io `-._ `-._`-.__.-'_.-' _.-' |`-._`-._ `-.__.-' _.-'_.-'| | `-._`-._ _.-'_.-' | `-._ `-._`-.__.-'_.-' _.-' `-._ `-.__.-' _.-' `-._ _.-' `-.__.-' 1129:x 09 jun 2021 18:11:02.586 # warning: the tcp backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.1129:x 09 jun 2021 18:11:02.586 # sentinel id is fce7776020cf12792fd239f6f9d34f2d3fdef98c1129:x 09 jun 2021 18:11:02.586 # +monitor master mymaster 192.168.100.202 6379 quorum 11129:x 09 jun 2021 18:18:04.434 * +reboot slave 192.168.100.204:6379 192.168.100.204 6379 @ mymaster 192.168.100.202 6379 #看到新增两条消息,从服务器增加了203和204主服务器时2021129:x 09 jun 2021 18:18:14.478 * +reboot slave 192.168.100.203:6379 192.168.100.203 6379 @ mymaster 192.168.100.202 6379#哨兵配置完成
-测试哨兵的故障切换
******(1)把master服务器在开启一个终端,在新开启的终端中关闭redis,测试是否可以主从切换[root@master ~]# /etc/init.d/redis stop stopping ...warning: using a password with '-a' or '-u' option on the command line interface may not be safe.redis stopped******(2)切换到开启哨兵的终端,查看新弹出的信息1129:x 09 jun 2021 18:20:36.588 # +failover-end master mymaster 192.168.100.202 63791129:x 09 jun 2021 18:20:36.588 # +switch-master mymaster 192.168.100.202 6379 192.168.100.203 63791129:x 09 jun 2021 18:20:36.588 * +slave slave 192.168.100.204:6379 192.168.100.204 6379 @ mymaster 192.168.100.203 6379 #发现主服务器变成了2031129:x 09 jun 2021 18:20:36.588 * +slave slave 192.168.100.202:6379 192.168.100.202 6379 @ mymaster 192.168.100.203 63791129:x 09 jun 2021 18:20:39.607 # +sdown slave 192.168.100.202:6379 192.168.100.202 6379 @ mymaster 192.168.100.203 6379‘******(3)在203上测试主从复制是否可以正常同步[root@slave1 ~]# redis-cli -h 192.168.100.203 -a 123123warning: using a password with '-a' or '-u' option on the command line interface may not be safe.192.168.100.203:6379> keys *1) "aaa"2) "bbb"192.168.100.203:6379> set yyy aaaok192.168.100.203:6379> keys *1) "yyy"2) "aaa"3) "bbb"[root@slave2 redis-5.0.4]# redis-cli -h 192.168.100.204 -a 123123warning: using a password with '-a' or '-u' option on the command line interface may not be safe.192.168.100.204:6379> keys * #发现同步成功1) "yyy"2) "bbb"3) "aaa"******(4)此时重新开启202的redis,并且查看哨兵的提示消息[root@master ~]# /etc/init.d/redis start starting redis server...1167:c 09 jun 2021 18:23:39.756 # oo0ooo0ooo0oo redis is starting oo0ooo0ooo0oo1167:c 09 jun 2021 18:23:39.756 # redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1167, just started1167:c 09 jun 2021 18:23:39.756 # configuration loadedredis is running...1129:x 09 jun 2021 18:23:50.324 * +convert-to-slave slave 192.168.100.202:6379 192.168.100.202 6379 @ mymaster 192.168.100.203 6379 #提示增加了一台slave******(5)在202的新终端上查看redis的数据是否成功同步[root@master ~]# redis-cli -h 192.168.100.202 -a 123123warning: using a password with '-a' or '-u' option on the command line interface may not be safe.192.168.100.202:6379> keys * #发现已经成功同步1) "bbb"2) "aaa"3) "yyy"#测试故障切换缓存,发现在master主机出现故障然后重新连接到集群后,master角色不会进行转移
-哨兵日志分析
#把哨兵放在前台运行时,日志信息会直接输出到终端上,放到后台运行时,日志会写到指定的路径中+reset-master <instance details> #当master被重置时.+slave <instance details> #当检测到一个slave并添加进slave列表时.+failover-state-reconf-slaves <instance details> #failover状态变为reconf-slaves状态时+failover-detected <instance details> #当failover发生时+slave-reconf-sent <instance details> #sentinel发送slaveof命令把它重新配置时+slave-reconf-inprog <instance details> #slave被重新配置为另外一个master的slave,但数据复制还未发生时。+slave-reconf-done <instance details> #slave被重新配置为另外一个master的slave并且数据复制已经与master同步时。-dup-sentinel <instance details> #删除指定master上的冗余sentinel时,当一个sentinel重新启动时,可能会发生这个事件+sentinel <instance details> #当master增加了一个sentinel时。+sdown <instance details> #进入sdown状态时;-sdown <instance details> #离开sdown状态时。+odown <instance details> #进入odown状态时。-odown <instance details> #离开odown状态时。+new-epoch <instance details> #当前配置版本被更新时。+try-failover <instance details> #达到failover条件,正等待其他sentinel的选举。+elected-leader <instance details> #被选举为去执行failover的时候。+failover-state-select-slave <instance details> #开始要选择一个slave当选新master时。no-good-slave <instance details> #没有合适的slave来担当新masterselected-slave <instance details> #找到了一个适合的slave来担当新masterfailover-state-send-slaveof-noone <instance details> #当把选择为新master的slave的身份进行切换的时候。failover-end-for-timeout <instance details> #failover由于超时而失败时。failover-end <instance details> #failover成功完成时。switch-master <master name> <oldip> <oldport> <newip> <newport> #当master的地址发生变化时。通常这是客户端最感兴趣的消息了。+tilt #进入tilt模式。-tilt #退出tilt模式。
推荐学习:redis视频教程
以上就是redis步骤解析之sentinel哨兵集群的详细内容。