实战案例(Redis集群动态缩容)

【实战案例(Redis集群动态缩容)】仰天大笑出门去,我辈岂是蓬蒿人。这篇文章主要讲述实战案例:Redis集群动态缩容相关的知识,希望能为你提供帮助。
redis 集群运行之后,难免由于硬件故障、网络规划变化、业务变化等原因对已有集群进行相应的调整, 比如: 增加节点、减少节点、节点迁移、更换服务器等。增加节点和删除节点会涉及到已有的槽位重新分配及数据迁移。
本次模拟案例:某个公司可能因为业务萎缩或主机故障,需要现有的五主五从的redis cluster架构缩小规模,我们模拟不能影响业务使用和不造成数据丢失情况下,将其动态缩容从四主四从的集群。
本实验是在五主五从的redis集群基础上完成的,请参见前文《实战案例:Redis集群动态扩容》(https://blog.51cto.com/shone/5278001) 。
特别说明:一般redis集群都是3以上的偶数构成,否则容易脑裂,本实践仅演示一台机器的删除过程,同样方法可以删除第二对主从redis,实现三主三从架构。

集群维护之动态缩容过程:

  • 先将被要删除的Redis node(IP58)上的槽位迁移到集群中的其他Redis node节点上
  • 然后再将其(主从两个节点)删除,如果一个Redis node节点上的槽位没有被完全迁移,删除该node的时候会提示有数据且无法删除。
1.迁移槽位至其他节点注意: 做数据迁移前一定要做好规划和数据的备份!这应该是每个SRE工程师必须记得的玉律!!!被迁移Redis master源服务器尽可能地保证没有数据在上面运行了,否则迁移过程容易出错并可能被强制中断而失败。
#### 查看所有节点和从属关系,确定好装备删除的节点
[root@CentOS84-IP172-18 ]#redis-cli -a 123456 --no-auth-warning --cluster check 172.16.0.18:6379
172.16.0.18:6379 (d5462f69...) -> 0 keys | 3278 slots | 1 slaves.
172.16.0.28:6379 (5163e9ab...) -> 0 keys | 3278 slots | 1 slaves.
172.16.0.58:6379 (c4af78bc...) -> 0 keys | 3274 slots | 1 slaves.
172.16.0.48:6379 (aaa956c2...) -> 1 keys | 3275 slots | 1 slaves.
172.16.0.38:6379 (4c429a48...) -> 0 keys | 3279 slots | 1 slaves.
[OK] 1 keys in 5 masters.
0.00 keys per slot on average.
> > > Performing Cluster Check (using node 172.16.0.18:6379)
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[2184-5461] (3278 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
M: 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e 172.16.0.28:6379
slots:[7645-10922] (3278 slots) master
1 additional replica(s)
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[1093-2183],[6553-7644],[12014-13104] (3274 slots) master
1 additional replica(s)
S: a7583f69703921c6b3d14a97f54b3015966155ab 172.16.0.118:6379
slots: (0 slots) slave
replicates d5462f6961c0f45ecbdf12d6606e6993c33e3e29
S: 3d69cddc61df9443ff7de9850c220fc9e9187c03 172.16.0.138:6379
slots: (0 slots) slave
replicates 4c429a48054a771cbc154319182a3d16cf4ce7a1
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master
1 additional replica(s)
S: 3bbdbc3ab34b67161655974fed9de5667def8ed0 172.16.0.128:6379
slots: (0 slots) slave
replicates 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e
M: 4c429a48054a771cbc154319182a3d16cf4ce7a1 172.16.0.38:6379
slots:[13105-16383] (3279 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
> > > Check for open slots...
> > > Check slots coverage...
[OK] All 16384 slots covered.
[root@CentOS84-IP172-18 ]#

#### 从集群中删除下面这对(主从)节点 IP58,演示缩容过程
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[1093-2183],[6553-7644],[12014-13104] (3274 slots) master
1 additional replica(s)
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa

#### 考虑五组变成四组脑裂问题,可以继续从集群中删除下面这对(主从)节点 IP48,本次仅演示删除IP58,如果要删除IP48方法相同即可。
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122

#### 从集群中删除下面这对(主从)节点 IP58,演示缩容过程
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[1093-2183],[6553-7644],[12014-13104] (3274 slots) master
1 additional replica(s)
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa

## 第一步先挪动了slots:[1093-2183]给IP18
###################################################################################
## IP172.16.0.58 上的slots:[1093-2183]迁移到 IP172.16.0.28 上去
[root@CentOS84-IP172-18 ]#redis-cli -a 123456 --no-auth-warning --cluster reshard 172.16.0.18:6379

> > > Performing Cluster Check (using node 172.16.0.18:6379)
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[2184-5461] (3278 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
M: 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e 172.16.0.28:6379
slots:[7645-10922] (3278 slots) master
1 additional replica(s)
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[1093-2183],[6553-7644],[12014-13104] (3274 slots) master
1 additional replica(s)
S: a7583f69703921c6b3d14a97f54b3015966155ab 172.16.0.118:6379
slots: (0 slots) slave
replicates d5462f6961c0f45ecbdf12d6606e6993c33e3e29
S: 3d69cddc61df9443ff7de9850c220fc9e9187c03 172.16.0.138:6379
slots: (0 slots) slave
replicates 4c429a48054a771cbc154319182a3d16cf4ce7a1
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master
1 additional replica(s)
S: 3bbdbc3ab34b67161655974fed9de5667def8ed0 172.16.0.128:6379
slots: (0 slots) slave
replicates 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e
M: 4c429a48054a771cbc154319182a3d16cf4ce7a1 172.16.0.38:6379
slots:[13105-16383] (3279 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
> > > Check for open slots...
> > > Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1091
What is the receiving node ID? d5462f6961c0f45ecbdf12d6606e6993c33e3e29
Please enter all the source node IDs.
Type all to use all the nodes as source nodes for the hash slots.
Type done once you entered all the source nodes IDs.
Source node #1: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
Source node #2: done

Ready to move 1091 slots.
Source nodes:
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[1093-2183],[6553-7644],[12014-13104] (3274 slots) master
1 additional replica(s)
Destination node:
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[2184-5461] (3278 slots) master
1 additional replica(s)
Resharding plan:
Moving slot 1093 from c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
.....................
Moving slot 2183 from c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
Do you want to proceed with the proposed reshard plan (yes/no)? yes
Moving slot 1093 from 172.16.0.58:6379 to 172.16.0.18:6379:
.........................

###################################################################################
## 第二步 IP172.16.0.58 上的slots:[6553-7644] 迁移到 IP172.16.0.28 上去
[root@CentOS84-IP172-18 ]#redis-cli -a 123456 --no-auth-warning --cluster check 172.16.0.18:6379
172.16.0.18:6379 (d5462f69...) -> 0 keys | 4369 slots | 1 slaves.
172.16.0.28:6379 (5163e9ab...) -> 0 keys | 3278 slots | 1 slaves.
172.16.0.58:6379 (c4af78bc...) -> 0 keys | 2183 slots | 1 slaves.
172.16.0.48:6379 (aaa956c2...) -> 1 keys | 3275 slots | 1 slaves.
172.16.0.38:6379 (4c429a48...) -> 0 keys | 3279 slots | 1 slaves.
[OK] 1 keys in 5 masters.
0.00 keys per slot on average.
> > > Performing Cluster Check (using node 172.16.0.18:6379)
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[1093-5461] (4369 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
M: 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e 172.16.0.28:6379
slots:[7645-10922] (3278 slots) master
1 additional replica(s)
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[6553-7644],[12014-13104] (2183 slots) master
1 additional replica(s)
S: a7583f69703921c6b3d14a97f54b3015966155ab 172.16.0.118:6379
slots: (0 slots) slave
replicates d5462f6961c0f45ecbdf12d6606e6993c33e3e29
S: 3d69cddc61df9443ff7de9850c220fc9e9187c03 172.16.0.138:6379
slots: (0 slots) slave
replicates 4c429a48054a771cbc154319182a3d16cf4ce7a1
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master
1 additional replica(s)
S: 3bbdbc3ab34b67161655974fed9de5667def8ed0 172.16.0.128:6379
slots: (0 slots) slave
replicates 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e
M: 4c429a48054a771cbc154319182a3d16cf4ce7a1 172.16.0.38:6379
slots:[13105-16383] (3279 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
> > > Check for open slots...
> > > Check slots coverage...
[OK] All 16384 slots covered.
[root@CentOS84-IP172-18 ]#


[root@CentOS84-IP172-18 ]#redis-cli -a 123456 --no-auth-warning --cluster reshard 172.16.0.18:6379
> > > Performing Cluster Check (using node 172.16.0.18:6379)
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[1093-5461] (4369 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
M: 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e 172.16.0.28:6379
slots:[7645-10922] (3278 slots) master
1 additional replica(s)
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[6553-7644],[12014-13104] (2183 slots) master
1 additional replica(s)
S: a7583f69703921c6b3d14a97f54b3015966155ab 172.16.0.118:6379
slots: (0 slots) slave
replicates d5462f6961c0f45ecbdf12d6606e6993c33e3e29
S: 3d69cddc61df9443ff7de9850c220fc9e9187c03 172.16.0.138:6379
slots: (0 slots) slave
replicates 4c429a48054a771cbc154319182a3d16cf4ce7a1
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master
1 additional replica(s)
S: 3bbdbc3ab34b67161655974fed9de5667def8ed0 172.16.0.128:6379
slots: (0 slots) slave
replicates 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e
M: 4c429a48054a771cbc154319182a3d16cf4ce7a1 172.16.0.38:6379
slots:[13105-16383] (3279 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
> > > Check for open slots...
> > > Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1092
What is the receiving node ID? 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e
Please enter all the source node IDs.
Type all to use all the nodes as source nodes for the hash slots.
Type done once you entered all the source nodes IDs.
Source node #1: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
Source node #2: done

......................
Do you want to proceed with the proposed reshard plan (yes/no)? yes
......................

###################################################################################
## 第三步 IP172.16.0.58 上的slots:[1093-5461] 迁移到 IP172.16.0.38 上去

[root@CentOS84-IP172-18 ]#redis-cli -a 123456 --no-auth-warning --cluster check 172.16.0.18:6379
172.16.0.18:6379 (d5462f69...) -> 0 keys | 4369 slots | 1 slaves.
172.16.0.28:6379 (5163e9ab...) -> 0 keys | 4370 slots | 1 slaves.
172.16.0.58:6379 (c4af78bc...) -> 0 keys | 1091 slots | 1 slaves.
172.16.0.48:6379 (aaa956c2...) -> 1 keys | 3275 slots | 1 slaves.
172.16.0.38:6379 (4c429a48...) -> 0 keys | 3279 slots | 1 slaves.
[OK] 1 keys in 5 masters.
0.00 keys per slot on average.
> > > Performing Cluster Check (using node 172.16.0.18:6379)
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[1093-5461] (4369 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
M: 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e 172.16.0.28:6379
slots:[6553-10922] (4370 slots) master
1 additional replica(s)
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[12014-13104] (1091 slots) master
1 additional replica(s)
S: a7583f69703921c6b3d14a97f54b3015966155ab 172.16.0.118:6379
slots: (0 slots) slave
replicates d5462f6961c0f45ecbdf12d6606e6993c33e3e29
S: 3d69cddc61df9443ff7de9850c220fc9e9187c03 172.16.0.138:6379
slots: (0 slots) slave
replicates 4c429a48054a771cbc154319182a3d16cf4ce7a1
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master
1 additional replica(s)
S: 3bbdbc3ab34b67161655974fed9de5667def8ed0 172.16.0.128:6379
slots: (0 slots) slave
replicates 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e
M: 4c429a48054a771cbc154319182a3d16cf4ce7a1 172.16.0.38:6379
slots:[13105-16383] (3279 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
> > > Check for open slots...
> > > Check slots coverage...
[OK] All 16384 slots covered.
[root@CentOS84-IP172-18 ]#redis-cli -a 123456 --no-auth-warning --cluster reshard 172.16.0.18:6379
> > > Performing Cluster Check (using node 172.16.0.18:6379)
M: d5462f6961c0f45ecbdf12d6606e6993c33e3e29 172.16.0.18:6379
slots:[1093-5461] (4369 slots) master
1 additional replica(s)
S: 93a4ba65d181a756c08bd3b6c2a7a4d24cf5855d 172.16.0.148:6379
slots: (0 slots) slave
replicates aaa956c280b5d9fe18fca48a910c1085b5f22122
S: aeee686e355fa7784b383fec3543232126dcfbad 172.16.0.158:6379
slots: (0 slots) slave
replicates c4af78bc4a26490d51edc78b6c547a7abaf4e1aa
M: 5163e9abbf42bd3540d9c04f6fb384ea23a1f58e 172.16.0.28:6379
slots:[6553-10922] (4370 slots) master
1 additional replica(s)
M: c4af78bc4a26490d51edc78b6c547a7abaf4e1aa 172.16.0.58:6379
slots:[12014-13104] (1091 slots) master
1 additional replica(s)
S: a7583f69703921c6b3d14a97f54b3015966155ab 172.16.0.118:6379
slots: (0 slots) slave
replicates d5462f6961c0f45ecbdf12d6606e6993c33e3e29
S: 3d69cddc61df9443ff7de9850c220fc9e9187c03 172.16.0.138:6379
slots: (0 slots) slave
replicates 4c429a48054a771cbc154319182a3d16cf4ce7a1
M: aaa956c280b5d9fe18fca48a910c1085b5f22122 172.16.0.48:6379
slots:[0-1092],[5462-6552],[10923-12013] (3275 slots) master

    推荐阅读