数据传输 | 利用 DTLE 将 MySQL 数据同步到 DBLE

作者:任仲禹
爱可生 DBA 团队成员,擅长故障分析和性能优化,文章相关技术问题,欢迎大家一起讨论。
本文来源:原创投稿
*爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。
背景 源于某客户的需求,存在线上某业务 MySQL 库因为数据量及业务读写压力较大,需要将业务数据迁移到 DBLE 分布式数据库,但同时因为业务为 7x24h,能够停机的时间窗口较短,所以需要考虑数据实时同步的方案。
过往 DBLE 的业务上线基本为全新部署,数据实时同步的情况极少实施,去年 DTLE 发布后这一问题得到了些改善,今天我们来实践下。
环境准备 1. 目标端 DBLE 集群部署
  • 安装 DBLE 软件、后端分片 MySQL 库过程略
    • DBLE 版本 3.20.10.8、MySQL版本 5.7.25
  • sharding.xml
40:-1

  • db.xml
show slave status show slave status

  • user.xml

2. 源端和目标端测试表创建
  • 源端 MySQL 数据库软件安装略
  • 源端MySQL与目标端DBLE都需要创建测试表名:test
use ren; CREATE TABLE `test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL, `city` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL, `dt` datetime DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `idx_ctiy` (`city`) ) ENGINE=InnoDB;

3. 部署单节点DTLE
  • DTLE社区版本GitHub下载地址:https://github.com/actiontech...
  • 下载完成后直接rpm安装(本示例使用内部QA验证版本)
    • rpm -ivh --prefix=/data/dtle dtle-ee-9.9.9.9.x86_64.rpm
  • 安装完成确认启动正常
# curl -XGET "127.0.0.1:4646/v1/nodes" -s | jq [ { "Address": "127.0.0.1", "ID": "223c31b4-05cd-a763-b3e7-dbea6d416576", "Datacenter": "dc1", "Name": "nomad0", "NodeClass": "", "Version": "1.1.2", "Drain": false, "SchedulingEligibility": "eligible", "Status": "ready", "StatusDescription": "", "Drivers": { "dtle": { "Attributes": { "driver.dtle.full_version": "9.9.9.9-master-a65ee13", "driver.dtle": "true", "driver.dtle.version": "9.9.9.9" }, "Detected": true, "Healthy": true, "HealthDescription": "Healthy", "UpdateTime": "2022-02-28T07:45:15.650289984Z" ········· ]

创建 MySQL-To-DBLE 任务 一、全量同步 1. 准备job文件
# cat job.json { "Job": { "ID": "mysqlToDBLE", "Datacenters": ["dc1"], "TaskGroups": [{ "Name": "src", "Tasks": [{ "Name": "src", "Driver": "dtle", "Config": { "Gtid": "", "ReplicateDoDb": [{ "TableSchema": "ren", "Tables": [{ "TableName": "test" }] }], "ConnectionConfig": { "Host": "10.186.61.11", "Port": 3306, "User": "root", "Password": "root" } } }] }, { "Name": "dest", "Tasks": [{ "Name": "dest", "Driver": "dtle", "Config": { "ConnectionConfig": { "Host": "10.186.61.10", "Port": 8066, "User": "ren", "Password": "ren" } } }] }] } }

2. 准备全量复制数据
  • 源端 MySQL库执行
mysql> insert into test values(1,'ren','sh',now()); mysql> insert into test values(2,'jack','bj',now()); mysql> insert into test values(3,'tom','sz',now());

3. 启动同步任务
# curl -XPOST "http://127.0.0.1:4646/v1/jobs" -d @job.json -s| jq { "EvalID": "88ab4a42-98b7-696e-0f98-08c1fe3ee4bd", "EvalCreateIndex": 12310, "JobModifyIndex": 12310, "Warnings": "", "Index": 12310, "LastContact": 0, "KnownLeader": false }

4. 检查同步情况
  • 确认全量数据同步完成
# 目标端 DBLE 中执行 mysql> use ren; Database changed mysql> show tables; +------------------+ | Tables_in_ren| +------------------+ | test| | gtid_executed_v4 | +------------------+ 2 rows in set (0.01 sec)mysql> select * from test; +----+------+------+---------------------+ | id | name | city | dt| +----+------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | |2 | jack | bj| 2022-03-07 06:53:41 | |3 | tom| sz| 2022-03-07 06:53:59 | +----+------+------+---------------------+ 3 rows in set (0.01 sec)# 源端 MySQL 写入增量测试数据 mysql> insert into test select null,'mike','nj',now(); Query OK, 1 row affected (0.01 sec) Records: 1Duplicates: 0Warnings: 0 ········ mysql> insert into test select null,'mike4','nj',now(); Query OK, 1 row affected (0.01 sec) Records: 1Duplicates: 0Warnings: 0mysql> update test set city = 'sh' where name like 'mike%'; Query OK, 4 rows affected (0.01 sec) Rows matched: 4Changed: 4Warnings: 0mysql> select * from test; +----+-------+------+---------------------+ | id | name| city | dt| +----+-------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | |2 | jack| bj| 2022-03-07 06:53:41 | |3 | tom| sz| 2022-03-07 06:53:59 | | 45 | mike| sh| 2022-03-07 08:03:57 | | 46 | mike2 | sh| 2022-03-07 08:04:02 | | 47 | mike3 | sh| 2022-03-07 08:04:05 | | 48 | mike4 | sh| 2022-03-07 08:04:09 | +----+-------+------+---------------------+ 7 rows in set (0.01 sec)# 目标端 DBLE 检查增量同步情况 mysql> select * from test; +----+-------+------+---------------------+ | id | name| city | dt| +----+-------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | |2 | jack| bj| 2022-03-07 06:53:41 | |3 | tom| sz| 2022-03-07 06:53:59 | | 45 | mike| sh| 2022-03-07 08:03:57 | | 46 | mike2 | sh| 2022-03-07 08:04:02 | | 47 | mike3 | sh| 2022-03-07 08:04:05 | | 48 | mike4 | sh| 2022-03-07 08:04:09 | +----+-------+------+---------------------+ 7 rows in set (0.04 sec)mysql> explain select * from test where id= 1; +---------------+----------+----------------------------------+ | SHARDING_NODE | TYPE| SQL/REF| +---------------+----------+----------------------------------+ | dn_01| BASE SQL | select * from test where id= 1 | +---------------+----------+----------------------------------+ 1 row in set (0.03 sec)

二、基于GTID位点增量同步 1. 销毁全量同步任务
# cd /data/dtle/usr/bin/ # ll total 188836 -rwxr-xr-x 1 root root 107811060 Mar 172020 consul -rwxr-xr-x 1 root root85550512 Jun 222021 nomad # ./nomad job status IDTypePriorityStatusSubmit Date mysqlToDBLEservice50running2022-03-07T15:47:31+08:00 mysqltoMysql-syncservice50running2022-03-03T16:06:10+08:00 # ./nomad job stop -purge mysqlToDBLE ······· ? Deployment "433ed3d4" successful ······· # ./nomad job status IDTypePriorityStatusSubmit Date mysqltoMysql-syncservice50running2022-03-03T16:06:10+08:00

2. 记录源端GTID位点
# 记录源端 MySQL 需要开始的GTID位点 mysql> show master status\G *************************** 1. row *************************** File: mysql-bin.000178 ········ Executed_Gtid_Set: 442dbe92-00c3-11ec-a0cf-02000aba3d0b:1-49705119, cdc6fb62-00c2-11ec-a259-02000aba3d0a:1-3555 1 row in set (0.01 sec) # 插入增量数据(模拟业务新增数据) mysql> insert into test select 88,'sync01','wh',now(); mysql> insert into test select 99,'sync02','wh',now(); # 源端 MySQL 确认数据已插入 mysql> select * from test; +----+--------+------+---------------------+ | id | name| city | dt| +----+--------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | ········ | 48 | mike4| sh| 2022-03-07 08:04:09 | | 88 | sync01 | wh| 2022-03-07 08:24:20 | | 99 | sync02 | wh| 2022-03-07 08:24:31 | +----+--------+------+---------------------+ 9 rows in set (0.00 sec) # 目标端 DBLE 数据因同步job已销毁,新插入数据未同步过来 mysql> select * from test; +----+-------+------+---------------------+ | id | name| city | dt| +----+-------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | ········ | 48 | mike4 | sh| 2022-03-07 08:04:09 | +----+-------+------+---------------------+ 7 rows in set (0.00 sec)

3. 准备增量同步job文件
# cat job.json { "Job": { "ID": "mysqlToDBLE", "Datacenters": ["dc1"], "TaskGroups": [{ "Name": "src", "Tasks": [{ "Name": "src", "Driver": "dtle", "Config": { "Gtid": "442dbe92-00c3-11ec-a0cf-02000aba3d0b:1-49705119,cdc6fb62-00c2-11ec-a259-02000aba3d0a:1-3555", "ReplicateDoDb": [{ "TableSchema": "ren", "Tables": [{ "TableName": "test" }] }], "ConnectionConfig": { "Host": "10.186.61.11", "Port": 3306, "User": "root", "Password": "root" } } }] }, { "Name": "dest", "Tasks": [{ "Name": "dest", "Driver": "dtle", "Config": { "ConnectionConfig": { "Host": "10.186.61.10", "Port": 8066, "User": "ren", "Password": "ren" } } }] }] } }

4. 开始增量同步任务
# curl -XPOST "http://127.0.0.1:4646/v1/jobs" -d @job.json -s |jq { "EvalID": "cad6fb19-62d3-67aa-6f5c-fbb79f8016d2", "EvalCreateIndex": 12855, "JobModifyIndex": 12855, "Warnings": "", "Index": 12855, "LastContact": 0, "KnownLeader": false }

5. 检查同步情况
# 目标端 DBLE 中查看到 GTID 位点之后的数据已同步过来 mysql> select * from test; +-----+--------+------+---------------------+ | id| name| city | dt| +-----+--------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | |48 | mike4| sh| 2022-03-07 08:04:09 | ········· |88 | sync01 | wh| 2022-03-07 08:24:20 | |99 | sync02 | wh| 2022-03-07 08:24:31 | +-----+--------+------+---------------------+ 11 rows in set (0.06 sec)

6. 其它DML及DDL同步
  • 验证下其它update、delete语句及DDL语句同步情况
# 源端 MySQL 执行操作 mysql> delete from test where id >= 100; Query OK, 2 rows affected (0.01 sec) mysql> delete from test where id > 3; Query OK, 6 rows affected (0.01 sec) mysql> update test set name = 'actionsky' where id = 3; Query OK, 1 row affected (0.00 sec) Rows matched: 1Changed: 1Warnings: 0# 目标端 DBLE 检查同步情况 mysql> select * from test; +----+-----------+------+---------------------+ | id | name| city | dt| +----+-----------+------+---------------------+ |1 | ren| sh| 2022-03-07 06:53:30 | |2 | jack| bj| 2022-03-07 06:53:41 | |3 | actionsky | sz| 2022-03-07 06:53:59 | +----+-----------+------+---------------------+ 3 rows in set (0.01 sec)# 源端 MySQL 执行 DDL操作 mysql> alter table test add column info varchar(20) default 'hello'; mysql> update test set info = 'thanks' where id = 3; mysql> alter table test add index idx_info(`info`); # 目标端 DBLE 可以进行DDL同步(篇幅所限,实际上DBLE兼容的DDL语句都能同步成功) mysql> select * from test; +----+-----------+------+---------------------+--------+ | id | name| city | dt| info| +----+-----------+------+---------------------+--------+ |1 | ren| sh| 2022-03-07 06:53:30 | hello| |2 | jack| bj| 2022-03-07 06:53:41 | hello| |3 | actionsky | sz| 2022-03-07 06:53:59 | thanks | +----+-----------+------+---------------------+--------+ 3 rows in set (0.02 sec)mysql> show create table test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL, `city` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL, `dt` datetime DEFAULT CURRENT_TIMESTAMP, `info` varchar(20) COLLATE utf8mb4_bin DEFAULT 'hello', PRIMARY KEY (`id`), KEY `idx_ctiy` (`city`), KEY `idx_info` (`info`) ) ENGINE=InnoDB AUTO_INCREMENT=89 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.01 sec)

配置 MySQL-To-DBLE 注意事项 1. 检测连接失败问题
  • 问题描述:DTLE配置JOB过程中,“检测连接”失败,nomad日志报错ERROR 1064 (HY000): java.sql.SQLSyntaxErrorException: illegal value[TRUE]
  • 原因:DTLE下发的检测客户端语句set autocommit=true,在DBLE某些版本中不支持
  • 解决:升级DBLE到3.20.10.6版本及之后
2. 任务启动后同步失败报'dtle'不存在
  • 问题描述:DTLE同步任务启动后报错, nomad 日志出现Can't create database 'dtle' that doesn’t exists.
  • 原因:
    • DTLE To MySQL ,不会出现该种报错
    • DTLE To DBLE ,由于DBLE中间件中schema的创建方式与普通MySQL不一致,所以该create语法不支持
  • 解决:
    • 需要对DBLE进行额外的 Schema/Table 配置,参考前文 sharding.xml 和 user.xml中相关配置
3. 任务启动后同步失败报'Data too long'
  • 问题描述:DTLE同步任务启动后报错,nomad日志出现“applier error/restart: insert gno: Error 1406: Data too long for column 'source_uuid' at row 1”
  • 原因:
    • DTLE在DBLE中创建的表gtid_executed_v4 中,字段source_uuid的Binary数据类型长度不够
    • 也可通过排查DBLE中间件日志(core/log/dble.log),报错信息为“execute sql err : errNo:1406 Data too long for column 'source_uuid' at row 1”
  • 解决:
    • DBLE中,修改字段
    • alter table gtid_executed_v4 modify column source_uuid binary(60);
结论
  • 【数据传输 | 利用 DTLE 将 MySQL 数据同步到 DBLE】DTLE 目前功能基本可以满足 MySQL -> DBLE 间数据实施同步需求,不过需要注意的是,不建议采用本文所提到的 全量同步 方式
    • 生产环境实施由于 MySQL 老库数据量较大,可以先将数据全量逻辑备份出来(需记录GTID位点),再通过 DBLE 自带的 split 工具进行拆分后进行导入,然后再使用 DTLE 基于GTID位点增量同步 的方式进行数据同步
  • DTLE 创建 To-DBLE 任务前需要关注下前文所示注意事项,尤其是提前准备好 sharding.xml、user.xml文件并创建好 DTLE的元数据表gtid_executed_v4。

    推荐阅读