title: 05.MySQL-MGR和主从分离 CreateTime: 2019-06-19 14:07:00 UpdateTime: 2024-09-10 14:09:05 CategoryName: Web --- ## 1. 参考资料 http://blog.csdn.net/jssg_tzw/article/details/69791330 http://blog.itpub.net/27067062/viewspace-2142098 http://www.sohu.com/a/165178082_505827 ## 2. 卸载MySQL ```shell #最简单的方法就是yum卸载自带的MySQL5.1 yum remove mysql mysql-server mysql-libs mariadb-libs compat-mysql51 rm -rf /var/lib/mysql rm /etc/my.cnf #查看是否还有mysql软件: rpm -qa|grep mysql #有的话继续删除,这样会把系统定时任务的 crontab也删除,需要的话需要重新安装. yum install vixie-cron crontabs systemctl enable crond #设置crond服务开机自动启动 systemctl start crond #启动crond服务 ``` ## 3. 安装MySQL 安装MySQL的依赖包 ```shell yum install libaio libaio-devel numactl net-tools perl perl-devel openssl openssl-devel ``` 下载RPM整体包 https://dev.mysql.com/get/Downloads/MySQL-8.4/mysql-8.4.2-1.el9.x86_64.rpm-bundle.tar 这个压缩包里包含所有的MySQL的RPM包,安装rpm包 ```shell rpm -ivh mysql-community-client-plugins-8.4.2-1.el9.x86_64.rpm mysql-community-common-8.4.2-1.el9.x86_64.rpm mysql-community-libs-8.4.2-1.el9.x86_64.rpm mysql-community-icu-data-files-8.4.2-1.el9.x86_64.rpm mysql-community-client-8.4.2-1.el9.x86_64.rpm mysql-community-server-8.4.2-1.el9.x86_64.rpm mysql-community-devel-8.4.2-1.el9.x86_64.rpm ``` 卸载 ```shell yum remove mysql-community-client-plugins-8.4.2-1.el9 mysql-community-common-8.4.2-1.el9 mysql-community-server-8.4.2-1.el9 mysql-community-icu-data-files-8.4.2-1.el9 ``` ## 4. 启动MySQL ```shell ##修改/etc/selinux/config文件中设置SELINUX=disabled setenforce 0 #删除上次的遗留文件 rm -rf /var/lib/mysql/* #修改日志文件权限 echo "" > /var/log/mysqld.log chown -R mysql:mysql /var/log/mysqld.log #初始化mysql,安装错误可以删除data目录,重新初始化,加上lower_case_table_names就会启动异常,需要在/etc/my.conf里配置 lower_case_table_names=1 mysqld --initialize --lower_case_table_names=1 #mysqld --initialize #修改文件权限 chown -R mysql:mysql /var/lib/mysql #如果做了软连接,修改实际目录的文件权限,/var/lib/mysql,mysql-files,mysql-keyring 三个文件夹 #chown -R mysql:mysql /data/mysql #启动MySQL systemctl enable mysqld systemctl start mysqld #查找初始化密码 grep password /var/log/mysqld.log #登录MySQL: mysql -uroot -p初始化密码 #更新root密码为Root+1qazxsw2 #SET PASSWORD=PASSWORD('Root+1qazxsw2'); ALTER USER 'root'@'localhost' IDENTIFIED BY 'Root+1qazxsw2'; #ALTER USER 'root'@'%' IDENTIFIED BY 'Root+1qazxsw2'; #不限制root用户的host update mysql.user set host = '%' where user = 'root'; flush privileges; ``` ## 5. 创建测试的数据库(非必须) ```sql -- 创建数据库 create database db1; -- 创建用户 CREATE USER test1@'127.0.0.1' IDENTIFIED BY 'Test1+12345'; -- 给数据库分配所有权 GRANT ALL PRIVILEGES ON db1.* TO test1@'127.0.0.1'; -- 执行测试的t_user.sql语句 source /root/t_user.sql; ``` ## 6. 配置MGR ### 6.1.修改host文件 修改每台机器的host文件,确保可以正常解析主机名. mysql> select * from performance_schema.replication_group_members; MySQL MGR members记录使用了服务器的host,而不是IP. ![修改host文件](/public/05/05-mysql-mgr-02.jpg) ### 6.2.创建复制用户(第一个Master执行) 创建复制用户binlog不能记录,否则会引起START GROUP_REPLICATION执行报错.先关闭binlog日志,添加好再打开binlog日志. ```sql #SET SQL_LOG_BIN=0; create user 'mysqlmgruser'@'%' identified with 'mysql_native_password' by '5j2;hoqbkbUF'; grant replication slave on *.* to 'mysqlmgruser'@'%' with grant option; #FLUSH PRIVILEGES; #SET SQL_LOG_BIN=1; reset master; change master to master_user='mysqlmgruser',master_password='5j2;hoqbkbUF' for channel 'group_replication_recovery'; ``` ### 6.3.安装group replication插件(三台都要安装) ```sql install PLUGIN group_replication SONAME 'group_replication.so'; -- 查看group replication组件 show plugins; ``` | Name | Status |Type |Library |License | | ---- | ---- |------------ |---- |---- | | group_replication| ACTIVE |GROUP REPLICATION| group_replication.so |GPL | ### 6.4.配置 **[下载MGR配置示例](/public/05/mgr-my.cnf)** ### 6.5.启动group replication -- 设置group_replication_bootstrap_group为ON是为了标示作为首个节点启动mgr集群,以后加入的就不能设置.如果是开机自启动,start group_replication就无需执行了. ``` mysql> set global group_replication_bootstrap_group=ON; mysql> start group_replication; mysql> set global group_replication_bootstrap_group=OFF; ``` ### 6.6.查看MGR状态 -- 查询表performance_schema.replication_group_members mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+------------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+------------------+-------------+--------------+ | group_replication_applier | 1d105439-d041-11e7-960b-fa163ebd7d48 | host-10-0-70-125 | 3306 | ONLINE | +---------------------------+--------------------------------------+------------------+-------------+--------------+ ### 6.7.添加组内其他成员 参照以上过程和配置,配置其他组的MySQL,记得修改my.cnf ``` server-id loose-group_replication_local_address loose-group_replication_group_seeds ``` 启动group replication 时,执行命令是: mysql> start group_replication; 若失败可以强制加入复制组,MySQL8不再支持支持这个参数 mysql> set global group_replication_allow_local_disjoint_gtids_join=ON; 其他不变 ### 6.8.错误问题及汇总 参考:https://blog.csdn.net/wangxiaotongfan/article/details/81870258 1.错误案例01 错误信息:2017-07-15T01:36:06.929941Z 4 [ERROR] Plugin group_replication reported: 'The group name 'group-replication-test' is not a valid UUID' 错误原因:loose-group_replication_group_name参数没有按照UUID格式指定,被认为设置该参数无效 解决方案:更改loose-group_replication_group_name参数值为,loose-group_replication_group_name="2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec" 2.错误案例02 错误信息: 2017-07-15T01:29:27.271909Z 0 [Warning] unknown variable 'loose-group_replication_group_name=group-replication-test' 2017-07-15T01:29:27.271926Z 0 [Warning] unknown variable 'loose-group_replication_start_on_boot=off' 2017-07-15T01:29:27.271930Z 0 [Warning] unknown variable 'loose-group_replication_local_address=10.26.7.129:3306' 2017-07-15T01:29:27.271935Z 0 [Warning] unknown variable 'loose-group_replication_group_seeds=10.26.7.129:3306,10.26.7.142:3306,10.26.7.166:3306' 2017-07-15T01:29:27.271939Z 0 [Warning] unknown variable 'loose-group_replication_bootstrap_group=off' 错误原因:因为先设置了这些参数,而没有装group_replication插件,导致数据库实例无法识别这些参数 解决方案:安装group replication插件,install plugin group_replication soname 'group_replication.so'; (uninstall plugin group_replication 卸载,show plugins查看) 3.错误案例03 错误信息: 2017-07-15T01:54:54.447829Z 0 [Note] Plugin group_replication reported: 'Unable to bind to 0.0.0.0:3306 (socket=60, errno=98)!' 2017-07-15T01:54:54.447948Z 0 [ERROR] Plugin group_replication reported: 'Unable to announce tcp port 3306. Port already in use?' 2017-07-15T01:54:54.448101Z 0 [ERROR] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 错误原因:配置的组复制监听端口和MYSQL实例端口冲突 解决方案:调整下面参数 loose-group_replication_local_address ="10.26.7.129:24001" #不同节点配置不同节点本身的IP地址和端口,区分MYSQL自身的3306端口 loose-group_replication_group_seeds ="10.26.7.129:24001,10.26.7.142:24001,10.26.7.166:24001" 4.错误案例04 错误信息: 2017-07-15T04:20:01.249529Z 21 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rpl_user@node2:3306' - retry-time: 60 retries: 1, Error_code: 2005 错误原因:没有配置DNS解析或者hosts解析,节点无法连接其他数据库 解决方案:配置hosts解析,每个节点/etc/hosts添加如下内容 ``` 10.26.7.166 node3 10.26.7.142 node2 10.26.7.129 node1 ``` 5.错误案例05 错误信息 2017-07-15T03:42:45.395407Z 288 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Can't create database 'db01'; database exists' on query. Default database: 'db01'. Query: 'create database db01', Error_code: 1007 2017-07-15T03:42:45.395472Z 288 [Warning] Slave: Can't create database 'db01'; database exists Error_code: 1007 2017-07-15T03:42:45.395503Z 288 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000005' position 434 错误原因:这个错误是由于节点由于各种原因退出mgr组,后面又加入了mgr组,但之前存在的数据依旧存在 解决方案:删除要加入组复制节点存在的数据库即可,但其他节点不是主写节点,需要先调整参数set global super_read_only=0;然后执行drop database db01;再重新加入组 set global group_replication_allow_local_disjoint_gtids_join=ON; start group_replication; 6.错误案例06 错误信息: 2017-07-15T03:44:09.982428Z 18 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Can't create database 'db01'; database exists' on query. Default database: 'db01'. Query: 'create database db01', Error_code: 1007 2017-07-15T03:44:09.982493Z 18 [Warning] Slave: Can't create database 'db01'; database exists Error_code: 1007 2017-07-15T03:44:09.982522Z 18 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000005' position 434 错误原因:同上错误案例05 解决方案:同上错误案例05 7.错误案例07 错误信息: 2017-07-15T03:49:10.370846Z 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec:1-4, 35e38786-66bb-11e7-bcc3-b8ca3a6a61a4:1 Group transactions: 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec:1-16' 错误原因:同上错误案例05,在从库执行了多余的事务 解决方案:同上错误案例05,直接重新加入节点即可 set global group_replication_allow_local_disjoint_gtids_join=ON; start group_replication; 8.错误案例08 错误信息 ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin. 错误原因:由于主节点创建了表t1,但没有指定主键(此时表结构可以复制到各节点,一旦插入数据DML操作即会报错) 解决方案:为表增加主键,然后做DML操作(MGR需要各表都有主键) alter table t1 add primary key(id); insert into t1 values(1),(2); 9.错误案例09 错误信息: mysqldump -R -E --triggers --single-transaction --master-data=2 -B db01 >db01.sql Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events. mysqldump: Couldn't execute 'SAVEPOINT sp': The MySQL server is running with the --transaction-write-set-extraction!=OFF option so it cannot execute this statement (1290) 错误原因:mgr不支持mysqldump的事务一致性备份,因为其不支持savepoint 解决方案:通过xtrabackup或者不加--single-transaction备份 10.错误案例10 错误信息: create table t2 as select * from t1; ERROR 1786 (HY000): Statement violates GTID consistency: CREATE TABLE ... SELECT. 错误原因:配置MGR,开启了GTID,所有GTID不支持的操作,MGR也不支持 解决方案:使用create table t2 like t1; insert into t2 select * from t;分开两个事务执行 ## 7. ~~数据库主从~~(建议使用MGR) ### 7.1.安装从数据库 按照上述步骤,在另一台服务器上安装mysql数据库,并创建db1测试数据库. ### 7.2.master数据库 主数据库服务器:10.0.8.137,从数据库服务器:10.0.8.138 修改主服务器的配置文件: vi /etc/my.cnf 添加以下内容: ``` log_bin = mysql-bin log_bin_index = mysql-bin.index server-id=1 default_password_lifetime=0 expire_logs_days=300 max_connections=5000 character_set_erver=utf8mb4 ##需要同步的数据库名字,如果是多个,就以此格式再写一行即可 ##建议暂时不要使用 #binlog-do-db = db1 #binlog-ignore-db=mysql ``` log-bin 配置的是开启二进制日志,并将日志写在配置的路径上.记录的日志将以mysql-bin.000001 的方式进行记录. log-bin-index 配置的是二进制日志记录文件的目录.该文件中每一行都是二进制日志文件的路径. server-id是该MySQL服务器的服务ID,用于区分在主从配置中的其他服务器,配置成功后,重启mysqld服务. 注意:如果不定义log-bin的文件名,则会以hostname主机名命名,一旦主机名修改,则会因为找不见二进制文件报错.所以最好还是显示指定文件名. ### 7.3.slave数据库 修改slave的/etc/my.cnf配置文件,在[mysqld]下添加如下配置: ``` server-id = 2 relay_log = relay-bin relay_log_index = relay-bin.index read_only=on default_password_lifetime=0 expire_logs_days=300 max_connections=1000 character-set-server=utf8mb4 ##需要同步的数据库名字,如果是多个,就以此格式再写一行即可 ##建议暂时不要使用 #replicate-do-db = db1 ##不需要同步的数据库名字,如果是多个,就以此格式再写一行即可 ##建议暂时不要使用 #replicate-ignore-db = mysql #需要注释二进制日志 #log_bin = mysql-bin ``` server_id和master意义一样,用于区分不同的mysql服务器 relay_log用以记录收到的中继二进制日志. relay_log_index用以保存收到的日志路径索引.如果不显示指定文件名,则以hostname值命名.保存后可以重启mysqld服务. read_only=on 是保证从库是只读状态,避免手误进行写入操作,造成数据异常. ### 7.4.连接Master服务器 创建在master上创建备份用户 mysql>GRANT REPLICATION SLAVE ON *.* TO 'backup'@'10.0.8.138' IDENTIFIED BY 'Backup+1qazxsw2'; mysql>FLUSH PRIVILEGES; 在slave上启动slave与master的连接.在slave上登录到mysql,在mysql>:下进行操作 mysql> stop slave; //停止slave 同步 mysql> change master to master_host = '10.0.8.137', master_user='backup', master_password='Backup+1qazxsw2', MASTER_AUTO_POSITION = 1; mysql> start slave; // 开始slave 复制 检查主从是否配置成功,在slave的服务器上查看 mysql> show slave status\G; 在打印出来的状态下,查看以下两个参数的值: ``` Slave_IO_Running: Yes Slave_SQL_Running: Yes ``` 如果这两个参数的值都是yes,则运行正常,可以在master数据库上添加一个数据库或者添加一张表,检查slave数据库上是否存在,如果这两个参数有任何一个不是 Yes,则说明存在问题.可以查看slave上的数据库错误日志文件查看错误原因. ### 7.5.常见错误原因 1.server-id一致 2.用户权限不够.需要的权限包括: ```sql GRANT REPLICATION SLAVE,RELOAD,CREATE USER,SUPER ON *.* TO USER@'IP_ADDR' IDENTIFIED BY 'PASSWORD' WITH GRANT OPTION; ``` 3.数据库UUID一致,如果mysql安装时是通过批量复制安装的,则有可能数据库的UUID一致,进入数据库的datadir目录,修改auto.cnf.随意修改下uuid中的值,重启mysqld服务即可 4.Master数据库端口被防火墙阻挡 ### 7.6.主从配置文件 **[下载主从配置示例](/public/05/m-s-my.cnf)** ## 8. 常用维护 show variables like '%log_bin%'; //master数据库,log_bin=on show master status; //master数据库 show slave status; //slave数据库 开机自启动 chkconfig add mysql on 自动备份: 在my.cnf中添加自动备份使用的账号密码, ``` [mysqldump] quick max_allowed_packet = 16M user=root password=Root+1qazxsw2 ``` 备份脚本: ```shell set-gtid-purged=OFF #(不备份GTID) /usr/bin/mysqldump --set-gtid-purged=OFF --add-drop-table db1 | gzip > /data/backup/db1_$(date +%Y%m%d%H%m%S).sql.gz #删除db1数据库100天以前的备份. find /data/backup -type f -mtime +100 -name db1*.sql.gz -exec rm -rf {} ; #删除所有数据库100天以前的备份 find /data/backup -type f -mtime +100 -name *.sql.gz -exec rm -rf {} ; ``` 定时任务 ```shell crontab -e 0 2 * * * /bin/sh /data/mysqlbackup. ```