05.MySQL-MGR和主从分离
文章目录
1. 参考资料
http://blog.csdn.net/jssg_tzw/article/details/69791330
http://blog.itpub.net/27067062/viewspace-2142098
http://www.sohu.com/a/165178082_505827
2. 卸载MySQL
#最简单的方法就是yum卸载自带的MySQL5.1
yum remove mysql mysql-server mysql-libs mariadb-libs compat-mysql51
rm -rf /var/lib/mysql
rm /etc/my.cnf
#查看是否还有mysql软件:
rpm -qa|grep mysql
#有的话继续删除,这样会把系统定时任务的 crontab也删除,需要的话需要重新安装.
yum install vixie-cron crontabs
systemctl enable crond #设置crond服务开机自动启动
systemctl start crond #启动crond服务
3. 安装MySQL
安装MySQL的依赖包
yum install libaio libaio-devel numactl net-tools perl perl-devel openssl openssl-devel
下载RPM整体包 https://dev.mysql.com/get/Downloads/MySQL-8.4/mysql-8.4.2-1.el9.x86_64.rpm-bundle.tar
这个压缩包里包含所有的MySQL的RPM包,安装rpm包
rpm -ivh mysql-community-client-plugins-8.4.2-1.el9.x86_64.rpm mysql-community-common-8.4.2-1.el9.x86_64.rpm mysql-community-libs-8.4.2-1.el9.x86_64.rpm mysql-community-icu-data-files-8.4.2-1.el9.x86_64.rpm mysql-community-client-8.4.2-1.el9.x86_64.rpm mysql-community-server-8.4.2-1.el9.x86_64.rpm mysql-community-devel-8.4.2-1.el9.x86_64.rpm
卸载
yum remove mysql-community-client-plugins-8.4.2-1.el9 mysql-community-common-8.4.2-1.el9 mysql-community-server-8.4.2-1.el9 mysql-community-icu-data-files-8.4.2-1.el9
4. 启动MySQL
##修改/etc/selinux/config文件中设置SELINUX=disabled
setenforce 0
#删除上次的遗留文件
rm -rf /var/lib/mysql/*
#修改日志文件权限
echo "" > /var/log/mysqld.log
chown -R mysql:mysql /var/log/mysqld.log
#初始化mysql,安装错误可以删除data目录,重新初始化,加上lower_case_table_names就会启动异常,需要在/etc/my.conf里配置 lower_case_table_names=1
mysqld --initialize --lower_case_table_names=1
#mysqld --initialize
#修改文件权限
chown -R mysql:mysql /var/lib/mysql
#如果做了软连接,修改实际目录的文件权限,/var/lib/mysql,mysql-files,mysql-keyring 三个文件夹
#chown -R mysql:mysql /data/mysql
#启动MySQL
systemctl enable mysqld
systemctl start mysqld
#查找初始化密码
grep password /var/log/mysqld.log
#登录MySQL: mysql -uroot -p初始化密码
#更新root密码为Root+1qazxsw2
#SET PASSWORD=PASSWORD('Root+1qazxsw2');
ALTER USER 'root'@'localhost' IDENTIFIED BY 'Root+1qazxsw2';
#ALTER USER 'root'@'%' IDENTIFIED BY 'Root+1qazxsw2';
#不限制root用户的host
update mysql.user set host = '%' where user = 'root';
flush privileges;
5. 创建测试的数据库(非必须)
-- 创建数据库
create database db1;
-- 创建用户
CREATE USER test1@'127.0.0.1' IDENTIFIED BY 'Test1+12345';
-- 给数据库分配所有权
GRANT ALL PRIVILEGES ON db1.* TO test1@'127.0.0.1';
-- 执行测试的t_user.sql语句
source /root/t_user.sql;
6. 配置MGR
6.1.修改host文件
修改每台机器的host文件,确保可以正常解析主机名.
mysql> select * from performance_schema.replication_group_members;
MySQL MGR members记录使用了服务器的host,而不是IP.
6.2.创建复制用户(第一个Master执行)
创建复制用户binlog不能记录,否则会引起START GROUP_REPLICATION执行报错.先关闭binlog日志,添加好再打开binlog日志.
#SET SQL_LOG_BIN=0;
create user 'mysqlmgruser'@'%' identified with 'mysql_native_password' by '5j2;hoqbkbUF';
grant replication slave on *.* to 'mysqlmgruser'@'%' with grant option;
#FLUSH PRIVILEGES;
#SET SQL_LOG_BIN=1;
reset master;
change master to master_user='mysqlmgruser',master_password='5j2;hoqbkbUF' for channel 'group_replication_recovery';
6.3.安装group replication插件(三台都要安装)
install PLUGIN group_replication SONAME 'group_replication.so';
-- 查看group replication组件
show plugins;
Name | Status | Type | Library | License |
---|---|---|---|---|
group_replication | ACTIVE | GROUP REPLICATION | group_replication.so | GPL |
6.4.配置
6.5.启动group replication
-- 设置group_replication_bootstrap_group为ON是为了标示作为首个节点启动mgr集群,以后加入的就不能设置.如果是开机自启动,start group_replication就无需执行了.
mysql> set global group_replication_bootstrap_group=ON;
mysql> start group_replication;
mysql> set global group_replication_bootstrap_group=OFF;
6.6.查看MGR状态
-- 查询表performance_schema.replication_group_members
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+------------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+------------------+-------------+--------------+
| group_replication_applier | 1d105439-d041-11e7-960b-fa163ebd7d48 | host-10-0-70-125 | 3306 | ONLINE |
+---------------------------+--------------------------------------+------------------+-------------+--------------+
6.7.添加组内其他成员
参照以上过程和配置,配置其他组的MySQL,记得修改my.cnf
server-id
loose-group_replication_local_address
loose-group_replication_group_seeds
启动group replication 时,执行命令是:
mysql> start group_replication;
若失败可以强制加入复制组,MySQL8不再支持支持这个参数
mysql> set global group_replication_allow_local_disjoint_gtids_join=ON;
其他不变
6.8.错误问题及汇总
参考:https://blog.csdn.net/wangxiaotongfan/article/details/81870258
1.错误案例01
错误信息:2017-07-15T01:36:06.929941Z 4 [ERROR] Plugin group_replication reported: 'The group name 'group-replication-test' is not a valid UUID'
错误原因:loose-group_replication_group_name参数没有按照UUID格式指定,被认为设置该参数无效
解决方案:更改loose-group_replication_group_name参数值为,loose-group_replication_group_name="2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec"
2.错误案例02
错误信息:
2017-07-15T01:29:27.271909Z 0 [Warning] unknown variable 'loose-group_replication_group_name=group-replication-test'
2017-07-15T01:29:27.271926Z 0 [Warning] unknown variable 'loose-group_replication_start_on_boot=off'
2017-07-15T01:29:27.271930Z 0 [Warning] unknown variable 'loose-group_replication_local_address=10.26.7.129:3306'
2017-07-15T01:29:27.271935Z 0 [Warning] unknown variable 'loose-group_replication_group_seeds=10.26.7.129:3306,10.26.7.142:3306,10.26.7.166:3306'
2017-07-15T01:29:27.271939Z 0 [Warning] unknown variable 'loose-group_replication_bootstrap_group=off'
错误原因:因为先设置了这些参数,而没有装group_replication插件,导致数据库实例无法识别这些参数
解决方案:安装group replication插件,install plugin group_replication soname 'group_replication.so'; (uninstall plugin group_replication 卸载,show plugins查看)
3.错误案例03
错误信息:
2017-07-15T01:54:54.447829Z 0 [Note] Plugin group_replication reported: 'Unable to bind to 0.0.0.0:3306 (socket=60, errno=98)!'
2017-07-15T01:54:54.447948Z 0 [ERROR] Plugin group_replication reported: 'Unable to announce tcp port 3306. Port already in use?'
2017-07-15T01:54:54.448101Z 0 [ERROR] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.'
错误原因:配置的组复制监听端口和MYSQL实例端口冲突
解决方案:调整下面参数
loose-group_replication_local_address ="10.26.7.129:24001" #不同节点配置不同节点本身的IP地址和端口,区分MYSQL自身的3306端口
loose-group_replication_group_seeds ="10.26.7.129:24001,10.26.7.142:24001,10.26.7.166:24001"
4.错误案例04
错误信息:
2017-07-15T04:20:01.249529Z 21 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rpl_user@node2:3306' - retry-time: 60 retries: 1, Error_code: 2005
错误原因:没有配置DNS解析或者hosts解析,节点无法连接其他数据库
解决方案:配置hosts解析,每个节点/etc/hosts添加如下内容
10.26.7.166 node3
10.26.7.142 node2
10.26.7.129 node1
5.错误案例05
错误信息
2017-07-15T03:42:45.395407Z 288 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Can't create database 'db01'; database exists' on query. Default database: 'db01'. Query: 'create database db01', Error_code: 1007
2017-07-15T03:42:45.395472Z 288 [Warning] Slave: Can't create database 'db01'; database exists Error_code: 1007
2017-07-15T03:42:45.395503Z 288 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000005' position 434
错误原因:这个错误是由于节点由于各种原因退出mgr组,后面又加入了mgr组,但之前存在的数据依旧存在
解决方案:删除要加入组复制节点存在的数据库即可,但其他节点不是主写节点,需要先调整参数set global super_read_only=0;然后执行drop database db01;再重新加入组
set global group_replication_allow_local_disjoint_gtids_join=ON;
start group_replication;
6.错误案例06
错误信息:
2017-07-15T03:44:09.982428Z 18 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Can't create database 'db01'; database exists' on query. Default database: 'db01'. Query: 'create database db01', Error_code: 1007
2017-07-15T03:44:09.982493Z 18 [Warning] Slave: Can't create database 'db01'; database exists Error_code: 1007
2017-07-15T03:44:09.982522Z 18 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000005' position 434
错误原因:同上错误案例05
解决方案:同上错误案例05
7.错误案例07
错误信息:
2017-07-15T03:49:10.370846Z 0 [ERROR] Plugin group_replication
reported: 'This member has more executed transactions than those
present in the group. Local transactions:
2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec:1-4, 35e38786-66bb-11e7-bcc3-b8ca3a6a61a4:1
Group transactions: 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec:1-16'
错误原因:同上错误案例05,在从库执行了多余的事务
解决方案:同上错误案例05,直接重新加入节点即可
set global group_replication_allow_local_disjoint_gtids_join=ON;
start group_replication;
8.错误案例08
错误信息
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.
错误原因:由于主节点创建了表t1,但没有指定主键(此时表结构可以复制到各节点,一旦插入数据DML操作即会报错)
解决方案:为表增加主键,然后做DML操作(MGR需要各表都有主键)
alter table t1 add primary key(id);
insert into t1 values(1),(2);
9.错误案例09
错误信息:
mysqldump -R -E --triggers --single-transaction --master-data=2 -B db01 >db01.sql
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events.
mysqldump: Couldn't execute 'SAVEPOINT sp': The MySQL server is running with the --transaction-write-set-extraction!=OFF option so it cannot execute this statement (1290)
错误原因:mgr不支持mysqldump的事务一致性备份,因为其不支持savepoint
解决方案:通过xtrabackup或者不加--single-transaction备份
10.错误案例10
错误信息:
create table t2 as select * from t1;
ERROR 1786 (HY000): Statement violates GTID consistency: CREATE TABLE ... SELECT.
错误原因:配置MGR,开启了GTID,所有GTID不支持的操作,MGR也不支持
解决方案:使用create table t2 like t1; insert into t2 select * from t;分开两个事务执行
7. 数据库主从(建议使用MGR)
7.1.安装从数据库
按照上述步骤,在另一台服务器上安装mysql数据库,并创建db1测试数据库.
7.2.master数据库
主数据库服务器:10.0.8.137,从数据库服务器:10.0.8.138
修改主服务器的配置文件:
vi /etc/my.cnf
添加以下内容:
log_bin = mysql-bin
log_bin_index = mysql-bin.index
server-id=1
default_password_lifetime=0
expire_logs_days=300
max_connections=5000
character_set_erver=utf8mb4
##需要同步的数据库名字,如果是多个,就以此格式再写一行即可
##建议暂时不要使用
#binlog-do-db = db1
#binlog-ignore-db=mysql
log-bin 配置的是开启二进制日志,并将日志写在配置的路径上.记录的日志将以mysql-bin.000001 的方式进行记录.
log-bin-index 配置的是二进制日志记录文件的目录.该文件中每一行都是二进制日志文件的路径.
server-id是该MySQL服务器的服务ID,用于区分在主从配置中的其他服务器,配置成功后,重启mysqld服务.
注意:如果不定义log-bin的文件名,则会以hostname主机名命名,一旦主机名修改,则会因为找不见二进制文件报错.所以最好还是显示指定文件名.
7.3.slave数据库
修改slave的/etc/my.cnf配置文件,在[mysqld]下添加如下配置:
server-id = 2
relay_log = relay-bin
relay_log_index = relay-bin.index
read_only=on
default_password_lifetime=0
expire_logs_days=300
max_connections=1000
character-set-server=utf8mb4
##需要同步的数据库名字,如果是多个,就以此格式再写一行即可
##建议暂时不要使用
#replicate-do-db = db1
##不需要同步的数据库名字,如果是多个,就以此格式再写一行即可
##建议暂时不要使用
#replicate-ignore-db = mysql
#需要注释二进制日志
#log_bin = mysql-bin
server_id和master意义一样,用于区分不同的mysql服务器
relay_log用以记录收到的中继二进制日志.
relay_log_index用以保存收到的日志路径索引.如果不显示指定文件名,则以hostname值命名.保存后可以重启mysqld服务.
read_only=on 是保证从库是只读状态,避免手误进行写入操作,造成数据异常.
7.4.连接Master服务器
创建在master上创建备份用户
mysql>GRANT REPLICATION SLAVE ON . TO 'backup'@'10.0.8.138' IDENTIFIED BY 'Backup+1qazxsw2';
mysql>FLUSH PRIVILEGES;
在slave上启动slave与master的连接.在slave上登录到mysql,在mysql>:下进行操作
mysql> stop slave; //停止slave 同步
mysql> change master to master_host = '10.0.8.137', master_user='backup', master_password='Backup+1qazxsw2', MASTER_AUTO_POSITION = 1;
mysql> start slave; // 开始slave 复制
检查主从是否配置成功,在slave的服务器上查看
mysql> show slave status\G;
在打印出来的状态下,查看以下两个参数的值:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
如果这两个参数的值都是yes,则运行正常,可以在master数据库上添加一个数据库或者添加一张表,检查slave数据库上是否存在,如果这两个参数有任何一个不是 Yes,则说明存在问题.可以查看slave上的数据库错误日志文件查看错误原因.
7.5.常见错误原因
1.server-id一致
2.用户权限不够.需要的权限包括:
GRANT REPLICATION SLAVE,RELOAD,CREATE USER,SUPER ON *.* TO USER@'IP_ADDR' IDENTIFIED BY 'PASSWORD' WITH GRANT OPTION;
3.数据库UUID一致,如果mysql安装时是通过批量复制安装的,则有可能数据库的UUID一致,进入数据库的datadir目录,修改auto.cnf.随意修改下uuid中的值,重启mysqld服务即可
4.Master数据库端口被防火墙阻挡
7.6.主从配置文件
8. 常用维护
show variables like '%log_bin%'; //master数据库,log_bin=on
show master status; //master数据库
show slave status; //slave数据库
开机自启动
chkconfig add mysql on
自动备份:
在my.cnf中添加自动备份使用的账号密码,
[mysqldump]
quick
max_allowed_packet = 16M
user=root
password=Root+1qazxsw2
备份脚本:
set-gtid-purged=OFF #(不备份GTID)
/usr/bin/mysqldump --set-gtid-purged=OFF --add-drop-table db1 | gzip > /data/backup/db1_$(date +%Y%m%d%H%m%S).sql.gz
#删除db1数据库100天以前的备份.
find /data/backup -type f -mtime +100 -name db1*.sql.gz -exec rm -rf {} ;
#删除所有数据库100天以前的备份
find /data/backup -type f -mtime +100 -name *.sql.gz -exec rm -rf {} ;
定时任务
crontab -e
0 2 * * * /bin/sh /data/mysqlbackup.
文章作者 springrain
上次更新 2024-09-10