bitscn.com
mysql_upgrade引起的master/slave replication中断解决
在生产环境master服务器上处理完《1548-cannot loadfrom mysql.proc. the table is probably corrupted》后,接到报警信息,slave服务器复制中断查看slave 状态
mysql>show slave status
发现如下语句执行错误
drop databaseif exists performance_schema
performance_schema是mysql自带的性能信息相关的库,mysql怎么会执行这个操作,看看错误日志吧
[root@db25522]# tail -n 500/data/my2/mysql/db25522.err
13051310:29:54 [note] error reading relay log event: slave sql thread was killed
13051310:29:54 [error] error reading packet from server: lost connection to mysqlserver during query ( server_errno=2013)
13051310:29:54 [note] slave i/o thread killed while reading event
13051310:29:54 [note] slave i/o thread exiting, read up to log 'mysql-bin.002734',position 1017737307
13051310:29:57 [note] slave i/o thread: connected to master 'repl@xxx.xxx.xxx.xxx:3306',replicationstarted in log 'mysql-bin.002734' at position 1017737307
13051310:29:57 [note] slave sql thread initialized, starting replication in log'mysql-bin.002734' at position 1017729051, relay log'/my/rlog/relay-bin.000764' position: 1017729197
13053015:00:53 [error] incorrect definition of table mysql.proc: expected column'comment' at position 15 to have type text, found type char(64).
13053015:00:53 [error] slave sql: query caused differenterrors on master and slave. error on master: message (format)='cannot load from mysql.%s. the tableis probably corrupted' error code=1548 ; error on slave: actual message='noerror', error code=0. default database: 'performance_schema'. query: 'drop database if exists performance_schema',error_code: 0
13053015:00:53 [warning] slave: cannot load from mysql.proc. the table is probablycorrupted error_code: 1548
13053015:00:53 [error] error running query, slave sql thread aborted. fix theproblem, and restart the slave sql thread with slave start. we stoppedat log 'mysql-bin.002947' position 721651903
莫非是数据不一致导致的?发现master服务器有performance_schema这个库,但是slave服务器没有。在执行 mysql_upgrade -uroot 之前,主从复制在运行,判断操作发生在mysql_upgrade-uroot之后,分析master上日志,在这个时间段内mysql进行了那些操作
[root@db25522]# mysqlbinlog --no-defaults --start-date='2013-05-3015:00:00' --end-date='2013-05-3015:03:00' mysql-bin.002947 >/root/tmp.log
查询日志发现
[root@db25522]#vi /root/tmp.log
/drop
/*!*/;
# at721651876
#13053015:00:25 server id 13084 end_log_pos721651903 xid = 435509540
commit/*!*/;
# at 721651903
#13053015:00:53 server id 13084 end_log_pos721652022 query thread_id=418930 exec_time=0 error_code=1548
settimestamp=1369897253/*!*/;
/*!/cutf8 *//*!*/;
set@@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=28/*!*/;
drop database if exists performance_schema
/*!*/;
# at721652022
#13053015:00:53 server id 13084 end_log_pos721652152 query thread_id=418930 exec_time=0 error_code=0
settimestamp=1369897253/*!*/;
createdatabase performance_schema character set utf8
/*!*/;
# at721652152
#13053015:03:19 server id 13084 end_log_pos721652223 query thread_id=418956 exec_time=0 error_code=0
settimestamp=1369897399/*!*/;
/*!/cgbk *//*!*/;
set@@session.character_set_client=28,@@session.collation_connection=28,@@session.collation_server=28/*!*/;
begin
/*!*/;
# at721652223
#13053015:03:19 server id 13084 end_log_pos721652336 query thread_id=418956 exec_time=0 error_code=0
usepricedb/*!*/;
settimestamp=1369897399/*!*/;
好吧,上面红色部分,执行了这个操作,再看看slave错误日志
13053015:00:53 [error] incorrect definition of table mysql.proc: expected column'comment' at position 15 to have type text, found type char(64).
13053015:00:53 [error] slave sql: query caused different errors on master andslave. error on master: message(format)='cannot load from mysql.%s. the table is probably corrupted' errorcode=1548 ; error on slave: actual message='no error', error code=0. defaultdatabase: 'performance_schema'. query: 'drop database if existsperformance_schema', error_code: 0
13053015:00:53 [warning] slave: cannot load from mysql.proc. the table is probablycorrupted error_code: 1548
13053015:00:53 [error] error running query, slave sql thread aborted. fix the problem,and restart the slave sql thread with slave start. we stopped atlog 'mysql-bin.002947' position 721651903
5.5的日志错误还是很人性化的,slave停止时读取的binlog 日志文件,位置都很清楚。这为我们restart slave提供了方便。既然是drop database ifexists performance_schema 导致的错误,那么跳过这条event。
slave服务器:
mysql>show variables like '%skip%';
mysql>setglobal sql_slave_skip_counter =1;
mysql>slave start ;
复制正常
总结:复制虽然正常了。为什么 mysql_upgrade 会做 drop database if existsperformance_schema 这个操作?希望遇到类似问题的朋友,一起交流。
bitscn.com