使用innobackupex备份遇到以下错误:
xtrabackup: error: log block numbers mismatch:
xtrabackup: error: expected log block no. 665497466, but got no. 673689450 from the log file.
xtrabackup: error: it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow, or log files being too small.
xtrabackup: Error: xtrabackup_copy_logfile() failed.
备份原因在innobackupex的输出信息中已经有说明,
log block numbers mismatch,备份时在尝试读取特定的Redo块时找不到;
expected log block no. 665497466, but got no. 673689450 from the log file,本应该读取的redo 块是no. 665497466,但是只能获取到no. 673689450
it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow, or log files being too small,
读redo错误的原因,要么innobackupex读取redo的速度太慢了,或者redo 文件太小了,导致读取速度跟不上redo文件的切换速度,在读取之前,相应的redo块已经被覆盖了。
检查备份失败时候的IO情况,确实是写IO比较大,但是磁盘繁忙度%util只有10%,且CPU iowait是0,说明磁盘负载还是比较低的,IO性能没有问题;
同时检查redo文件,一共4个1G,且当时备份实例的DML比较少,文件大小不该是备份失败的原因。
有点迷茫的时候,翻了一下xtrabackup的官方文档,有看到以下说明:
Log copying thread checks the transactional log every second to see if there were any new log records written
that need to be copied, but there is a chance that the log copying thread might not be able to keep up with the amount
of writes that go to the transactional logs, and will hit an error when the log records are overwritten before they could
be read.
原来innobackupex会以一秒的间隔去读取redo,那么很有可能在这1秒时间内,刚好MySQL进行了redo文件的覆盖,比如要读取的redo块是no. 665497466但是隔一秒之后,已经被覆盖了,再去读取的时候最老的redo块已经是no. 673689450了。