《MySQL主库xtrabackup备份导致应用异常》遗留问题(一)

先从xtrabackup代码分析一下

# flush tables with read lock
        mysql_check();
        mysql_lockall() if !$option_no_lock;

因为mysql_check()去检查一下mysql是否还活着,如果活着,就调用mysql_lockall,这个案例中mysql_check()还活着,走到mysql_lockall

sub mysql_lockall {
    $now = current_time();
    print STDERR "$now  $prefix Starting to lock all tables...\n";
    mysql_send "USE mysql;";
    mysql_send "SET AUTOCOMMIT=0;";
    if (compare_versions($mysql_server_version, '4.0.22') == 0
        || compare_versions($mysql_server_version, '4.1.7') == 0) {
        # MySQL server version is 4.0.22 or 4.1.7
        mysql_send "COMMIT;";
        mysql_send "FLUSH TABLES WITH READ LOCK;";
    } else {
        # MySQL server version is other than 4.0.22 or 4.1.7
        mysql_send "FLUSH TABLES WITH READ LOCK;";
        mysql_send "COMMIT;";
    }

Mysql执行flush tables with read lock(简称FTWRL)

sub mysql_send {
    my $request = shift;
    $current_mysql_request = $request;
    print MYSQL_WRITER "$request\n";
    mysql_check();
    $current_mysql_request = '';
}

在xtrabackup mysql_open时,会执行:$mysql_pid = open(*MYSQL_WRITER, “| mysql $options >$mysql_stdout 2>$mysql_stderr “);来得到执行FTWRL的mysql线程ID,再执行mysql_check()

sub mysql_check {
    my $mysql_pid_copy = $mysql_pid;
 
    # send a dummy query to mysql child process
    $hello_id++;
    my $hello_message = "innobackup hello $hello_id";
    print MYSQL_WRITER "select '$hello_message';\n"
        or Die "Connection to mysql child process failed: $!";
 
    # wait for reply
    eval {
        local $SIG{ALRM} = sub { die "alarm clock restart" };
        my $stdout = '';
        my $stderr = '';
        alarm $mysql_response_timeout;
        while (index($stdout, $hello_message) < 0) {
            sleep 2;
            if ($mysql_pid && $mysql_pid == waitpid($mysql_pid, &WNOHANG)) {
                my $reason = `cat $mysql_stderr`;
                $mysql_pid = '';
                die "mysql child process has died: $reason";
            }
            $stdout = `cat $mysql_stdout`;
            $stderr = `cat $mysql_stderr | grep -v ^Warning`;
            if ($stderr) {
                # mysql has reported an error, do exit
                die "mysql error: $stderr";
            }
        }
        alarm 0;
    };
    if ($@ =~ /alarm clock restart/) {
        Die "Connection to mysql child process (pid=$mysql_pid_copy) timedout."
            . " (Time limit of $mysql_response_timeout seconds exceeded."
            . " You may adjust time limit by editing the value of parameter"
            . " \"\$mysql_response_timeout\" in this script.)";
} elsif ($@) { Die $@; }
$mysql_last_access_time = time();
}

因为eval块写出到mysql-stdout中,未找到“select $hello_message”,返回-1,进入while,但waitpid($mysql_pid, &WNOHANG)返回-1(FTWRL还在running),退出if。直到超时900S后,xtrabackup备份退出,退时调用override的Die函数

sub Die {
    my $message = shift;
    my $extra_info = '';
 
    # kill all child processes of this process
    kill_child_processes();
 
    if ($current_mysql_request) {
        $extra_info = " while waiting for reply to MySQL request:" .
            " '$current_mysql_request'";
    }
    die "$prefix Error: $message$extra_info";
}

会杀掉所有mysql的线程

sub kill_child_processes {
    if ($ibbackup_pid) {
        kill($kill_signal, $ibbackup_pid);
        $ibbackup_pid = '';
    }
 
    if ($mysql_pid) {
        kill($kill_signal, $mysql_pid);
        $mysql_pid = '';
    }
}

因为$mysql_pid就是执行FTWRL的mysql process id。


Post a Comment