mysqlfailover

mysqlfailover监控主备复制情况及故障转移,只支持5.6版本,要用到gtid_mode为ON

$ ./mysqlfailover --help
Usage: mysqlfailover --master=root@localhost --discover-slaves-login=root --candidates=root@host123:3306,root@host456:3306 
 
mysqlfailover - automatic replication health monitoring and failover
 
Options:
  --version             show program's version number and exit
  --help                
  --candidates=CANDIDATES
                        connection information for candidate slave servers for
                        failover in the form:
                        <user>[:<password>]@<host>[:<port>][:<socket>] or
                        <login-path>[:<port>][:<socket>]. Valid only with
                        failover command. List multiple slaves in comma-
                        separated list.
  --discover-slaves-login=DISCOVER
                        at startup, query master for all registered slaves and
                        use the user name and password specified to connect.
                        Supply the user and password in the form
                        <user>[:<password>] or <login-path>. For example,
                        --discover-slaves-login=joe:secret will use 'joe' as
                        the user and 'secret' as the password for each
                        discovered slave.
  --exec-after=EXEC_AFTER
                        name of script to execute after failover or switchover
  --exec-before=EXEC_BEFORE
                        name of script to execute before failover or
                        switchover
  --log=LOG_FILE        specify a log file to use for logging messages
  --log-age=LOG_AGE     specify maximum age of log entries in days. Entries
                        older than this will be purged on startup. Default = 7
                        days.
  --master=MASTER       connection information for master server in the form:
                        <user>[:<password>]@<host>[:<port>][:<socket>] or
                        <login-path>[:<port>][:<socket>]
  --max-position=MAX_POSITION
                        Used to detect slave delay. The maximum difference
                        between the master's log position and the slave's
                        reported read position of the master. A value greater
                        than this means the slave is too far behind the
                        master. Default is 0.
  --ping=PING           Number of ping attempts for detecting downed server.
  --seconds-behind=MAX_DELAY
                        Used to detect slave delay. The maximum number of
                        seconds behind the master permitted before slave is
                        considered behind the master. Default is 0.
  --slaves=SLAVES       connection information for slave servers in the form:
                        <user>[:<password>]@<host>[:<port>][:<socket>] or
                        <login-path>[:<port>][:<socket>]. List multiple slaves
                        in comma-separated list.
  --timeout=TIMEOUT     Maximum timeout in seconds to wait for each
                        replication command to complete. For example, timeout
                        for slave waiting to catch up to master. Default = 3.
  -i INTERVAL, --interval=INTERVAL
                        interval in seconds for polling the master for failure
                        and reporting health. Default = 15 seconds. Lowest
                        value is 5 seconds.
  -f FAILOVER_MODE, --failover-mode=FAILOVER_MODE
                        action to take when the master fails. 'auto' =
                        automatically fail to best slave, 'elect' = fail to
                        candidate list or if no candidate meets criteria fail,
                        'fail' = take no action and stop when master fails.
                        Default = 'auto'.
  --exec-fail-check=EXEC_FAIL
                        name of script to execute on each interval to invoke
                        failover
  --force               override the registration check on master for multiple
                        instances of the console monitoring the same master.
  --exec-post-failover=EXEC_POST_FAIL
                        name of script to execute after failover is complete
                        and the utility has refreshed the health report.
  --rediscover          Rediscover slaves on interval. Allows console to
                        detect when slaves have been removed or added.
  -v, --verbose         control how much information is displayed. e.g., -v =
                        verbose, -vv = more verbose, -vvv = debug
  --rpl-user=RPL_USER   the user and password for the replication user
                        requirement, in the form: <user>[:<password>] or
                        <login-path>. E.g. rpl:passwd - By default = none

注意1:需要权限还是有点大的,最后把all权限都给slave,当然这里为省事

$ ./mysqlfailover --master=slave:slave@xx.xx.xx.44:7038 --discover-slaves-login=slave:slave --log=log.txt
# Discovering slaves for master at xx.xx.xx.44:7038
WARNING: There are slaves that have not been registered with --report-host or --report-port.
ERROR: Query failed. 1227: Access denied; you need (at least one of) the SUPER privilege(s) for this operation
ERROR: Query failed. 1142: CREATE command denied to user 'slave'@'xx.xx.xx.xx' for table 'failover_console'
ERROR: Query failed. 1142: SELECT command denied to user 'slave'@'xx.xx.xx.xx' for table 'failover_console'
ERROR: Query failed. 1142: INSERT command denied to user 'slave'@'xx.xx.xx.xx' for table 'failover_console'

注意2:依赖的参数还是比较多的 gtid_mode(主备) master-info-repository(主) report-host(备) report-port(备)

$ ./mysqlfailover --master=slave:slave@xx.xx.xx.44:7038 --discover-slaves-login=slave:slave --log=log.txt
# Discovering slaves for master at xx.xx.xx.44:7038
Multiple instances of failover console found for master xx.xx.xx.44:7038.
If this is an error, restart the console with --force. 
Failover mode changed to 'FAIL' for this instance. 
Console will start in 10 seconds..........starting Console.
# Checking privileges.
ERROR: Failover requires --master-info-repository=TABLE for all slaves.

eg1:查看主备健康情况

$ ./mysqlfailover --master=slave:slave@xx.xx.xx.44:7038 --discover-slaves-login=slave:slave --log=log.txt -i 5
# Discovering slaves for master at xx.xx.xx.44:7038
Multiple instances of failover console found for master xx.xx.xx.44:7038.
If this is an error, restart the console with --force. 
Failover mode changed to 'FAIL' for this instance. 
Console will start in 10 seconds..........starting Console.
# Checking privileges.
 
MySQL Replication Failover Utility
Failover Mode = fail     Next Interval = Sun Mar 17 17:27:31 2013
 
NOTICE: Failover mode changed to fail due to another
        instance of the console running against master.
 
Master Information
------------------
Binary Log File   Position  Binlog_Do_DB  Binlog_Ignore_DB  
mysql-bin.000014  326                                       
 
GTID Executed Set
1BCC5C6A-814B-11E2-88A0-00188B77085E:1
 
Replication Health Status
+----------------+-------+---------+--------+------------+---------+
| host           | port  | role    | state  | gtid_mode  | health  |
+----------------+-------+---------+--------+------------+---------+
| xx.xx.xx.44   | 7038  | MASTER  | UP     | ON         | OK      |
| xx.xx.xx.11  | 7039  | SLAVE   | UP     | ON         | OK      |
+----------------+-------+---------+--------+------------+---------+
Q-quit R-refresh H-health G-GTID Lists U-UUIDs L-log entries
 
MySQL Replication Failover Utility
Failover Mode = fail     Next Interval = Sun Mar 17 17:27:31 2013
 
NOTICE: Failover mode changed to fail due to another
        instance of the console running against master.
 
Master Information
------------------
Binary Log File   Position  Binlog_Do_DB  Binlog_Ignore_DB  
mysql-bin.000014  326                                       
 
GTID Executed Set
1BCC5C6A-814B-11E2-88A0-00188B77085E:1
 
Replication Health Status
+----------------+-------+---------+--------+------------+---------+
| host           | port  | role    | state  | gtid_mode  | health  |
+----------------+-------+---------+--------+------------+---------+
| xx.xx.xx.44   | 7038  | MASTER  | UP     | ON         | OK      |
| xx.xx.xx.11  | 7039  | SLAVE   | UP     | ON         | OK      |
+----------------+-------+---------+--------+------------+---------+
 
Q-quit R-refresh H-health G-GTID Lists U-UUIDs L-log entries

按L键效果:

MySQL Replication Failover Utility
Failover Mode = fail     Next Interval = Sun Mar 17 17:27:31 2013
 
NOTICE: Failover mode changed to fail due to another
        instance of the console running against master.
 
Master Information
------------------
Binary Log File   Position  Binlog_Do_DB  Binlog_Ignore_DB  
mysql-bin.000014  326                                       
 
GTID Executed Set
1BCC5C6A-814B-11E2-88A0-00188B77085E:1
 
Log File
+-------------------------+-------------------------------------------------------------------------------+
| Date                    | Entry                                                                         |
+-------------------------+-------------------------------------------------------------------------------+
| 2013-03-17 12:13:21 PM  | INFO Discovering slaves for master at localhost:3306                          |
| 2013-03-17 12:13:21 PM  | INFO Registering instance on master.                                          |
| 2013-03-17 12:13:21 PM  | CRITICAL Topology must support global transaction ids and have GTID_MODE=ON.  |
| 2013-03-17 12:13:21 PM  | INFO Unregistering instance on master.                                        |
| 2013-03-17 12:13:21 PM  | INFO Failover console stopped.                                                |
| 2013-03-17 12:13:21 PM  | CRITICAL Topology must support global transaction ids and have GTID_MODE=ON.  |
| 2013-03-17 12:14:12 PM  | INFO Discovering slaves for master at localhost:3306                          |
| 2013-03-17 12:14:12 PM  | INFO Discovering slave at xxxxxx.hostname:3406                               |
| 2013-03-17 12:14:12 PM  | INFO Found slave: xxxxxx.hostname:3406                                       |
| 2013-03-17 12:14:12 PM  | INFO Discovering slave at xxxxxx.hostname:3506                               |
| 2013-03-17 12:14:12 PM  | INFO Found slave: xxxxxx.hostname:3506                                       |
| 2013-03-17 12:14:12 PM  | INFO Registering instance on master.                                          |
| 2013-03-17 12:14:12 PM  | CRITICAL Topology must support global transaction ids and have GTID_MODE=ON.  |
| 2013-03-17 12:14:12 PM  | INFO Unregistering instance on master.                                        |
| 2013-03-17 12:14:12 PM  | INFO Failover console stopped.                                                |
| 2013-03-17 12:14:12 PM  | CRITICAL Topology must support global transaction ids and have GTID_MODE=ON.  |
+-------------------------+-------------------------------------------------------------------------------+
Q-quit R-refresh H-health G-GTID Lists U-UUIDs L-log entries Up|Down-scroll

eg2:演示failover,当44挂掉时,11会升级为主库,继续提供服务

[Mysql@xx.hostname ~]
$mysqladmin -uroot shutdown -S /tmp/mysql.sock 
 
Failover starting in 'auto' mode...
# Candidate slave xx.xx.xx.11:7039 will become the new master.
# Preparing candidate for failover.
# Creating replication user if it does not exist.
# Stopping slaves.
# Performing STOP on all slaves.
# Switching slaves to new master.
# Starting slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Failover complete.
# Discovering slaves for master at xx.xx.xx.11:7039
 
Failover console will restart in 5 seconds.
 
MySQL Replication Failover Utility
Failover Mode = fail     Next Interval = Sun Mar 17 17:30:03 2013
 
NOTICE: Failover mode changed to fail due to another
        instance of the console running against master.
 
Master Information
------------------
Binary Log File   Position  Binlog_Do_DB  Binlog_Ignore_DB  
mysql-bin.000025  191                                       
 
GTID Executed Set
1BCC5C6A-814B-11E2-88A0-00188B77085E:1
 
Log File
+----------------+-------+---------+--------+------------+---------+
| host           | port  | role    | state  | gtid_mode  | health  |
+----------------+-------+---------+--------+------------+---------+
| xx.xx.xx.11    | 7039  | MASTER  | UP     | ON         | OK      |
+----------------+-------+---------+--------+------------+---------+

2 Responses to “mysqlfailover”

  1. 拉风_zhang说道:

    hi stronghearted…

    当主库当掉,需要提升一个从库为新主库时,如果被选择的从库不是最新的那个,然后再让其他的从库指向它,这种情况mysqlfailover怎么处理呢?其实,我关心的是,能不能进行差异补全。

  2. admin说道:

    hi
    拉风_zhang,目前怎么去补全数据,还是要自己去实现。google的semi-sync patch考虑更多一些

Post a Comment