mirror of
https://github.com/percona/percona-toolkit.git
synced 2025-09-09 18:30:16 +00:00
Document REPLICA CHECKS and make --check-slave-lag clearer.
This commit is contained in:
@@ -8927,17 +8927,21 @@ won't break replication (or simply fail to replicate). If you are sure that
|
|||||||
it's OK to run the checksum queries, you can negate this option to disable the
|
it's OK to run the checksum queries, you can negate this option to disable the
|
||||||
checks. See also L<"--replicate-database">.
|
checks. See also L<"--replicate-database">.
|
||||||
|
|
||||||
|
See also L<"REPLICA CHECKS">.
|
||||||
|
|
||||||
=item --check-slave-lag
|
=item --check-slave-lag
|
||||||
|
|
||||||
type: string; group: Throttle
|
type: string; group: Throttle
|
||||||
|
|
||||||
Pause checksumming until this replica's lag is less than L<"--max-lag">. The
|
Pause checksumming until this replica's lag is less than L<"--max-lag">. The
|
||||||
value is a DSN that inherits properties from the master host and the connection
|
value is a DSN that inherits properties from the master host and the connection
|
||||||
options (L<"--port">, L<"--user">, etc.). This option overrides the normal
|
options (L<"--port">, L<"--user">, etc.). By default, pt-table-checksum
|
||||||
behavior of finding and continually monitoring replication lag on ALL connected
|
monitors lag on all connected replicas, but this option limits lag monitoring
|
||||||
replicas. If you don't want to monitor ALL replicas, but you want more than
|
to the specified replica. This is useful if certain replicas are intentionally
|
||||||
just one replica to be monitored, then use the DSN option to the
|
lagged (with L<pt-slave-delay> for example), in which case you can specify
|
||||||
L<"--recursion-method"> option instead of this option.
|
a normal replica to monitor.
|
||||||
|
|
||||||
|
See also L<"REPLICA CHECKS">.
|
||||||
|
|
||||||
=item --chunk-index
|
=item --chunk-index
|
||||||
|
|
||||||
@@ -9208,8 +9212,7 @@ all replicas to which it connects, using Seconds_Behind_Master. If any replica
|
|||||||
is lagging more than the value of this option, then pt-table-checksum will sleep
|
is lagging more than the value of this option, then pt-table-checksum will sleep
|
||||||
for L<"--check-interval"> seconds, then check all replicas again. If you
|
for L<"--check-interval"> seconds, then check all replicas again. If you
|
||||||
specify L<"--check-slave-lag">, then the tool only examines that server for
|
specify L<"--check-slave-lag">, then the tool only examines that server for
|
||||||
lag, not all servers. If you want to control exactly which servers the tool
|
lag, not all servers.
|
||||||
monitors, use the DSN value to L<"--recursion-method">.
|
|
||||||
|
|
||||||
The tool waits forever for replicas to stop lagging. If any replica is
|
The tool waits forever for replicas to stop lagging. If any replica is
|
||||||
stopped, the tool waits forever until the replica is started. Checksumming
|
stopped, the tool waits forever until the replica is started. Checksumming
|
||||||
@@ -9219,6 +9222,8 @@ The tool prints progress reports while waiting. If a replica is stopped, it
|
|||||||
prints a progress report immediately, then again at every progress report
|
prints a progress report immediately, then again at every progress report
|
||||||
interval.
|
interval.
|
||||||
|
|
||||||
|
See also L<"REPLICA CHECKS">.
|
||||||
|
|
||||||
=item --max-load
|
=item --max-load
|
||||||
|
|
||||||
type: Array; default: Threads_running=25; group: Throttle
|
type: Array; default: Threads_running=25; group: Throttle
|
||||||
@@ -9300,13 +9305,15 @@ or checksum differences.
|
|||||||
type: int
|
type: int
|
||||||
|
|
||||||
Number of levels to recurse in the hierarchy when discovering replicas.
|
Number of levels to recurse in the hierarchy when discovering replicas.
|
||||||
Default is infinite. See also L<"--recursion-method">.
|
Default is infinite. See also L<"--recursion-method"> and L<"REPLICA CHECKS">.
|
||||||
|
|
||||||
=item --recursion-method
|
=item --recursion-method
|
||||||
|
|
||||||
type: string
|
type: string
|
||||||
|
|
||||||
Preferred recursion method for discovering replicas. Possible methods are:
|
Preferred recursion method for discovering replicas. pt-table-checksum
|
||||||
|
performs several L<"REPLICA CHECKS"> before and while running.
|
||||||
|
Possible methods are:
|
||||||
|
|
||||||
METHOD USES
|
METHOD USES
|
||||||
=========== ==================
|
=========== ==================
|
||||||
@@ -9315,18 +9322,21 @@ Preferred recursion method for discovering replicas. Possible methods are:
|
|||||||
dsn=DSN DSNs from a table
|
dsn=DSN DSNs from a table
|
||||||
none Do not find slaves
|
none Do not find slaves
|
||||||
|
|
||||||
The processlist method is the default, because SHOW SLAVE HOSTS is not
|
The C<processlist> method is the default, because C<SHOW SLAVE HOSTS> is not
|
||||||
reliable. However, the hosts method can work better if the server uses a
|
reliable. However, if the server uses a non-standard port (not 3306), then
|
||||||
non-standard port (not 3306). The tool usually does the right thing and
|
the C<hosts> method becomes the default because it works better in this case.
|
||||||
finds all replicas, but you may give a preferred method and it will be used
|
|
||||||
first.
|
|
||||||
|
|
||||||
The hosts method requires replicas to be configured with report_host,
|
The C<hosts> method requires replicas to be configured with C<report_host>,
|
||||||
report_port, etc.
|
C<report_port>, etc.
|
||||||
|
|
||||||
The dsn method is special: it specifies a table from which other DSN strings
|
The C<dsn> method is special: rather than automatically discovering replicas,
|
||||||
are read. The specified DSN must specify a D and t, or a database-qualified
|
this method specifies a table with replica DSNs. The tool will only connect
|
||||||
t. The DSN table should have the following structure:
|
to these replicas. This method works best when replicas do not use the same
|
||||||
|
MySQL username or password as the master, or when you want to prevent the tool
|
||||||
|
from connecting to certain replicas. The C<dsn> method is specified like:
|
||||||
|
C<--recursion-method dsn=h=host,D=percona,t=dsns>. The specified DSN must
|
||||||
|
have D and t parts, or just a database-qualified t part, which specify the
|
||||||
|
DSN table. The DSN table must have the following structure:
|
||||||
|
|
||||||
CREATE TABLE `dsns` (
|
CREATE TABLE `dsns` (
|
||||||
`id` int(11) NOT NULL AUTO_INCREMENT,
|
`id` int(11) NOT NULL AUTO_INCREMENT,
|
||||||
@@ -9335,10 +9345,13 @@ t. The DSN table should have the following structure:
|
|||||||
PRIMARY KEY (`id`)
|
PRIMARY KEY (`id`)
|
||||||
);
|
);
|
||||||
|
|
||||||
To make the tool monitor only the hosts 10.10.1.16 and 10.10.1.17 for
|
DSNs are ordered by C<id>, but C<id> and C<parent_id> are otherwise ignored.
|
||||||
replication lag and checksum differences, insert the values C<h=10.10.1.16> and
|
The C<dsn> column contains a replica DSN like it would be given on the command
|
||||||
C<h=10.10.1.17> into the table. Currently, the DSNs are ordered by id, but id
|
line, for example: C<"h=replica_host,u=repl_user,p=repl_pass">.
|
||||||
and parent_id are otherwise ignored.
|
|
||||||
|
The C<none> method prevents the tool from connecting to any replicas.
|
||||||
|
This effectively disables all the L<"REPLICA CHECKS"> because there will
|
||||||
|
not be any replicas to check. Thefore, this method is not recommended.
|
||||||
|
|
||||||
=item --replicate
|
=item --replicate
|
||||||
|
|
||||||
@@ -9494,6 +9507,60 @@ keyword. You might need to quote the value. Here is an example:
|
|||||||
|
|
||||||
=back
|
=back
|
||||||
|
|
||||||
|
=head1 REPLICA CHECKS
|
||||||
|
|
||||||
|
By default, pt-table-checksum attempts to find and connect to all replicas
|
||||||
|
connected to the master host. This automated process is called
|
||||||
|
"slave recursion" and is controlled by the L<"--recursion-method"> and
|
||||||
|
L<"--recurse"> options. The tool performs these checks on all replicas:
|
||||||
|
|
||||||
|
=over
|
||||||
|
|
||||||
|
=item 1. L<"--[no]check-replication-filters">
|
||||||
|
|
||||||
|
pt-table-checksum checks for replication filters on all replicas because
|
||||||
|
they can complicate or break the checksum process. By default, the tool
|
||||||
|
will exit if any replication filters are found, but this check can be
|
||||||
|
disabled by specifying C<--no-check-replication-filters>.
|
||||||
|
|
||||||
|
=item 2. L<"--replicate"> table
|
||||||
|
|
||||||
|
pt-table-cheksum checks that the L<"--replicate"> table exists on all
|
||||||
|
replicas, else checksumming can break replication when updates to the table
|
||||||
|
on the master replicate to a replica that doesn't have the table. This
|
||||||
|
check cannot be disabled, and the tool wait forever until the table
|
||||||
|
exists on all replicas, printing L<"--progress"> messages while it waits.
|
||||||
|
|
||||||
|
=item 3. Single chunk size
|
||||||
|
|
||||||
|
If a table can be checksummed in a single chunk on the master,
|
||||||
|
pt-table-checksum will check that the table size on all replicas is
|
||||||
|
approximately the same. This prevents a rare problem where the table
|
||||||
|
on the master is empty or small, but on a replica it is much larger.
|
||||||
|
In this case, the single chunk checksum on the master would overload
|
||||||
|
the replica. This check cannot be disabled.
|
||||||
|
|
||||||
|
=item 4. Lag
|
||||||
|
|
||||||
|
After each chunk, pt-table-checksum checks the lag on all replicas, or only
|
||||||
|
the replica specified by L<"--check-slave-lag">. This helps the tool
|
||||||
|
not to overload the replicas with checksum data. There is no way to
|
||||||
|
disable this check, but you can specify a single replica to check with
|
||||||
|
L<"--check-slave-lag">, and if that replica is the fastest, it will help
|
||||||
|
prevent the tool from waiting too long for replica lag to abate.
|
||||||
|
|
||||||
|
=item 5. Checksum chunks
|
||||||
|
|
||||||
|
When pt-table-checksum finishes checksumming a table, it waits for the last
|
||||||
|
checksum chunk to replicate to all replicas so it can perform the
|
||||||
|
L<"--[no]replicate-check">. Disabling that option by specifying
|
||||||
|
L<--no-replicate-check> disables this check, but it also disables
|
||||||
|
immediate reporting of checksum differences, thereby requiring a second run
|
||||||
|
of the tool with L<"--replicate-check-only"> to find and print checksum
|
||||||
|
differences.
|
||||||
|
|
||||||
|
=back
|
||||||
|
|
||||||
=head1 DSN OPTIONS
|
=head1 DSN OPTIONS
|
||||||
|
|
||||||
These DSN options are used to create a DSN. Each option is given like
|
These DSN options are used to create a DSN. Each option is given like
|
||||||
|
Reference in New Issue
Block a user