Don't require DSN. Begin rewriting docs.

This commit is contained in:
Daniel Nichter
2011-09-29 13:02:14 -06:00
parent ed68f3ef5c
commit 78b4f36d71

View File

@@ -5231,7 +5231,7 @@ sub main {
my $o = new OptionParser();
$o->get_specs();
$o->get_opts();
my $dp = $o->DSNParser();
$dp->prop('set-vars', $o->get('set-vars'));
@@ -5243,8 +5243,8 @@ sub main {
$o->set('ignore-tables', \%ignore_tables);
if ( !$o->get('help') ) {
if ( !@ARGV ) {
$o->save_error("No host specified");
if ( @ARGV > 1 ) {
$o->save_error("More than one host specified; only one allowed");
}
if ( ($o->get('replicate') || '') !~ m/[\w`]\.[\w`]/ ) {
@@ -6366,30 +6366,29 @@ if ( !caller ) { exit main(@ARGV); }
=head1 NAME
pt-table-checksum - Perform an online replication consistency check, or checksum MySQL tables efficiently on one or many servers.
pt-table-checksum - Perform an online replication consistency check.
=head1 SYNOPSIS
Usage: pt-table-checksum [OPTION...] DSN
Usage: pt-table-checksum [OPTION...] [DSN]
pt-table-checksum checksums MySQL tables efficiently on one or more hosts.
Each host is specified as a DSN and missing values are inherited from the
first host. If you specify multiple hosts, the first is assumed to be the
master.
pt-table-checksum performs an online replication consistency check by
replicating checksum queries. By default, all tables on all replicas
are checked. The C<DSN>, if specified, must be the master host. The
tool exists non-zero if any differences are found, or if any warnings
or error occur.
Checksum all slaves against the master:
First create a C<percona> database on the master:
pt-table-checksum \
h=master-host \
--replicate mydb.checksums
CREATE DATABASE percona;
# Wait for first command to complete and replication to catchup
# on all slaves, then...
Then run the tool on the master to check that all data on all replicas
is consistent:
pt-table-checksum \
h=master-host \
--replicat mydb.checksums \
--replicate-check 2
pt-table-checksum --create-replicate-table
The L<"--create-replicate-table"> option can be dropped once the
L<"--replicate"> table has been created.
=head1 RISKS
@@ -6399,9 +6398,8 @@ are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-table-checksum executes queries that cause the MySQL server to checksum its
data. This can cause significant server load. It is read-only unless you use
the L<"--replicate"> option, in which case it inserts a small amount of data
into the specified table.
data. This can cause significant server load. The tool also inserts a small
amount of data into the L<"--replicate"> table.
At the time of this release, we know of no bugs that could cause serious harm to
users.
@@ -6424,48 +6422,7 @@ Checksums typically take about twice as long as COUNT(*) on very large InnoDB
tables in my tests. For smaller tables, COUNT(*) is a good bit faster than
the checksums.
If you specify more than one server, pt-table-checksum assumes the first
server is the master and others are slaves. Checksums are parallelized for
speed, forking off a child process for each table. Duplicate server names are
ignored, but if you want to checksum a server against itself you can use two
different forms of the hostname (for example, "localhost 127.0.0.1", or
"h=localhost,P=3306 h=localhost,P=3307").
If you want to compare the tables in one database to those in another database
on the same server, just checksum both databases:
pt-table-checksum --databases db1,db2
You can then use L<pt-checksum-filter> to compare the results in both databases
easily.
pt-table-checksum examines table structure only on the first host specified,
so if anything differs on the others, it won't notice. It ignores views.
The checksums work on MySQL version 3.23.58 through 6.0-alpha. They will not
necessarily produce the same values on all versions. Differences in
formatting and/or space-padding between 4.1 and 5.0, for example, will cause
the checksums to be different.
=head1 SPECIFYING HOSTS
Each host is specified on the command line as a DSN. A DSN is a comma-separted
list of C<option=value> pairs. The most basic DSN is C<h=host> to specify
the hostname of the server and use default for everything else (port, etc.).
See L<"DSN OPTIONS"> for more information.
DSN options that are listed as C<copy: yes> are copied from the first DSN
to subsequent DSNs that do not specify the DSN option. For example,
C<h=host1,P=12345 h=host2> is equivalent to C<h=host1,P=12345 h=host2,P=12345>.
This allows you to avoid repeating DSN options that have the same value
for all DSNs.
Connection-related command-line options like L<"--user"> and L<"--password">
provide default DSN values for the corresponding DSN options indicated by
the short form of each option. For example, the short form of L<"--user">
is C<-u> which corresponds to the C<u> DSN option, so C<--user bob h=host>
is equivalent to C<h=host,u=bob>. These defaults apply to all DSNs that
do not specify the DSN option.
TODO
=head1 HOW FAST IS IT?
@@ -6524,64 +6481,9 @@ C<CRC32> is the default checksum function to use, and should be enough for most
cases. If you need stronger guarantees that your data is identical, you should
use one of the other functions.
=head1 CONSISTENT CHECKSUMS
If you are using this tool to verify your slaves still have the same data as the
master, which is why I wrote it, you should read this section.
The best way to do this with replication is to use the L<"--replicate"> option.
When the queries are finished running on the master and its slaves, you can go
to the slaves and issue SQL queries to see if any tables are different from the
master. Try the following:
SELECT db, tbl, chunk, this_cnt-master_cnt AS cnt_diff,
this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc)
AS crc_diff
FROM checksum
WHERE master_cnt <> this_cnt OR master_crc <> this_crc
OR ISNULL(master_crc) <> ISNULL(this_crc);
The L<"--replicate-check"> option can do this query for you.
=head1 OUTPUT
Output is to STDOUT, one line per server and table, with header lines for each
database. I tried to make the output easy to process with awk. For this reason
columns are always present. If there's no value, pt-table-checksum prints
'NULL'.
Output is unsorted, though all lines for one table should be output together.
For speed, all checksums are done in parallel (as much as possible) and may
complete out of the order in which they were started. You might want to run
them through another script or command-line utility to make sure they are in the
order you want. If you pipe the output through L<pt-checksum-filter>, you
can sort the output and/or avoid seeing output about tables that have no
differences.
=head1 REPLICATE TABLE MAINTENANCE
If you use L<"--replicate"> to store and replicate checksums, you may need to
perform maintenance on the replicate table from time to time to remove old
checksums. This section describes when checksums in the replicate table are
deleted automatically by pt-table-checksum and when you must manually delete
them.
Before starting, pt-table-checksum calculates chunks for each table, even
if L<"--chunk-size"> is not specified (in that case there is one chunk: "1=1").
Then, before checksumming each table, the tool deletes checksum chunks in the
replicate table greater than the current number of chunks. For example,
if a table is chunked into 100 chunks, 0-99, then pt-table-checksum does:
DELETE FROM replicate table WHERE db=? AND tbl=? AND chunk > 99
That removes any high-end chunks from previous runs which no longer exist.
Currently, this operation cannot be disabled.
If the replicate table becomes cluttered with old or invalid checksums
and the auto-delete operation is not deleting them, then you will need to
manually clean up the replicate table. Alternatively, if you specify
L<"--[no]empty-replicate-table">, then the tool deletes every row in the
replicate table.
TODO
=head1 EXIT STATUS
@@ -6598,7 +6500,7 @@ If you are using innotop (see L<http://code.google.com/p/innotop>),
mytop, or another tool to watch currently running MySQL queries, you may see
the checksum queries. They look similar to this:
REPLACE /*test.test_tbl:'2'/'5'*/ INTO test.checksum(db, ...
TODO
Since pt-table-checksum's queries run for a long time and tend to be
textually very long, and thus won't fit on one screen of these monitoring
@@ -7042,7 +6944,9 @@ rules, see L<http://dev.mysql.com/doc/en/replication-rules.html>.
The table specified by L<"--replicate"> will never be checksummed itself.
=item --replicate-check
=item --[no]replicate-check
default: yes
Check results in L<"--replicate"> table, to the specified depth. You must use
this after you run the tool normally; it skips the checksum step and only checks