mirror of
https://github.com/percona/percona-toolkit.git
synced 2025-09-14 07:06:28 +00:00
774 lines
20 KiB
ReStructuredText
774 lines
20 KiB
ReStructuredText
.. program:: pt-find
|
|
|
|
====================
|
|
:program:`pt-find`
|
|
====================
|
|
|
|
.. highlight:: perl
|
|
|
|
|
|
NAME
|
|
====
|
|
|
|
:program:`pt-find` - Find |MySQL| tables and execute actions, like GNU find.
|
|
|
|
|
|
SYNOPSIS
|
|
========
|
|
|
|
Usage
|
|
-----
|
|
|
|
::
|
|
|
|
pt-find [OPTION...] [DATABASE...]
|
|
|
|
:program:`pt-find` searches for |MySQL| tables and executes actions, like GNU find. The
|
|
default action is to print the database and table name.
|
|
|
|
Find all tables created more than a day ago, which use the MyISAM engine, and
|
|
print their names:
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --ctime +1 --engine MyISAM
|
|
|
|
Find InnoDB tables that haven't been updated in a month, and convert them to
|
|
MyISAM storage engine (data warehousing, anyone?):
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --mtime +30 --engine InnoDB --exec "ALTER TABLE %D.%N ENGINE=MyISAM"
|
|
|
|
|
|
Find tables created by a process that no longer exists, following the
|
|
name_sid_pid naming convention, and remove them.
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --connection-id '\D_\d+_(\d+)$' --server-id '\D_(\d+)_\d+$' --exec-plus "DROP TABLE %s"
|
|
|
|
|
|
Find empty tables in the test and junk databases, and delete them:
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --empty junk test --exec-plus "DROP TABLE %s"
|
|
|
|
|
|
Find tables more than five gigabytes in total size:
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --tablesize +5G
|
|
|
|
Find all tables and print their total data and index size, and sort largest
|
|
tables first (sort is a different program, by the way).
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --printf "%T\t%D.%N\n" | sort -rn
|
|
|
|
|
|
As above, but this time, insert the data back into the database for posterity:
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
pt-find --noquote --exec "INSERT INTO sysdata.tblsize(db, tbl, size) VALUES('%D', '%N', %T)"
|
|
|
|
|
|
RISKS
|
|
=====
|
|
|
|
The following section is included to inform users about the potential risks,
|
|
whether known or unknown, of using this tool. The two main categories of risks
|
|
are those created by the nature of the tool (e.g. read-only tools vs. read-write
|
|
tools) and those created by bugs.
|
|
|
|
:program:`pt-find` only reads and prints information by default, but :option:`--exec` and :option:`--exec-plus` can execute user-defined SQL. You should be as careful with it as you are with any command-line tool that can execute queries against your
|
|
database.
|
|
|
|
At the time of this release, we know of no bugs that could cause serious harm to
|
|
users.
|
|
|
|
The authoritative source for updated information is always the online issue
|
|
tracking system. Issues that affect this tool will be marked as such. You can
|
|
see a list of such issues at the following URL:
|
|
`http://www.percona.com/bugs/pt-find <http://www.percona.com/bugs/pt-find>`_.
|
|
|
|
See also :ref:`bugs` for more information on filing bugs and getting help.
|
|
|
|
|
|
DESCRIPTION
|
|
===========
|
|
|
|
:program:`pt-find` looks for |MySQL| tables that pass the tests you specify, and executes the actions you specify. The default action is to print the database and table
|
|
name to ``STDOUT``.
|
|
|
|
:program:`pt-find` is simpler than *GNU* :command:`find`. It doesn't allow you to specify complicated expressions on the command line. :program:`pt-find` uses ``SHOW TABLES`` when possible, and ``SHOW TABLE STATUS`` when needed.
|
|
|
|
OPTION TYPES
|
|
============
|
|
|
|
There are three types of options: normal options, which determine some behavior
|
|
or setting; tests, which determine whether a table should be included in the
|
|
list of tables found; and actions, which do something to the tables:program:`pt-find`
|
|
finds.
|
|
|
|
:program:`pt-find` uses standard ``Getopt::Long`` option parsing, so you should use double dashes in front of long option names, unlike ``GNU`` :command:`find`.
|
|
|
|
OPTIONS
|
|
=======
|
|
|
|
This tool accepts additional command-line arguments. Refer to the "SYNOPSIS" and usage information for details.
|
|
|
|
.. option:: --ask-pass
|
|
|
|
Prompt for a password when connecting to |MySQL|.
|
|
|
|
.. option:: --case-insensitive
|
|
|
|
Specifies that all regular expression searches are case-insensitive.
|
|
|
|
.. option:: --charset
|
|
|
|
short form: -A; type: string
|
|
|
|
Default character set. If the value is utf8, sets *Perl* 's binmode on
|
|
``STDOUT`` to utf8, passes the mysql_enable_utf8 option to ``DBD::mysql``, and runs SET
|
|
NAMES UTF8 after connecting to |MySQL|. Any other value sets binmode on ``STDOUT``
|
|
without the utf8 layer, and runs SET NAMES after connecting to |MySQL|.
|
|
|
|
.. option:: --config
|
|
|
|
type: Array
|
|
|
|
Read this comma-separated list of config files; if specified, this must be the
|
|
first option on the command line.
|
|
|
|
.. option:: --day-start
|
|
|
|
Measure times (for :option:`--mmin`, etc) from the beginning of today rather than
|
|
from the current time.
|
|
|
|
.. option:: --defaults-file
|
|
|
|
short form: -F; type: string
|
|
|
|
Only read mysql options from the given file. You must give an absolute
|
|
pathname.
|
|
|
|
.. option:: --help
|
|
|
|
Show help and exit.
|
|
|
|
.. option:: --host
|
|
|
|
short form: -h; type: string
|
|
|
|
Connect to host.
|
|
|
|
.. option:: --or
|
|
|
|
Combine tests with OR, not AND.
|
|
|
|
By default, tests are evaluated as though there were an AND between them. This
|
|
option switches it to OR.
|
|
|
|
Option parsing is not implemented by:program:`pt-find` itself, so you cannot specify
|
|
complicated expressions with parentheses and mixtures of OR and AND.
|
|
|
|
.. option:: --password
|
|
|
|
short form: -p; type: string
|
|
|
|
Password to use when connecting.
|
|
|
|
.. option:: --pid
|
|
|
|
type: string
|
|
|
|
Create the given PID file. The file contains the process ID of the script.
|
|
The PID file is removed when the script exits. Before starting, the script
|
|
checks if the PID file already exists. If it does not, then the script creates
|
|
and writes its own PID to it. If it does, then the script checks the following:
|
|
if the file contains a PID and a process is running with that PID, then
|
|
the script dies; or, if there is no process running with that PID, then the
|
|
script overwrites the file with its own PID and starts; else, if the file
|
|
contains no PID, then the script dies.
|
|
|
|
.. option:: --port
|
|
|
|
short form: -P; type: int
|
|
|
|
Port number to use for connection.
|
|
|
|
.. option:: --[no]quote
|
|
|
|
default: yes
|
|
|
|
Quotes |MySQL| identifier names with |MySQL|'s standard backtick character.
|
|
|
|
Quoting happens after tests are run, and before actions are run.
|
|
|
|
.. option:: --set-vars
|
|
|
|
type: string; default: wait_timeout=10000
|
|
|
|
Set these |MySQL| variables. Immediately after connecting to |MySQL|, this string
|
|
will be appended to SET and executed.
|
|
|
|
.. option:: --socket
|
|
|
|
short form: -S; type: string
|
|
|
|
Socket file to use for connection.
|
|
|
|
.. option:: --user
|
|
|
|
short form: -u; type: string
|
|
|
|
User for login if not current user.
|
|
|
|
.. option:: --version
|
|
|
|
Show version and exit.
|
|
|
|
TESTS
|
|
=====
|
|
|
|
Most tests check some criterion against a column of ``SHOW TABLE STATUS`` output.
|
|
Numeric arguments can be specified as +n for greater than n, -n for less than n,
|
|
and n for exactly n. All numeric options can take an optional suffix multiplier
|
|
of k, M or G (1_024, 1_048_576, and 1_073_741_824 respectively). All patterns
|
|
are *Perl* regular expressions (see 'man perlre') unless specified as SQL LIKE
|
|
patterns.
|
|
|
|
Dates and times are all measured relative to the same instant, when:program:`pt-find`
|
|
first asks the database server what time it is. All date and time manipulation
|
|
is done in SQL, so if you say to find tables modified 5 days ago, that
|
|
translates to SELECT DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 5 DAY). If you
|
|
specify :option:`--day-start`, if course it's relative to CURRENT_DATE instead.
|
|
|
|
However, table sizes and other metrics are not consistent at an instant in
|
|
time. It can take some time for |MySQL| to process all the SHOW queries, and:program:`pt-find` can't do anything about that. These measurements are as of the
|
|
time they're taken.
|
|
|
|
If you need some test that's not in this list, file a bug report and I'll
|
|
enhance:program:`pt-find` for you. It's really easy.
|
|
|
|
|
|
.. option:: --autoinc
|
|
|
|
type: string; group: Tests
|
|
|
|
Table's next AUTO_INCREMENT is n. This tests the Auto_increment column.
|
|
|
|
|
|
.. option:: --avgrowlen
|
|
|
|
type: size; group: Tests
|
|
|
|
Table avg row len is n bytes. This tests the Avg_row_length column.
|
|
The specified size can be "NULL" to test where Avg_row_length IS NULL.
|
|
|
|
|
|
.. option:: --checksum
|
|
|
|
type: string; group: Tests
|
|
|
|
Table checksum is n. This tests the Checksum column.
|
|
|
|
|
|
.. option:: --cmin
|
|
|
|
type: size; group: Tests
|
|
|
|
Table was created n minutes ago. This tests the Create_time column.
|
|
|
|
|
|
.. option:: --collation
|
|
|
|
type: string; group: Tests
|
|
|
|
Table collation matches pattern. This tests the Collation column.
|
|
|
|
|
|
.. option:: --column-name
|
|
|
|
type: string; group: Tests
|
|
|
|
A column name in the table matches pattern.
|
|
|
|
|
|
.. option:: --column-type
|
|
|
|
type: string; group: Tests
|
|
|
|
A column in the table matches this type (case-insensitive).
|
|
|
|
Examples of types are: varchar, char, int, smallint, bigint, decimal, year,
|
|
timestamp, text, enum.
|
|
|
|
|
|
.. option:: --comment
|
|
|
|
type: string; group: Tests
|
|
|
|
Table comment matches pattern. This tests the Comment column.
|
|
|
|
|
|
.. option:: --connection-id
|
|
|
|
type: string; group: Tests
|
|
|
|
Table name has nonexistent |MySQL| connection ID. This tests the table name for
|
|
a pattern. The argument to this test must be a *Perl* regular expression that
|
|
captures digits like this: (\d+). If the table name matches the pattern,
|
|
these captured digits are taken to be the |MySQL| connection ID of some process.
|
|
If the connection doesn't exist according to SHOW FULL PROCESSLIST, the test
|
|
returns true. If the connection ID is greater than:program:`pt-find`'s own
|
|
connection ID, the test returns false for safety.
|
|
|
|
Why would you want to do this? If you use |MySQL| statement-based replication,
|
|
you probably know the trouble temporary tables can cause. You might choose to
|
|
work around this by creating real tables with unique names, instead of
|
|
temporary tables. One way to do this is to append your connection ID to the
|
|
end of the table, thusly: scratch_table_12345. This assures the table name is
|
|
unique and lets you have a way to find which connection it was associated
|
|
with. And perhaps most importantly, if the connection no longer exists, you
|
|
can assume the connection died without cleaning up its tables, and this table
|
|
is a candidate for removal.
|
|
|
|
This is how I manage scratch tables, and that's why I included this test in
|
|
:program:`pt-find`.
|
|
|
|
The argument I use to :option:`--connection-id" is "\D_(\d+)$`. That finds tables
|
|
with a series of numbers at the end, preceded by an underscore and some
|
|
non-number character (the latter criterion prevents me from examining tables
|
|
with a date at the end, which people tend to do: baron_scratch_2007_05_07 for
|
|
example). It's better to keep the scratch tables separate of course.
|
|
|
|
If you do this, make sure the user:program:`pt-find` runs as has the PROCESS privilege!
|
|
Otherwise it will only see connections from the same user, and might think some
|
|
tables are ready to remove when they're still in use. For safety,:program:`pt-find`
|
|
checks this for you.
|
|
|
|
See also :option:`--server-id`.
|
|
|
|
|
|
.. option:: --createopts
|
|
|
|
type: string; group: Tests
|
|
|
|
Table create option matches pattern. This tests the Create_options column.
|
|
|
|
|
|
.. option:: --ctime
|
|
|
|
type: size; group: Tests
|
|
|
|
Table was created n days ago. This tests the Create_time column.
|
|
|
|
|
|
.. option:: --datafree
|
|
|
|
type: size; group: Tests
|
|
|
|
Table has n bytes of free space. This tests the Data_free column.
|
|
The specified size can be "NULL" to test where Data_free IS NULL.
|
|
|
|
|
|
.. option:: --datasize
|
|
|
|
type: size; group: Tests
|
|
|
|
Table data uses n bytes of space. This tests the Data_length column.
|
|
The specified size can be "NULL" to test where Data_length IS NULL.
|
|
|
|
|
|
.. option:: --dblike
|
|
|
|
type: string; group: Tests
|
|
|
|
Database name matches SQL LIKE pattern.
|
|
|
|
|
|
.. option:: --dbregex
|
|
|
|
type: string; group: Tests
|
|
|
|
Database name matches this pattern.
|
|
|
|
|
|
.. option:: --empty
|
|
|
|
group: Tests
|
|
|
|
Table has no rows. This tests the Rows column.
|
|
|
|
|
|
.. option:: --engine
|
|
|
|
type: string; group: Tests
|
|
|
|
Table storage engine matches this pattern. This tests the Engine column, or in
|
|
earlier versions of |MySQL|, the Type column.
|
|
|
|
|
|
.. option:: --function
|
|
|
|
type: string; group: Tests
|
|
|
|
Function definition matches pattern.
|
|
|
|
|
|
.. option:: --indexsize
|
|
|
|
type: size; group: Tests
|
|
|
|
Table indexes use n bytes of space. This tests the Index_length column.
|
|
The specified size can be "NULL" to test where Index_length IS NULL.
|
|
|
|
|
|
.. option:: --kmin
|
|
|
|
type: size; group: Tests
|
|
|
|
Table was checked n minutes ago. This tests the Check_time column.
|
|
|
|
|
|
.. option:: --ktime
|
|
|
|
type: size; group: Tests
|
|
|
|
Table was checked n days ago. This tests the Check_time column.
|
|
|
|
|
|
.. option:: --mmin
|
|
|
|
type: size; group: Tests
|
|
|
|
Table was last modified n minutes ago. This tests the Update_time column.
|
|
|
|
|
|
.. option:: --mtime
|
|
|
|
type: size; group: Tests
|
|
|
|
Table was last modified n days ago. This tests the Update_time column.
|
|
|
|
|
|
.. option:: --procedure
|
|
|
|
type: string; group: Tests
|
|
|
|
Procedure definition matches pattern.
|
|
|
|
|
|
.. option:: --rowformat
|
|
|
|
type: string; group: Tests
|
|
|
|
Table row format matches pattern. This tests the Row_format column.
|
|
|
|
|
|
.. option:: --rows
|
|
|
|
type: size; group: Tests
|
|
|
|
Table has n rows. This tests the Rows column.
|
|
The specified size can be "NULL" to test where Rows IS NULL.
|
|
|
|
|
|
.. option:: --server-id
|
|
|
|
type: string; group: Tests
|
|
|
|
Table name contains the server ID. If you create temporary tables with the
|
|
naming convention explained in :option:`--connection-id`, but also add the server ID of the
|
|
server on which the tables are created, then you can use this pattern match to
|
|
ensure tables are dropped only on the server they're created on. This prevents
|
|
a table from being accidentally dropped on a slave while it's in use (provided
|
|
that your server IDs are all unique, which they should be for replication to
|
|
work).
|
|
|
|
For example, on the master (server ID 22) you create a table called
|
|
scratch_table_22_12345. If you see this table on the slave (server ID 23), you
|
|
might think it can be dropped safely if there's no such connection 12345. But
|
|
if you also force the name to match the server ID with \ ``--server-id '\D_(\d+)_\d+$'``\ ,
|
|
the table won't be dropped on the slave.
|
|
|
|
|
|
.. option:: --tablesize
|
|
|
|
type: size; group: Tests
|
|
|
|
Table uses n bytes of space. This tests the sum of the Data_length and
|
|
Index_length columns.
|
|
|
|
|
|
.. option:: --tbllike
|
|
|
|
type: string; group: Tests
|
|
|
|
Table name matches SQL LIKE pattern.
|
|
|
|
|
|
.. option:: --tblregex
|
|
|
|
type: string; group: Tests
|
|
|
|
Table name matches this pattern.
|
|
|
|
|
|
.. option:: --tblversion
|
|
|
|
type: size; group: Tests
|
|
|
|
Table version is n. This tests the Version column.
|
|
|
|
|
|
.. option:: --trigger
|
|
|
|
type: string; group: Tests
|
|
|
|
Trigger action statement matches pattern.
|
|
|
|
|
|
.. option:: --trigger-table
|
|
|
|
type: string; group: Tests
|
|
|
|
:option:`--trigger` is defined on table matching pattern.
|
|
|
|
|
|
.. option:: --view
|
|
|
|
type: string; group: Tests
|
|
|
|
CREATE VIEW matches this pattern.
|
|
|
|
|
|
ACTIONS
|
|
=======
|
|
|
|
|
|
The :option:`--exec-plus` action happens after everything else, but otherwise actions
|
|
happen in an indeterminate order. If you need determinism, file a bug report
|
|
and I'll add this feature.
|
|
|
|
.. option:: --exec
|
|
|
|
type: string; group: Actions
|
|
|
|
Execute this SQL with each item found. The SQL can contain escapes and
|
|
formatting directives (see :option:`--printf`).
|
|
|
|
|
|
.. option:: --exec-dsn
|
|
|
|
type: string; group: Actions
|
|
|
|
Specify a DSN in key-value format to use when executing SQL with :option:`--exec` and
|
|
:option:`--exec-plus`. Any values not specified are inherited from command-line
|
|
arguments.
|
|
|
|
|
|
.. option:: --exec-plus
|
|
|
|
type: string; group: Actions
|
|
|
|
Execute this SQL with all items at once. This option is unlike :option:`--exec`. There
|
|
are no escaping or formatting directives; there is only one special placeholder
|
|
for the list of database and table names, %s. The list of tables found will be
|
|
joined together with commas and substituted wherever you place %s.
|
|
|
|
You might use this, for example, to drop all the tables you found:
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
DROP TABLE %s
|
|
|
|
|
|
This is sort of like GNU find's "-exec command {} +" syntax. Only it's not
|
|
totally cryptic. And it doesn't require me to write a command-line parser.
|
|
|
|
|
|
.. option:: --print
|
|
|
|
group: Actions
|
|
|
|
Print the database and table name, followed by a newline. This is the default
|
|
action if no other action is specified.
|
|
|
|
|
|
.. option:: --printf
|
|
|
|
type: string; group: Actions
|
|
|
|
Print format on the standard output, interpreting '\' escapes and '%'
|
|
directives. Escapes are backslashed characters, like \n and \t. *Perl*
|
|
interprets these, so you can use any escapes *Perl* knows about. Directives are
|
|
replaced by %s, and as of this writing, you can't add any special formatting
|
|
instructions, like field widths or alignment (though I'm musing over ways to do
|
|
that).
|
|
|
|
Here is a list of the directives. Note that most of them simply come from
|
|
columns of SHOW TABLE STATUS. If the column is NULL or doesn't exist, you get
|
|
an empty string in the output. A % character followed by any character not in
|
|
the following list is discarded (but the other character is printed).
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
CHAR DATA SOURCE NOTES
|
|
---- ------------------ ------------------------------------------
|
|
a Auto_increment
|
|
A Avg_row_length
|
|
c Checksum
|
|
C Create_time
|
|
D Database The database name in which the table lives
|
|
d Data_length
|
|
E Engine In older versions of |MySQL|, this is Type
|
|
F Data_free
|
|
f Innodb_free Parsed from the Comment field
|
|
I Index_length
|
|
K Check_time
|
|
L Collation
|
|
M Max_data_length
|
|
N Name
|
|
O Comment
|
|
P Create_options
|
|
R Row_format
|
|
S Rows
|
|
T Table_length Data_length+Index_length
|
|
U Update_time
|
|
V Version
|
|
|
|
|
|
DSN OPTIONS
|
|
===========
|
|
|
|
|
|
These DSN options are used to create a DSN. Each option is given like
|
|
\ ``option=value``\ . The options are case-sensitive, so P and p are not the
|
|
same option. There cannot be whitespace before or after the \ ``=``\ and
|
|
if the value contains whitespace it must be quoted. DSN options are
|
|
comma-separated. See the percona-toolkit manpage for full details.
|
|
|
|
|
|
* ``A``
|
|
|
|
dsn: charset; copy: yes
|
|
|
|
Default character set.
|
|
|
|
|
|
|
|
* ``D``
|
|
|
|
dsn: database; copy: yes
|
|
|
|
Default database.
|
|
|
|
|
|
* ``F``
|
|
|
|
dsn: mysql_read_default_file; copy: yes
|
|
|
|
Only read default options from the given file
|
|
|
|
|
|
* ``h``
|
|
|
|
dsn: host; copy: yes
|
|
|
|
Connect to host.
|
|
|
|
|
|
* ``p``
|
|
|
|
dsn: password; copy: yes
|
|
|
|
Password to use when connecting.
|
|
|
|
|
|
* ``p``
|
|
|
|
dsn: port; copy: yes
|
|
|
|
Port number to use for connection.
|
|
|
|
|
|
* ``S``
|
|
|
|
dsn: mysql_socket; copy: yes
|
|
|
|
Socket file to use for connection.
|
|
|
|
|
|
* ``u``
|
|
|
|
dsn: user; copy: yes
|
|
|
|
User for login if not current user.
|
|
|
|
|
|
ENVIRONMENT
|
|
===========
|
|
|
|
|
|
The environment variable \ ``PTDEBUG``\ enables verbose debugging output to ``STDERR``.
|
|
To enable debugging and capture all output to a file, run the tool like:
|
|
|
|
|
|
.. code-block:: perl
|
|
|
|
PTDEBUG=1: pt-find ... > FILE 2>&1
|
|
|
|
|
|
Be careful: debugging output is voluminous and can generate several megabytes
|
|
of output.
|
|
|
|
|
|
SYSTEM REQUIREMENTS
|
|
===================
|
|
|
|
|
|
You need *Perl* , ``DBI``, ``DBD::mysql``, and some core packages that ought to be
|
|
installed in any reasonably new version of *Perl* .
|
|
|
|
|
|
BUGS
|
|
====
|
|
|
|
|
|
For a list of known bugs, see `http://www.percona.com/bugs/pt-find <http://www.percona.com/bugs/pt-find>`_.
|
|
|
|
Please report bugs at `https://bugs.launchpad.net/percona-toolkit <https://bugs.launchpad.net/percona-toolkit>`_.
|
|
|
|
AUTHORS
|
|
=======
|
|
|
|
Baron Schwartz
|
|
|
|
COPYRIGHT, LICENSE, AND WARRANTY
|
|
================================
|
|
|
|
This program is copyright 2007-2011 Baron Schwartz, 2011 Percona Inc.
|
|
Feedback and improvements are welcome.
|
|
|
|
VERSION
|
|
=======
|
|
|
|
:program:`pt-find` 1.0.1
|
|
|