initial documentation

This commit is contained in:
baron@percona.com
2012-01-25 16:11:09 -05:00
parent ad552756b2
commit 5acec0d38e

View File

@@ -3193,15 +3193,6 @@ sub help {
------------------- Press any key to continue ----------------------- ------------------- Press any key to continue -----------------------
HELP HELP
print $help; print $help;
=begin IGNORE
my $lines = $help =~ tr/\n//;
while ( $lines-- ) {
$Diskstats::printed_lines--;
print_header(%args) unless $Diskstats::printed_lines;
}
=cut
pause(%args); pause(%args);
return; return;
} }
@@ -3421,14 +3412,14 @@ if ( !caller ) { exit main(@ARGV); }
=head1 NAME =head1 NAME
pt-diskstats - Aggregate and summarize F</proc/diskstats>. pt-diskstats - An interactive I/O monitoring tool for GNU/Linux.
=head1 SYNOPSIS =head1 SYNOPSIS
Usage: pt-diskstats [OPTION...] [FILES] Usage: pt-diskstats [OPTION...] [FILES]
pt-diskstats reads F</proc/diskstats> periodically, or files with the pt-diskstats prints disk I/O statistics for GNU/Linux. It is somewhat similar
contents of F</proc/diskstats>, aggregates the data, and prints it nicely. to iostat, but it is interactive and more detailed.
=head1 RISKS =head1 RISKS
@@ -3437,7 +3428,7 @@ whether known or unknown, of using this tool. The two main categories of risks
are those created by the nature of the tool (e.g. read-only tools vs. read-write are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs. tools) and those created by bugs.
pt-diskstats is a read-only tool. It should be very low-risk. pt-diskstats simply reads /proc/diskstats. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm At the time of this release, we know of no bugs that could cause serious harm
to users. to users.
@@ -3451,87 +3442,133 @@ See also L<"BUGS"> for more information on filing bugs and getting help.
=head1 DESCRIPTION =head1 DESCRIPTION
pt-diskstats tool is similar to iostat, but has some advantages. It separates The pt-diskstats tool is similar to iostat, but has some advantages. It prints
reads and writes, for example, and computes some things that iostat does in read and write statistics separately, and has more columns. It is menu-driven
either incorrect or confusing ways. It is also menu-driven and interactive and interactive, with several different ways to aggregate the data. It
with several different ways to aggregate the data, and integrates well with integrates well with the L<pt-stalk> tool. It also does the "right thing" by
the L<pt-collect> tool. These properties make it very convenient for quickly default, such as hiding disks that are idle. These properties make it very
drilling down into I/O performance at the desired level of granularity. convenient for quickly drilling down into I/O performance and inspecting disk
behavior.
This program works in two main modes. One way is to process a file with saved This program works in two modes. The default is to collect samples of
disk statistics, which you specify on the command line. The other way is to /proc/diskstats and print out the formatted statistics at intervals. The other
start a background process gathering samples at intervals and saving them into mode is to process a file that contains saved samples of /proc/diskstats; there
a file, and process this file in the foreground. In both cases, the tool is is a shell script later in this documentation that shows how to collect such a
interactively controlled by keystrokes, so you can redisplay and slice the file.
data flexibly and easily. If the tool is not attached to a terminal, it
doesn't run interactively; it just processes and prints its output, then exits.
Otherwise it loops until you exit with the 'q' key.
If you press the '?' key, you will bring up the interactive help menu that In both cases, the tool is interactively controlled by keystrokes, so you can
shows which keys control the program. redisplay and slice the data flexibly and easily. It loops forever, until you
exit with the 'q' key. If you press the '?' key, you will bring up the
interactive help menu that shows which keys control the program.
Files should have this format: When the program is gathering samples of /proc/diskstats and refreshing its
display, it prints information about the newest sample each time it refreshes.
When it is operating on a file of saved samples, it redraws the entire file's
contents every time you change an option.
TS <timestamp> <-- must start with a TS line. The program doesn't print information about every disk device on the system. It
<contents of /proc/diskstats> hides devices that it has never observed to have any activity. You can enable
TS <timestamp> and disable this by pressing the 'i' key.
<contents of /proc/diskstats>
... et cetera
Note that previously the format was backwards -- It would put the timestamp
at the bottom of each sample, not the top. This was doubly troublesome:
It was inconsistent with how the rest of the Toolkit deals with timestamps,
and allowed malformed data to sit in the bottom of the file and give incorrect
results.
See L<http://aspersa.googlecode.com/svn/html/diskstats.html> for a detailed
example of using the tool.
=head1 OUTPUT =head1 OUTPUT
The program's output looks like the following sample, which is too wide for this
manual page, so we have formatted it as several samples with continuations:
#ts device rd_s rd_avkb rd_mb_s rd_io_s rd_mrg rd_cnc rd_rt ...
{10} sda 0.5 4.0 0.0 0.1 0% 0.0 15.6 ...
{10} sdb 0.0 0.0 0.0 0.0 0% 0.0 0.0 ...
{10} dm-0 0.0 0.0 0.0 0.0 0% 0.0 0.0 ...
{10} dm-1 0.5 4.0 0.0 0.1 0% 0.0 15.6 ...
#ts device ... wr_s wr_avkb wr_mb_s wr_io_s wr_mrg wr_cnc wr_rt ...
{10} sda ... 30.6 6.7 0.2 6.5 40% 0.7 22.8 ...
{10} sdb ... 1.7 17.8 0.0 0.0 77% 0.0 0.8 ...
{10} dm-0 ... 2.5 4.0 0.0 0.1 0% 0.0 2.6 ...
{10} dm-1 ... 38.2 4.0 0.1 7.6 0% 0.8 21.2 ...
#ts device ... busy in_prg io_s qtime stime
{10} sda ... 2% 0 6.6 0.0 0.0
{10} sdb ... 0% 0 0.0 0.0 0.0
{10} dm-0 ... 0% 0 0.1 0.0 0.0
{10} dm-1 ... 2% 0 7.7 0.0 0.0
The columns are as follows: The columns are as follows:
=over =over
=item #ts =item #ts
The number of seconds of samples in the line. If there is only one, then This column's contents vary depending on the tool's aggregation mode. In the
the timestamp itself is shown, without the {curly braces}. default mode, when each line contains information about a single disk but
possibly aggregates across several samples from that disk, this column shows the
number of samples that were included into the line of output, in {curly braces}.
In the example shown, each line of output aggregates {10} samples of
/proc/diskstats.
In the "all" group-by mode, this column shows timestamp offsets, relative to the
time the tool began aggregating or the timestamp of the previous lines printed,
depending on the mode. The output can be confusing to explain, but it's rather
intuitive when you see the lines appearing on your screen periodically.
Similarly, in "sample" group-by mode, the number indicates the total time span
that is grouped into each sample.
=item device =item device
The device name. If there is more than one device, then instead the number The device name. If there is more than one device, then instead the number
of devices aggregated into the line is shown, in {curly braces}. of devices aggregated into the line is shown, in {curly braces}.
=item rd_s
The average number of reads per second. This is the number of I/O requests that
were sent to the block device. However, the requests may be merged by the I/O
scheduler, so they might be sent to the physical device differently.
=item rd_avkb
The average size of the reads, in kilobytes.
=item rd_mb_s
The average number of megabytes read per second.
=item rd_io_s =item rd_io_s
The number of IO reads per second, average, during the sampled interval. The average number of IO reads per second. This is the number that is actually
sent to the physical device after merging adjacent requests and any other
processing in the queue.
=item rd_mrg
The percentage of read requests that were merged together in the disk
scheduler before reaching the physical device.
=item rd_cnc =item rd_cnc
The average concurrency of the read operations, as computed by Little's Law The average concurrency of the read operations, as computed by Little's Law.
(a.k.a. queueing theory). This is the end-to-end concurrency, including time spent in the queue.
=item rd_rt =item rd_rt
The average response time of the read operations, in milliseconds. The average response time of the read operations, in milliseconds. This is the
end-to-end response time, including time spent in the queue. It is the response
time that the application making I/O requests sees.
=item wr_mb_s =item wr_s, wr_avkb, wr_mb_s, wr_io_s, wr_mrg, wr_cnc, wr_rt
IO writes per second, average. These columns show write activity, and they match the corresponding columns for
read activity.
=item wr_cnc
Write concurrency, similar to read concurrency.
=item wr_rt
Write response time, similar to read response time.
=item busy =item busy
The fraction of time that the device had at least one request in progress; The fraction of time that the device had at least one request in progress;
this is what iostat calls %util (which is a misleading name). this is what iostat calls %util. It cannot exceed 100% unless there is a
rounding error, but it is a common mistake to think that a device that's busy
all the time is saturated. A device such as a RAID volume should support
concurrency higher than 1, and solid-state drives can support very high
concurrency. Concurrency can grow without bound, and is a more reliable
indicator of how loaded the device really is.
=item in_prg =item in_prg
@@ -3540,38 +3577,58 @@ concurrencies, which are averages that are generated from reliable numbers,
this number is an instantaneous sample, and you can see that it might this number is an instantaneous sample, and you can see that it might
represent a spike of requests, rather than the true long-term average. represent a spike of requests, rather than the true long-term average.
=back =item ios_s
In addition to the above columns, there are a few columns that are hidden by The average throughput of the physical device, in I/O operations per second.
default. If you press the 'c' key, and then press Enter, you will blank out This column can be used to help you understand how much activity the underlying
the regular expression pattern that selects columns to display, and you will device is actually doing.
then see the extra columns:
=over =item qtime
=item rd_s The average queue time; that is, time a request spends in the device scheduler
queue before being sent to the physical device. This is an average over reads
and writes.
The number of reads per second. =item stime
=item rd_avkb The average service time; that is, the time elapsed while the physical device
processes the request, after the request leaves the queue. This is an average
over reads and writes.
The average size of the reads, in kilobytes. You can compare the stime and qtime columns to see whether the response time for
reads and writes is spent in the queue or on the physical device. However, you
=item rd_mrg cannot see the difference between reads and writes. Changing the block device
scheduler algorithm might improve queue time greatly. The default algorithm,
The percentage of read requests that were merged together in the disk cfq, is very bad for servers, and should only be used on laptops and
scheduler before reaching the device. workstations that perform tasks such as working with spreadsheets and surfing
the Internet.
=item rd_mb_s
The number of megabytes read per second, average, during the sampled interval.
=item wr_s, wr_avgkb, and wr_mrg, wr_mb_s
These are analogous to their C<rd_*> cousins.
=back =back
=head1 COLLECTING DATA
It is straightforward to gather a sample of data for this tool. Files should
have this format:
TS <timestamp> <-- must start with a TS line.
<contents of /proc/diskstats>
TS <timestamp>
<contents of /proc/diskstats>
... et cetera
You can simply use pt-diskstats with L<"--save-samples"> to collect this data
for you. If you wish to capture samples as part of some other tool, and use
pt-diskstats to analyze them, you can include a snippet of shell script such as
the following:
INTERVAL=1
while true; do
sleep=$(date +%s.%N | awk "{print $INTERVAL - (\$1 % $INTERVAL)}")
sleep $sleep
date +"TS %s.%N %F %T" >> diskstats-samples.txt
cat /proc/diskstats >> diskstats-samples.txt
done
=head1 OPTIONS =head1 OPTIONS
This tool accepts additional command-line arguments. Refer to the This tool accepts additional command-line arguments. Refer to the
@@ -3588,31 +3645,30 @@ first option on the command line.
=item --columns-regex =item --columns-regex
type: string; default: cnc|rt|busy|prg|time|io_s type: string; default: .
Perl regex of which columns to include. Print columns that match this Perl regex.
=item --devices-regex =item --devices-regex
type: string type: string
Perl regex of which devices to include. Print devices that match this Perl regex.
=item --group-by =item --group-by
type: string; default: disk type: string; default: disk
Group-by mode (default disk); specify one of the following: Group-by mode: disk, sample, or all. In B<disk> mode, each line of output shows
one disk device. In B<sample> mode, each line of output shows one sample of
disk - Each line of output shows one disk device. statistics. In B<all> mode, each line of output shows one sample and one disk
sample - Each line of output shows one sample of statistics. device.
all - Each line of output shows one sample and one disk device.
=item --sample-time =item --sample-time
type: int; default: 1 type: int; default: 1
In --group-by sample mode, include INTERVAL seconds of samples per group. In --group-by sample mode, include N seconds of samples per group.
=item --save-samples =item --save-samples
@@ -3624,7 +3680,7 @@ File to save diskstats samples in; these can be used for later analysis.
type: int type: int
When in interactive mode, stop after N samples. When in interactive mode, stop after N samples. Run forever by default.
=item --refresh-interval =item --refresh-interval
@@ -3640,7 +3696,8 @@ Show inactive devices.
default: yes default: yes
Print the headers as often as needed to prevent it from scrolling out of view. Print the headers as often as needed to prevent them from scrolling out of view.
You can press the space bar to reprint headers at will.
=item --help =item --help
@@ -3722,4 +3779,21 @@ This program is copyright 2010-2011 Baron Schwartz, 2011 Percona Inc.
Feedback and improvements are welcome. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, TH WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue `man perlgpl' or `man perlartistic' to read these
licenses.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA.
=head1 VERSION
pt-diskstats 1.0.1
=cut