diff --git a/bin/pt-diskstats b/bin/pt-diskstats index 4741358e..6c52b73c 100755 --- a/bin/pt-diskstats +++ b/bin/pt-diskstats @@ -3193,15 +3193,6 @@ sub help { ------------------- Press any key to continue ----------------------- HELP print $help; -=begin IGNORE - - my $lines = $help =~ tr/\n//; - - while ( $lines-- ) { - $Diskstats::printed_lines--; - print_header(%args) unless $Diskstats::printed_lines; - } -=cut pause(%args); return; } @@ -3421,14 +3412,14 @@ if ( !caller ) { exit main(@ARGV); } =head1 NAME -pt-diskstats - Aggregate and summarize F. +pt-diskstats - An interactive I/O monitoring tool for GNU/Linux. =head1 SYNOPSIS Usage: pt-diskstats [OPTION...] [FILES] -pt-diskstats reads F periodically, or files with the -contents of F, aggregates the data, and prints it nicely. +pt-diskstats prints disk I/O statistics for GNU/Linux. It is somewhat similar +to iostat, but it is interactive and more detailed. =head1 RISKS @@ -3437,7 +3428,7 @@ whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. -pt-diskstats is a read-only tool. It should be very low-risk. +pt-diskstats simply reads /proc/diskstats. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. @@ -3451,87 +3442,133 @@ See also L<"BUGS"> for more information on filing bugs and getting help. =head1 DESCRIPTION -pt-diskstats tool is similar to iostat, but has some advantages. It separates -reads and writes, for example, and computes some things that iostat does in -either incorrect or confusing ways. It is also menu-driven and interactive -with several different ways to aggregate the data, and integrates well with -the L tool. These properties make it very convenient for quickly -drilling down into I/O performance at the desired level of granularity. +The pt-diskstats tool is similar to iostat, but has some advantages. It prints +read and write statistics separately, and has more columns. It is menu-driven +and interactive, with several different ways to aggregate the data. It +integrates well with the L tool. It also does the "right thing" by +default, such as hiding disks that are idle. These properties make it very +convenient for quickly drilling down into I/O performance and inspecting disk +behavior. -This program works in two main modes. One way is to process a file with saved -disk statistics, which you specify on the command line. The other way is to -start a background process gathering samples at intervals and saving them into -a file, and process this file in the foreground. In both cases, the tool is -interactively controlled by keystrokes, so you can redisplay and slice the -data flexibly and easily. If the tool is not attached to a terminal, it -doesn't run interactively; it just processes and prints its output, then exits. -Otherwise it loops until you exit with the 'q' key. +This program works in two modes. The default is to collect samples of +/proc/diskstats and print out the formatted statistics at intervals. The other +mode is to process a file that contains saved samples of /proc/diskstats; there +is a shell script later in this documentation that shows how to collect such a +file. -If you press the '?' key, you will bring up the interactive help menu that -shows which keys control the program. +In both cases, the tool is interactively controlled by keystrokes, so you can +redisplay and slice the data flexibly and easily. It loops forever, until you +exit with the 'q' key. If you press the '?' key, you will bring up the +interactive help menu that shows which keys control the program. -Files should have this format: +When the program is gathering samples of /proc/diskstats and refreshing its +display, it prints information about the newest sample each time it refreshes. +When it is operating on a file of saved samples, it redraws the entire file's +contents every time you change an option. - TS <-- must start with a TS line. - - TS - - ... et cetera - -Note that previously the format was backwards -- It would put the timestamp -at the bottom of each sample, not the top. This was doubly troublesome: -It was inconsistent with how the rest of the Toolkit deals with timestamps, -and allowed malformed data to sit in the bottom of the file and give incorrect -results. - -See L for a detailed -example of using the tool. +The program doesn't print information about every disk device on the system. It +hides devices that it has never observed to have any activity. You can enable +and disable this by pressing the 'i' key. =head1 OUTPUT +The program's output looks like the following sample, which is too wide for this +manual page, so we have formatted it as several samples with continuations: + + #ts device rd_s rd_avkb rd_mb_s rd_io_s rd_mrg rd_cnc rd_rt ... + {10} sda 0.5 4.0 0.0 0.1 0% 0.0 15.6 ... + {10} sdb 0.0 0.0 0.0 0.0 0% 0.0 0.0 ... + {10} dm-0 0.0 0.0 0.0 0.0 0% 0.0 0.0 ... + {10} dm-1 0.5 4.0 0.0 0.1 0% 0.0 15.6 ... + + #ts device ... wr_s wr_avkb wr_mb_s wr_io_s wr_mrg wr_cnc wr_rt ... + {10} sda ... 30.6 6.7 0.2 6.5 40% 0.7 22.8 ... + {10} sdb ... 1.7 17.8 0.0 0.0 77% 0.0 0.8 ... + {10} dm-0 ... 2.5 4.0 0.0 0.1 0% 0.0 2.6 ... + {10} dm-1 ... 38.2 4.0 0.1 7.6 0% 0.8 21.2 ... + + #ts device ... busy in_prg io_s qtime stime + {10} sda ... 2% 0 6.6 0.0 0.0 + {10} sdb ... 0% 0 0.0 0.0 0.0 + {10} dm-0 ... 0% 0 0.1 0.0 0.0 + {10} dm-1 ... 2% 0 7.7 0.0 0.0 + The columns are as follows: =over =item #ts -The number of seconds of samples in the line. If there is only one, then -the timestamp itself is shown, without the {curly braces}. +This column's contents vary depending on the tool's aggregation mode. In the +default mode, when each line contains information about a single disk but +possibly aggregates across several samples from that disk, this column shows the +number of samples that were included into the line of output, in {curly braces}. +In the example shown, each line of output aggregates {10} samples of +/proc/diskstats. + +In the "all" group-by mode, this column shows timestamp offsets, relative to the +time the tool began aggregating or the timestamp of the previous lines printed, +depending on the mode. The output can be confusing to explain, but it's rather +intuitive when you see the lines appearing on your screen periodically. + +Similarly, in "sample" group-by mode, the number indicates the total time span +that is grouped into each sample. =item device The device name. If there is more than one device, then instead the number of devices aggregated into the line is shown, in {curly braces}. +=item rd_s + +The average number of reads per second. This is the number of I/O requests that +were sent to the block device. However, the requests may be merged by the I/O +scheduler, so they might be sent to the physical device differently. + +=item rd_avkb + +The average size of the reads, in kilobytes. + +=item rd_mb_s + +The average number of megabytes read per second. + =item rd_io_s -The number of IO reads per second, average, during the sampled interval. +The average number of IO reads per second. This is the number that is actually +sent to the physical device after merging adjacent requests and any other +processing in the queue. + +=item rd_mrg + +The percentage of read requests that were merged together in the disk +scheduler before reaching the physical device. =item rd_cnc -The average concurrency of the read operations, as computed by Little's Law -(a.k.a. queueing theory). +The average concurrency of the read operations, as computed by Little's Law. +This is the end-to-end concurrency, including time spent in the queue. =item rd_rt -The average response time of the read operations, in milliseconds. +The average response time of the read operations, in milliseconds. This is the +end-to-end response time, including time spent in the queue. It is the response +time that the application making I/O requests sees. -=item wr_mb_s +=item wr_s, wr_avkb, wr_mb_s, wr_io_s, wr_mrg, wr_cnc, wr_rt -IO writes per second, average. - -=item wr_cnc - -Write concurrency, similar to read concurrency. - -=item wr_rt - -Write response time, similar to read response time. +These columns show write activity, and they match the corresponding columns for +read activity. =item busy The fraction of time that the device had at least one request in progress; -this is what iostat calls %util (which is a misleading name). +this is what iostat calls %util. It cannot exceed 100% unless there is a +rounding error, but it is a common mistake to think that a device that's busy +all the time is saturated. A device such as a RAID volume should support +concurrency higher than 1, and solid-state drives can support very high +concurrency. Concurrency can grow without bound, and is a more reliable +indicator of how loaded the device really is. =item in_prg @@ -3540,38 +3577,58 @@ concurrencies, which are averages that are generated from reliable numbers, this number is an instantaneous sample, and you can see that it might represent a spike of requests, rather than the true long-term average. -=back +=item ios_s -In addition to the above columns, there are a few columns that are hidden by -default. If you press the 'c' key, and then press Enter, you will blank out -the regular expression pattern that selects columns to display, and you will -then see the extra columns: +The average throughput of the physical device, in I/O operations per second. +This column can be used to help you understand how much activity the underlying +device is actually doing. -=over +=item qtime -=item rd_s +The average queue time; that is, time a request spends in the device scheduler +queue before being sent to the physical device. This is an average over reads +and writes. -The number of reads per second. +=item stime -=item rd_avkb +The average service time; that is, the time elapsed while the physical device +processes the request, after the request leaves the queue. This is an average +over reads and writes. -The average size of the reads, in kilobytes. - -=item rd_mrg - -The percentage of read requests that were merged together in the disk -scheduler before reaching the device. - -=item rd_mb_s - -The number of megabytes read per second, average, during the sampled interval. - -=item wr_s, wr_avgkb, and wr_mrg, wr_mb_s - -These are analogous to their C cousins. +You can compare the stime and qtime columns to see whether the response time for +reads and writes is spent in the queue or on the physical device. However, you +cannot see the difference between reads and writes. Changing the block device +scheduler algorithm might improve queue time greatly. The default algorithm, +cfq, is very bad for servers, and should only be used on laptops and +workstations that perform tasks such as working with spreadsheets and surfing +the Internet. =back +=head1 COLLECTING DATA + +It is straightforward to gather a sample of data for this tool. Files should +have this format: + + TS <-- must start with a TS line. + + TS + + ... et cetera + +You can simply use pt-diskstats with L<"--save-samples"> to collect this data +for you. If you wish to capture samples as part of some other tool, and use +pt-diskstats to analyze them, you can include a snippet of shell script such as +the following: + + INTERVAL=1 + while true; do + sleep=$(date +%s.%N | awk "{print $INTERVAL - (\$1 % $INTERVAL)}") + sleep $sleep + date +"TS %s.%N %F %T" >> diskstats-samples.txt + cat /proc/diskstats >> diskstats-samples.txt + done + =head1 OPTIONS This tool accepts additional command-line arguments. Refer to the @@ -3588,31 +3645,30 @@ first option on the command line. =item --columns-regex -type: string; default: cnc|rt|busy|prg|time|io_s +type: string; default: . -Perl regex of which columns to include. +Print columns that match this Perl regex. =item --devices-regex type: string -Perl regex of which devices to include. +Print devices that match this Perl regex. =item --group-by type: string; default: disk -Group-by mode (default disk); specify one of the following: - - disk - Each line of output shows one disk device. - sample - Each line of output shows one sample of statistics. - all - Each line of output shows one sample and one disk device. +Group-by mode: disk, sample, or all. In B mode, each line of output shows +one disk device. In B mode, each line of output shows one sample of +statistics. In B mode, each line of output shows one sample and one disk +device. =item --sample-time type: int; default: 1 -In --group-by sample mode, include INTERVAL seconds of samples per group. +In --group-by sample mode, include N seconds of samples per group. =item --save-samples @@ -3624,7 +3680,7 @@ File to save diskstats samples in; these can be used for later analysis. type: int -When in interactive mode, stop after N samples. +When in interactive mode, stop after N samples. Run forever by default. =item --refresh-interval @@ -3640,7 +3696,8 @@ Show inactive devices. default: yes -Print the headers as often as needed to prevent it from scrolling out of view. +Print the headers as often as needed to prevent them from scrolling out of view. +You can press the space bar to reprint headers at will. =item --help @@ -3722,4 +3779,21 @@ This program is copyright 2010-2011 Baron Schwartz, 2011 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED -WARRANTIES, INCLUDING, WITHOUT LIMITATION, TH \ No newline at end of file +WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. + +This program is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free Software +Foundation, version 2; OR the Perl Artistic License. On UNIX and similar +systems, you can issue `man perlgpl' or `man perlartistic' to read these +licenses. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., 59 Temple +Place, Suite 330, Boston, MA 02111-1307 USA. + +=head1 VERSION + +pt-diskstats 1.0.1 + +=cut