Files
percona-toolkit/t/pt-table-checksum/pt-1637.t
Sveta Smirnova f9726e75cc PT-1059 tools cannot parse index names containing newlines (#578)
* PT-1059 - Tools can't parse index names containing newlines

Fixed regular expressions in TableParser.
Added test case, including test for new lines in the column name

* PT-1059 - Tools can't parse index names containing newlines

Disabled pt-1637.t until PT-2174 is fixed.
Updated number of tables in b/t/pt-table-checksum/issue_1485195.t

* Patch newlines in table columns (#369)

Will accept this change as part of the fix for PT-1059 - Tools cannot parse index names containing new lines. We will later fix the issue with the patch ourselves.

mysql 5.6.40 allows newlines in column names however the following code:

my @defs = $ddl =~ m/^(\s+`.*?),?$/gm;

breaks due to it detecting newlines as line ends. The 'm' argument at the end does this by auto-detecting lines by newline characters.

To correct this issue I've made use of zero-length assertions known as " positive lookback"

https://www.regular-expressions.info/lookaround.html

what does it do?

m/(?:(?<=,\n)|(?<=\(\n))(\s+`(?:.|\n)+?`.+?),?\n/g;

TLDR:

Treat the string as one long string and don't treat \n as the end of a line.

look for (\s+`(?:.|\n)+?`.+?),?\n

if one of those matches look at what precedes the string

if it's ',\n' or ')\n' the string matches. Only save what's in (\s+`(?:.|\n)+?`.+?),?\n

m/ is declaring this a matching regex.

(?:(?<=,\n)|(?<=(\n)) This is an OR statement including two look-behind clauses. The ?: tells the enclosing parentheses to not store the result as a variable. I've put the two look-behinds in this OR statement below this line:

(?<=,\n) Look behind the matched string for a comma followed by a newline, the comma must be there for this look behind to match.

(?<=(\n) Look behind the matched string for a open parentheses followed by a newline, the open parentheses must be there.

(\s+`(?:.|\n)+?`.+?),?\n This is the actual match. Match newline character followed by one or more spaces followed by back-tick followed by a character which can be any character or a newline one or more times, but don't be greedy and take the rest of the match into consideration. Followed by a back tick and any character one or more times. This match stops where there is a comma or failing that a newline following a back tick and some characters.

,?\n match a comma that may not be there followed by a newline.
/g don't stop if this pattern matches keep looking for more patterns to the end of the string.

* PT-1059 - Tools can't parse index names containing newlines

Placed fix from PR-369 into proper place and created test case for this fix.

---------

Co-authored-by: geneguido <31323560+geneguido@users.noreply.github.com>
2023-02-02 17:09:13 +03:00

93 lines
2.4 KiB
Perl

#!/usr/bin/env perl
BEGIN {
die "The PERCONA_TOOLKIT_BRANCH environment variable is not set.\n"
unless $ENV{PERCONA_TOOLKIT_BRANCH} && -d $ENV{PERCONA_TOOLKIT_BRANCH};
unshift @INC, "$ENV{PERCONA_TOOLKIT_BRANCH}/lib";
};
use strict;
use warnings FATAL => 'all';
use English qw(-no_match_vars);
use Test::More;
use PerconaTest;
use Sandbox;
use SqlModes;
require "$trunk/bin/pt-table-checksum";
plan skip_all => 'Disabled until PT-2174 is fixed';
my $dp = new DSNParser(opts=>$dsn_opts);
my $sb = new Sandbox(basedir => '/tmp', DSNParser => $dp);
diag ('Starting second sandbox master');
my ($master1_dbh, $master1_dsn) = $sb->start_sandbox(
server => 'chan_master1',
type => 'master',
);
diag ('Starting second sandbox slave 1');
my ($slave1_dbh, $slave1_dsn) = $sb->start_sandbox(
server => 'chan_slave1',
type => 'slave',
master => 'chan_master1',
);
diag ('Starting second sandbox slave 2');
my ($slave2_dbh, $slave2_dsn) = $sb->start_sandbox(
server => 'chan_slave2',
type => 'slave',
master => 'chan_master1',
);
my $dbh = $sb->get_dbh_for('master');
if ( !$dbh ) {
plan skip_all => 'Cannot connect to sandbox master';
}
else {
plan tests => 2;
}
diag("loading samples");
$sb->load_file('chan_master1', 't/pt-table-checksum/samples/pt-1637.sql');
my @args = ($master1_dsn,
"--set-vars", "innodb_lock_wait_timeout=50",
"--ignore-databases", "mysql", "--no-check-binlog-format",
"--recursion-method", "dsn=h=127.0.0.1,D=test,t=dsns",
"--run-time", "5", "--fail-on-stopped-replication",
);
# The sandbox servers run with lock_wait_timeout=3 and it's not dynamic
# so we need to specify --set-vars innodb_lock_wait_timeout=3 else the tool will die.
$sb->do_as_root("chan_slave1", 'stop slave IO_thread;');
my $output;
my $exit_status;
($output, $exit_status) = full_output(
sub { $exit_status = pt_table_checksum::main(@args) },
stderr => 1,
);
is(
$exit_status,
128,
"PT-1637 exist status 128 if replication is stopped and --fail-on-replication-stopped",
);
$sb->do_as_root("chan_slave1", 'start slave IO_thread;');
sleep(2);
$sb->stop_sandbox(qw(chan_master1 chan_slave2 chan_slave1));
# #############################################################################
# Done.
# #############################################################################
$sb->wipe_clean($dbh);
ok($sb->ok(), "Sandbox servers") or BAIL_OUT(__FILE__ . " broke the sandbox");
exit;