Files
percona-toolkit/lib
Sveta Smirnova f9726e75cc PT-1059 tools cannot parse index names containing newlines (#578)
* PT-1059 - Tools can't parse index names containing newlines

Fixed regular expressions in TableParser.
Added test case, including test for new lines in the column name

* PT-1059 - Tools can't parse index names containing newlines

Disabled pt-1637.t until PT-2174 is fixed.
Updated number of tables in b/t/pt-table-checksum/issue_1485195.t

* Patch newlines in table columns (#369)

Will accept this change as part of the fix for PT-1059 - Tools cannot parse index names containing new lines. We will later fix the issue with the patch ourselves.

mysql 5.6.40 allows newlines in column names however the following code:

my @defs = $ddl =~ m/^(\s+`.*?),?$/gm;

breaks due to it detecting newlines as line ends. The 'm' argument at the end does this by auto-detecting lines by newline characters.

To correct this issue I've made use of zero-length assertions known as " positive lookback"

https://www.regular-expressions.info/lookaround.html

what does it do?

m/(?:(?<=,\n)|(?<=\(\n))(\s+`(?:.|\n)+?`.+?),?\n/g;

TLDR:

Treat the string as one long string and don't treat \n as the end of a line.

look for (\s+`(?:.|\n)+?`.+?),?\n

if one of those matches look at what precedes the string

if it's ',\n' or ')\n' the string matches. Only save what's in (\s+`(?:.|\n)+?`.+?),?\n

m/ is declaring this a matching regex.

(?:(?<=,\n)|(?<=(\n)) This is an OR statement including two look-behind clauses. The ?: tells the enclosing parentheses to not store the result as a variable. I've put the two look-behinds in this OR statement below this line:

(?<=,\n) Look behind the matched string for a comma followed by a newline, the comma must be there for this look behind to match.

(?<=(\n) Look behind the matched string for a open parentheses followed by a newline, the open parentheses must be there.

(\s+`(?:.|\n)+?`.+?),?\n This is the actual match. Match newline character followed by one or more spaces followed by back-tick followed by a character which can be any character or a newline one or more times, but don't be greedy and take the rest of the match into consideration. Followed by a back tick and any character one or more times. This match stops where there is a comma or failing that a newline following a back tick and some characters.

,?\n match a comma that may not be there followed by a newline.
/g don't stop if this pattern matches keep looking for more patterns to the end of the string.

* PT-1059 - Tools can't parse index names containing newlines

Placed fix from PR-369 into proper place and created test case for this fix.

---------

Co-authored-by: geneguido <31323560+geneguido@users.noreply.github.com>
2023-02-02 17:09:13 +03:00
..
2023-02-02 17:08:54 +03:00
2013-03-19 11:58:26 -06:00
2023-01-23 17:51:17 +03:00
2013-01-02 17:19:16 -07:00
WIP
2019-09-18 09:20:56 -03:00
2013-01-31 14:52:34 -03:00
2013-02-20 11:41:42 -07:00
2013-01-02 17:19:16 -07:00
2017-12-05 11:00:18 -03:00
2016-12-15 18:04:17 -03:00
2013-01-02 17:19:16 -07:00
2013-01-31 14:52:34 -03:00
2017-09-04 12:03:36 -03:00
2016-02-05 13:35:44 -03:00
2013-03-11 17:38:42 -06:00
2013-08-03 12:11:56 -07:00
2013-03-01 16:35:43 -03:00
2013-03-11 17:38:42 -06:00
2013-01-02 17:19:16 -07:00
2017-03-09 17:14:37 -03:00
2020-03-31 13:19:26 -03:00
2017-05-03 15:48:11 -03:00
2016-02-05 13:35:44 -03:00
2013-02-05 10:22:31 -07:00