Perlstring模式匹配的负面正则expression式
我有这个正则expression式:
if($string =~ m/^(Clinton|[^Bush]|Reagan)/i) {print "$string\n"};
我想和克林顿和里根相匹配,但不是布什。
这不起作用。
示例文本:
克林顿说
布什用蜡笔
里根忘了
只是省略了一场布什比赛:
$ perl -ne 'print if /^(Clinton|Reagan)/' textfile Clinton said Reagan forgot
或者如果你真的想指定:
$ perl -ne 'print if /^(?!Bush)(Clinton|Reagan)/' textfile Clinton said Reagan forgot
你的正则expression式不起作用,因为[]定义了一个字符类,但是你想要的是一个前瞻:
(?=) - Positive look ahead assertion foo(?=bar) matches foo when followed by bar (?!) - Negative look ahead assertion foo(?!bar) matches foo when not followed by bar (?<=) - Positive look behind assertion (?<=foo)bar matches bar when preceded by foo (?<!) - Negative look behind assertion (?<!foo)bar matches bar when NOT preceded by foo (?>) - Once-only subpatterns (?>\d+)bar Performance enhancing when bar not present (?(x)) - Conditional subpatterns (?(3)foo|fu)bar - Matches foo if 3rd subpattern has matched, fu if not (?#) - Comment (?# Pattern does xy or z)
所以试试:(?!bush)
你的正则expression式如下:
/^ - if the line starts with ( - start a capture group Clinton| - "Clinton" | - or [^Bush] - Any single character except "B", "u", "s" or "h" | - or Reagan) - "Reagan". End capture group. /i - Make matches case-insensitive
换句话说,正则expression式的中间部分正在把你搞砸。 因为它是一个“全能型”组,它将允许任何不以“Bush”中的任何大写或小写字母开头的行。 例如,这些行将匹配您的正则expression式:
Our president, George Bush In the news today, pigs can fly 012-3123 33
你可以按照前面的build议做一个负面的预测,或者简单地做两个正则expression式:
if( ($string =~ m/^(Clinton|Reagan)/i) and ($string !~ m/^Bush/i) ) { print "$string\n"; }
正如米罗德在评论中指出的那样,当使用^
符号仅匹配行首时,第二个检查是不必要的,因为以“克林顿”或“里根”开头的行不能以“布什”开头。
但是,如果没有这些插文,这将是有效的。
使用两个正则expression式(或三个)有什么不对? 这使你的意图更清晰,甚至可以提高你的performance:
if ($string =~ /^(Clinton|Reagan)/i && $string !~ /Bush/i) { ... } if (($string =~ /^Clinton/i || $string =~ /^Reagan/i) && $string !~ /Bush/i) { print "$string\n" }
如果我的理解是正确的,那么你希望以任何顺序来匹配克林顿和里根的任何一行,而不是布什。 正如Stuck所build议的,这是一个带有前瞻断言的版本:
#!/usr/bin/perl use strict; use warnings; my $regex = qr/ (?=.*clinton) (?!.*bush) .*reagan /ix; while (<DATA>) { chomp; next unless (/$regex/); print $_, "\n"; } __DATA__ shouldn't match - reagan came first, then clinton, finally bush first match - first two: reagan and clinton second match - first two reverse: clinton and reagan shouldn't match - last two: clinton and bush shouldn't match - reverse: bush and clinton shouldn't match - and then came obama, along comes mary shouldn't match - to clinton with perl
结果
first match - first two: reagan and clinton second match - first two reverse: clinton and reagan
根据需要,它可以与任何具有里根和克林顿的任何顺序相匹配。
您可能想要尝试阅读如何使用示例在前瞻断言工作http://www252.pair.com/comdog/mastering_perl/Chapters/02.advanced_regular_expressions.html
他们非常好吃:)