PHP正则expression式匹配HTML标签<a>外的关键字
我一直在试图做一个正则expression式来匹配和replaceHTML的一部分关键字的出现:
- 我想匹配
keyword
和<strong>keyword</strong>
- 但
<a href="someurl.html" target="_blank">keyword</a>
和<a href="someur2.html">already linked keyword </a>
不应匹配
我只对匹配(和replace)第一行的keyword
感兴趣。
我想这个的原因是用<a href="dictionary.php?k=keyword">keyword</s>
replacekeyword
,但只有keyword
不在<a>
标签内。
任何帮助将不胜感激!
$str = preg_replace('~Moses(?!(?>[^<]*(?:<(?!/?a\b)[^<]*)*)</a>)~i', '<a href="novo-mega-link.php">$0</a>', $str);
负向视图中的expression式匹配下一个结束标签</a>
,但前提是它没有首先看到开始的<a>
标签。 如果成功,则意味着Moses
这个词在一个锚元素内,所以前瞻失败,并且不会发生匹配。
这是一个演示 。
我设法做我想要的( 不使用正则expression式 ):
- parsing我的string的每个字符
- 删除所有的
<a>
标签(将它们复制到一个临时数组,并在string上保留一个占位符) -
str_replace
新的string,以取代所有的关键字 - 通过原始的
<a>
标签重新填充占位符
这里是我使用的代码,以防别人需要它:
$str = <<<STRA Moses supposes his toeses are roses, but <a href="original-moses1.html">Moses</a> supposes erroneously; for nobody's toeses are posies of roses, as Moses supposes his toeses to be. Ganda <span class="cenas"><a href="original-moses2.html" target="_blank">Moses</a></span>! STRA; $arr1 = str_split($str); $arr_links = array(); $phrase_holder = ''; $current_a = 0; $goto_arr_links = false; $close_a = false; foreach($arr1 as $k => $v) { if ($close_a == true) { if ($v == '>') { $close_a = false; } continue; } if ($goto_arr_links == true) { $arr_links[$current_a] .= $v; } if ($v == '<' && $arr1[$k+1] == 'a') { /* <a */ // keep collecting every char until </a> $arr_links[$current_a] .= $v; $goto_arr_links = true; } elseif ($v == '<' && $arr1[$k+1] == '/' && $arr1[$k+2] == 'a' && $arr1[$k+3] == '>' ) { /* </a> */ $arr_links[$current_a] .= "/a>"; $goto_arr_links = false; $close_a = true; $phrase_holder .= "{%$current_a%}"; /* put a parameter holder on the phrase */ $current_a++; } elseif ($goto_arr_links == false) { $phrase_holder .= $v; } } echo "Links Array:\n"; print_r($arr_links); echo "\n\n\nPhrase Holder:\n"; echo $phrase_holder; echo "\n\n\n(pre) Final Phrase (with my keyword replaced):\n"; $final_phrase = str_replace("Moses", "<a href=\"novo-mega-link.php\">Moses</a>", $phrase_holder); echo $final_phrase; echo "\n\n\nFinal Phrase:\n"; foreach($arr_links as $k => $v) { $final_phrase = str_replace("{%$k%}", $v, $final_phrase); } echo $final_phrase;
输出:
链接数组:
Array ( [0] => <a href="original-moses1.html">Moses</a> [1] => <a href="original-moses2.html" target="_blank">Moses</a> )
短语持有人:
Moses supposes his toeses are roses, but {%0%} supposes erroneously; for nobody's toeses are posies of roses, as Moses supposes his toeses to be. Ganda <span class="cenas">{%1%}</span>!
(前)最后的短语(与我的关键字replace):
<a href="novo-mega-link.php">Moses</a> supposes his toeses are roses, but {%0%} supposes erroneously; for nobody's toeses are posies of roses, as <a href="novo-mega-link.php">Moses</a> supposes his toeses to be. Ganda <span class="cenas">{%1%}</span>!
最后的短语:
<a href="novo-mega-link.php">Moses</a> supposes his toeses are roses, but <a href="original-moses1.html">Moses</a> supposes erroneously; for nobody's toeses are posies of roses, as <a href="novo-mega-link.php">Moses</a> supposes his toeses to be. Ganda <span class="cenas"><a href="original-moses2.html" target="_blank">Moses</a></span>!
$lines = explode( "\n", $content ); $lines[0] = stri_replace( "keyword", "replacement", $lines[0] ); $content = implode( "\n", $lines );
或者如果你明确地想要使用一个正则expression式
$lines = explode( "\n", $content ); $lines[0] = preg_replace( "/keyword/i", "replacement", $lines[0] ); $content = implode( "\n", $lines );
考虑使用HTMLparsing库而不是像simplehtmldom这样的正则expression式。 您可以使用它来更新特定HTML标记的内容(因此,忽略不想更改的内容)。 那么你不必使用正则expression式。 只要过滤了适当的标签,就可以使用像str_replace
这样的函数。