MySQL – 返回REGEXP查询中的匹配模式

我有一个MYSQL数据库，我有一个查询：

SELECT `id`, `originaltext` FROM `source` WHERE `originaltext` regexp '[0-9][0-9]'

这将检测所有具有2位数字的原始文本。

我需要mysql将这些数字作为字段返回，所以我可以进一步操纵它们。

理想情况下，如果我可以添加额外的标准应该是> 20将是伟大的，但我可以做到这一点分开。

如果你想在你的数据库中使用更多的正则expression式，你可以考虑使用LIB_MYSQLUDF_PREG 。这是一个导入PCRE库的MySQL用户函数的开源库。 LIB_MYSQLUDF_PREG仅以源代码forms提供。要使用它，你需要能够编译它并将其安装到你的MySQL服务器上。安装这个库不会以任何方式改变MySQL的内置正则expression式支持。它只是提供以下附加function：

PREG_CAPTURE从string中提取正则expression式匹配。 PREG_POSITION返回正则expression式匹配string的位置。 PREG_REPLACE对string执行search和replace。 PREG_RLIKEtesting一个正则expression式是否匹配一个string。

所有这些函数都以正则expression式作为第一个参数。这个正则expression式必须像Perl正则expression式运算符那样格式化。例如，要testingregex是否与主题不区分大小写匹配，可以使用MySQL代码PREG_RLIKE（'/ regex / i'，subject）。这与PHP的preg函数类似，PHP函数也需要PHPstring中正则expression式的额外//分隔符。

如果你想要更简单的东西，你可以改变这个function，以更好地满足你的需求。

 CREATE FUNCTION REGEXP_EXTRACT(string TEXT, exp TEXT) -- Extract the first longest string that matches the regular expression -- If the string is 'ABCD', check all strings and see what matches: 'ABCD', 'ABC', 'AB', 'A', 'BCD', 'BC', 'B', 'CD', 'C', 'D' -- It's not smart enough to handle things like (A)|(BCD) correctly in that it will return the whole string, not just the matching token. RETURNS TEXT DETERMINISTIC BEGIN DECLARE s INT DEFAULT 1; DECLARE e INT; DECLARE adjustStart TINYINT DEFAULT 1; DECLARE adjustEnd TINYINT DEFAULT 1; -- Because REGEXP matches anywhere in the string, and we only want the part that matches, adjust the expression to add '^' and '$' -- Of course, if those are already there, don't add them, but change the method of extraction accordingly. IF LEFT(exp, 1) = '^' THEN SET adjustStart = 0; ELSE SET exp = CONCAT('^', exp); END IF; IF RIGHT(exp, 1) = '$' THEN SET adjustEnd = 0; ELSE SET exp = CONCAT(exp, '$'); END IF; -- Loop through the string, moving the end pointer back towards the start pointer, then advance the start pointer and repeat -- Bail out of the loops early if the original expression started with '^' or ended with '$', since that means the pointers can't move WHILE (s <= LENGTH(string)) DO SET e = LENGTH(string); WHILE (e >= s) DO IF SUBSTRING(string, s, e) REGEXP exp THEN RETURN SUBSTRING(string, s, e); END IF; IF adjustEnd THEN SET e = e - 1; ELSE SET e = s - 1; -- ugh, such a hack to end it early END IF; END WHILE; IF adjustStart THEN SET s = s + 1; ELSE SET s = LENGTH(string) + 1; -- ugh, such a hack to end it early END IF; END WHILE; RETURN NULL; END

MySQL中没有使用正则expression式提取文本的语法。您可以使用REGEXP来标识包含两个连续数字的行，但要提取它们，您必须使用在这种情况下非常困难的普通string操作函数。

备择scheme：

从数据库中select整个值，然后在客户端上使用正则expression式。
使用更好的支持SQL标准的不同的数据库（可能不是一个选项，我知道）。然后你可以使用： SUBSTRING(originaltext from '%#[0-9]{2}#%' for '#') 。

我有同样的问题，这是我find的解决scheme（但它不会在所有情况下）：

使用LOCATE()来查找你想要匹配的string的开始和结尾
使用MID()来提取之间的子string…
保持正则expression式匹配只有你肯定find匹配的行。

我用我的代码作为一个存储过程（函数），将工作提取任何数字build立在一个单一的块。这是我更广泛的图书馆的一部分。

 DELIMITER $$ -- 2013.04 michal@glebowski.pl -- FindNumberInText("ab 234 95 cd", TRUE) => 234 -- FindNumberInText("ab 234 95 cd", FALSE) => 95 DROP FUNCTION IF EXISTS FindNumberInText$$ CREATE FUNCTION FindNumberInText(_input VARCHAR(64), _fromLeft BOOLEAN) RETURNS VARCHAR(32) BEGIN DECLARE _r VARCHAR(32) DEFAULT ''; DECLARE _i INTEGER DEFAULT 1; DECLARE _start INTEGER DEFAULT 0; DECLARE _IsCharNumeric BOOLEAN; IF NOT _fromLeft THEN SET _input = REVERSE(_input); END IF; _loop: REPEAT SET _IsCharNumeric = LOCATE(MID(_input, _i, 1), "0123456789") > 0; IF _IsCharNumeric THEN IF _start = 0 THEN SET _start = _i; END IF; ELSE IF _start > 0 THEN LEAVE _loop; END IF; END IF; SET _i = _i + 1; UNTIL _i > length(_input) END REPEAT; IF _start > 0 THEN SET _r = MID(_input, _start, _i - _start); IF NOT _fromLeft THEN SET _r = REVERSE(_r); END IF; END IF; RETURN _r; END$$

MySQL – 返回REGEXP查询中的匹配模式

访问成员expression式的值

为什么正则expression式如此引起争议？

grep正则expression式空白行为

在纯bash中使用regexp提取子string

findPython中所有正则expression式匹配的索引？

我可以在正则expression式中使用OR而不捕获封闭的内容吗？

如何提取括号（圆括号）之间的文本？

十进制数字正则expression式，其中十进制数字是可选的

如何在执行Regex.Replace（）时使用命名组

MVVM将EventArgs作为命令parameter passing