使用正则expression式validationIPv4地址
我一直在试图获得一个有效的IPv4validation正则expression式,但没有太多的运气。 (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}
,但它会产生一些奇怪的结果:
[chris@helios bashscripts]$ grep --version grep (GNU grep) 2.7 [chris@helios bashscripts]$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.1 192.168.1.1 [chris@helios bashscripts]$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.255 192.168.1.255 [chris@helios bashscripts]$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.255.255 [chris@helios bashscripts]$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.2555 192.168.1.2555
我做了一个search,看看这个问题是否已经被问及回答,但其他答案似乎只是简单地说明如何确定4组1-3个数字,或者不适合我。
有任何想法吗? 谢谢!
你已经有了一个可行的答案,但是为了防止你对原来的方法有什么不好,答案是你需要围绕你的交替使用圆括号,否则(\.|$)
只有在数字less于200。
'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' ^ ^
^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
接受 :
127.0.0.1 192.168.1.1 192.168.1.255 255.255.255.255 0.0.0.0 1.1.1.01
拒绝 :
30.168.1.255.1 127.1 192.168.1.256 -1.2.3.4 3...3
在线试用unit testing: https : //www.debuggex.com/r/-EDZOqxTxhiTncN6/1
正则expression式不是这个工作的工具。 编写一个parsing器来分离四个数字并检查它们是否在[0,255]的范围内会更好。 无function的正则expression式已经不可读了!
IPv4地址(准确捕获)匹配0.0.0.0到255.255.255.255使用此正则expression式来匹配IP数字与准确性。 4个号码中的每一个号码都存储在一个捕获组中,因此您可以访问它们以进一步处理。
\b (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\. (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\. (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\. (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) \b
取自JGsoft RegexBuddy图书馆
编辑:这(\.|$)
部分似乎更加奇怪
我在寻找类似于IPv4地址的东西 – 一个正则expression式也停止了被validation的常用私有IP地址(192.168.xy,10.xyz,172.16.xy),所以使用负面的前瞻来实现这个:
(?!(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.).*) (?!255\.255\.255\.255)(25[0-5]|2[0-4]\d|[1]\d\d|[1-9]\d|[1-9]) (\.(25[0-5]|2[0-4]\d|[1]\d\d|[1-9]\d|\d)){3}
(当然这些应该在一行上,为了可读性的目的而分成3行)
Debuggex演示
它可能没有针对速度进行优化,但只在寻找“真实的”互联网地址时效果很好。
事情将(而且应该)失败:
0.1.2.3 (0.0.0.0/8 is reserved for some broadcasts) 10.1.2.3 (10.0.0.0/8 is considered private) 172.16.1.2 (172.16.0.0/12 is considered private) 172.31.1.2 (same as previous, but near the end of that range) 192.168.1.2 (192.168.0.0/16 is considered private) 255.255.255.255 (reserved broadcast is not an IP) .2.3.4 1.2.3. 1.2.3.256 1.2.256.4 1.256.3.4 256.2.3.4 1.2.3.4.5 1..3.4
将(而且应该)工作的知识产权:
1.0.1.0 (China) 8.8.8.8 (Google DNS in USA) 100.1.2.3 (USA) 172.15.1.2 (USA) 172.32.1.2 (USA) 192.167.1.2 (Italy)
如果其他人正在寻找validation“互联网IP地址不包括公共私人地址”
新的和改进的更短的版本
^(?:(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])(\.(?!$)|$)){4}$
它使用负向前视(?!)
来删除ip可能以a结尾的情况.
老答案
^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3} (?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])$
我认为这是最准确和最严格的正则expression式,它不接受像000.021.01.0.
这样的000.021.01.0.
它似乎像大多数其他答案在这里做,并要求额外的正则expression式来拒绝类似的情况下,即0
开始的数字和一个IP以a结尾.
const char*ipv4_regexp = "\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\." "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\." "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\." "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b";
我将从JGsoft RegexBuddy库取得的正则expression式改编为C语言(regcomp / regexec),我发现它可以正常工作,但在某些操作系统(如Linux)中存在一些问题。 那个正则expression式接受192.168.100.009那样的ipv4地址,其中009在Linux中被认为是一个八进制值,所以地址不是你想的那个。 我改变了这个正则expression式如下:
const char* ipv4_regex = "\\b(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\." "(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\." "(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\." "(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\b";
使用正则expression式现在192.168.100.009不是一个有效的ipv4地址,而192.168.100.9是好的。
我也修改了多播地址的正则expression式,它是以下内容:
const char* mcast_ipv4_regex = "\\b(22[4-9]|23[0-9])\\." "(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\." "(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9]?)\\." "(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\b";
我认为你必须使正则expression式适应你用来开发你的应用程序的语言
我把一个例子在Java中:
package utility; import java.util.regex.Matcher; import java.util.regex.Pattern; public class NetworkUtility { private static String ipv4RegExp = "\\b(?:(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d?)\\.){3}(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d?)\\b"; private static String ipv4MulticastRegExp = "2(?:2[4-9]|3\\d)(?:\\.(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]\\d?|0)){3}"; public NetworkUtility() { } public static boolean isIpv4Address(String address) { Pattern pattern = Pattern.compile(ipv4RegExp); Matcher matcher = pattern.matcher(address); return matcher.matches(); } public static boolean isIpv4MulticastAddress(String address) { Pattern pattern = Pattern.compile(ipv4MulticastRegExp); Matcher matcher = pattern.matcher(address); return matcher.matches(); } }
我设法从所有其他答案构造一个正则expression式。
(25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[0-9]?)(\.(25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[0-9]?)){3}
-bash-3.2$ echo "191.191.191.39" | egrep '(^|[^0-9])((2([6-9]|5[0-5]?|[0-4][0-9]?)?|1([0-9][0-9]?)?|[3-9][0-9]?|0)\.{3} (2([6-9]|5[0-5]?|[0-4][0-9]?)?|1([0-9][0-9]?)?|[3-9][0-9]?|0)($|[^0-9])'
>> 191.191.191.39
(这是一个与整个addr空间(包括广播等)相匹配的DFA。
我认为这是最短的。
^(([01]?\d\d?|2[0-4]\d|25[0-5]).){3}([01]?\d\d?|2[0-4]\d|25[0-5])$
这是正则expression式为我工作:
"\<((([1-9]|1[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3}([1-9]|1[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-4]))\>"
我发现这个示例非常有用,而且它允许不同的ipv4表示法。
使用python的示例代码:
def is_valid_ipv4(ip4): """Validates IPv4 addresses. """ import re pattern = re.compile(r""" ^ (?: # Dotted variants: (?: # Decimal 1-255 (no leading 0's) [3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2} | 0x0*[0-9a-f]{1,2} # Hexadecimal 0x0 - 0xFF (possible leading 0's) | 0+[1-3]?[0-7]{0,2} # Octal 0 - 0377 (possible leading 0's) ) (?: # Repeat 0-3 times, separated by a dot \. (?: [3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2} | 0x0*[0-9a-f]{1,2} | 0+[1-3]?[0-7]{0,2} ) ){0,3} | 0x0*[0-9a-f]{1,8} # Hexadecimal notation, 0x0 - 0xffffffff | 0+[0-3]?[0-7]{0,10} # Octal notation, 0 - 037777777777 | # Decimal notation, 1-4294967295: 429496729[0-5]|42949672[0-8]\d|4294967[01]\d\d|429496[0-6]\d{3}| 42949[0-5]\d{4}|4294[0-8]\d{5}|429[0-3]\d{6}|42[0-8]\d{7}| 4[01]\d{8}|[1-3]\d{0,9}|[4-9]\d{0,8} ) $ """, re.VERBOSE | re.IGNORECASE) return pattern.match(ip4) <> None
((\.|^)(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0$)){4}
这个正则expression式不接受08.8.8.8或8.08.8.8或8.8.08.8或8.8.8.08
只要IP围绕数字以外的任何字符(在IP后面或前面),就可以find有效的IP地址。 4创build的反向引用:$ + {first}。$ + {second}。$ + {third}。$ + {forth}
Find String: #any valid IP address (?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d])) #only valid private IP address RFC1918 (?<IP>(?<![\d])(:?(:?(?<first>10)[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5])))|(:?(?<first>172)[\.](?<second>(:?1[6-9])|(:?2[0-9])|(:?3[0-1])))|(:?(?<first>192)[\.](?<second>168)))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d])) Notepad++ Replace String Option 1: Replaces the whole IP (NO Change): $+{IP} Notepad++ Replace String Option 2: Replaces the whole IP octect by octect (NO Change) $+{first}.$+{second}.$+{third}.$+{forth} Notepad++ Replace String Option 3: Replaces the whole IP octect by octect (replace 3rd octect value with 0) $+{first}.$+{second}.0.$+{forth} NOTE: The above will match any valid IP including 255.255.255.255 for example and change it to 255.255.0.255 which is wrong and not very useful of course.
用实际值代替每个八位字节的部分,但是你可以build立你自己的查找和replace,这对于修改文本文件中的IP是非常有用的:
for example replace the first octect group of the original Find regex above: (?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5])) with (?<first>10) and (?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5])) with (?<second>216) and you are now matching addresses starting with first octect 192 only Find on notepad++: (?<IP>(?<![\d])(?<first>10)[\.](?<second>216)[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
您仍然可以使用与以前完全相同的方式使用反参考组来执行Replace。
你可以得到如何在上面匹配下面的想法:
cat ipv4_validation_test.txt Full Match: 0.0.0.1 12.108.1.34 192.168.1.1 10.249.24.212 10.216.1.212 192.168.1.255 255.255.255.255 0.0.0.0 Partial Match (IP Extraction from line) 30.168.1.0.1 -1.2.3.4 sfds10.216.24.23kgfd da11.15.112.255adfdsfds sfds10.216.24.23kgfd NO Match 1.1.1.01 3...3 127.1. 192.168.1.. 192.168.1.256 da11.15.112.2554adfdsfds da311.15.112.255adfdsfds
使用grep你可以看到下面的结果:
From grep: grep -oP '(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt 0.0.0.1 12.108.1.34 192.168.1.1 10.249.24.212 10.216.1.212 192.168.1.255 255.255.255.255 0.0.0.0 30.168.1.0 1.2.3.4 10.216.24.23 11.15.112.255 10.216.24.23 grep -P '(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt 0.0.0.1 12.108.1.34 192.168.1.1 10.249.24.212 10.216.1.212 192.168.1.255 255.255.255.255 0.0.0.0 30.168.1.0.1 -1.2.3.4 sfds10.216.24.23kgfd da11.15.112.255adfdsfds sfds10.216.24.23kgfd #matching ip addresses starting with 10.216 grep -oP '(?<IP>(?<![\d])(?<first>10)[\.](?<second>216)[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt 10.216.1.212 10.216.24.23 10.216.24.23
String zeroTo255 = "([0-9]|[0-9][0-9]|(0|1)[0-9][0-9]|2[0-4][0-9]|25[0-5])"; it can contain single digit ie ([0-9]); It can contain two digits ie ([0-9][0-9]); range is (099 to 199)ie((0|1)[0-9][0-9]); range is (200 - 249) ie (2[0-9][0-9]) ; range is (250-255) ie(25[0-5]);
mysql> select ip from foo where ip regexp '^\\s*[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]\\s*';