get_headers不一致
运行下面的代码
var_dump(get_headers("http://www.domainnnnnnnnnnnnnnnnnnnnnnnnnnnn.com/CraxyFile.jpg"));
返回HTTP 200而不是404对于任何不存在的域或URL
Array ( [0] => HTTP/1.1 200 OK [1] => Server: nginx/1.1.15 [2] => Date: Mon, 08 Oct 2012 12:29:13 GMT [3] => Content-Type: text/html; charset=utf-8 [4] => Connection: close [5] => Set-Cookie: PHPSESSID=3iucojet7bt2peub72rgo0iu21; path=/; HttpOnly [6] => Expires: Thu, 19 Nov 1981 08:52:00 GMT [7] => Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 [8] => Pragma: no-cache [9] => Set-Cookie: bypassStaticCache=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; httponly [10] => Set-Cookie: bypassStaticCache=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; httponly [11] => Vary: Accept )
如果你跑
var_dump(get_headers("http://www.domain.com/CraxyFile.jpg"));
你得到
Array ( [0] => HTTP/1.1 404 Not Found [1] => Date: Mon, 08 Oct 2012 12:32:18 GMT [2] => Content-Type: text/html [3] => Content-Length: 8727 [4] => Connection: close [5] => Server: Apache [6] => Vary: Accept-Encoding )
他们有很多实例, get_headers
已被certificate是validation现有URL的解决scheme
- 检查PHP中是否存在URL的最佳方法是什么?
- 我如何检查一个URL是否通过PHP存在?
这是一个错误或get_headers是不是一个可靠的方式来validationurl
请参阅实时演示
更新1
得知CURL也有同样的问题
$curl = curl_init(); curl_setopt_array($curl, array(CURLOPT_RETURNTRANSFER => true,CURLOPT_URL => 'idontexist.tld')); curl_exec($curl); $info = curl_getinfo($curl); curl_close($curl); var_dump($info);
也返回相同的结果
问题与域名的长度无关,只是域名是否存在。
您正在使用DNS服务将不存在的域parsing为服务器,该服务器为您提供了一个“友好的”错误页面,并返回200个响应代码。 这意味着它也不是get_headers()
的问题,具体而言,它是任何基于依赖于合理DNS查找的过程。
处理这个问题的一种方法是不需要为每个工作环境编写工作代码,如下所示:
// A domain that definitely does not exist. The easiest way to guarantee that // this continues to work is to use an illegal top-level domain (TLD) suffix $testDomain = 'idontexist.tld'; // If this resolves to an IP, we know that we are behind a service such as this // We can simply compare the actual domain we test with the result of this $badIP = gethostbyname($testDomain); // Then when you want to get_headers() $url = 'http://www.domainnnnnnnnnnnnnnnnnnnnnnnnnnnn.com/CraxyFile.jpg'; $host = parse_url($url, PHP_URL_HOST); if (gethostbyname($host) === $badIP) { // The domain does not exist - probably handle this as if it were a 404 } else { // do the actual get_headers() stuff here }
您可能需要以某种方式将第一个调用的返回值caching到gethostbyname()
,因为您知道您正在查找不存在的名称,而这通常需要几秒钟的时间。