超时python requests.get整个响应

我正在收集一系列网站的统计数据，为了简单起见，我正在使用这些数据。这是我的代码：

data=[] websites=['http://google.com', 'http://bbc.co.uk'] for w in websites: r= requests.get(w, verify=False) data.append( (r.url, len(r.content), r.elapsed.total_seconds(), str([(l.status_code, l.url) for l in r.history]), str(r.headers.items()), str(r.cookies.items())) )

现在，我希望requests.get在10秒后超时，所以循环不会卡住。

这个问题之前也很受关注，但没有一个答案是干净的。我会在这个上得到一个很好的答案一些赏金。

我听说，也许不使用请求是一个好主意，但那么我应该如何得到好的东西请求提供。（元组中的）

怎么样使用eventlet？如果您想要在10秒后超时请求，即使正在接收数据，此代码段也适用于您：

 import requests import eventlet eventlet.monkey_patch() with eventlet.Timeout(10): requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip", verify=False)

设置超时参数：

 r = requests.get(w, verify=False, timeout=10)

只要你没有在该请求中设置stream=True ，这将导致requests.get()超时，如果连接超过10秒，或者如果服务器不发送超过10秒。

要创build超时，您可以使用信号。

解决这个案子最好的办法可能是

设置一个exception作为警报信号的处理程序
以十秒的延迟呼叫报警信号
调用try-except-finally块内的函数。
如果function超时，则到达除外块。
在finally块中，你中止了报警，所以它不会在以后发出。

以下是一些示例代码：

 import signal from time import sleep class TimeoutException(Exception): """ Simple Exception to be called on timeouts. """ pass def _timeout(signum, frame): """ Raise an TimeoutException. This is intended for use as a signal handler. The signum and frame arguments passed to this are ignored. """ # Raise TimeoutException with system default timeout message raise TimeoutException() # Set the handler for the SIGALRM signal: signal.signal(signal.SIGALRM, _timeout) # Send the SIGALRM signal in 10 seconds: signal.alarm(10) try: # Do our code: print('This will take 11 seconds...') sleep(11) print('done!') except TimeoutException: print('It timed out!') finally: # Abort the sending of the SIGALRM signal: signal.alarm(0)

有这样的一些警告：

它不是线程安全的，信号总是被传送到主线程，所以你不能把它放在任何其他线程中。
信号调度和实际代码的执行之后有一些延迟。这意味着即使只睡了十秒钟，示例也会超时。

但是，这一切都在标准的Python库！除睡眠function导入外，只有一个导入。如果你打算在很多地方使用超时你可以很容易地把TimeoutException，_timeout和singaling放在一个函数中，然后调用它。或者你可以做一个装饰器，并把它放在函数上，请参阅下面链接的答案。

您也可以将其设置为“上下文pipe理器”，以便您可以使用with语句：

 import signal class Timeout(): """ Timeout for use with the `with` statement. """ class TimeoutException(Exception): """ Simple Exception to be called on timeouts. """ pass def _timeout(signum, frame): """ Raise an TimeoutException. This is intended for use as a signal handler. The signum and frame arguments passed to this are ignored. """ raise Timeout.TimeoutException() def __init__(self, timeout=10): self.timeout = 10 signal.signal(signal.SIGALRM, Timeout._timeout) def __enter__(self): signal.alarm(self.timeout) def __exit__(self, exc_type, exc_value, traceback): signal.alarm(0) return exc_type is Timeout.TimeoutException # Demonstration: from time import sleep print('This is going to take maximum 10 seconds...') with Timeout(10): sleep(15) print('No timeout?') print('Done')

这种上下文pipe理器方法的一个可能的缺点是，你不知道代码是否实际超时。

来源和推荐阅读：

信号文件
这个答案超时由@David Narayan 。他已经把上面的代码组织成一个装饰器。

更新： http : //docs.python-requests.org/en/master/user/advanced/#timeouts

在新版本的requests ：

如果您为超时指定单个值，如下所示：

 r = requests.get('https://github.com', timeout=5)

超时值将应用于connect和read超时。如果您想单独设置值，请指定一个元组：

 r = requests.get('https://github.com', timeout=(3.05, 27))

如果远程服务器速度非常慢，您可以通过传递“无”作为超时值，然后检索一杯咖啡，让“请求”永久等待响应。

 r = requests.get('https://github.com', timeout=None)

我的旧的（可能是过时的）答案（很久以前发布）：

还有其他方法可以解决这个问题：

1.使用TimeoutSauce内部类

来自： https ： //github.com/kennethreitz/requests/issues/1928#issuecomment-35811896

 import requests from requests.adapters import TimeoutSauce class MyTimeout(TimeoutSauce): def __init__(self, *args, **kwargs): connect = kwargs.get('connect', 5) read = kwargs.get('read', connect) super(MyTimeout, self).__init__(connect=connect, read=read) requests.adapters.TimeoutSauce = MyTimeout 
这段代码应该使我们将读取超时设置为等于连接超时，这是您在Session.get（）调用中传递的超时值。（请注意，我没有真正testing过这个代码，所以可能需要一些快速的debugging，我只是直接写入到GitHub窗口中。）

2.使用来自kevinburke的请求分支： https ： //github.com/kevinburke/requests/tree/connect-timeout

从它的文档： https ： //github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst

如果您为超时指定单个值，如下所示：
 r = requests.get('https://github.com', timeout=5) 
超时值将应用于连接和读超时。如果您想单独设置值，请指定一个元组：
 r = requests.get('https://github.com', timeout=(3.05, 27)) 

kevinburke已经要求它被合并到主要的请求项目中，但还没有被接受。

这可能是矫枉过正的，但是芹菜分布式任务队列对超时有很好的支持。

特别是，您可以定义一个软时间限制，在您的stream程中引发exception（因此您可以清理）和/或限制超时时间的限制。

在封面之下，它使用了与之前的文章相同的信号方法，但是使用的方式更加可用和易于pipe理。如果您正在监控的网站列表很长，您可能会受益于其主要function – 各种pipe理大量任务执行的方式。

我相信你可以使用multiprocessing ，而不依赖于第三方包：

 import multiprocessing import requests def call_with_timeout(func, args, kwargs, timeout): manager = multiprocessing.Manager() return_dict = manager.dict() # define a wrapper of `return_dict` to store the result. def function(return_dict): return_dict['value'] = func(*args, **kwargs) p = multiprocessing.Process(target=function, args=(return_dict,)) p.start() # Force a max. `timeout` or wait for the process to finish p.join(timeout) # If thread is still active, it didn't finish: raise TimeoutError if p.is_alive(): p.terminate() p.join() raise TimeoutError else: return return_dict['value'] call_with_timeout(requests.get, args=(url,), kwargs={'timeout': 10}, timeout=60)

传递给kwargs的超时是从服务器获取任何响应的超时，参数timeout是获得完整响应的超时。

这个代码适用于socketError 11004和10060 ……

 # -*- encoding:UTF-8 -*- __author__ = 'ACE' import requests from PyQt4.QtCore import * from PyQt4.QtGui import * class TimeOutModel(QThread): Existed = pyqtSignal(bool) TimeOut = pyqtSignal() def __init__(self, fun, timeout=500, parent=None): """ @param fun: function or lambda @param timeout: ms """ super(TimeOutModel, self).__init__(parent) self.fun = fun self.timeer = QTimer(self) self.timeer.setInterval(timeout) self.timeer.timeout.connect(self.time_timeout) self.Existed.connect(self.timeer.stop) self.timeer.start() self.setTerminationEnabled(True) def time_timeout(self): self.timeer.stop() self.TimeOut.emit() self.quit() self.terminate() def run(self): self.fun() bb = lambda: requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip") a = QApplication([]) z = TimeOutModel(bb, 500) print 'timeout' a.exec_()

原谅我，但我想知道为什么没有人提出以下更简单的解决scheme？：-o

requests.get（'www.mypage.com'，timeout = 20）

尽pipe有关于请求的问题，但我发现pycurl CURLOPT_TIMEOUT或CURLOPT_TIMEOUT_MS非常容易。

不需要线程或信号：

 import pycurl import StringIO url = 'http://www.example.com/example.zip' timeout_ms = 1000 raw = StringIO.StringIO() c = pycurl.Curl() c.setopt(pycurl.TIMEOUT_MS, timeout_ms) # total timeout in milliseconds c.setopt(pycurl.WRITEFUNCTION, raw.write) c.setopt(pycurl.NOSIGNAL, 1) c.setopt(pycurl.URL, url) c.setopt(pycurl.HTTPGET, 1) try: c.perform() except pycurl.error: traceback.print_exc() # error generated on timeout pass # or just pass if you don't want to print the error

那么，我在这个页面上尝试了很多解决scheme，仍然面临不稳定，随机挂起，连接性能差的问题。

我现在正在使用Curl，我真的很高兴这是“最长时间”的function和全球performance，即使这样一个糟糕的实现：

 content=commands.getoutput('curl -m6 -Ss "http://mywebsite.xyz"')

在这里，我定义了一个6秒的最大时间参数，包括连接和传输时间。

我敢肯定，Curl有一个很好的Python绑定，如果你喜欢坚持Pythonic语法:)

如果涉及到这一点，创build一个看门狗线程，在10秒钟后将请求的内部状态混淆，例如：

closures底层套接字，理想情况下
如果请求重试该操作，则触发exception

请注意，根据系统库，您可能无法设置DNSparsing的最后期限。

我提出了一个更为直接的解决scheme，这个解决scheme确实是丑陋的，但解决了真正的问题。它有点像这样：

 resp = requests.get(some_url, stream=True) resp.raw._fp.fp._sock.settimeout(read_timeout) # This will load the entire response even though stream is set content = resp.content

你可以在这里阅读完整的解释

超时python requests.get整个响应

javascript：pause setTimeout（）;

停止input/写入后如何在input文本中触发事件？

Android上的http连接超时不起作用

sql alchemy连接超时

NGINX：从上游读取响应报头时，上游超时（110：连接超时）

如何设置raw_input的时间限制

ASP.NET中的会话超时

AngularJS。在调用angular-ui模式时清除$ timeout

ASP.NET MVC和httpRuntime executionTimeout

PHP会话超时

超时python requests.get整个响应

javascript：pause setTimeout（）;

停止input/写入后如何在input文本中触发事件？

Android上的http连接超时不起作用

sql alchemy连接超时

NGINX：从上游读取响应报头时，上游超时（110：连接超时）

如何设置raw_input的时间限制

ASP.NET中的会话超时

AngularJS。 在调用angular-ui模式时清除$ timeout

ASP.NET MVC和httpRuntime executionTimeout

PHP会话超时

AngularJS。在调用angular-ui模式时清除$ timeout