httplib混淆

我试图通过在Python中编写login序列来testingWeb应用程序的function，但是我遇到了一些麻烦。

以下是我需要做的事情：

用一些参数和标题做一个POST。
遵循redirect
检索HTML正文。

现在，我相对比较新的python，但我迄今为止testing的两件事情没有奏效。首先，我使用了httplib和putrequest（）（在URL中传递参数）和putheader（）。这似乎没有遵循redirect。

然后我尝试urllib和urllib2，传递标题和参数作为字典。这似乎返回login页面，而不是我正在尝试login的页面，我想这是因为缺lesscookie或其他东西。

我错过了一些简单的东西吗

谢谢。

把重点放在urllib2上，效果很好。不要乱用httplib ，它不是顶级的API。

你注意到的是， urllib2不遵循redirect。

你需要在HTTPRedirectHandler的实例中进行折叠，以捕获并遵循redirect。

此外，您可能希望HTTPRedirectHandler默认的HTTPRedirectHandler来捕获您将作为unit testing的一部分进行检查的信息。

 cookie_handler= urllib2.HTTPCookieProcessor( self.cookies ) redirect_handler= HTTPRedirectHandler() opener = urllib2.build_opener(redirect_handler,cookie_handler)

然后，您可以使用这个opener对象POST和GET，正确处理redirect和cookie。

你也可以添加你自己的HTTPHandler的子类来捕获和logging各种错误代码。

这是我对这个问题的看法。

 #!/usr/bin/env python import urllib import urllib2 class HttpBot: """an HttpBot represents one browser session, with cookies.""" def __init__(self): cookie_handler= urllib2.HTTPCookieProcessor() redirect_handler= urllib2.HTTPRedirectHandler() self._opener = urllib2.build_opener(redirect_handler, cookie_handler) def GET(self, url): return self._opener.open(url).read() def POST(self, url, parameters): return self._opener.open(url, urllib.urlencode(parameters)).read() if __name__ == "__main__": bot = HttpBot() ignored_html = bot.POST('https://example.com/authenticator', {'passwd':'foo'}) print bot.GET('https://example.com/interesting/content') ignored_html = bot.POST('https://example.com/deauthenticator',{})

@ S.洛特，谢谢。你的build议为我工作，做了一些修改。这是我做的。

 data = urllib.urlencode(params) url = host+page request = urllib2.Request(url, data, headers) response = urllib2.urlopen(request) cookies = CookieJar() cookies.extract_cookies(response,request) cookie_handler= urllib2.HTTPCookieProcessor( cookies ) redirect_handler= HTTPRedirectHandler() opener = urllib2.build_opener(redirect_handler,cookie_handler) response = opener.open(request)

我最近不得不自己做这件事。我只需要标准库中的类。这里是我的代码摘录：

 from urllib import urlencode from urllib2 import urlopen, Request # encode my POST parameters for the login page login_qs = urlencode( [("username",USERNAME), ("password",PASSWORD)] ) # extract my session id by loading a page from the site set_cookie = urlopen(URL_BASE).headers.getheader("Set-Cookie") sess_id = set_cookie[set_cookie.index("=")+1:set_cookie.index(";")] # construct headers dictionary using the session id headers = {"Cookie": "session_id="+sess_id} # perform login and make sure it worked if "Announcements:" not in urlopen(Request(URL_BASE+"login",headers=headers), login_qs).read(): print "Didn't log in properly" exit(1) # here's the function I used after this for loading pages def download(page=""): return urlopen(Request(URL_BASE+page, headers=headers)).read() # for example: print download(URL_BASE + "config")

我会给机械化（ http://wwwsearch.sourceforge.net/mechanize/ ）一枪。它可以很好地处理你的cookie /头透明。

尝试斜纹 – 一种简单的语言，允许用户从命令行界面浏览Web。使用斜纹，您可以浏览使用表单，Cookie和大多数标准Webfunction的网站。更重要的是， twill是用Python编写的，有一个python API ，例如：

 from twill import get_browser b = get_browser() b.go("http://www.python.org/") b.showforms()

除了您可能会丢失cookie之外，您可能还有一些字段没有发布到networking服务器。最好的方法是从Web浏览器捕获实际的POST。您可以使用LiveHTTPHeaders或WireShark来窥探stream量，并在脚本中模拟相同的行为。

Funkload也是一款出色的networking应用testing工具。它包装webunit来处理浏览器仿真，然后给你提供function和负载testingfunction。

Python：urllib / urllib2 / httplib混淆

我应该使用什么字符编码的HTTP头？

如何在Java中发送HTTP请求？

使用jQuery无法正确设置Accept HTTP标头

什么是X-REQUEST-ID http头？

响应内容types为CSV

Java中使用HttpClient的HTTP基本身份validation？

CORS – 引入预检请求的动机是什么？

url：用户名@

GET或POST比另一个更安全吗？

如何在任何错误上返回HTTP 500代码，无论如何