HttpClient 4 – 如何捕获最后的redirecturl
我有相当简单的HttpClient 4代码调用HttpGet来获取HTML输出。 HTML返回的脚本和图像位置都设置为本地(例如<img src="http://img.dovov.comfoo.jpg"/>
),所以我需要调用URL来使这些变为绝对的( <img src="http://foo.comhttp://img.dovov.comfoo.jpg"/>
)现在出现这个问题 – 在调用期间,可能会有一两个302redirect,所以原始URL不再反映HTML的位置。
给出所有可能(或不可以)redirect的返回内容的最新URL。
我看着HttpGet#getAllHeaders()
和HttpResponse#getAllHeaders()
– 找不到任何东西。
编辑: HttpGet#getURI()
返回原始的调用地址
这将是当前的URL,你可以通过调用
HttpGet#getURI();
编辑:你没有提到你如何做redirect。 这对我们很有用,因为我们自己处理302。
听起来就像你使用DefaultRedirectHandler。 我们曾经这样做。 获取当前url有点棘手。 你需要使用你自己的上下文。 这里是相关的代码片段,
HttpGet httpget = new HttpGet(url); HttpContext context = new BasicHttpContext(); HttpResponse response = httpClient.execute(httpget, context); if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) throw new IOException(response.getStatusLine().toString()); HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute( ExecutionContext.HTTP_REQUEST); HttpHost currentHost = (HttpHost) context.getAttribute( ExecutionContext.HTTP_TARGET_HOST); String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI());
默认的redirect没有为我们工作,所以我们改变了,但我忘了是什么问题。
在HttpClient 4中,如果您使用的是LaxRedirectStrategy
或DefaultRedirectStrategy
任何子类,build议使用这种方法(请参阅DefaultRedirectStrategy
源代码):
HttpContext context = new BasicHttpContext(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); RedirectLocations locations = (RedirectLocations) context.getAttribute(DefaultRedirectStrategy.REDIRECT_LOCATIONS); if (locations != null) { finalUrl = locations.getAll().get(locations.getAll().size() - 1); }
由于HttpClient 4.3.x,上面的代码可以简化为:
HttpClientContext context = HttpClientContext.create(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); List<URI> locations = context.getRedirectLocations(); if (locations != null) { finalUrl = locations.get(locations.size() - 1); }
HttpGet httpGet = new HttpHead("<put your URL here>"); HttpClient httpClient = HttpClients.createDefault(); HttpClientContext context = HttpClientContext.create(); httpClient.execute(httpGet, context); List<URI> redirectURIs = context.getRedirectLocations(); if (redirectURIs != null && !redirectURIs.isEmpty()) { for (URI redirectURI : redirectURIs) { System.out.println("Redirect URI: " + redirectURI); } URI finalURI = redirectURIs.get(redirectURIs.size() - 1); }
根据ZZ编码器的解决scheme,恕我直言改进的方式是使用ResponseInterceptor来简单地跟踪最后的redirect位置。 这样,你就不会失去信息,比如在标签之后。 如果没有响应拦截器,你会失去hashtag。 例如: http : //j.mp/OxbI23
private static HttpClient createHttpClient() throws NoSuchAlgorithmException, KeyManagementException { SSLContext sslContext = SSLContext.getInstance("SSL"); TrustManager[] trustAllCerts = new TrustManager[] { new TrustAllTrustManager() }; sslContext.init(null, trustAllCerts, new java.security.SecureRandom()); SSLSocketFactory sslSocketFactory = new SSLSocketFactory(sslContext); SchemeRegistry schemeRegistry = new SchemeRegistry(); schemeRegistry.register(new Scheme("https", 443, sslSocketFactory)); schemeRegistry.register(new Scheme("http", 80, new PlainSocketFactory())); HttpParams params = new BasicHttpParams(); ClientConnectionManager cm = new org.apache.http.impl.conn.SingleClientConnManager(schemeRegistry); // some pages require a user agent AbstractHttpClient httpClient = new DefaultHttpClient(cm, params); HttpProtocolParams.setUserAgent(httpClient.getParams(), "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0.1"); httpClient.setRedirectStrategy(new RedirectStrategy()); httpClient.addResponseInterceptor(new HttpResponseInterceptor() { @Override public void process(HttpResponse response, HttpContext context) throws HttpException, IOException { if (response.containsHeader("Location")) { Header[] locations = response.getHeaders("Location"); if (locations.length > 0) context.setAttribute(LAST_REDIRECT_URL, locations[0].getValue()); } } }); return httpClient; } private String getUrlAfterRedirects(HttpContext context) { String lastRedirectUrl = (String) context.getAttribute(LAST_REDIRECT_URL); if (lastRedirectUrl != null) return lastRedirectUrl; else { HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute(ExecutionContext.HTTP_REQUEST); HttpHost currentHost = (HttpHost) context.getAttribute(ExecutionContext.HTTP_TARGET_HOST); String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI()); return currentUrl; } } public static final String LAST_REDIRECT_URL = "last_redirect_url";
像ZZ编码器的解决scheme一样使用它:
HttpResponse response = httpClient.execute(httpGet, context); String url = getUrlAfterRedirects(context);
我觉得更简单的方法来查找最后一个URL是使用DefaultRedirectHandler。
package ru.test.test; import java.net.URI; import org.apache.http.HttpResponse; import org.apache.http.ProtocolException; import org.apache.http.impl.client.DefaultRedirectHandler; import org.apache.http.protocol.HttpContext; public class MyRedirectHandler extends DefaultRedirectHandler { public URI lastRedirectedUri; @Override public boolean isRedirectRequested(HttpResponse response, HttpContext context) { return super.isRedirectRequested(response, context); } @Override public URI getLocationURI(HttpResponse response, HttpContext context) throws ProtocolException { lastRedirectedUri = super.getLocationURI(response, context); return lastRedirectedUri; } }
使用此处理程序的代码:
DefaultHttpClient httpclient = new DefaultHttpClient(); MyRedirectHandler handler = new MyRedirectHandler(); httpclient.setRedirectHandler(handler); HttpGet get = new HttpGet(url); HttpResponse response = httpclient.execute(get); HttpEntity entity = response.getEntity(); lastUrl = url; if(handler.lastRedirectedUri != null){ lastUrl = handler.lastRedirectedUri.toString(); }
我发现这在HttpComponents客户端文档上
CloseableHttpClient httpclient = HttpClients.createDefault(); HttpClientContext context = HttpClientContext.create(); HttpGet httpget = new HttpGet("http://localhost:8080/"); CloseableHttpResponse response = httpclient.execute(httpget, context); try { HttpHost target = context.getTargetHost(); List<URI> redirectLocations = context.getRedirectLocations(); URI location = URIUtils.resolve(httpget.getURI(), target, redirectLocations); System.out.println("Final HTTP location: " + location.toASCIIString()); // Expected to be an absolute URI } finally { response.close(); }
在2.3版本中,Android仍然不支持以下redirect(HTTP代码302)。 我刚刚阅读位置标题并再次下载:
if (statusCode != HttpStatus.SC_OK) { Header[] headers = response.getHeaders("Location"); if (headers != null && headers.length != 0) { String newUrl = headers[headers.length - 1].getValue(); // call again the same downloading method with new URL return downloadBitmap(newUrl); } else { return null; } }
没有通告redirect保护,所以要小心。 更多关于博客使用AndroidHttpClient遵循302redirect
这是我设法得到redirecturl:
Header[] arr = httpResponse.getHeaders("Location"); for (Header head : arr){ String whatever = arr.getValue(); }
或者,如果您确定只有一个redirect位置,请执行以下操作:
httpResponse.getFirstHeader("Location").getValue();
- 如何忽略Apache HttpClient 4.0中的SSL证书错误
- 先用Apache HttpClient 4进行基本身份validation
- 在Java中使用HTTPClient进行GZip POST请求
- 在Android上接受HTTP的证书
- HttpClient – 任务被取消 – 如何得到确切的错误信息?
- 我如何使用Pythonlogin到网站?
- HttpClient请求抛出IOException
- 我需要Android中的HttpClient的替代选项来将数据发送到PHP,因为它不再受支持
- 没有MediaTypeFormatter可用于从媒体types为“text / plain”的内容读取“String”types的对象,