HTTP Requests in Java with Proxies

anderrv

Ander Rodriguez

Posted on March 29, 2022

HTTP Requests in Java with Proxies

Accessing data over HTTP is more common every day. Be it APIs or webpages, intercommunication between applications is growing. And website scraping.

There is no easy built-in solution to perform HTTP calls in Java. Many packages offer some related functionalities, but it's not easy to pick one. Especially if you need some extra features like connecting via authenticated proxies.

We'll go from the basic request to advanced features using fluent.Request, part of the Apache HttpComponents project.

Direct Request

The first step is to request the desired page. We will use httpbin for the demo. It shows headers and origin IP, allowing us to check if the request was successful.

We need to import Request, get the target page and extract the result as a string. The package provides methods for those cases and many more. Lastly, print the response.

import org.apache.hc.client5.http.fluent.Request;

public class TestRequest {
    public static void main(final String... args) throws Exception {
        String url = "http://httpbin.org/anything";

        String response = Request
                .get(url) // use GET HTTP method
                .execute() // perform the call
                .returnContent() // handle and return response
                .asString(); // convert response to string

        System.out.println(response);
    }
}
Enter fullscreen mode Exit fullscreen mode

We are not handling the response nor checking for errors. It is a simplified version of a real-use case.

But we can see on the result that the request was successful, and our IP shows as the origin. We'll solve that in a moment.

Proxy Request

There are many reasons to add proxies to an HTTP request, such as security or anonymity. In any case, Java libraries (usually) make adding proxies complicated.

In our case, we can use viaProxy with the proxy URL as long as we don't need authentication. More on that later.

For now, we'll use a proxy from a free list. Note that these free proxies might not work for you. They are short-time lived.

import org.apache.hc.client5.http.fluent.Request;

public class TestRequest {
    public static void main(final String... args) throws Exception {
        String url = "http://httpbin.org/anything";
        String proxy = "http://169.57.1.85:8123"; // Free proxy

        String response = Request.get(url)
                .viaProxy(proxy) // will set the passed proxy
                .execute().returnContent().asString();

        System.out.println(response);
    }
}
Enter fullscreen mode Exit fullscreen mode

Proxy with Authentication

Paid or private proxy providers - such as ZenRows - frequently use authentication in each call. Sometimes it is done via IP allowed lists, but it's frequent to use other means like Proxy-Authorization headers.

Calling the proxy without the proper auth method will result in an error: Exception in thread "main" org.apache.hc.client5.http.HttpResponseException: status code: 407, reason phrase: Proxy Authentication Required.

Following the example, we will need two things: auth and passing the proxy as a Host.

Proxy-Authorization contains the user and password base64 encoded.

Then, we need to change how viaProxy gets the proxy since it does not allow URLs with user and password. For that, we will create a new HttpHost passing in the whole URL. It will internally handle the problem and omit the unneeded parts.

import java.net.URI;
import java.util.Base64;

import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;

public class TestRequest {
    public static void main(final String... args) throws Exception {
        String url = "http://httpbin.org/anything";
        URI proxyURI = new URI("http://YOUR_API_KEY:@proxy.zenrows.com:8001"); // Proxy URL as given by the provider
        String basicAuth = new String(
            Base64.getEncoder() // get the base64 encoder
            .encode(
                proxyURI.getUserInfo().getBytes() // get user and password from the proxy URL
            ));
        String response = Request.get(url)
                .addHeader("Proxy-Authorization", "Basic " + basicAuth) // add auth
                .viaProxy(HttpHost.create(proxyURI)) // will set the passed proxy as a host
                .execute().returnContent().asString();

        System.out.println(response);
    }
}
Enter fullscreen mode Exit fullscreen mode

Ignore SSL Certificates

When adding proxies to SSL (https) connections, libraries tend to raise a warning/error about the certificate. From a security perspective, that is awesome! We avoid being shown or redirected to sites we prefer to avoid.

But what about forcing our connections through our own proxies? There is no security risk in those cases, so we want to ignore those warnings. That is, again, not an easy task in Java.

The error goes something like this: Exception in thread "main" javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target.

For this case, we will modify the target URL by switching it to https. And also, call a helper method that we'll create next. Nothing else changes on the main function.

public class TestRequest {
    public static void main(final String... args) throws Exception {
        ignoreCertWarning(); // new method that will ignore certificate warnings

        String url = "https://httpbin.org/anything"; // switch to https
        // ...
    }
}
Enter fullscreen mode Exit fullscreen mode

Now to the complicated and verbose part. We need to create an SSL context and fake certificates. As you can see, the certificates manager and its methods do nothing. It will just bypass the inner working and thus avoid the problems. Lastly, initialize the context with the created fake certs and set it as default. And we are good to go!

import java.security.cert.X509Certificate;
import javax.net.ssl.*;

public class TestRequest {
    // ...
    private static void ignoreCertWarning() {
        SSLContext ctx = null;
        TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
            public X509Certificate[] getAcceptedIssuers() {return null;}
            public void checkClientTrusted(X509Certificate[] certs, String authType) {}
            public void checkServerTrusted(X509Certificate[] certs, String authType) {}
        } };

        try {
            ctx = SSLContext.getInstance("SSL");
            ctx.init(null, trustAllCerts, null);
            SSLContext.setDefault(ctx);
        } catch (Exception e) {}
    }
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

Accessing data (or scraping) in Java can get complicated and verbose. But with the right tools and libraries, we got to tame its verbosity - but for the certificate.

We might get back to this topic in the future. The HttpComponents library offers attractive functionalities such as async and multi-threaded execution.

💖 💪 🙅 🚩
anderrv
Ander Rodriguez

Posted on March 29, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related