Java URI Library Compliant with RFC 3986

hidebike712

Hideki Ikeda

Posted on June 4, 2024

Java URI Library Compliant with RFC 3986

Overview

For Java developers, it's well-known that Java’s standard URI library (java.net.URI class) still adheres to the outdated RFC 2396 and does not fully comply with RFC 3986, which is today's de facto standard. While other major specifications such as RFC 6749 (The OAuth 2.0 Authorization Framework) and OpenID Connect reference RFC 3986 as the foundational URI protocols, Java’s standard URI library is not completely compliant with them. This means that developers have to find ways to address the deficiencies of the standard library, for example, by using other open-source libraries or modifying the code. In my case, I was using another open-source URI library to address these issues. However, the owner of this library has ceased maintenance, it does not support IPv6, and I was not convinced by the implementations of other open-source libraries. This led me to implement my own URI library that complies with RFC 3986.

What’s worng with java.net.URI?

🔴 Host containing underscores (_)

RFC 3986 permits underscores (_) in the host part of URI references; however, RFC 2396 does not allow them. java.net.URI
class does not recognize these as valid, resulting in unexpected behaviors:

// Create a URI whose host contains an underscore.
java.net.URI u = new java.net.URI("http://my_host.com");

// This outputs 'null'.
System.out.println(u.getHost());
Enter fullscreen mode Exit fullscreen mode

This issue has been acknowledged in multiple bug reports, such as JDK-8019345
and JDK-8221675, yet it remains unresolved, presenting significant challenges
when working with such URIs.

🔴 IPvFuture Host

RFC 3986 introduces IPvFuture as a valid host type, such as in v9.abc:def, but java.net.URI fails to parse these,
leading to exceptions:

// This throws "java.net.URISyntaxException".
new java.net.URI("http://[v9.abc:def]");
Enter fullscreen mode Exit fullscreen mode

🔴 Scheme-only URI

RFC 3986 allows scheme-only URIs such as data: but those URIs can't be parsed by java.net.URI class.

// This throws "java.net.URISyntaxException".
new java.net.URI("data:");
Enter fullscreen mode Exit fullscreen mode

org.czeal.rfc3986 Library

org.czeal.rfc3986 library is a library that I've developed to address these issues. It ensures compliance with modern URI standards and avoids these and other issues.

The library offers four key functionalities for robust URI management:

Each feature is designed to handle URIs accurately and effectively, ensuring reliable and precise management across various
application contexts.

Installation

<dependency>
    <groupId>org.czeal</groupId>
    <artifactId>rfc3986</artifactId>
    <version>{version}</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

Usage

✅ Parsing

To parse URI references, use URIReference.parse(String uriRef) or URIReference.parse(String uriRef, Charset charset). Below are some examples of using URIReference.parse(String uriRef).

Example 1: Parse Basic URI

URIReference uriRef = URIReference.parse("http://example.com/a/b"); // Parse.

System.out.println(uriRef.toString());                // "http://example.com/a/b"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "example.com"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "REGNAME"
System.out.println(uriRef.getHost().getValue());      // "example.com"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // "/a/b"
System.out.println(uriRef.getQuery());                // null
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 2: Parse Relative Reference

URIReference uriRef = URIReference.parse("//example.com/a/b"); // Parse.

System.out.println(uriRef.toString());                // "//example.com/a/b"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // null
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "example.com"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "REGNAME"
System.out.println(uriRef.getHost().getValue());      // "example.com"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // "/a/b"
System.out.println(uriRef.getQuery());                // null
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 3: Parse URI with IPV4 Host

URIReference uriRef = URIReference.parse("http://101.102.103.104"); // Parse.

System.out.println(uriRef.toString());                // "http://101.102.103.104"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "101.102.103.104"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "IPV4"
System.out.println(uriRef.getHost().getValue());      // "101.102.103.104"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // null
System.out.println(uriRef.getQuery());                // null
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 4: Parse URI with IPV6 Host

URIReference uriRef = URIReference.parse("http://[2001:0db8:0001:0000:0000:0ab9:C0A8:0102]"); // Parse.

System.out.println(uriRef.toString());                // "http://[2001:0db8:0001:0000:0000:0ab9:C0A8:0102]"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "[2001:0db8:0001:0000:0000:0ab9:C0A8:0102]"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "IPV6"
System.out.println(uriRef.getHost().getValue());      // "[2001:0db8:0001:0000:0000:0ab9:C0A8:0102]"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // null
System.out.println(uriRef.getQuery());                // null
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 5: Parse URI with IPvFuture Host

URIReference uriRef = URIReference.parse("http://[v9.abc:def]"); // Parse.

System.out.println(uriRef.toString());                // "http://[v9.abc:def]"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "[v9.abc:def]"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "IPVFUTURE"
System.out.println(uriRef.getHost().getValue());      // "[v9.abc:def]"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // null
System.out.println(uriRef.getQuery());                // null
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 6: Parse URI with Percent-encoded Host

URIReference uriRef = URIReference.parse("http://%65%78%61%6D%70%6C%65.com"); // Parse.

System.out.println(uriRef.toString());                // "http://%65%78%61%6D%70%6C%65.com"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "%65%78%61%6D%70%6C%65.com"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "REGNAME"
System.out.println(uriRef.getHost().getValue());      // "%65%78%61%6D%70%6C%65.com"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // null
System.out.println(uriRef.getQuery());                // null
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

⚠️ If parsing fails, those methods throws NullPointerException or IllegalArgumentException. See Java doc for more details.

✅ Resolving

To resolve a relative reference against a URI reference, use resolve(String uriRef) or resolve(URIReference uriRef). Below is an example demonstrating how to resolve a relative reference against a base URI.

// A base URI.
URIReference baseUri = URIReference.parse("http://example.com");

// A relative reference.
URIReference relRef = URIReference.parse("/a/b");

// Resolve the relative reference against the base URI.
URIReference resolved = baseUri.resolve(relRef);

System.out.println(resolved.toString());                // "http://example.com/a/b"
System.out.println(resolved.isRelativeReference());     // false
System.out.println(resolved.getScheme());               // "http"
System.out.println(resolved.hasAuthority());            // true
System.out.println(resolved.getAuthority().toString()); // "example.com"
System.out.println(resolved.getUserinfo());             // null
System.out.println(resolved.getHost().getType());       // "REGNAME"
System.out.println(resolved.getHost().getValue());      // "example.com"
System.out.println(resolved.getPort());                 // -1
System.out.println(resolved.getPath());                 // "/a/b"
System.out.println(resolved.getQuery());                // null
System.out.println(resolved.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

✅ Normalizing

For normalization, invoke normalize() on a URIReference instance to normalize.

Example 1: Normalize URI with Mixed-Case Scheme

URIReference normalized = URIReference.parse("hTTp://example.com") // Parse.
                                      .normalize();                // Normalize.

System.out.println(normalized.toString());                // "http://example.com/"
System.out.println(normalized.isRelativeReference());     // false
System.out.println(normalized.getScheme());               // "http"
System.out.println(normalized.hasAuthority());            // true
System.out.println(normalized.getAuthority().toString()); // "example.com"
System.out.println(normalized.getUserinfo());             // null
System.out.println(normalized.getHost().getType());       // "REGNAME"
System.out.println(normalized.getHost().getValue());      // "example.com"
System.out.println(normalized.getPort());                 // -1
System.out.println(normalized.getPath());                 // "/"
System.out.println(normalized.getQuery());                // null
System.out.println(normalized.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 2: Normalize URI with Percent-Encoded Host

URIReference normalized = URIReference.parse("http://%65%78%61%6D%70%6C%65.com") // Parse.
                                      .normalize();                              // Normalize.

System.out.println(normalized.toString());                // "http://example.com/"
System.out.println(normalized.isRelativeReference());     // false
System.out.println(normalized.getScheme());               // "http"
System.out.println(normalized.hasAuthority());            // true
System.out.println(normalized.getAuthority().toString()); // "example.com"
System.out.println(normalized.getUserinfo());             // null
System.out.println(normalized.getHost().getType());       // "REGNAME"
System.out.println(normalized.getHost().getValue());      // "example.com"
System.out.println(normalized.getPort());                 // -1
System.out.println(normalized.getPath());                 // "/"
System.out.println(normalized.getQuery());                // null
System.out.println(normalized.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 3: Normalize URI with Unresolved Path

URIReference normalized = URIReference.parse("http://example.com/a/b/c/../d/") // Parse.
                                      .normalize();                            // Normalize.

System.out.println(normalized.toString());                // "http://example.com/a/b/d/"
System.out.println(normalized.isRelativeReference());     // false
System.out.println(normalized.getScheme());               // "http"
System.out.println(normalized.hasAuthority());            // true
System.out.println(normalized.getAuthority().toString()); // "example.com"
System.out.println(normalized.getUserinfo());             // null
System.out.println(normalized.getHost().getType());       // "REGNAME"
System.out.println(normalized.getHost().getValue());      // "example.com"
System.out.println(normalized.getPort());                 // -1
System.out.println(normalized.getPath());                 // "/a/b/d/"
System.out.println(normalized.getQuery());                // null
System.out.println(normalized.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 4: Normalize Relative Reference

// Parse a relative reference.
URIReference relRef = URIReference.parse("/a/b/c/../d/");

// Resolve the relative reference against "http://example.com".
// NOTE: Relative references must be resolved before normalization.
URIReference resolved = relRef.resolve("http://example.com");

// Normalize the resolved URI.
URIReference normalized = resolved.normalize();

System.out.println(normalized.toString());                // "http://example.com/a/b/d/"
System.out.println(normalized.isRelativeReference());     // false
System.out.println(normalized.getScheme());               // "http"
System.out.println(normalized.hasAuthority());            // true
System.out.println(normalized.getAuthority().toString()); // "example.com"
System.out.println(normalized.getUserinfo());             // null
System.out.println(normalized.getHost().getType());       // "REGNAME"
System.out.println(normalized.getHost().getValue());      // "example.com"
System.out.println(normalized.getPort());                 // -1
System.out.println(normalized.getPath());                 // "/a/b/d/"
System.out.println(normalized.getQuery());                // null
System.out.println(normalized.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

⚠️ Relative reference must be resolved before normalization as RFC 3986, 5.2.1 states as below.

RFC 3986, 5.2.1. Pre-parse the Base URI

A URI reference must be transformed to its target URI before
it can be normalized.

✅ Constructing

To construct URI references, use URIReferenceBuilder class.

Example 1: Construct Basic URI

URIReference uriRef = new URIReferenceBuilder()
                          .setScheme("http") 
                          .setHost("example.com")
                          .setPath("/a/b/c")
                          .query("k1", "v1")
                          .build();

System.out.println(uriRef.toString());                // "http://example.com/a/b/c?k1=v1"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "example.com"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "REGNAME"
System.out.println(uriRef.getHost().getValue());      // "example.com"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // "/a/b/c"
System.out.println(uriRef.getQuery());                // "k1=v1"
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

Example 2: Construct URI from Existing URI

URIReference uriRef = new URIReferenceBuilder()
                          .fromURIReference("http://example.com/a/b/c?k1=v1")
                          .appendPath("d", "e", "f")
                          .appendQueryParam("k2", "v2")
                          .build();

System.out.println(uriRef.toString());                // "http://example.comd/a/b/c/d/e/f?k1=v1&k2=v2"
System.out.println(uriRef.isRelativeReference());     // false
System.out.println(uriRef.getScheme());               // "http"
System.out.println(uriRef.hasAuthority());            // true
System.out.println(uriRef.getAuthority().toString()); // "example.com"
System.out.println(uriRef.getUserinfo());             // null
System.out.println(uriRef.getHost().getType());       // "REGNAME"
System.out.println(uriRef.getHost().getValue());      // "example.com"
System.out.println(uriRef.getPort());                 // -1
System.out.println(uriRef.getPath());                 // "/a/b/c/d/e/f"
System.out.println(uriRef.getQuery());                // "k1=v1&k2=&v2"
System.out.println(uriRef.getFragment());             // null
Enter fullscreen mode Exit fullscreen mode

⚠️ The current implementation of URIReferenceBuilder class won't throw an exception until build() method is invoked even if invalid input is given since validation for each URI component is performed only when build() is called.

Note

📌 Immutable class

This library designs most classes such as URIReference to be immutable. Here are some examples.

// Example 1: Invoking the "normalize()" method creates a new URIReference instance.
URIReference normalized = URIReference.parse("hTTp://example.com").normalize();

// Example 2: Invoking the "resolve(String uriRef)" method creates a new URIReference instance.
URIReference resolved = URIReference.parse("http://example.com").resolve("/a/b");
Enter fullscreen mode Exit fullscreen mode

See Also

💖 💪 🙅 🚩
hidebike712
Hideki Ikeda

Posted on June 4, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related