Day35:Parse URL - 100DayOfRust
BC
Posted on February 23, 2020
We can use url
library to parse a URL:
Cargo.toml:
[dependencies]
url = "2.1.1"
Example:
use url::Url;
fn main() {
let url = "https://github.com:8443/rust/issues?labels=E-easy&state=open";
let parsed = Url::parse(url).unwrap();
println!("scheme: {}", parsed.scheme());
println!("host: {}", parsed.host().unwrap());
if let Some(port) = parsed.port() {
println!("port: {}", port);
}
println!("path: {}", parsed.path());
println!("query: {}", parsed.query().unwrap());
}
Run the code:
scheme: https
host: github.com
port: 8443
path: /rust/issues
query: labels=E-easy&state=open
One thing I found that I am not sure if it is a bug in this url
library: if I change the url to be:
https://github.com:443/rust/issues?labels=E-easy&state=open
The port part will return None
instead of 443. Seems this library ignored port if the port is the standard one with the protocol.
This is different behavior like Python:
from urllib.parse import urlparse
parsed = urlparse("https://github.com:443/rust/issues?labels=E-easy&state=open")
print(parsed.port)
# this will print 443 out
In this case, if we still want to get 443, instead of using port
method, we should use port_or_known_default
:
if let Some(port) = parsed.port_or_known_default() {
println!("port: {}", port);
}
But the caveat is, with this method, even we don't have ":443" in the url, it will still print out 443.
[Update] Apparently this port
behavior is designed on purpose, according to the source code here:
/// Note that default port numbers are never reflected by the serialization,
/// use theport_or_known_default()
method if you want a default port number returned.
But I still prefer that if the url contains the port number explicitly, no matter it is the standard port, the port
should always print it out, return None
is just weird.
Posted on February 23, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.