A Gemini Client in Rust - 04 The Visit Function

krowemoh

Nivethan

Posted on November 9, 2020

A Gemini Client in Rust - 04 The Visit Function

Hello! We currently just wrote our infinite loop to process user command. In this chapter we will implement 2 commands, we will implement the quit option so user can leave our client, and we will write the visit function which will allow our users to visit gemini space!

Let's jump right in.

Quitting

Pack it up, tutorial is over friends!

Bad joke, moving on currently we print out our command to the screen, now we're going to do some very light tokenization. Tokenization means to make the command into sub pieces and we do different things based on the pieces. Luckily our client is going to be simple, we will split on white space and the resulting array will be our tokens.

./src/main.rs

use std::io;
use std::io::{Write};

fn main() {
    let prompt = ">";
    loop {
        print!("{} ", prompt);
        io::stdout().flush().unwrap();

        let mut command = String::new();
        io::stdin().read_line(&mut command).unwrap();

        let tokens: Vec<&str> = command.trim().split(" ").collect();
        println!("{:?}", tokens);
    }
}
Enter fullscreen mode Exit fullscreen mode

We take our command and do some very light processing. Real tokenization will go character by character and will have rules on how to split a line into sub pieces. We are getting away with splitting on white space because we are writing a basic client.

To learn more, I highly recommend the following book.

http://craftinginterpreters.com/

I used TypeScript for the first half of the book, whereas the author used Java.

Back to Rust! We first trim the command to remove the newline character, we then split on white space. At this point the split function has returned an iterator. We then collect everything and the result is a vector of strings.

Now for another rule. Our first token will dictate what we will do, so all we need to do now is match against it.

./src/main.rs

use std::io;
use std::io::{Write};

fn main() {
    let prompt = ">";
    loop {
        print!("{} ", prompt);
        io::stdout().flush().unwrap();

        let mut command = String::new();
        io::stdin().read_line(&mut command).unwrap();

        let tokens: Vec<&str> = command.trim().split(" ").collect();

        match tokens[0] {
            "q" => break,
            _ => println!("{:?}", tokens),

        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Voila! All we do is match on the first token, if we have a q, we will break our infinite loop back into our main function, which will then immediately end. All other tokens will match against our _ which will print our tokens.

Now let's work on the core of our client, the visit function!

Visit Function

Pack your bags, we're going to see grandma!

Another joke...

Similar to our quit option, lets add our visit command

./src/main.rs

...
fn visit(tokens: Vec<&str>) {
    println!("Attempting to visit....{}", tokens[1]);
}
...
...
            "q" => break,
            "visit" => visit(tokens),
...
Enter fullscreen mode Exit fullscreen mode

First thing, we'll only go over what has changed now so please feel free to leave any bugs or errors in the comments below and I'll update as soon as possible. Thank you!

The first first thing to note is that we now added a visit command underneath our quit option. We call our visit function which currently does nothing but we'll fix that shortly.

Rust is a safe language but it isn't idiot proof! If we enter visit all by itself, we would cause rust to panic because we reference tokens[1] in our function which wouldn't exist.

We need to add some array verification so we don't cause a panic!

./src/main.rs

...
fn visit(tokens: Vec<&str>) {
    if tokens.len() < 2 {
        println!("Nowhere to visit.");
        return;
    }

    let destination = tokens[1];

    println!("Attempting to visit....{}", destination);
}
...
Enter fullscreen mode Exit fullscreen mode

Wonderful! We will now print an error message for the user if they try typing visit without a destination.

Now we're going to work on the big step. We're going to connect to a Gemini server and actually get data!

Connecting

The more we connect, the more we hurt each other.

https://en.wikipedia.org/wiki/Hedgehog%27s_dilemma

!

This next part we're going to work on, isn't going to work but its fun to see how things break just as much as its fun to see it all work!

./src/main.rs

...
use std::io::{Read, Write};
...
fn visit(tokens: Vec<&str>) {
    if tokens.len() < 2 {
        println!("Nowhere to visit.");
        return;
    }

    let destination = format!("{}:1965", tokens[1]);
    println!("Attempting to visit....{}", destination);

    let mut socket = TcpStream::connect(&destination).unwrap();
    socket.write(tokens[1].as_bytes()).unwrap();

    let mut raw_data: Vec<u8> = vec![];
    socket.read_to_end(&mut raw_data).unwrap();

    let content = String::from_utf8_lossy(&raw_data);
    println!("{}", content);
}
...
Enter fullscreen mode Exit fullscreen mode

We first include the Read option from the standard library's io module. This will allow us to use the read options on sockets.

Next we update our destination variable, instead of just where we want to go, we are now adding the port number as well. Port 1965 is what is used by Gemini servers to host content.

Next we connect to the destination on that port.

We then write out the location we want as that is what the Gemini server will serve us. We need to convert this to bytes.

Next we read the data, we don't know how much data the Gemini server is going to send back so we set up a vector of bytes and then read_to_end. This will give us a growing buffer to hold the data we receive.

Next we convert that data we received into a String. We do this by using the from_utf8_lossy function, we use the lossy function because we don't know the encoding at this point so anything outside utf8 will be substituted out.

Finally we print our data to the screen.

You can see all this in action by trying to visit gemini.circumlunar.space

Now this function should print just a blank line. This is because of a key part of the specification. One mandatory rule for Gemini is that we must use TLS. Right now, we aren't and so we aren't really talking to the Gemini server yet.

TLS

You can sub in the word SSL for TLS and the meaning would be identical, SSL was an older version of TLS, Transport Layer Security.

Now we will include some crates into our project. We will be using the rustls crate to do our TLS connections.

The first step is to add rustls to our Cargo.toml file.

./Cargo.toml

[dependencies]
rustls = { version="0.18", features=["dangerous_configuration"] }
webpki = "0.21"
Enter fullscreen mode Exit fullscreen mode

I would have really liked to not use any dependencies but that's okay!

rustls will be the crate we rely on and in turn we also need webpki for a type, we may be able to strip it out but for now let's just use the entire thing.

The big thing here is that in our features flags for rustls, we enable dangerous_configuration. This is because we are about to do something dangerous. Gemini, as part of its spec, allows us to do anything with the TLS certificate, including not verifying. This is what we'll be implementing so we need to be able to turn off certificate validation in rustls and this is only done via the dangerous_configuration feature.

Let's get to the code!

We're going to split this up into two sections and go over them separately.

use std::io;
use std::io::{Read, Write};
use std::net::TcpStream;
use std::sync::Arc;
use rustls::Session;
use rustls::{RootCertStore, Certificate, ServerCertVerified, TLSError, ServerCertVerifier};
use webpki::{DNSNameRef};

struct DummyVerifier { }

impl DummyVerifier {
    fn new() -> Self {
        DummyVerifier { }
    }
}

impl ServerCertVerifier for DummyVerifier {
    fn verify_server_cert(
        &self,
        _: &RootCertStore,
        _: &[Certificate],
        _: DNSNameRef,
        _: &[u8]
    ) -> Result<ServerCertVerified, TLSError> {
        return Ok(ServerCertVerified::assertion()); 
    }
}

fn visit(tokens: Vec<&str>) {
...
Enter fullscreen mode Exit fullscreen mode

Whew! We have a whole slew of things to get through. The first thing to note is that we have a few more crates and utilities added to our little client. We could and probably should move all this into a separate file for TLS connections but! I think mentally its easier if everything is in one file so you can hold everything in your head.

The next thing to note is that we create a struct called DummyVerifier. rustls comes with a default certificate verification utility that will use a certificate root store to verify server certificates. What this means is that rustls has a way of verifying that a server certificate is valid.

In Gemini, we don't need this, and in our basic client we can actually skip TLS validation. To do this we will need to override rustls' default behavior. This is why we enabled dangerous_configuration in our Cargo.toml file.

Once we have our DummyVerifier struct, we add the ability to create new instances of it.

Next is the key part of our override, we re-implement the ServerCertVerifier trait for our dummy. For Gemini, we simply take all the parameters in and return an Ok(). This means that regardless of what certificate the server uses, we will always return that it was valid.

! Voila! We now have a verifier that we can have rustls use that will ignore TLS certificate issues.

The next step is to use this verifier with rustls to make our connection.

Buckle in!

...
fn visit(tokens: Vec<&str>) {
    if tokens.len() < 2 {
        println!("Nowhere to visit.");
        return;
    }

    let destination = format!("{}:1965", tokens[1]);
    let dns_request = tokens[1];
    let request = format!("gemini://{}/\r\n", tokens[1]);

    println!("Attempting to visit....{}", destination);

    let mut cfg = rustls::ClientConfig::new();
    let mut config = rustls::DangerousClientConfig {cfg: &mut cfg};
    let dummy_verifier = Arc::new(DummyVerifier::new());
    config.set_certificate_verifier(dummy_verifier);

    let rc_config = Arc::new(cfg);
    let dns_name = webpki::DNSNameRef::try_from_ascii_str(dns_request).unwrap();

    let mut client = rustls::ClientSession::new(&rc_config, dns_name);
    let mut socket = TcpStream::connect(destination).unwrap();

    let mut stream = rustls::Stream::new(&mut client, &mut socket);
    stream.write(request.as_bytes()).unwrap();

    while client.wants_read() {
        client.read_tls(&mut socket).unwrap();
        client.process_new_packets().unwrap();
    }
    let mut data = Vec::new();
    let _ = client.read_to_end(&mut data);

    let status =  String::from_utf8_lossy(&data);

    println!("{}", status);

    client.read_tls(&mut socket).unwrap();
    client.process_new_packets().unwrap();
    let mut data = Vec::new();
    let _ = client.read_to_end(&mut data);

    let content =  String::from_utf8_lossy(&data);

    println!("{}", content);
}
...
Enter fullscreen mode Exit fullscreen mode

Let's go over everything line by line.

The command we are entering is visit gemini.circumlunar.space

The first step is make sure we have a place to visit, this is the if statement.

Next we're going to format our destination in a few ways. The first is to create a url with our port number. This is the destination variable.

The next step is to create the DNS query url, this will be just the url without a port number.

The last variable we need is a gemini url which we're calling request. This adds the prefix, the closing /, and the carriage return line feed.

Now we begin the TLS dance.

We set up cfg which is how rustls will do the TLS validation. We create a new configuration. Normally we would add a root store to this config object which rustls will in turn use to validate certificates.

In our case, we need to use DangerousClientConfig to mutate the cfg object into something that can use our custom verifier.

Next we initialize our dummy_verifier. I'm not sure what Arc is so feel free to leave a comment!

Next we use set_certificate_verifier to override rustls' default certificate verifier.

Now we have a config object that won't bother validating certificates!

Now we can start working on the connection.

The next lis is setting up another Arc object of the cfg.

Then we create a DNSNameRef using the dns_request variable we set up earlier. This step seems to be superfluous and there probably is a way to remove the webpki dependency.

Now we will create the TLS connection!

We use ClientSession to start a TLS session using our cfg object and DNSNameRef.

We have our session, next we need our socket! We use TcpStream to connect to our destination variable, this is why we added port number to our destination variable earlier as TcpStream needs a port number to connect to.

At this point we have a socket and we a TLS session! We are going to marry the two by using the Stream module in rustls.

Now we can use the stream object to write our request variable. We need to convert our request to bytes. This means that rustls will encrypt our data and send it to the server in one step!

Now, ideally we would be able to use the stream.read option to simply read the stream for the server's response however for reasons unknown to me, we can't!

This part is a bit hacky, and unfortunately I can't explain it. It is a bit frustrating to not know why something is working or if it'll even work tomorrow. It very much exists on a prayer.

We start by checking out TLS session for data, this appears to take some variable amount of time and I haven't figured out what is dependent on. Because we don't know when the data will come in, we need to be polling the session for data in a while loop.

Each time we poll, we read from our session and process the packets on it.

process_new_packets will validate our certificate, using our DummyVerifier. Once verified, we can then use read_to_end() to read our TLS session for data.

We would normally have an unwrap here to panic if something breaks, in this case the TLS session throws a panic when the response has no length. We don't want rust to panic, we need it to keep checking the stream. This is why we have the read_to_end() go into an unused variable.

Then we read the data we received and convert it to a string using from_utf8_lossy.

The first part is the status that the Gemini server will send. For now we'll only worry about successful requests, this means that we will get a response body following our status. We then read from our session once more, decrypt the data using process_new_packets() and then we read the data into a variable.

We now have the response body as well!

Voila! We have the tiniest hint of a real working Gemini client now! That was a slog so feel free to bask in it. I apologize that the above explanation isn't very good and may be incorrect, it is completely made up based on how I think TLS and sockets work and the fact it works is flukey. I would love to tighten this up. The code feels brittle but! thats okay!

Now if you play with client a little, you'll quick find that it isn't very useful, we are stuck going only to base level gemini spaces right now. What we need to do and will do in the next chapter is handle URLs! If we could do that, we have the bare minimum needed to traverse gemini space!

See you soon.

💖 💪 🙅 🚩
krowemoh
Nivethan

Posted on November 9, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related