My Journey Building a WebSocket Server in Go

hiro_111

hiro

Posted on February 24, 2024

My Journey Building a WebSocket Server in Go

A favorite YouTuber of mine once said that the best way to learn a new programming language is to build projects with it.

Inspired by this video, I decided to dive in and create my own WebSocket server using Go.

(There are articles out there explaining what the WebSocket protocol is. So I won't write about it and let them play the role.)

Read RFC6455

Before starting, my knowledge of the WebSocket protocol was pretty limited:

  • It's built on top of HTTP.
  • Communication starts with a client sending an HTTP "Upgrade" request to switch to WebSocket.
  • The server responds, and boom! A WebSocket connection is established.

Not knowing much else, I decided to consult the source - RFC6455. This is where I learned that WebSocket really only uses HTTP for the initial handshake. After that, it's all about exchanging raw TCP data in the form of data frames defined by the RFC.

The Challenge: Mixing HTTP and TCP

This is where things got tricky. Normally in Go, we don't have direct access to TCP payloads when working with HTTP objects. So, how do we combine HTTP and TCP communication in our server?

One straightforward approach is running a separate TCP server:

separated servers

// Here, conn is TCP connection (net.Conn)
// ...
const delimiter = "\r\n" // "\u000D\u000A" // CR & LF
buf := make([]byte, 2048)
n, err := conn.Read(buf)
if err != nil {
    log.Println("error reading payload to buffer: ", err)
    return
}
payload := strings.Split(string(buf[:n]), delimiter)
reqLine := strings.Split(payload[0], " ")
// validate HTTP method, version
headers := getHTTPHeaders(payload[1:])
// check Host, Upgrade, Connection, and WebSocket version headers
// generate WebSocket key
key := hash(headers["Sec-WebSocket-Key"])
// respond
conn.Write([]byte("HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\n..."))
fmt.Println("=== handshake done! ===")

// from now on, we can send/receive WebSocket dataframe between server-client.
Enter fullscreen mode Exit fullscreen mode

This is great for learning, as I got to manually construct HTTP responses over TCP. However, it's not the most practical solution. This is where Go's HTTP Hijack API saves the day!

Hijacking for the Win!

Go's HTTP Hijacker interface lets us pull the raw TCP connection right out of an HTTP request—perfect for our use case. Even popular libraries like Gorilla WebSocket use this technique under the hood.

This means we can have our cake and eat it too - a single server handling both HTTP and WebSocket traffic:

combined server

Here's how the Hijack API fits into my WebSocket server:

// after validating HTTP header in http.Request...
hj := w.(http.Hijacker)
conn, _, err := hj.Hijack()
if err != nil {
    fmt.Println("error hijacking http response writer:", err)
    w.WriteHeader(http.StatusInternalServerError)
    w.Write([]byte("internal server error"))
    return
}
// don't forget to close the TCP connection, 
// otherwise the client will send FIN packet around after 2 seconds.
defer conn.Close()
conn.Write([]byte(WSHandshakeResponse(key)))
fmt.Println("=== handshake done! ===")

// from now on, we can send/receive WebSocket dataframe between server-client.
// ...

Enter fullscreen mode Exit fullscreen mode

That's the Handshake!

That's a whirlwind tour of the WebSocket handshake process. If I have the time, I'll follow up with an article about how to implement WebSocket data frames in Go.

Thanks for reading ✌️

💖 💪 🙅 🚩
hiro_111
hiro

Posted on February 24, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related