Essential guide to WebSocket authentication

bookercodes

Alex Booker

Posted on December 13, 2023

Essential guide to WebSocket authentication

Authenticating WebSocket connections from the browser is a lot trickier than it should be.

Cookie authentication isn’t suitable for every app, and the WebSocket browser API makes it impossible to set an Authorization header with a token.

It’s actually all a bit of a mess!

That is the last thing you want to hear when it comes to security, so I’ve done the research to present this tidy list of methods to send credentials from the browser.

The challenge with WebSocket authentication

Even though WebSocket and HTTP are completely separate protocols, every WebSocket connection begins with a HTTP handshake.

HTTP handshake

The WebSocket specification does not prescribe any particular way to authenticate a WebSocket connection once it’s established but suggests you authenticate the HTTP handshake before establishing the connection:

This protocol doesn’t prescribe any particular way that servers can authenticate clients during the WebSocket handshake. The WebSocket server can use any client authentication mechanism available to a generic HTTP server, such as cookies or HTTP authentication.

HTTP has dedicated fields for authentication (like the Authorization header), and you likely have a standard way to authenticate HTTP requests already so this makes a lot of sense on the surface!

Surprisingly, though, the WebSocket browser API doesn’t allow you to set arbitrary headers with the HTTP handshake like Authorization. 😱

HTTP cookie authentication is an option, but, as you will see in this post, it’s not always suitable, and potentially even vulnerable to CSRF.

In this post, I’ll outline your options to work around this remarkable limitation of modern browsers to securely and reliably send credentials to the server.

Authentication methods for securing WebSocket connections

Send access token in the query parameter

One of the simplest methods to pass credentials from the client to a WebSocket server is to pass the access token via the URL like this:

wss://website.com?token=your_token_here

Then, on the server, you can authenticate the request.

Here’s an example with Node:

import { createServer } from 'http'
import { WebSocketServer } from 'ws'
import { parse } from 'url'

const PORT = 8000
const server = createServer()
// noSever: Tells WebSocketServer not to create an HTTP server 
// but to instead handle upgrade requests from the existing 
// server (above).
const wsServer = new WebSocketServer({ noServer: true })

const authenticate = request => {
    const { token } = parse(request.url, true).query
    // TODO: Actually authenticate token
    if (token === "abc") {
        return true
    }
}

server.on('upgrade', (request, socket, head) => {

    const authed = authenticate(request)

    if (!authed) {
        // \r\n\r\n: These are control characters used in HTTP to
        // denote the end of the HTTP headers section.
        socket.write('HTTP/1.1 401 Unauthorized\r\n\r\n')
        socket.destroy()
        return
    }

    wsServer.handleUpgrade(request, socket, head, connection => {
        // Manually emit the 'connection' event on a WebSocket 
        // server (we subscribe to this event below).
        wsServer.emit('connection', connection, request)
    })
})

wsServer.on('connection', connection => {
    console.log('connection')

    connection.on('message', bytes => {
        // %s: Convert the bytes (buffer) into a string using
        // utf-8 encoding.
        console.log('received %s', bytes)
    })
})

server.listen(PORT, () => console.log(`Server started on port ${PORT}`))
Enter fullscreen mode Exit fullscreen mode

If the token is valid, you move ahead with the upgrade. Otherwise, send a standard 401 “Unauthorized” response code and close the underlying socket.

This approach is easy to reason about and it only takes a few lines of code to implement on the client.

The downside we need to explore is the security implication of encoding the token in the query string in this way.

Some developers on public forums will argue this isn’t so bad:

  • When you use TLS (WSS), query strings are encrypted in transit.
  • Compared to HTTP, WebSocket URLs aren’t really exposed to the user. Users can't bookmark or copy-and-paste them. This minimizes the risk of accidental sharing.

However, it is important to acknowledge query parameters will still show up in plaintext on the server where they will likely get logged.

Even if your code doesn’t, the framework or cloud host likely will.

This is precarious because logs leak can error messages, for example (accidental information disclosure).

Should a malicious actor attain access to the logs, they would have access to all the data and functionality that the user behind the WebSocket connection has.

In the next section, let’s explore an evolution of this method that’s more secure, albeit more work to implement.

Send an ephemeral access token in the query parameter

As we covered in the section above, sending your main access token in the query parameter is not sufficiently secure because it might be logged in plaintext on the server.

To dramatically reduce the risk, we could use the main access token to request an ephemeral single-use token from an authentication service then send that short-lived token the query parameter.

This way, by the time the token is logged on the server, it will likely be useless since it will either have already been used or expired.

The basic flow can be illustrated like this:

basic flow illustrated

While inherently more secure than sending the main access token, you now need to implement a custom and stateful authentication service specifically for WebSockets, which is a pretty significant downside considering we first started exploring sending the token in the query parameter because of its convenience!

Send access token over WebSocket

Another option to authenticate a WebSocket connection is to send credentials in the first message post-connection.

The server must then validate the token before allowing the client to do anything else.

Here’s a contrived server implementation I wrote with Node to illustrate the model:

import { WebSocketServer } from 'ws'
import { createServer } from 'http'
import { randomUUID } from 'crypto'

const server = createServer()
const wsServer = new WebSocketServer({ server })

const PORT = 8000
const connections = {}

const authenticate = token => {
    // TODO: Actually authenticate token
    if (token === "abc") {
        return true
    }
}

const handleMessage = (bytes, uuid) => {
  const message = JSON.parse(bytes.toString())
  const connection = connections[uuid]

  if (message.type === "authenticate") {
    connection.authenticated = authenticate(message.token)
    return
  }

  if (connection.authenticated) {
    // Process the message
  } else {
    connection.terminate()
  }
}

const handleClose = uuid => delete connections[uuid]

wsServer.on('connection', (connection, request) => {
  const uuid = randomUUID()
  connections[uuid] = connection

  connection.on('message', message => handleMessage(message, uuid))
  connection.on('close', () => handleClose(uuid))
});

server.listen(PORT, () => console.log(`Server started on port ${PORT}`))

Enter fullscreen mode Exit fullscreen mode

If the token is invalid, the server terminates the connection. Otherwise, the it tracks the connection as “authenticated” and processes subsequent messages.

When implemented correctly, this method is completely secure, however, it involves defining your own custom authentication mechanism protocol securely and correctly.

Sending credentials over WebSockets in this way has some other downsides:

  • Implementing a custom stateful protocol will increase complexity. The example above looks simple but in reality you now need to manage session lifetimes, handle synchronization issues when you scale, and deal with potential inconsistencies. Should something go wrong, your users might not be able to login or, worse yet, you might introduce a security vulnerability.

  • You become vulnerable to DOS attacks. With this method, anyone can open a WebSocket connection. An attacker might open a bunch of WebSocket connections and refuse to authenticate, tying up server resources like memory indefinitely, potentially overloading your server until it becomes sluggish. To counteract this, you’ll need to implement rigorous timeouts, further contributing to the complexity compared to HTTP-based authentication methods.

Send credentials in a HTTP cookie

The WebSocket handshake is done with a standard HTTP request and response cycle, which supports cookies, allowing you to authenticate the request.

Authentication using cookies has been widely adopted since the early days of the internet. It's a reliable method that offers security you can trust. However, there are some limitations you should be aware of:

  • Not suitable if your WebSocket server is hosted on a different domain. If your WebSocket server is on a different domain than your web app, the browser will not send the authentication cookies to the WebSocket server, which makes the authentication fail.

  • Vulnerable to CSRF. The browser does not enforce a Same-Origin Policy for the WebSocket handshake like it would an ordinary HTTP request. A malicious website badwebsite.com could open a connection to yourwebsite.com and the browser will happily send along the authentication cookie, creating an opportunity for badwebsite.com to send and receive messages on the user’s behalf, unbeknown to them. To circumvent this, it’s pivotal the server checks the Origin header of the request before allowing it. Alternatively, you may choose to implement a CSRF token.

Send credentials with the Sec-WebSocket-Protocol header

While the WebSocket browser API doesn’t let you set arbitrary headers like Authorization, it does allow you to set a value for the Sec-WebSocket-Protocol header, creating an opportunity to smuggle the token in the request header!

In vanilla JavaScript, the client WebSocket code might look like this:

const ws  = new WebSocket(
  "wss://example.com/path", 
  ["Authorization", "your_token_here"]
)
Enter fullscreen mode Exit fullscreen mode

And with a library like React useWebSocket, something like this:

const { sendMessage, lastMessage } = useWebSocket("wss://example.com/path", {
  protocols: ["Authorization", "your_token_here"]
})
Enter fullscreen mode Exit fullscreen mode

The Sec-WebSocket-Protocol header is designed to negotiate a subprotocol not carry authentication information but some developers including me and those behind Kubernetes are asking “why not?"

You might be wondering what the downside of this neat workaround is. Every option in this list so far has a downside, and setting Sec-WebSocket-Protocol is no exception:

  • The token might get logged in plaintext on the server. Because Sec-WebSocket-Protocol is not designed to carry authentication tokens, they may end up in log files unintentionally as part of standard logging of WebSocket protocol negotiation, thus causing potential security risks.

  • You might experience unexpected behavior. It’s also important to acknowledge that such use of Sec-WebSocket-Protocol isn't standardized, meaning libraries, tooling, and middleware might not handle this kind of logic gracefully. For simple apps, this is unlikely to cause a problem, however, in a sufficiently complex system with multiple components, this could cause unexpected behavior including security issues.

Send credentials with basic access authentication

Some posts out there suggest an outdated trick whereby you encode the username and password in the WebSocket URL:

const ws = new WebSocket("wss://username:password@example.com")
Enter fullscreen mode Exit fullscreen mode

Under the hood, the browser will pull these out to add a basic auhtenticaiton access header.

Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ==

Apart from the fact that basic authentication is limited to a username and password (and you probably want to send a token), this method has always suffered from inconsistent browser support and is now totally deprecated.

Don’t use it. I’m only including a brief note about it here for completeness.

Forget about WebSocket authentication (mostly) with Ably

So far we’ve weighed the benefits and limitations of each approach however, you could just use a library that handles it all for you under the hood.

With Ably, authentication is a solved problem.

Ably is a realtime infrastructure API that makes it trivial to add realtime features to your apps compared to if you used WebSockets directly.

Ably realtime infrastructure API

Apart from being easy to get started with, Ably handles authentication for you in a secure way, allowing you to focus on the features that really matter to your users.

Instead of deliberating the best authentication mechanism and burdening all that responsibility even if you’re not a security expert, Ably provides you with “white spots” to fill in with code that connects to your database to authenticate the user.

The token management part, including refresh tokens and permissions, is all handled for you.

Learn more about how Ably can help you build impressive realtime features at scale here or create a free account and play around for yourself.

Conclusion

In this post, we explored several methods to send credentials from the web browser to a WebSocket server.

It would have been nice if I could recommend one go-to method but, as will now be evident to you, each method has advantages and disadvantages that must be considered relative to your project.

Here’s a quick summary for future reference:

  • Query parameter: Sending the credentials in a query parameter is dead easy, but it introduces a risk of the credentials being logged in plaintext on the server, and that could be risky! A homemade authentication service that issues ephemeral tokens for use in the query parameter would improve the security greatly, however, for many, that is a bridge too far.

  • WebSocket connection: Sending the credentials over the WebSocket connection is worth considering, however, you usually end up implementing your own authentication protocol that is finicky to maintain, potentially vulnerable to DOS attacks, and doesn’t play well with anything else.

  • Cookies: Cookie authentication is attractive due to its reliability and ease of implementation. However, it may not be compatible with your system design.

  • Sec-WebSocket-Protocol: Smuggling the token in the Sec-WebSocket-Protocol is a stroke of genius, and, if it’s good enough for Kubernetes, it might be good enough for you! At the same time, misusing the header in this way might lead to unexpected behavior. User BatteryAcid on StackOverflow summed it up pretty well when they wrote - "I implemented this and it works - just feels weird. thanks" 😂

Of course, if this all sounds like a headache, you might consider Ably. Apart from solving the authentication problem, Ably provides additional features you’d need to implement on top of WebSockets like Presence and message queues, and provides production guarantees that will be time-consuming or costly to achieve on your own like 99.999% uptime guarantee, exactly-once delivery, and guaranteed message ordering.

💖 💪 🙅 🚩
bookercodes
Alex Booker

Posted on December 13, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related