Skymood - Watch Bluesky's heartbeat through emojis in real-time 🌟
Sebastian Korfmann
Posted on November 18, 2024
This is a brief background story to how I built Skymood over the course of two evenings.
Since becoming more active on bluesky again, I started to look into how bluesky is built behind the scenes. The backbone which ties everything together is the Firehose, which is a full stream of events (posts, likes, follows, handle changes, etc). While that's one of the coolest aspects of Bluesky and ATProto, it's also a ton of data (in the realm of ~ 50 GB / day as of today) which would have to be transferred and processed, since it's an all or nothing approach with no filtering option.
However, there's another option which got released a few weeks ago: Jetstream. In contrast to the Firehose, this allows filtering by Collection NSIDs and Repositories. This means, we can filter for e.g. all posts, either globally or scope it to a bunch of given user ids. That sounds pretty intriguing, doesn't it?
As one does, I started to play around with just getting bunch of posts and a way to demonstrate what's possible. When looking at the stream of posts flying by I was like, it would be nice to get some kind of moodboard based on the emojis. One prompt later:
Well, that's promising, so let's evolve that into a website we can put on the Internet.
Literally Serverless
The Jetstream Websocket endpoint is open and can be connected to from any browser. So technically, there's no reason why a server would be required. And sure enough, consuming the Jetstream filtered for posts is totally doable.
Just give it a try in your terminal
$ websocat wss://jetstream2.us-east.bsky.network/subscribe\?wantedCollections=app.bsky.feed.post
So I went ahead and threw together a quick React SPA using Waku which was subscribing straight to the Jetstream Websocket from the client side. The prototype was done pretty quickly. The only noticeable hurdle was, that the Jetstream SDK depends on node:events
. Rather than trying to workaround that, it seemed a lot simpler to just go with a plain Websocket implementation.
const ws = new WebSocket('wss://jetstream2.us-west.bsky.network/subscribe?wantedCollections=app.bsky.feed.post');
ws.addEventListener('message', async (event) => {
console.log(event)
})
I was kind of expecting some performance issues, but it was rather smooth. The only small optimization was to debounce rendering, so that not each and every new emoji would re-render.
Deployed this to Cloudflare and pretty much done. Well, not so fast. Let's look at the consumed bandwidth.
$ websocat wss://jetstream2.us-east.bsky.network/subscribe\?wantedCollections=app.bsky.feed.post | pv > /dev/null
Depending on the time of the day, this is currently using between ~ 60 KB/s and 150 KB/s for the global posts stream - and it's gonna increase by the day. Not really an issue on a fast, unmetered fibre connection, but on mobile it might be another story. Last but not least, it feels wrong to let the client do all the expensive work while potentially increasing the bandwidth & resources bill for the Bluesky team. Back to the drawing board.
Cloudflare Durable Objects
Cloudflare's Durable Objects were on my list to try for a long time, handling Websockets are one of major use-cases. Rewriting the React client side handling was just a few prompts away and I was good to go.
The main changes were to subscribe to Jetstream as an upstream connection, match for emojis and publish the emojis to subscribed clients.
// Track connected WebSocket clients and their emoji filters
const clients = new Map<WebSocket, Set<string>>();
// When receiving a message from the data source
ws.addEventListener('message', async (event) => {
try {
const data = JSON.parse(event.data as string);
const postText = data.text.toLowerCase();
// Extract unique emojis from the text
const emojiRegex = /[\p{Emoji_Presentation}\p{Extended_Pictographic}]/gu;
const emojis = [...new Set(postText.match(emojiRegex) || [])];
// Skip if no emojis found
if (emojis.length === 0) return;
// Notify all connected clients
for (const [client, filters] of clients) {
// Send the list of emojis found
client.send(JSON.stringify({
type: 'emojis',
emojis
}));
// If client has emoji filters and post contains matching emoji,
// send the full post details
if (filters.size > 0 && emojis.some(emoji => filters.has(emoji))) {
client.send(JSON.stringify({
type: 'post',
text: data.text,
emojis,
timestamp: Date.now()
}));
}
}
} catch (error) {
console.error('Error processing message:', error);
}
});
And with that, we were down to somewhere between 0.5 - 3 KB/s
for a client connection while the server is doing the heavylifting only once. That's a lot better!
However, while the Durable Object was doing ok it seemed to be a bit slow (gut feeling, no data) and close the maximum memory of 128 MB (see limits). What to do? So far it's a single Durable Object. Ideas from the forum / Discord of Cloudflare were along the lines of sharding, aka introduce a few durable objects for client handling while a single one does the upstream processing. That's a lot of complexity right there. I'm sure there are better ways, if anyone at Cloudflare reads this: I'd be more than happy to iterate on my approach. But for now I just wanted to ship it. So off to the next chapter.
Pivot: Bun.js
Bun.js is another thing I wanted to look into for a while. In the back of my head it's categorized as Nodejs, but more performant. It turned out that Bun has a custom websocket server baked in, which claims 7x more throughput compare to Nodejs and ws
. Haven't verified those claims, but that's enough of an excuse to use it in this case.
Again, a few prompts later the new version was up and running and deployed to fly on a rather small machine in Frankfurt, Germany. Let's see:
$ websocat wss://skymood-bun.fly.dev
{"type":"clientCount","count":1}
{"type":"emojis","emojis":["😘"]}
{"type":"emojis","emojis":["🧐"]}
{"type":"emojis","emojis":["🙏"]}
{"type":"emojis","emojis":["😀"]}
{"type":"emojis","emojis":["😂"]}
...
and the filtered version
$ (echo '{"type":"filter","emoji":"🥰"}'; cat) | websocat wss://skymood-bun.fly.dev
{"type":"emojis","emojis":["😊"]}
{"type":"emojis","emojis":["‼","✨"]}
{"type":"emojis","emojis":["🥰"]}
{"type":"post","text":"Good morning have a lovely day 🥰","url":"https://bsky.app/profile/did:plc:q4st6fcbn5jdi7xdf4o7a2jy/post/3lb7jrngi7c2m","timestamp":1731919511338,"emojis":["🥰"]}
{"type":"emojis","emojis":["🌻","🔪"]}
{"type":"emojis","emojis":["❤"]}
Beautiful :)
Quite a journey, but we now have a rather robust Websocket relay, which connects to the Bluesky Jetstream once, and republishes only a small subset of only relevant messages. Find the current server code over at Github
Make sure to check out the end result https://skymood.skorfmann.com/ and don't forget to share this post, the website or both :) Also, post comments either here or on this Bluesky post
Posted on November 18, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.