Append-Only Feeds: A Better Way To Order Feeds

stevekrenzel

Steve Krenzel

Posted on September 12, 2018

Append-Only Feeds: A Better Way To Order Feeds


I had been working at Twitter for a few weeks and still didn’t understand why people used it. Fortunately, most of my days were spent in the app’s networking layer making sure things worked well in countries without reliable connectivity. So it wasn’t strictly necessary that I “got” Twitter, I just had to make it more reliable.¹

I loved Reddit and Hacker News though. I used them for almost a decade at that point and checked them both religiously². When a hackathon was approaching at Twitter I decided to evaluate why I enjoyed one form of social news curation and didn’t quite connect with the other.

Could we learn something from Reddit that could improve Twitter? And why do people seem to prefer reverse chronological feeds on Twitter, Facebook, and Instagram, but on sites like Reddit and Hacker News they prefer feeds that may completely change ordering between every refresh?

Reddit and Twitter are both top-10 websites, so there doesn’t seem to be a general macrolevel preference here.

This exploration into feeds would come in handy again later as I set out to build Banter, a social network for podcasts. It’s an interesting question — If you could design your ideal feed, what would it look like?

Gatekeepers³

The immediate differences I explored, and the major hurdles I found with being a new user on Twitter, revolved around where the content on each site was coming from. They use very different mechanisms to source content. Similar to ensembles in machine learning, do you trust a few finely-tuned highly-trusted filters or do you trust millions of random filters each having a marginal chance of being more accurate than a coin-flip?

It turns out that both can work.

Chain of Trust

Twitter’s curation is built around a chain of trust. I choose people to follow and I trust that those people will a) generate tweets relevant to my interests or b) retweet tweets relevant to my interests. This relationship then recurses, where those people also follow other people and place the same trust in those follows.

But when it comes to bootstrapping your follows, who do you choose? Twitter has front-loaded all of the trust decisions onto the user. To further complicate things, people you trust also need to be people who tweet. I’m interested in programming, so I naturally wanted to follow Guido van Rossum, Simon Peyton Jones, etc. only to find that they don’t really tweet.

It turns out that the people who you think you want to tweet about a domain and the people who are actually good at tweeting about that domain are rarely the same. This makes bootstrapping your network very difficult.

Interestingly enough, Facebook and Instagram also use a chain of trust but bootstrapping was never an issue for them. You followed your friends and you were done. Twitter, being so open-ended and flexible, appeared to be a Turing tarpit of sorts, but applied to social media.

Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.
– Alan J. Perlis

http://pu.inf.uni-tuebingen.de/users/klaeren/epigrams.html

Wisdom of the Crowds

Reddit’s curation is, in a way, the exact opposite of Twitter’s. Instead of choosing who to trust, you rely on millions of strangers to vote on random stories. Making things a little better, the introduction of sub-reddits allowed a user to select internet strangers that would at least share a common interest (in theory).

The nice thing about Reddit’s approach is that you’ll see interesting content on day one with minimal effort on your part. The content may be lowest-common-denominator due to needing broad appeal, but it’ll be fairly consistent day-over-day. A Twitter feed, by contrast, will often have much higher variance in quality and types of content.

The key to Reddit’s success here is that the barriers to entry for participating in the community are much lower. You don’t need to know anyone on the platform to get value from it. You also don’t need to know anyone to participate in the community. Anyone can submit a story with the potential of many people seeing it. On Twitter, you’re tweeting into the void until someone explicitly chooses to trust you. That’s a big hump to get over.

Could Twitter avoid this hump somehow?

Feed Ordering

When a social platform embraces a chain-of-trust or wisdom-of-the-crowds approach there are constraints placed on how content flows through the network which, depending on the choice, determines how the content should be shown to the user.

Reverse Chronological

Twitter was still strictly reverse chronological when I started working there. There was something comforting in its simplicity. You knew that if you followed someone and they tweeted something, that it would be in your timeline right where they put it.

It seems that people like reverse chronological timelines. There is comfort in understanding why things are in your feed the way that they are. And there is frustration when you reload your feed and everything is in a different order and some things are just plain gone. People become super passionate when you start messing with their timelines:

The nice thing about reverse chronological feeds in a chain-of-trust environment is that in a highly connected graph (e.g. Twitter), every tweet on the platform is potentially within a few seconds of reaching you. It also turns out that recency is a pretty good heuristic for relevancy.

The major downside of a reverse chronological feed is that you may easily miss important things if you don’t check often. If you’re only on Twitter for cute animals it’s not a big deal if you miss one or two posts, but for breaking news or organizing revolutions you better be checking often and hope that the people you follow retweet these things over and over again just in case you didn’t scroll down far enough the last time.

Voting

Reverse chronological feeds like old-Twitter are great because they’re simple to understand, the ordering never changes, and you get to choose who can put things in your feed. But what about feeds that are ranked by physics simulations, change between page refreshes, and are sourced by internet strangers?

You can’t just have any random person on the internet put something at the top of your feed. That’d be as crazy as… e-mail. So we need a way of taking millions of submissions and filtering them.

Hacker News and Reddit both use the mighty upvote, but they treat them a little differently. Hacker News has posts always falling through time, and upvotes lift them up. On Reddit, upvotes effectively move a post’s timestamp forward in time, and then stories are ranked based on that new timestamp. There’s other parameters that go into these, but that’s the high-level gist.

So anytime you look at the front page of these sites, you’re looking at a sliding window of best-of posts for some period of time. The immediate benefit here is that you’re less likely to miss really important or popular things because they’ll hang out near the top for a while.

The downside is that important news may take a while to make it to the top and users are susceptible to vote brigading. You’ll also rarely see a non-popular opinion, almost by definition.

I’m Livin’ My Best Feed

Where’d that Twitter hackathon wind up? Well… I can’t really get into the details, but I can say that I focused on the bootstrapping problem with mixed results. Regardless, the process made me very mindful of how information gets to a user and how explicit their role is in both the ordering and sourcing of content. As my co-founder and I set out to build our own social network, the topic of feeds and follow models came up a lot.


We wanted your home feed to be all about seeing what friends are sharing, having conversations, and just helping you figure out what you should listen to next. For people that want a purely reverse chronological feed like every other podcast app we included a dedicated subscriptions feed, but that’s not what Banter is about.

Episodes and Clips in the Banter feed. 

Our Algorithm

So if the home feed isn’t reverse chronological, what is it? It’s almost as simple — we take everything that’s happened since the last time you opened the app, order it, and append it all to your feed. Then, most importantly, we never touch those items again. A post can get a million more likes and it doesn’t matter, it’ll be right where you last saw it.

When I say “order it”, I mean that if an item was published at noon and gets a bunch of likes, we’ll move it forward in time a little bit. We don’t actually change its published timestamp, but we may order it as though it was published at 5:00pm instead of noon. We’ll never move an item backward in time and we cap how far forward it can move. So we simply nudge popular items higher into your feed. If something is really popular, it’ll get a bigger nudge.

This approach has an immediate and fascinating consequence: If a user checks their feed often, their feed becomes purely reverse-chronological. And if a user checks their feed every few days, their feed is mostly ranked.

The feed naturally adopts to two major personas: 1) The frequent listener who just wants to start every morning off with their favorite podcast and 2) the occasional listener who just wants to listen to something interesting while they do chores on the weekend.

Other Consequences

On Reddit, there’s no sense of object permanence. Once you refresh your feed you may never see a post again, though you’ll want to believe that it’s below the fold somewhere. In Banter, you know that once you’ve seen an item in your feed, it’s there forever. If you want to show it to a friend later, you can always scroll on down and find it.

And on Twitter, there’s a sense that you might miss something big if you don’t check often. We reduce that likelihood by making sure popular items are higher in your feed. You don’t have to scroll down as far to find the important stuff.

We also don’t do any Facebook-style blackbox filtering where a friend may post something and it never goes into your feed or you check your feed and suddenly something from two weeks ago is there even though it wasn’t there 30 minutes ago. When you load up your feed in Banter, we don’t hold anything back.

Perfect?

The algorithm reminds me a little bit of hybrid sorting algorithms that use one algorithm for big chunks of items and then swap to another algorithm once the major chunks are ordered. Here we rely on reverse chronological for the big chunks, and ranking for refining the small chunks.

Is this the perfect feed algorithm? The jury is still out and the Banter community is still tiny (around 700 people as I write this), but our feed has been really well received by our admittedly small user base.

This may work better for podcasts too because, unlike photos or tweets, when you commit to an episode you’re probably going to give it 20 minutes to an hour+ of your life, instead of a few seconds. There’s more of a need to filter out cruft.

Bootstrapping

If you want to check out the feed for yourself, you can download Banter here.

In addition to the above feed-ordering experiment, we’re also experimenting with network bootstrapping. When you join Banter we want to make sure you’re not alone, so (at least while we’re tiny) we automatically have people follow each other. We previously had everyone follow everyone else, but now we have a sliding window where other people who recently joined will follow each other. Over time you’ll have about 100 people follow you automatically.

The theory is that if you both joined during a similar time period, then you both likely have something in common (e.g. you both read this blog post). This seemed like a more interesting way to bootstrap follows than slurping in your FB network and it makes sure that if you share something you’ve enjoyed, others are there to enjoy it too.


If you found this post useful or interesting, you’ll probably also enjoy our podcasting app, Banter. We’re building a community of listeners, and we’d love to have you. Plus, we just launched the ability to share clips of any podcast.

And if you’re interested in React, React Native, GraphQL, Postgres, or just about any tech stuff follow us here or on Twitter. (I’m @stevekrenzel and my cofounder is @jamesreggio.) You can also email us at founders@banter.fm .


[1] That may sound strange, but I also worked on Google Translate without speaking any other languages and on Windows at Microsoft while having been a devout unix user since high school, so it’s not a particularly unique experience.

[2] Remember when PG wanted to get HN back to it’s roots and asked us all to submit only Erlang articles for a while?

[3] This has all since gotten murkier. Twitter may put things in your feed you’re not subscribed to. You can follow individual users on Reddit. Twitter may reorder your timeline. Facebook won’t show you every post your friends make. And so on. Please allow my oversimplification, for simplification purposes.

[4] See also: https://en.wikipedia.org/wiki/Small-world_network

[5] It’s not actually a physics simulation. Just has a variable called gravity.

[6] Always build an escape hatch.

[7] More specifically (with tweaks to avoid domain errors), we order by:

Where, currently, N = 3.

[8] With a few exceptions such as muting episodes, unsubscribing from shows, new recasts, etc.

💖 💪 🙅 🚩
stevekrenzel
Steve Krenzel

Posted on September 12, 2018

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related