Examining HN Discovery Quality Using Existing Complaints

danielsgriffin

Daniel Griffin

Posted on September 10, 2024

Examining HN Discovery Quality Using Existing Complaints

We've launched a new Hacker News search experience, focused on discovery: hn.trieve.ai (GitHub: Trieve API backend, frontend search interface).

Hacker News has long been a playground for search innovation—with the community often leaning in to explore new possibilities in search. Over the past six months, Nick has been looking back at the various search experiences and detailed his findings in a post: History of HackerNews Search: From 2007 to 2024.

We combed through HN (and user issues posted to Algolia's repo for HN search) in search of search complaints. Over the years there have been some complaints about indexing issues, and we’re not covering those in this post. Instead, we looked for examples where people shared actual search queries. For each query, we looked for what they said or implied about their search intent and the search results they found. What have people said about search quality? What searches are not possible or not easy in the current HN search? When are folks resorting to running a site: search on a search engine like Google? Where can Trieve help make search better?

Discover well-beyond exact matches in titles

Searching for "postgres clustering"

Algolia for "postgres clustering"

Algolia for postgres clustering

Trieve for "postgres clustering"

Trieve for postgres clustering

Searching for "AT&T says criminals stole phone records of 'nearly all' customers in new data breach"

Algolia for "AT&T says criminals stole phone records of nearly all customers in new data breach"

Algolia for AT&T says criminals stole phone records of nearly all customers in new data breach

Trieve for "AT&T says criminals stole phone records of nearly all customers in new data breach"

Trieve for AT&T says criminals stole phone records of nearly all customers in new data breach

Searching with special characters

Searching for "[video]"

Algolia for "[video]"

Algolia for [video]

Trieve (semantic) for "[video]"

Trieve (semantic) for [video]

Algolia (quoted) for "[video]"

Algolia (quoted) for [video]

Trieve (semantic, quoted) for "[video]"

Trieve (semantic, quoted) for [video]

Searching for "AT&T"

Algolia (prefix=true) for "AT&T"

Algolia (prefix=true) for AT&T

Algolia (prefix=false) for "AT&T"

Algolia (prefix=false) for AT&T

Algolia (quoted) for "AT&T"

Algolia (quoted) for AT&T

Trieve for "AT&T"

Trieve for AT&T

Out-of-domain strings

Searching for "lootitooti"

Algolia for "lootitooti"

Algolia for lootitooti

Trieve for "lootitooti"

Trieve for lootitooti

Presque vue searches

Searching for "deterministic Docker builds"

Algolia (type: Story) for "deterministic Docker builds"

Algolia (type: Story) for deterministic Docker builds

Trieve (type: Story) for "deterministic Docker builds"

Trieve (type: Story) for deterministic Docker builds

Searching for "tip of your tongue phenomenon"

Bonus! Again, precision focused approach of requiring "your" has downsides.

Algolia for "tip of your tongue phenomenon"

Algolia for tip of your tongue phenomenon

Trieve for "tip of your tongue phenomenon"

Trieve for tip of your tongue phenomenon

Filter on author with a hyphenated username

Searching for ""It Won't Fail Because of Me" by:1970-01-01"

Algolia for "It Won't Fail Because of Me" by:1970-01-01

Algolia for

Trieve for "It Won't Fail Because of Me"

Trieve for

Default sorting by relevance v. popularity metrics

Searching for "Excel"

This is from a comparison that @airstrike shared after we launched our discovery search. He preferred the results from Algolia. Algolia defaults to a popularity-based sort. Algolia also has sort-by-date, but does not have a specific relevance-focused sorting option.

Trieve offers multiple sorting options:

  • default: relevance (not tuned to extrinsic popularity metrics)
  • number of points (similar to Algolia's "popularity")
  • date (reverse chronological)
  • descendants (number of comments)

Nick (@skeptrune) responded with some of our internal deliberations:

We went back and forth on making points sorting default and ended up deciding against it, but maybe we should have. Our thinking was that since it's focused on "discovery" it was worth prioritizing relevance, but I can see how it can feel the result quality isn't as great.

If someone is looking for more of the popularity-focused results, they can start their Trieve HN Discovery searches with the sortby= parameter set to num_value (try this link).

Algolia (sorted by popularity) for "Excel"

Algolia (sorted by popularity) for Excel

Trieve (sorted by relevance) for "Excel"

Trieve (sorted by relevance) for Excel

Trieve (sorted by points) for "Excel"

Trieve (sorted by points) for Excel

Trieve (sorted by descendants) for "Excel"

Trieve (sorted by descendants) for Excel


If you want to explore comparisons between the Algolia HN Search and our Trieve HN Discovery, it can help to use our "Try it with Trieve!" button via our open-source unpacked Chrome extension: github.com/devflowinc/try-it-with-trieve

A "Try it with Trieve!" button in action
A Try it with Trieve button in action.

Learn something from this post? If you'd like to support our project, we'd be grateful if you'd explore and star our GitHub repository!

💖 💪 🙅 🚩
danielsgriffin
Daniel Griffin

Posted on September 10, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related