Supercharging Obsidian Search with AI and Ollama

airabbit

AIRabbit

Posted on November 26, 2024

Supercharging Obsidian Search with AI and Ollama

Have you ever torn your hair out trying to find a note you know you saved, but the search bar just stares back at you? That was me last week. I was desperately searching for a one-liner command to clear Time Machine's local storage on my Mac. I typed in "clear time machine", "remove backups", "free space" - nothing. It felt like my notes had swallowed the command into a black hole.

Turns out I had saved it under "storage" and "space", not "delete" or "remove". Classic memory lapse. This got me thinking: our brains often don't remember the exact words we use when taking notes. In personal knowledge management, this "memory storage paradox" can make finding information a needle in a haystack problem.

The Search for a Better Search

I love Obsidian for note-taking, but its search functionality relies on exact matches. I needed a way to bridge the gap between how I remember and how I write. So, I explored some existing solutions:

  1. Vector Embeddings: They offer semantic search but require complex setup and heavy resources.
  2. Full-Text Search with Indexing: Fast but limited to literal matches.
  3. Manual Tagging: Effective but demands discipline and foresight.
  4. GPT-Based Solutions: Great semantic understanding but pose privacy concerns and depend on external services.

I have adapted some of these powerful RAG-based solutions in the past, but this time I wanted to see if there was a simpler way to implement search without embedding or indexing.

Essentially this solution is to let the AI *formulate the search* expression and not do the search itself (similar to the concept of generating a SQL statement instead of executing it https://github.com/vanna-ai/vanna).

Instead of overhauling my entire note collection, why not enhance the search query itself? By using a local Language Model (LLM) to expand my search terms, I could get a semantically rich search without sacrificing privacy or simplicity.

I find it very appealing for a number of reasons.

  • Semantic Understanding: Captures related terms and concepts.
  • Privacy Preservation: Everything runs locally; no data leaves my machine.
  • Immediate Implementation: No need for indexing or pre-processing notes.
  • Simplicity: Minimal changes to my existing workflow.

Building the Solution

How It Works

  1. User Inputs a Search Term: Let's say "clear time machine."
  2. Local LLM Generates Related Terms: The model outputs terms like "time machine cleanup," "delete backups," etc.
  3. Construct Enhanced Search Query: Combines original and related terms using Obsidian's search syntax.
  4. Execute Search in Obsidian: Retrieves notes that match any of the expanded terms.

Diving into the Code

Here's the function that queries the local LLM for related terms:

async getRelatedTerms(searchTerm: string): Promise<string[]> {
    try {
        const response = await fetch(`${this.settings.llamaEndpoint}/api/generate`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                model: "llama3.1:latest",
                prompt: `For the search term "${searchTerm}", provide a list of:
                        - Common misspellings
                        - Similar terms
                        - Alternative spellings
                        - Related words
                        Return ONLY the actual terms, one per line, with no explanations or headers.
                        Focus on finding variations of the exact term first.`,
                stream: false
            })
        });

       ...
Enter fullscreen mode Exit fullscreen mode

And here's how the enhanced search query looks like:

buildSearchQuery(terms: string[]): string {
    const allTerms = [`"${this.searchTerm.trim()}"`, ...terms.map(term => `"${term.trim()}"`)];

    if (this.plugin.settings.includeTag && this.plugin.settings.defaultTag.trim() !== '') {
        return `tag:#${this.plugin.settings.defaultTag} AND (${allTerms.join(' OR ')})`;
    }
    return allTerms.join(' OR ');
}
Enter fullscreen mode Exit fullscreen mode

Real-World Example

Let's revisit my Time Machine dilemma.

  • Original Search: "clear time machine"
  • LLM-Expanded Terms:

    • "time machine cleanup"
    • "delete time machine backups"
    • "remove old backups"
    • "free up time machine space"
  • Enhanced Search Query:

tag:#howto AND (
    "clear time machine" OR 
    "time machine cleanup" OR 
    "delete time machine backups" OR 
    "remove old backups" OR 
    "free up time machine space"
)
Enter fullscreen mode Exit fullscreen mode

With this query, Obsidian pulled up the elusive note instantly!

Getting It Up and Running

Requirements

  • Obsidian: Your go-to note-taking app.
  • Local LLM API: I used Llama running locally.
  • Basic Knowledge of Obsidian: Familiarity with search syntax helps.

Steps

  1. Set Up a Local LLM: Install and run Llama or any other local LLM API.
  2. Install the Plugin: Place the plugin files into your Obsidian plugins directory.
  3. Configure Settings: Set your LLM endpoint and default tag in the plugin settings.
  4. Start Searching: Use the enhanced search to find notes more effectively.

Final Thoughts

With this little tweak, I was able to use my on-device AI to improve an existing search capability in Obsisidan, and it really did. I am thinking of adapting a similar solution for other tools that, unlike Obsidian, do not yet have GPT or AI-powered search. I will be sure to share any findings with you soon.

💖 💪 🙅 🚩
airabbit
AIRabbit

Posted on November 26, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related