When was the last time you learn something from your LLM logs ? Here is the solution : phospho
Paul
Posted on March 6, 2024
TL;DR: phospho is an open-source text analytics for LLM Apps. It helps companies turn their LLM prototype into a product with testing, evaluation, monitoring and guardrails at the semantic level.
Building LLM apps has never been easier. There are TONS of tools. Yet, companies that ship to production are scarce. And lots of AI tools that have made it to production have either HIGH churn rate or, LOW usage rate. Why ?
Unfortunately, many AI builders (I was one of them) are trapped at ground zero:
- They don’t know what to improve in their products, because there is so many ways to improve (and many yet to come!)
- To make decisions, they either have KPIs irrelevant to their use case or just gut feeling from everyone, but their users
- Who are the users, what they do, or what they say, is usually a big unknown.
No wonder they feel stuck. Their best chance at improving is either guesswork or reading through thousands of logs & messages. It is like looking for a needle in a haystack.
There is no secret. Here is what the best companies shipping LLM products do that others don’t:
- They release often, and fast… because they have a clear set of custom metrics that act as a simple “green light/red light”
- They improve on precise product issues… because they understand in great detail who use their products and why
- They act on feedback quickly… because they listen to it every day in their Slack channels or via mail and get alerted when something is going wrong
🧪 This is the purpose of phospho. phospho is an open-source text analytics tool for LLM apps.
It gathers all the tools that enable your team to go from prototype to product at record pace: testing, evaluation, monitoring at the semantic level. Let’s deep dive:
Build and Test
- Define your own textual event detection pipeline
- Set up webhooks and enforce guardrails
- Assess the quality before releasing with personalized evals, continuously A/B test
Understand and Analyze
- Detect usage patterns, categorize interactions by type, topics, intents, and more
- Evaluate app response quality
- Run tests at scale and in real-time
Improve and Take Action
- Trigger workflows, escalations, and alerts based on detected events or evaluations
- Dive deeper into the data; get consolidated reports through the platform or via the API
- Break down the analysis by users or sessions
Integrations
Python and Javascript SDKs to easily integrate in your LLM stack
Phospho can be self hosted or used in a managed cloud version
How to use this text analytics tool ?
✨ The repo is open-source on Github. Join the Discord here
⚙️ If you run an LLM app, try the platform for free.
Posted on March 6, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.