2024-06-17: CoT prompting
Arman Tarkhanian
Posted on June 18, 2024
Seems like we still have a fire lit under our butts. v0 was apparently not satisfactory enough, so my primary goal last week to make a more fully-fledged version of CoT (as I mentioned last week) so that the AI can actually think about its responses before querying the user for further details.
It was a bit of a technical lift to get it done. The way our project is set up is a little weird, in that, we're using Django for routing so for some reason all our LLM logic is localized to the serializer and apps.py files, even though it really should all be refactored into their own files. I had to do a bunch of theorycrafting and planning to come up with a design that I thought would work for the CoT. Obviously sending that whole massive chunk of thinking is not good for UX, so it needs to be done under the hood. So I thought it should do a CoT and THEN have a second step to refine that into a singular question or statement. So I set out to do that. I had to end up keeping two separate memories, though, and we apparently have like 5 or 6 variables dedicated just to handling the memory/conversation/chat history, so it got real confusing real fast.
Fortunately, I managed to detangle all that, and I also had one of the other backend engineers show me how to add a new column to the database for this CoT message so it can pick up from where it left off upon reload or switching conversations. All this took like 12 hours on Tuesday.
So it was working, but it needed some further engineering to make the prompt actually do a good job of improving the UX. So I met with the product team several times to just jam it out and add and tweak the prompts until it was up to the standard we wanted. The only major issue is that it's SLOW, but I guess that's the point. They preferred it take 5 seconds to generate a response over it being suboptimal.
Anyway, so all that was good and proper, then came the menace of trying to merge it back in to dev so that everyone could try it. The rest of the dev team had been working on implementing text streaming for the chat messages, and the way they did it is through SSE instead of simply doing it over a websocket, so the code was absolutely grotesque.
On top of that, instead of relying on version control, the one guy primarily working on it decided to make an entirely separate file/routing system just for the streaming, and I'd been working out of the "vanilla" ones this whole time. I immediately told them to start "zipping" it all together, and that was a whole mess on Friday because they didn't understand what I meant for some reason, so that one dev just slapped on the "new" methods into the original file instead of actually just reworking the functionality. Eventually the lead dev stepped in and did it himself, and I was able to merge in my code. It didn't work properly, but by the time I had to sign off on Friday (it was a busy weekend for me), it was about 90% there so I left it for the other guys to work on.
They ended up fixing it (mostly), but then a whole mess happened with threading that I'll cover next blog post because it's technically been this week, not last week. The reason why we were in such a rush to get this out, as I mentioned, was to get investors hooked. Hopefully it was all worth it.
Until next time, cheers.
Posted on June 18, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.