Sophisticated Speech-to-Text Application
David Akim
Posted on November 25, 2024
This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.
What I Built
I developed a Speech-to-Text application in Taipy using AssemblyAI's Universal-2 Speech-to-Text model. The application's features are:
- Transcribe spoken words into written text.
- Detect multiple speakers in an audio file and what each speaker said.
- Summarize your audio data with key takeaways
- Download transcriptions to a text file.
Demo
Screenshots
Journey
The application was developed using Taipy, a Python-based framework, which made integrating with AssemblyAI’s Speech-to-Text Model seamless, as both use Python. AssemblyAI's comprehensive documentation simplified the implementation of the transcription and diarization features. Taipy was utilized for the user interface, while AssemblyAI handled all the Speech-to-Text processing.
This submission also meets the criteria for the No More Monkey Business Challenge Prompt. The summarization feature was implemented using LeMUR, where a custom prompt was sent to the LLM to generate a concise summary of the transcript.
This is a solo submission, with all the work on Taipy and AssemblyAI completed by myself. It was an enjoyable learning experience. AssemblyAI has made building Speech-to-Text applications incredibly easy, and I will certainly use it again in the future.
Posted on November 25, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.