Day 4: Voice Activity Detection with Python
Dilek Karasoy
Posted on January 5, 2023
When it comes to speech recognition, most only know about Automatic Speech Recognition (ASR). Voice Activity Detection (VAD) is an important and fundamental piece in any product related to speech. Voice AI vendors integrate VAD into their ASRs but do not offer it separately. Picovoice also built Cobra for internal use initially. Then make it public due to the market demand as there is no alternative to Googleβs WebRTC VAD, which does not work on all platforms.
You can read more on what voice activity detection is, but today is the day to learn how to detect voice activity with Cobra VAD Python SDK:
1. Install VAD SDK
pip3 install pvcobra
Sign up for Picovoice Console if you haven't already done (it's free) to grab your AccessKey.
2. Implement in Python
import pvcobra
handle = pvcobra.create(access_key)
When initialized, the valid sample rate is given by handle.sample_rate. The expected frame length is handle.frame_length. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
def get_next_audio_frame():
pass
while True:
voice_probability = handle.process(get_next_audio_frame())
Congratulations!
Posted on January 5, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.