How to use Whisper AI (using Google Colab)

gloscode

Glos Code

Posted on July 15, 2023

How to use Whisper AI (using Google Colab)

What is Whisper AI?

An automatic speech recognition system called Whisper was trained on 680,000 hours of supervised web-based multilingual and multitasking data. It was created by OpenAI, the same business that produced ChatGPT and DALLE. Whisper has speech recognition capabilities and the ability to multitask, so it can simultaneously create text from audio files or translate languages. Although it is still in development, it has the capacity to be an effective tool for numerous applications.

What is Google Colab?

Python code can be executed online for free using Google Colab. It is a cloud-based Jupyter Notebook environment that doesn't need to be installed. Colab provides a number of features, such as:

  • The ability to run Python code in a web browser. This implies that you don't need to install any software on your computer in order to use Colab to develop and run Python programmes.
  • Use of Google's cloud computing and storage capabilities. This means that you won't need to be concerned about your computer's resources when using Colab to run lengthy and intricate Python programmes.
  • The ability to communicate and work together on initiatives. You can collaborate in real-time on projects by sharing your Colab notebooks with other users.

Why Google Colab?

For Whisper or other Python projects, you may prefer to use Google Colab rather than your personal computer for a number of reasons.

  • Unlike owning and maintaining a machine, Google Colab is available for free.
  • Google Colab provides access to strong GPUs that help speed up your Python projects including machine learning.
  • Google Colab is accessible from everywhere because it is cloud-based.
  • You can collaborate on projects with others in Google Colab's collaborative environment.
  • The use of Google Colab does have some possible disadvantages, though.
  • Google Colab occasionally runs slowly, especially when usage is at its highest.
  • The storage capacity of Google Colab is constrained.

Step-By-Step Guide

Setup

The following command will download and install the most recent version of Whisper (or update to it):

!pip install git+https://github.com/openai/whisper.git
Enter fullscreen mode Exit fullscreen mode

To update the package to the latest version, run:

!pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
Enter fullscreen mode Exit fullscreen mode

Additionally, your system must have the ffmpeg command-line programme installed, which is accessible through most package managers:

!sudo apt update && sudo apt install ffmpeg
Enter fullscreen mode Exit fullscreen mode

Usage(Command-line based)

Size Parameters English-only model Multilingual model Required VRAM Relative speed
tiny 39 M tiny.en tiny ~1 GB ~32x
base 74 M base.en base ~1 GB ~16x
small 244 M small.en small ~2 GB ~6x
medium 769 M medium.en medium ~5 GB ~2x
large 1550 M N/A large ~10 GB 1x

Recommended: medium

The following command will transcribe speech in audio files, using the medium model:

!whisper "[Add your audio file, Example: english.wav]" --model medium
Enter fullscreen mode Exit fullscreen mode

The default setting (which selects the small model) works well for transcribing English. To transcribe an audio file containing non-English speech, you can specify the language using the --language option:

!whisper "[Add your language-specific audio file, Example: japanese.wav]" --language [Add language, Example: Japanese]
Enter fullscreen mode Exit fullscreen mode

Adding --task translate will translate the speech into English:

!whisper "[Add your language-specific audio file, Example: japanese.wav]" --language [Add language, Example: Japanese] --task translate

Enter fullscreen mode Exit fullscreen mode

Run the following to view all available options:

!whisper --help
Enter fullscreen mode Exit fullscreen mode

Outro

We appreciate you reading our blog post. I sincerely hope you found it useful and enlightening. Please feel free to leave any questions or comments in the space provided below. I'd be delighted to hear from you.
Please spread the word about this article to your followers and friends if you liked it.
Once more, thanks for reading! I value your assistance.

💖 💪 🙅 🚩
gloscode
Glos Code

Posted on July 15, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related