Tutorial: Play with a Speech-to-Text API using Node.js
Yongchang He
Posted on March 16, 2022
Play with an API from Deepgram converting an audio file or audio stream into written text
The purpose of building this blog is to write down the detailed operation history and my memo for learning Node.js.
If you are also interested and want to get hands dirty, just follow these steps below and have fun!~
Prerequisite
- Have installed Node.js
- Have Command Line Interface (CLI / Terminal)
- Have your favourite code IDE (e.g. VSCode)
- Have created a Deepgram account.
Getting started
We should first navigate to our favored directory, and create a folder(e.g. named sttApp) using this command:
mkdir sttApp
Then open the folder using your favourite IDE. Mine is VS code. We can see now the directory is empty with no files.
Next step let's use our terminal, navigate to your current directory /sttApp :
cd sttApp
And run the following code to initialize a new application:
npm init
Press enter several times to leave these parameters with default configuration, and then your CLI should get a result like this:
Next, we install the Deepgram Node.js SDK using the following:
npm install @deepgram/sdk
Till now if all the previous steps are correct, you should get a similar directory in your code IDE like the following:
Now in the current directory of your code IDE (/sttAPP) create a file named index.js , and copy and paste the following code to index.js and save your file:
const { Deepgram } = require('@deepgram/sdk');
const fs = require('fs');
// The API key you created in step 1
const deepgramApiKey = 'YOUR_API_KEY';
// Replace with your file path and audio mimetype
const pathToFile = 'SOME_FILE.wav';
const mimetype = 'audio/wav';
// Initializes the Deepgram SDK
const deepgram = new Deepgram(deepgramApiKey);
console.log('Requesting transcript...')
console.log('Your file may take up to a couple minutes to process.')
console.log('While you wait, did you know that Deepgram accepts over 40 audio file formats? Even MP4s.')
console.log('To learn more about customizing your transcripts check out developers.deepgram.com.')
deepgram.transcription.preRecorded(
{ buffer: fs.readFileSync(pathToFile), mimetype },
{ punctuate: true, language: 'en-US' },
)
.then((transcription) => {
console.dir(transcription, {depth: null});
})
.catch((err) => {
console.log(err);
});
The next step is to log in to your Deepgram, navigate to your Dashboard , and choose to Get a Transcript via API or SDK:
Click reveal Key and copy your API KEY SECRET:
In the next step, paste your API KEY SECRET into line 5 of your index.js, like the following:
Then let's replace line 8 and 9 with our voice file path and mime-type
(Hint: use a new CLI to navigate to the directory where your voice file is located and use pwd
to acquire absolute path):
Now lastly let's run our application with the following command (Make sure you are at /sttApp):
node index.js
And you’ll receive a JSON response including a transcript that you want, and including word arrays, timings, and confidence scores:
Pretty COOL!
If you still get confused with the content above, please feel free to leave messages below or refer to my git repository here for the whole project: linkToGit
References
https://console.deepgram.com/project/850abca5-449a-47fa-8c40-6a463e59ad00/mission/transcript-via-api-or-sdk
https://dev.to/devteam/join-us-for-a-new-kind-of-hackathon-on-dev-brought-to-you-by-deepgram-2bjd
Overview of My Submission
A tutorial for beginners to learn node.js using STT API from Deepgram.
Submission Category:
Analytics Ambassadors
Link to Code on GitHub
Additional Resources / Info
None
Posted on March 16, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.