Pipe a file stream from AWS S3 to OpenAI Whisper in Node.js
Anton Dosov
Posted on March 2, 2023
Current (2023-03-01) OpenAI Whisper API expects a file uploaded as part of multipart/form-data
in POST request. I initially struggled to use this API inside Node.js with an audio file stored in an AWS S3 bucket, so thought I'd share a working snippet:
The API usage example from the OpenAI docs in the following:
curl --request POST \
--url https://api.openai.com/v1/audio/transcriptions \
--header 'Authorization: Bearer TOKEN' \
--header 'Content-Type: multipart/form-data' \
--form file=@/path/to/file/openai.mp3 \
--form model=whisper-1
The Node.js + AWS S3 snippet that seemed to work for looks like this:
import fetch from "node-fetch";
import AWS from "aws-sdk";
import fs from "fs";
import FormData from "form-data";
// Set the region and access keys for your AWS account
AWS.config.update({
region: 'eu-central-1',
accessKeyId: '***',
secretAccessKey: '***'
});
// Create an S3 client
const s3 = new AWS.S3();
// Set the bucket and file key
const bucketName = 'openai-sample';
const fileKey = 'path/to/file/openai.mp3';
// Set the parameters for the S3 getObject operation
const params = {
Bucket: bucketName,
Key: fileKey
};
// Get audio metadata to retrieve size and type
s3.headObject(params, function(err, data) {
if (err) throw err;
// Get read object stream
const s3Stream = s3.getObject(params)
.createReadStream();
// create form data to be send to whisper api
const formData = new FormData();
// append stream with a file
formData.append('file', s3Stream, {
contentType: data.ContentType,
knownLength: data.ContentLength,
filename: fileKey
});
formData.append('model', 'whisper-1');
fetch('https://api.openai.com/v1/audio/transcriptions', {
method: 'POST',
body: formData,
headers: {
'Authorization': 'Bearer TOKEN',
...formData.getHeaders()
}
}).then(res => res.json()).then(json => console.log(json.text));
});
💖 💪 🙅 🚩
Anton Dosov
Posted on March 2, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.