How to Add AI Dubbing to App

jacksssson

Jackson

Posted on March 14, 2022

How to Add AI Dubbing to App

Text to speech (TTS) is highly sought after by audio/video editors, thanks to its ability to automatically turn text into naturally sounding speech, as a low cost alternative to human dubbing. It can be used on all kinds of video, regardless of whether the video is long or short.

I recently stumbled upon the AI dubbing capability of HMS Core Audio Editor Kit, which does just that. It is able to turn input text into speech with just a tap, and comes loaded with a selection of smooth, naturally-sounding male and female timbres.

This is ideal for developing apps that involve e-books, creating audio content, and editing audio/video. Below describes how I integrated this capability.

Making Preparations​

Complete all necessary preparations by following the official guide.
Configuring the Project​

1.Set the app authentication information

The information can be set via an API key or access token (recommended).

Use setAccessToken to set an access token during app initialization.

HAEApplication.getInstance().setAccessToken("your access token");
Enter fullscreen mode Exit fullscreen mode

Or, use setApiKey to set an API key during app initialization. The API key needs to be set only once.

HAEApplication.getInstance().setApiKey("your ApiKey");
Enter fullscreen mode Exit fullscreen mode

2.Initialize the runtime environment

Initialize HuaweiAudioEditor, and create a timeline and necessary lanes.

// Create a HuaweiAudioEditor instance.
HuaweiAudioEditor mEditor = HuaweiAudioEditor.create(mContext);
// Initialize the runtime environment of HuaweiAudioEditor.
mEditor.initEnvironment();
// Create a timeline.
HAETimeLine mTimeLine = mEditor.getTimeLine();
// Create a lane.
HAEAudioLane audioLane = mTimeLine.appendAudioLane();
Enter fullscreen mode Exit fullscreen mode

Import audio.

// Add an audio asset to the end of the lane.
HAEAudioAsset audioAsset = audioLane.appendAudioAsset("/sdcard/download/test.mp3", mTimeLine.getCurrentTime());
Enter fullscreen mode Exit fullscreen mode

3.Integrate AI dubbing.

Call HAEAiDubbingEngine to implement AI dubbing.

// Configure the AI dubbing engine.
HAEAiDubbingConfig haeAiDubbingConfig = new HAEAiDubbingConfig()
// Set the volume.
.setVolume(volumeVal)
// Set the speech speed.
.setSpeed(speedVal)
// Set the speaker.
.setType(defaultSpeakerType);
// Create a callback for an AI dubbing task.
HAEAiDubbingCallback callback = new HAEAiDubbingCallback() {
    @Override
    public void onError(String taskId, HAEAiDubbingError err) {
        // Callback when an error occurs.
    }
    @Override
    public void onWarn(String taskId, HAEAiDubbingWarn warn) {}
    @Override
    public void onRangeStart(String taskId, int start, int end) {}
    @Override
    public void onAudioAvailable(String taskId, HAEAiDubbingAudioInfo haeAiDubbingAudioFragment, int i, Pair<Integer, Integer> pair, Bundle bundle) {
        // Start receiving and then saving the file.
    }
    @Override
    public void onEvent(String taskId, int eventID, Bundle bundle) {
        // Synthesis is complete.
        if (eventID == HAEAiDubbingConstants.EVENT_SYNTHESIS_COMPLETE) {
            // The AI dubbing task has been complete. That is, the synthesized audio data is completely processed.
        }
    }
    @Override
    public void onSpeakerUpdate(List<HAEAiDubbingSpeaker> speakerList, List<String> lanList,
         List<String> lanDescList) { }
};
// AI dubbing engine.
HAEAiDubbingEngine mHAEAiDubbingEngine = new HAEAiDubbingEngine(haeAiDubbingConfig);
// Set the listener for the playback process of an AI dubbing task.
mHAEAiDubbingEngine.setAiDubbingCallback(callback);
// Convert text to speech and play the speech. In the method, text indicates the text to be converted to speech, and mode indicates the mode for playing the converted audio.
String taskId = mHAEAiDubbingEngine.speak(text, mode);
// Pause playback.
mHAEAiDubbingEngine.pause();
// Resume playback.
mHAEAiDubbingEngine.resume();
// Stop AI dubbing.
mHAEAiDubbingEngine.stop();
Enter fullscreen mode Exit fullscreen mode

Result​

In the demo below, I successfully implement the AI dubbing function in app. Now, I can converts text into emotionally expressive speech, with default and custom timbres.

Image description

To learn more, please visit:

Audio Editor Kit official website
Audio Editor Kit Development Guide
Reddit to join developer discussions
GitHub to download the sample code
Stack Overflow to solve integration problems

Follow our official account for the latest HMS Core-related news and updates.

💖 💪 🙅 🚩
jacksssson
Jackson

Posted on March 14, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

How to Add AI Dubbing to App
machinelearning How to Add AI Dubbing to App

March 14, 2022