Integrate nearly real-time free multi-language translation in the application, based on Chrome AI API

sagacheng

Saga

Posted on June 11, 2024

Integrate nearly real-time free multi-language translation in the application, based on Chrome AI API

Chrome has integrated AI capabilities in the latest version of Chrome Dev (version 127.0.6512.0 and above), provided in the form of experimental flags.

Download the latest Chrome Dev: https://www.google.com/intl/en_us/chrome/dev/

Chrome Dev Configuration

  1. Verify that the Chrome Dev version is higher than 127.0.6512.0
  2. In the URL input: chrome://flags/#optimization-guide-on-device-model, choose Enabled BypassPerfRequirement to allow the model to download smoothly.
  3. In the URL input: chrome://flags/#prompt-api-for-gemini-nano, select Enabled.
  4. Wait for the model to finish downloading. You can check whether the download is complete at chrome://components/. If it does not start downloading automatically, you can click Check for update to force the download, which will need to download about 1GB of content. When you see Version: 2024.65.2205, it means it can be used. Restart Chrome Dev.

Chrome Dev Configuration

API Capability Testing

Open the command line with cmd + option + I, enter await window.ai.canCreateTextSession();, when you see "readily" it means it can be used.

Check the model loading situation

Case 1: Rewriting the tone of the text

We can see that with just two lines of code, we can solve the text expression problem that troubles many people, and it can be done with extremely fast speed and excellent privacy.

Rewriting the tone of the text

Case Study 2: Text Translation

Complete text translation in a fast and free way, making multi-language display of any application more convenient.

Text Translation

Integration within the application

Our app https://timmerse.com is a customizable 3D immersive world, suitable for work and entertainment. Create a space to achieve immersive connections between people. Combining video calls and custom 3D worlds, integrating AI NPC, makes gatherings in work and life more creative and enjoyable.

When playing videos in the OpenDay scene, we can easily translate and display the original English subtitles into bilingual subtitles in real time according to the user's Chrome language preference.

timmerse.com screenshot

Of course, the llm model is not just for translation. With the wide spread of various end-side models and multimodal, it will definitely change the way people interact with devices in various ways, improving the efficiency of life and work.

💖 💪 🙅 🚩
sagacheng
Saga

Posted on June 11, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related