Integrate nearly real-time free multi-language translation in the application, based on Chrome AI API
Saga
Posted on June 11, 2024
Chrome has integrated AI capabilities in the latest version of Chrome Dev (version 127.0.6512.0 and above), provided in the form of experimental flags.
Download the latest Chrome Dev: https://www.google.com/intl/en_us/chrome/dev/
Chrome Dev Configuration
- Verify that the Chrome Dev version is higher than 127.0.6512.0
- In the URL input:
chrome://flags/#optimization-guide-on-device-model
, chooseEnabled BypassPerfRequirement
to allow the model to download smoothly. - In the URL input:
chrome://flags/#prompt-api-for-gemini-nano
, selectEnabled
. - Wait for the model to finish downloading. You can check whether the download is complete at
chrome://components/
. If it does not start downloading automatically, you can clickCheck for update
to force the download, which will need to download about 1GB of content. When you see Version: 2024.65.2205, it means it can be used. Restart Chrome Dev.
API Capability Testing
Open the command line with cmd + option + I
, enter await window.ai.canCreateTextSession();
, when you see "readily" it means it can be used.
Case 1: Rewriting the tone of the text
We can see that with just two lines of code, we can solve the text expression problem that troubles many people, and it can be done with extremely fast speed and excellent privacy.
Case Study 2: Text Translation
Complete text translation in a fast and free way, making multi-language display of any application more convenient.
Integration within the application
Our app https://timmerse.com is a customizable 3D immersive world, suitable for work and entertainment. Create a space to achieve immersive connections between people. Combining video calls and custom 3D worlds, integrating AI NPC, makes gatherings in work and life more creative and enjoyable.
When playing videos in the OpenDay scene, we can easily translate and display the original English subtitles into bilingual subtitles in real time according to the user's Chrome language preference.
Of course, the llm model is not just for translation. With the wide spread of various end-side models and multimodal, it will definitely change the way people interact with devices in various ways, improving the efficiency of life and work.
Posted on June 11, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.