You can now use Chrome's native AI in the official version of Chrome.
Saga
Posted on September 26, 2024
Live Demo: https://chrome-ai.edgeone.app
Chrome built-in AI initially required filling out an application form and could only be experienced in the developer version of Chrome. Now, users can enable this feature in the official version with just a few simple steps.
After completing the configuration according to the instructions on the webpage, you can access the debug page. Here, you can quickly modify the code and experience the powerful capabilities of local AI.
Note: The Chrome API is still in draft stage and may undergo significant changes. This webpage is developed based on Chrome version 129 and is not compatible with the API of version 128.
Why is there Chrome Local AI?
- In the past, when using AI applications, we usually relied on server-side solutions, raising privacy concerns for some users.
- Some developers have tried to migrate AI models into the browser, but the model size is typically around a thousand times that of the median webpage size. Since these models are not shared before the website loads, accessing different webpages requires re-downloading these models, which is a resource-intensive solution for users.
Therefore, Chrome integrates Gemini Nano in the browser and exposes standard Web platform APIs, aiming to run on most desktops and laptops. With Chrome's built-in AI capabilities, your website can quickly execute AI-driven tasks without the need to deploy or manage your own AI models.
Currently, users can directly call large models locally in a privacy-safe mode on the webpage, performing functions such as Q&A and translation.
Benefits of Built-in AI for Web Developers?
- Simple Deployment: The browser automatically distributes the models, considering the device's capabilities and managing model updates. This means you are not responsible for downloading or updating large models over the network, nor do you have to worry about storage releases, runtime memory limitations, service costs, and other issues.
- Access to Hardware Acceleration: The browser's AI runtime is optimized to make full use of available hardware resources, whether it’s GPU, NPU, or falling back to CPU. As a result, your application can achieve optimal performance on every device.
Benefits of Running AI on Device?
- Local Processing of Sensitive Data: AI on devices can enhance your privacy protection. For instance, if you deal with sensitive data, you can offer AI features with end-to-end encryption to users.
- Responsive User Experience: In some cases, eliminating the round-trip to the server means providing almost instantaneous results. AI on devices can be the key differentiator between usable features and suboptimal user experiences.
- Broader Access to AI: Users’ devices can share part of the processing load in exchange for more features. For example, if you offer advanced AI functionalities, you can preview these features through on-device AI, letting potential customers understand the advantages of your product without increasing your costs. This hybrid approach can also help you manage inference costs, especially in frequently used user flows.
- Offline AI Usage: Your users can access AI features even without an internet connection. This means your website and web applications can function normally in offline or unstable network conditions.
Browser Architecture and APIs
The built-in AI capabilities are primarily accessed via the Task API. The Task API is designed to run inference with the best assigned model.
In Chrome, these APIs aim to run inference for Gemini Nano through fine-tuning or expert models. Gemini Nano is designed to run locally on most modern devices and is best suited for language-related use cases such as summarization, rewriting, or classification.
Key Term: Fine-tuning is a dynamic approach to enhance a model's ability to perform specific tasks without the need to download a new model for each task.
- Prompt API: Send any task expressed in natural language to the built-in large language model (Gemini Nano in Chrome).
- Fine-tuning (LoRA) API: Adjust the model's weights using low-rank adaptive fine-tuning to improve the performance of the built-in LLM on the task.
What capabilities can be provided to users?
- AI-enhanced content consumption: including summaries, translations, answering content-related questions, classification, and feature analysis.
- AI-supported content creation: including writing assistance, proofreading, grammar correction, and rewriting.
Summary API:
- Meeting notes summary for users who joined the meeting late or completely missed it.
- Key points in customer relationship management support dialogues.
- Sentence or paragraph-sized summaries of multiple product reviews.
- Key points of long articles to help readers determine if the article is relevant.
- Summarizing questions in forums to help experts find the most relevant questions in their field of expertise.
Writing and rewriting API:
- Writing based on initial ideas and optional background. For example, writing a formal email to a bank requesting a credit limit increase, with the background being that you are a long-term customer.
- Optimizing existing content by adjusting the length or tone of the text. For example, rewriting a short email to make it sound more polite and formal.
Posted on September 26, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024