Build a customer response application with MediaPipe, Chrome's Built-In Prompt API Locally

railsstudent

Connie Leung

Posted on September 24, 2024

Build a customer response application with MediaPipe, Chrome's Built-In Prompt API Locally

In this blog post, I describe how to build a customer reply application locally using MediaPipe and Chrome’s Built-In Prompt API. The application does not need a server and does not use any vendor’s LLM; therefore, the cost is zero. When a user enters feedback, the Text classification task of the MediaPipe SDK identifies the sentiment of the text (positive, negative). Moreover, the language detector task returns the language's ISO 639-1 code. Then, I formulate a prompt and ask Chrome’s Built-In Prompt API to generate a response according to the sentiment, language, and feedback.

Update the Chrome Browser

Update the Chrome Browser to the latest version. As of this writing, the newest version of Chrome is 129.

Please follow the steps in this blog post to download Gemini Nano in Chrome.

https://impsbl.hatenablog.jp/entry/CallGeminiNanoLocallyInChrome_en

Angular Application

Scaffold an Angular Application

ng new ng-ai-sprint-customer-response-demo
Enter fullscreen mode Exit fullscreen mode

Install dependencies

npm i -save-exact @mediapipe/tasks-text iso-639-language
Enter fullscreen mode Exit fullscreen mode

The @mediapipe/tasks-text installs the Text Classification and Language Detector tasks packages. The iso-639-language package returns the language name from the ISO 639-1 code.

Upload models to Google Cloud Storage

First, I downloaded the text classifier and language detector models here: and .

Then, I uploaded them to a new GCS bucket to keep my project's bundle size small. Next, I updated the bucket's CORS policy so the Angular application could load these files.

// cors.json

[
    {
      "origin": ["http://localhost:4200"],
      "responseHeader": ["Content-Type"],
      "method": ["GET", "HEAD", "PUT", "POST"],
      "maxAgeSeconds": 3600
    }
]
Enter fullscreen mode Exit fullscreen mode
cd ~/google-cloud-sdk 
gcloud storage buckets update gs://<bucket name> --cors-file=cors.json
Enter fullscreen mode Exit fullscreen mode

I installed the global cloud SDK in the location, ~/google-cloud-sdk. The gcloud command updates the CORS policy of the GCS bucket.

Load the models during application startup

I use the APP_INITIALIZER token during application startup to initialize the models. The models must be available before users make their first classification request.

// assets/config.json

{
    "taskTextUrl": "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-text@latest/wasm",
    "textClassifier": {
        "name": "BERT-classifier",
        "path": "https://storage.googleapis.com/ai-sprint-2024/mediapipe-models/text-classification/bert_classifier.tflite",
        "maxResults": 3
    },
    "languageDetector": {
        "name": "Language Detector",
        "path": "https://storage.googleapis.com/ai-sprint-2024/mediapipe-models/language-detection/language_detector.tflite",
        "maxResults": 1
    }
}
Enter fullscreen mode Exit fullscreen mode

The config.json file stores the MediaPipe SDK configuration and the models' public URL.

// ai/utils/load-models.ts

import config from '~assets/config.json';
import { TextClassifier, FilesetResolver, LanguageDetector } from "@mediapipe/tasks-text";

export async function createTextClassifier(): Promise<TextClassifier> {
    const text = await FilesetResolver.forTextTasks(config.taskTextUrl);
    return TextClassifier.createFromOptions(text, {
      baseOptions: {
        modelAssetPath: config.textClassifier.path
      },
      maxResults: config.textClassifier.maxResults,
    });
}

export async function createLanguageDetector(): Promise<LanguageDetector> {
  const text = await FilesetResolver.forTextTasks(config.taskTextUrl);
  return LanguageDetector.createFromOptions(text, {
    baseOptions: {
      modelAssetPath: config.languageDetector.path,
    },
    maxResults: config.languageDetector.maxResults,
  });
}
Enter fullscreen mode Exit fullscreen mode

The createTextClassifer function loads a text classification model while the createLanguageDetector function loads a language detector mode. When users enter feedback, the models can perform text classification and language detection tasks.

import { Injectable, signal } from '@angular/core';
import { createLanguageDetector, createTextClassifier } from '../utils/load-models';

@Injectable({
  providedIn: 'root'
})
export class ModelService {
  #textClassifer = signal<TextClassifier | null>(null);
  #languageDetector = signal<LanguageDetector | null>(null);

  async init() {
    const [textClassifier, languageDetector] = await Promise.all([createTextClassifier(), createLanguageDetector()]);
    this.#textClassifer.set(textClassifier);
    this.#languageDetector.set(languageDetector);
  }
}
Enter fullscreen mode Exit fullscreen mode
// ai/providers/models.provider.ts

import { APP_INITIALIZER, Provider } from '@angular/core';
import { ModelService } from '../services/model.service';

export function provideModels(): Provider {
    return {
        provide: APP_INITIALIZER,
        multi: true,
        useFactory: (service: ModelService) => () => service.init(),
        deps: [ModelService]
    } as Provider;
}
Enter fullscreen mode Exit fullscreen mode

The init method of the ModelService service loads the model files and stores the results in the signals.

Obtain Window AI Assistant during application startup

If Chrome supports the Prompt API, then window.ai.assistant returns an instance of AI Assistant. In the application, I declare an InjectionToken and provide the AI Assistant with the ability to create an AI Text Session. The session can prompt the local LLM to generate a reply for customer feedback.

// ai/constants/core.constant.ts

import { InjectionToken } from '@angular/core';

export const AI_ASSISTANT_TOKEN = new InjectionToken<{ create: Function, capabilities: Function } | undefined>('AI_ASSISTANT_TOKEN');
Enter fullscreen mode Exit fullscreen mode
// ai/providers/ai-assistant.provider.ts

import { isPlatformBrowser } from '@angular/common';
import { EnvironmentProviders, inject, makeEnvironmentProviders, PLATFORM_ID } from '@angular/core';
import { AI_ASSISTANT_TOKEN } from '../constants/core.constant';

export function provideAIAssistant(): EnvironmentProviders {
    return makeEnvironmentProviders([
        {
            provide: AI_ASSISTANT_TOKEN,
            useFactory: () => {
                const platformId = inject(PLATFORM_ID);
                const objWindow = isPlatformBrowser(platformId) ? window : undefined;
                if (objWindow) {
                    const winWithAI = objWindow as any;
                    if (winWithAI?.ai?.assistant) {
                        return winWithAI.ai.assistant;
                    }
                }

                return undefined;
            },
        }
    ]);
}
Enter fullscreen mode Exit fullscreen mode
// app.config.ts

import { ApplicationConfig, provideExperimentalZonelessChangeDetection } from '@angular/core';
import { provideAIAssistant } from './ai/providers/ai-assistant.provider';
import { provideModels } from './ai/providers/models.provider';

export const appConfig: ApplicationConfig = {
  providers: [
    provideExperimentalZonelessChangeDetection(),
    provideAIAssistant(),
    provideModels(),
  ]
};
Enter fullscreen mode Exit fullscreen mode

In appConfig, the provideAIAssistant and provideModels functions load the models in memory and provide a Window AI Assistant.

Create the Services

// ai/services/model.service.ts

import { Injectable, signal } from '@angular/core';
import { LanguageDetector, TextClassifier } from '@mediapipe/tasks-text';
import Iso639Type from 'iso-639-language';
import config from '~assets/config.json';
import { createLanguageDetector, createTextClassifier } from '../utils/load-models';

@Injectable({
  providedIn: 'root'
})
export class ModelService {
  classifyText(query: string): { sentiment: string; score: number }[] {
    const classifer = this.#textClassifer();
    if (classifer) {
      const result = classifer.classify(query);
      if (result.classifications.length) {
        return result.classifications[0].categories.map(({ categoryName, score }) => ({
          sentiment: categoryName,
          score,
        }));
      }
    }

    return [];
  }

  detectLanguage(query: string): string {
    const langDetector = this.#languageDetector();
    if (langDetector) {
      const result = langDetector.detect(query);
      if (result.languages.length) {
        const iso639_1 = Iso639Type.getType(1); 
        return iso639_1.getNameByCodeEnglish(result.languages[0].languageCode);
      }
    }

    return '';
  }
}
Enter fullscreen mode Exit fullscreen mode

The classifyText method of the ModelService service accepts a query and returns the sentiments' category name and score. The detectLanguage method determines the language's ISO 639-1 code and calls the library to return the language name.

// ai/services/prompt.service.ts

import { inject, Injectable } from '@angular/core';
import { AI_ASSISTANT_TOKEN } from '../constants/core.constant';

@Injectable({
  providedIn: 'root'
})
export class PromptService {
  #aiAssistant = inject(AI_ASSISTANT_TOKEN);

  async prompt(query: string): Promise<string> {
    if (!this.#aiAssistant) {
      throw new Error(`Your browser doesn't support the Prompt API. If you are on Chrome, join the Early Preview Program to enable it.`);
    }

    const session = await this.#aiAssistant.create();
    if (!session) {
      throw new Error('Failed to create AITextSession.');
    }

    const answer = await session.prompt(query);
    session.destroy();

    return answer;
  }
}
Enter fullscreen mode Exit fullscreen mode

The prompt method creates an AI Text Session, prompts the local LLM to generate a reply, and stores the text to the answer variable. Then, this method destroys the session to clean up the resources and returns the text.

Create a Feedback Service

// feedback/services/feedback.service.ts

import { inject, Injectable, signal } from '@angular/core';
import { ModelService } from '~app/ai/services/model.service';
import { PromptService } from '~app/ai/services/prompt.service';

@Injectable({
  providedIn: 'root'
})
export class FeedbackService {
  promptService = inject(PromptService);
  modelService = inject(ModelService);

  categories = signal<{ sentiment: string; score: number }[]>([]);
  language = signal('');
  prompt = signal('');

  async generateReply(query: string): Promise<string> {
    this.categories.set([]);
    this.language.set('');
    this.prompt.set('');

    const language = this.modelService.detectLanguage(query);
    const categories = this.modelService.classifyText(query);
    const sentiment = categories[0].sentiment;
    const responsePrompt = `
      The customer wrote a ${sentiment} feedback in ${language}. 
      Please write the response in one paragraph in ${language}, 100 words max.
      Feedback: ${query} 
    `;

    const response = await this.promptService.prompt(responsePrompt);

    this.categories.set(categories);
    this.language.set(language);
    this.prompt.set(responsePrompt);

    return response;
  }
}
Enter fullscreen mode Exit fullscreen mode

The feedback service first detects the language of the feedback. Then, it calculates the sentiment's category and score. The first category has the highest score; therefore, it is the sentiment of the feedback. Then, I include the data to construct a prompt and prompt the local LLM to generate a short reply.

const responsePrompt = `
      The customer wrote a ${sentiment} feedback in ${language}. 
      Please write the response in one paragraph in ${language}, 100 words max.
      Feedback: ${query} 
`;
Enter fullscreen mode Exit fullscreen mode

The Prompt service generates the response and returns the result. Moreover, the Feedback service sets the categories, language, and query in the signals so that the user interface can display the values.

Build the user interface

@Component({
  selector: 'app-root',
  standalone: true,
  imports: [DetectAIComponent],
  template: `
    <h2>Generate Response for Customer Feedback</h2>
    <h3>Use MediaPipe Text Classifier and Language Detection Tasks, and Chrome Built-In AI</h3>
    <app-detect-ai />
  `,
})
export class AppComponent {}
Enter fullscreen mode Exit fullscreen mode

The AppComponent has a component to detect whether or not the browser supports Chrome Built-in Prompt API.

import { inject } from '@angular/core';
import { AI_ASSISTANT_TOKEN } from '../constants/core.constant';

export function isPromptAPISupported(): boolean {
   return !!inject(AI_ASSISTANT_TOKEN);
}
Enter fullscreen mode Exit fullscreen mode
@Component({
  selector: 'app-detect-ai',
  standalone: true,
  imports: [FeedbackInputComponent],
  template: `
    <div>
      @if (hasCapability) {
        <app-feedback-input />
      } @else {
          <p>Your browser doesn't support the Prompt API.</p>
          <p>If you're on Chrome, join the <a href="https://developer.chrome.com/docs/ai/built-in#get_an_early_preview" target="_blank">
            Early Preview Program</a> to enable it.
          </p>
      }
    </div>
  `,
})
export class DetectAIComponent {
  hasCapability = isPromptAPISupported();
}
Enter fullscreen mode Exit fullscreen mode

If the browser supports the Prompt API, this component displays the FeedbackInputComponent component. Otherwise, it instructs users to sign up for the early preview program.

import { ChangeDetectionStrategy, Component, computed, inject, signal } from '@angular/core';
import { FormsModule } from '@angular/forms';
import { FeedbackService } from './services/feedback.service';

@Component({
  selector: 'app-feedback-input',
  standalone: true,
  imports: [FormsModule],
  template: `
    <label class="label" for="input">Input customer feedback: </label>
    <textarea rows="8" id="input" name="input" [(ngModel)]="feedback"></textarea>
    <button (click)="submit()" [disabled]="buttonState().disabled">{{ buttonState().text }}</button>
    <div>
      <p>
        <span class="label">Language: </span>{{ language() }}
      </p>
      <p>
        <span class="label">Categories: </span>
        @for (category of categories(); track $index) {
          <p>{{ category.sentiment }}, {{ category.score }}</p>
        }
      </p>
      <p>
        <span class="label">Prompt: </span>{{ prompt() }}
      </p>
    </div>
    <div>
      <span class="label">Response:</span>
      <p>{{ response() }}</p>
    </div>
  `,
})
export class FeedbackInputComponent {
  feedbackService = inject(FeedbackService);

  feedback = signal('', { equal: () => false });
  isLoading = signal(false);
  response = signal('');

  language = this.feedbackService.language;
  categories = this.feedbackService.categories
  prompt = this.feedbackService.prompt;

  buttonState = computed(() => {
    return {
      text: this.isLoading() ? 'Processing...' : 'Submit',
      disabled: this.isLoading() || this.feedback().trim() === ''  
    }    
  })  

  async submit() {
    this.isLoading.set(true);
    this.response.set('');
    const result = await this.feedbackService.generateReply(this.feedback());
    this.response.set(result);
    this.isLoading.set  (false);
  }
}
Enter fullscreen mode Exit fullscreen mode

The FeedbackInputComponent component allows users to enter feedback and click the button to submit the text. The feedback service generates a reply and displays the categories, category scores, language, prompt, and the generated reply in the HTML template.

In conclusion, software engineers can create Web AI applications without setting up a backend server or accumulating the costs of LLM on the cloud.

Resources:

💖 💪 🙅 🚩
railsstudent
Connie Leung

Posted on September 24, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related