Text to speech tutorial using RxJS and Angular

Introduction

This is day 23 of Wes Bos's JavaScript 30 challenge and I am going to use RxJS and Angular to create an English text to speech tutorial. The Web Speech API provides interfaces to make speech request and turn text into speech according to the selected voice.

In this blog post, I describe how to use RxJS fromEvent to listen to change event of input controls and update the properties of SpeechSynthesisUtterance object. SpeechSynthesisUtterance interface creates a speech request and calls SpeechSynthesis interface to speak the text and turn the English text to speech.

Create a new Angular project

ng generate application day23-speech-synthesis

Create Speech feature module

First, we create a Speech feature module and import it into AppModule. The feature module encapsulates SpeechSynthesisComponent, SpeechTextComponent and SpeechVoiceComponent.

Import SpeechModule in AppModule

// speech.module.ts

import { CommonModule } from '@angular/common';
import { NgModule } from '@angular/core';
import { FormsModule } from '@angular/forms';
import { SpeechSynthesisComponent } from './speech-synthesis/speech-synthesis.component';
import { SpeechTextComponent } from './speech-text/speech-text.component';
import { SpeechVoiceComponent } from './speech-voice/speech-voice.component';

@NgModule({
  declarations: [
    SpeechSynthesisComponent,
    SpeechVoiceComponent,
    SpeechTextComponent
  ],
  imports: [
    CommonModule,
    FormsModule,
  ],
  exports: [
    SpeechSynthesisComponent
  ]
})
export class SpeechModule { }

// app.module.ts

import { NgModule } from '@angular/core';
import { BrowserModule } from '@angular/platform-browser';

import { AppComponent } from './app.component';
import { SpeechModule } from './speech';

@NgModule({
  declarations: [
    AppComponent
  ],
  imports: [
    BrowserModule,
    SpeechModule
  ],
  providers: [],
  bootstrap: [AppComponent]
})
export class AppModule { }

Declare Speech components in feature module

In Speech feature module, we declare three Angular components, SpeechSynthesisComponent, SpeechTextComponent and SpeechVoiceComponent to build an English text to speech application.

src/app
├── app.component.ts
├── app.module.ts
└── speech
    ├── index.ts
    ├── interfaces
    │   └── speech.interface.ts
    ├── services
    │   └── speech.service.ts
    ├── speech-synthesis
    │   └── speech-synthesis.component.ts
    ├── speech-text
    │   └── speech-text.component.ts
    ├── speech-voice
    │   └── speech-voice.component.ts
    └── speech.module.ts

SpeechComponent acts like a shell that encloses SpeechTextComponent and SpeechVoiceComponent. For your information, <app-speech-synthesis> is the tag of TimerComponent.

// speech-synthesis.component.ts

import { ChangeDetectionStrategy, Component } from '@angular/core';

@Component({
  selector: 'app-speech-synthesis',
  template: `
    <div class="voiceinator">
      <h1>The Voiceinator 5000</h1>
      <app-speech-voice></app-speech-voice>
      <app-speech-text></app-speech-text>
    </div>`,
  styles: [` ...omitted due to brevity... `],
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class SpeechSynthesisComponent {}

SpeehVoiceComponent encapsulates input controls to change the rate, pitch and voice of the speech whereas SpeechTextComponent is composed of text area, speak and stop buttons to decide what and when to speak.

// speech-voice.component.ts

import { ChangeDetectionStrategy, Component, ElementRef, OnDestroy, OnInit, ViewChild } from '@angular/core';
import { Observable, of } from 'rxjs';
import { SpeechService } from '../services/speech.service';

@Component({
  selector: 'app-speech-voice',
  template: `
    <ng-container>
      <select name="voice" id="voices" #voices>
        <option *ngFor="let voice of voices$ | async" [value]="voice.name">{{voice.name}} ({{voice.lang}})</option>
      </select>
      <label for="rate">Rate:</label>
      <input name="rate" type="range" min="0" max="3" value="1" step="0.1" #rate>
      <label for="pitch">Pitch:</label>
      <input name="pitch" type="range" min="0" max="2" step="0.1" #pitch value="1">
    </ng-container>
  `,
  styles: [...omitted due to brevity...],
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class SpeechVoiceComponent implements OnInit, OnDestroy {
  @ViewChild('rate', { static: true, read: ElementRef })
  rate!: ElementRef<HTMLInputElement>;

  @ViewChild('pitch', { static: true, read: ElementRef })
  pitch!: ElementRef<HTMLInputElement>;

  @ViewChild('voices', { static: true, read: ElementRef })
  voiceDropdown!: ElementRef<HTMLSelectElement>;

  voices$!: Observable<SpeechSynthesisVoice[]>;

  constructor(private speechService: SpeechService) { }

  ngOnInit(): void {
    this.voices$ = of([]);
  }

  ngOnDestroy(): void {}
}

// speech-text.component.ts

import { ChangeDetectionStrategy, Component, ElementRef, OnDestroy, OnInit, ViewChild } from '@angular/core';
import { Subject, Subscription, fromEvent, map, merge, tap } from 'rxjs';
import { SpeechService } from '../services/speech.service';

@Component({
  selector: 'app-speech-text',
  template: `
    <ng-container>
      <textarea name="text" [(ngModel)]="msg" (change)="textChanged$.next()"></textarea>
      <button id="stop" #stop>Stop!</button>
      <button id="speak" #speak>Speak</button>
    </ng-container>
  `,
  styles: [...omitted due to brevity...],
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class SpeechTextComponent implements OnInit, OnDestroy {
  @ViewChild('stop', { static: true, read: ElementRef })
  btnStop!: ElementRef<HTMLButtonElement>;

  @ViewChild('speak', { static: true, read: ElementRef })
  btnSpeak!: ElementRef<HTMLButtonElement>;

  textChange$ = new Subject<void>();
  msg = 'Hello! I love JavaScript 👍';

  constructor(private speechService: SpeechService) { }

  ngOnInit(): void {
    this.speechService.updateSpeech({ name: 'text', value: this.msg });
  }

  ngOnDestroy(): void {}
}

Next, I delete boilerplate codes in AppComponent and render SpeechSynthesisComponent in inline template.

// app.component.ts

import { Component } from '@angular/core';
import { Title } from '@angular/platform-browser';

@Component({
  selector: 'app-root',
  template: '<app-speech-synthesis></app-speech-synthesis>',
  styles: [`
    :host {
      display: block;
    }
  `]
})
export class AppComponent {
  title = 'Day 23 Speech Synthesis';

  constructor(titleService: Title) {
    titleService.setTitle(this.title);
  }
}

Add speech service to implement text to speech

I create a shared service to add a layer on top of Web Speech API to make speech request and speak the texts according to the selected voice, rate and pitch.

// speech.service.ts

import { Injectable } from '@angular/core';
import { SpeechProperties } from '../interfaces/speech.interface';

@Injectable({
  providedIn: 'root'
})
export class SpeechService {
  private voices: SpeechSynthesisVoice[] = [];

  updateSpeech(property: SpeechProperties): void {
    const { name, value } = property;
    if ((name === 'text')) {
      localStorage.setItem(name, value);
    } else if (['rate', 'pitch'].includes(name)) {
      localStorage.setItem(name, `${value}`);
    }
    this.toggle();
  }

  setVoices(voices: SpeechSynthesisVoice[]): void {
    this.voices = voices;
  }

  updateVoice(voiceName: string): void {
    localStorage.setItem('voice', voiceName);
    this.toggle();
  }

  private findVoice(voiceName: string): SpeechSynthesisVoice | null {
    const voice = this.voices.find(v => v.name === voiceName);
    return voice ? voice : null;
  }

  toggle(startOver = true): void {
    const speech = this.makeRequest();
    speechSynthesis.cancel();
    if (startOver) {
      speechSynthesis.speak(speech);
    }
  }

  private makeRequest() {
    const speech = new SpeechSynthesisUtterance();
    speech.text = localStorage.getItem('text') || '';
    speech.rate = +(localStorage.getItem('rate') || '1');
    speech.pitch = +(localStorage.getItem('pitch') || '1');
    const voice = this.findVoice(localStorage.getItem('voice') || '');
    if (voice) {
      speech.voice = voice;
    }
    return speech;
  }
}

updateSpeech updates pitch, rate or text in local storage
setVoices stores English voices in internal member of SpeechService
findVoice find voice by voice name
updateVoice updates voice name in local storage
makeRequest loads the property values from local storage and creates a SpeechSynthesisUtternce request
toggle ends and speaks the text again

Use RxJS and Angular to implement SpeechVoiceComponent

I am going to define an Observable to retrieve English voices and populate voices dropdown.

Use ViewChild to obtain references to input ranges and voice dropdown

@ViewChild('rate', { static: true, read: ElementRef })
rate!: ElementRef<HTMLInputElement>;

@ViewChild('pitch', { static: true, read: ElementRef })
pitch!: ElementRef<HTMLInputElement>;

@ViewChild('voices', { static: true, read: ElementRef })
voiceDropdown!: ElementRef<HTMLSelectElement>;
Declare subscription instance member and unsubscribe in ngDestroy()

subscription = new Subscription();

ngOnDestroy(): void {
    this.subscription.unsubscribe();
}

Declare voices$ Observable and populate options in voices dropdown in ngOnInit().

ngOnInit(): void {
    this.voices$ = fromEvent(speechSynthesis, 'voiceschanged')
       .pipe(
          map(() => speechSynthesis.getVoices().filter(voice => voice.lang.includes('en'))),
          tap((voices) => this.speechService.setVoices(voices)),
       );
}

In inline template, use async pipe to resolve this.voices$ and populate options in voices dropdown

<select name="voice" id="voices" #voices>
     <option *ngFor="let voice of voices$ | async" [value]="voice.name">{{voice.name}} ({{voice.lang}})</option>
</select>

Use fromEvent to listen to change event of the dropdown, update voice name in local storage and speak the texts.

const voiceDropdownNative = this.voiceDropdown.nativeElement;
this.subscription.add(
   fromEvent(this.voiceDropdown.nativeElement, 'change')
     .pipe(
        tap(() => this.speechService.updateVoice(voiceDropdownNative.value))
     ).subscribe()
);

Similarly, use fromEvent to listen to change event of input ranges, update rate and pitch in local storage and speak the texts

const rateNative = this.rate.nativeElement;
const pitchNative = this.pitch.nativeElement;
this.subscription.add(
      merge(fromEvent(rateNative, 'change'), fromEvent(pitchNative, 'change'))
        .pipe(
          map((e) => e.target as HTMLInputElement),
          map((e) => ({ name: e.name as 'rate' | 'pitch', value: e.value })),
          tap((property) => this.speechService.updateSpeech(property))
      ).subscribe()
);

Use RxJS and Angular to implement SpeechTextComponent

Use ViewChild to obtain references to text area and buttons

@ViewChild('stop', { static: true, read: ElementRef })
btnStop!: ElementRef<HTMLButtonElement>;

@ViewChild('speak', { static: true, read: ElementRef })
btnSpeak!: ElementRef<HTMLButtonElement>;
Declare subscription instance member and unsubscribe in ngDestroy()

subscription = new Subscription();

ngOnDestroy(): void {
    this.subscription.unsubscribe();
}

Declare textChanged$ subject that emits value when text changes in text area and tab out to lose focus.

// speech-text.component.ts

textChanged$ = new Subject<void>();

ngOnInit() {
    this.subscription.add(
       this.textChanged$
         .pipe(tap(() => this.speechService.updateSpeech({ name: 'text', value: this.msg })))
         .subscribe()
    );
}

The subject invokes SpeechService to update the message in local storage and say the texts.

Similarly, use fromEvent to listen to click event of buttons. When speak button is clicked, the stream stops and starts saying the text in the text area. When cancel button is clicked, the stream stops the speech immediately.

ngOnInit() {
    const btnStop$ = fromEvent(this.btnStop.nativeElement, 'click').pipe(map(() => false));
    const btnSpeak$ = fromEvent(this.btnSpeak.nativeElement, 'click').pipe(map(() => true));
    this.subscription.add(
      merge(btnStop$, btnSpeak$)
        .pipe(tap(() => this.speechService.updateSpeech({ name: 'text', value: this.msg })))
        .subscribe((startOver) => this.speechService.toggle(startOver))
    );
}

fromEvent(this.btnStop.nativeElement, 'click').pipe(map(() => false)) – stop button maps to false to end the speech immediately
fromEvent(this.btnSpeak.nativeElement, 'click').pipe(map(() => true)) – speak button maps true to stop and restart the speech

The example is done and we have a page that comprehends English words and able to say them.

Final Thoughts

In this post, I show how to use RxJS and Angular to build a a text to speech application that reads and speak English texts. Web Speech API supports various spoken languages and voices to say texts with different parameters (rate, pitch, text and volume). Child components create Observables to pass values to shared SpeechService to update local storage and make new speech request to speak.

This is the end of the blog post and I hope you like the content and continue to follow my learning experience in Angular and other technologies.

Resources:

Repo: https://github.com/railsstudent/ng-rxjs-30/tree/main/projects/day23-speech-synthesis
Live demo: https://railsstudent.github.io/ng-rxjs-30/day23-speech-synthesis/
Wes Bos’s JavaScript 30 Challenge: https://github.com/wesbos/JavaScript30

Blog