Screen recording with Agora.io: overcoming obstacles

Hi all!

Let's say you have someone working with your website and you want to record everything they do, not only on the website, but record their whole screen. What for? Well, our use case was recording classes with a teacher - for quality assurance and safety reasons.

Prerequisites

Nuxt.js SPA - the main app where classes occur, teacher and students communicate in realtime.
Agora.io used for WebRTC
Socket.io backend

But that's not all. Here are the limitations to the solution:

No human factor. A teacher should not be able to voluntarily stop recording.
All recordings should be available in cloud.
Adding new dependencies to the project should be minimized.

Because of 1, screen recording apps are not the solution since they include human factor. Also any app storing recordings locally won't work either. Using external libraries or SaaS for screen recording violates 3rd rule.
So the apparent solution is to use Agora.io the cloud recording feature.

According to the docs, there are 3 modes in which recording works:

Individual recording. Essentially recording video/audio streams from a single room participant. Looks like that's it, but - for some reason the output is not a .mp4 or any single video file, but a collection of multiple .ts files with audio/video streams and a .m3u8 file linking it all to a playlist. Which is obviously inconvenient for end use.
Composite recording. The settings for this one are basically the same as for individual one, and it has a huge plus - the result file is .mp4! But it only works for multiple participants, and the surprise was that... it simply does not start if you don't include a second participant. Which makes it useless since we only record the teacher's screen.
Web Page recording. This one is interesting. It doesn't take any media streams as input - instead you just send a web page URL for the record - and it records it until you call an API method to stop. Looks cool, but it's very hard to implement since the classes are private and there are no free-to-join links out there.

Seems like there is no other viable option except 1.
OK, let's do it:

export type AgoraToken = {
  token: string
  room: string
  user: string
  appId: string,
  classId: number,
}

export class AgoraScreenRecordingService {
  private APP_ID = process.env.AGORA_APP_ID ?? 'APP_ID_NOT_DEFINED'
  private API_KEY = process.env.AGORA_API_KEY ?? 'API_KEY_NOT_SET'
  private API_SECRET = process.env.AGORA_API_SECRET ?? 'API_SECRET_NOT_SET'
  private AWS_BUCKET = process.env.AGORA_AWS_BUCKET ?? 'AWS_BUCKET_NOT_SET'
  private AWS_KEY = process.env.AGORA_AWS_KEY ?? 'AWS_KEY_NOT_SET'
  private AWS_SECRET = process.env.AGORA_AWS_SECRET ?? 'AWS_SECRET_NOT_SET'
  private mode = 'individual'

  private axios: AxiosInstance

  private static instance: AgoraScreenRecordingService

  public static getInstance () {
    return this.instance ?? (this.instance = new AgoraScreenRecordingService())
  }

  constructor () {
    this.axios = axios.create({
      baseURL: `https://api.agora.io/v1/apps/${this.APP_ID}/cloud_recording/`,
      auth: {
        username: this.API_KEY ?? '',
        password: this.API_SECRET ?? '',
      },
      headers: {
        'Content-Type': 'application/json;charset=utf-8'
      }
    })
  }

  private async acquire (token: AgoraToken) {
    return (await this.axios.post('acquire', {
      cname: token.room,
      uid: token.user,
      clientRequest: {
        resourceExpiredHour: 24,
        scene: 0
      }
    })).data.resourceId
  }

  async startRecording (token: AgoraToken): Promise<{ sid: string, resourceId: string }> {
    const resourceId = await this.acquire(token)

    const { sid } = (await this.axios.post(`resourceid/${resourceId}/mode/${this.mode}/start`, {
      cname: token.room,
      uid: token.user,
      clientRequest: {
        token: token.token,
        recordingConfig: {
          maxIdleTime: 300,
          channelType: 1,
          streamTypes: 2,
          subscribeUidGroup: 0,
          streamMode: 'standard',
        },
        storageConfig: {
          accessKey: this.AWS_KEY,
          region: 0,
          bucket: this.AWS_BUCKET,
          secretKey: this.AWS_SECRET,
          vendor: 1,
          fileNamePrefix: [
            `class${token.classId}`,
          ]
        }
      },
    })).data

    return { sid, resourceId }
  }

  async stopRecording (token: AgoraToken, resourceId: string, sid: string) {
    try {
      await this.axios.post(`/resourceid/${resourceId}/sid/${sid}/mode/${this.mode}/stop`, {
        cname: token.room,
        uid: token.user,
        clientRequest: {}
      })
    } catch (e) {
      console.log('Recording wasn\'t stopped', (e as AxiosError).message)
    }
  }
}

  // teacher's agora.io access params 
  accessParams: AgoraToken
  // id data of current active recording
  currentRecording: {sid: string, resourceId: string}

  classState.on('status-changed', async status => {
    if (status === 'OPEN') {
      currentRecording = await AgoraScreenRecordingService
          .getInstance()
          .startRecording(accessParams)
    } else {
      await AgoraScreenRecordingService
      .getInstance()
      .stopRecording(accessParams, currentRecording.resourceId, currentRecording.sid)
    }
  })

It works! Now we have to compile a bunch of resulting .ts files into a human-readable .mp4. What could be easier, right? There is FFMpeg for that!
(If you thought there are only a few .ts files per session - you were wrong. There are hundreds of them per hour)

  ffmpeg -i file_av.m3u8 -vsync 0 -vcodec libx264 -crf 27 -preset veryfast -c:a aac output.mp4

Seems like it worked! Wait... why is video 8 minutes long instead of 1.5 hours? Why is FPS 20000? OK, lets try that:

  ffmpeg -i $path -vsync 0 -vcodec libx264 -crf 27 -preset veryfast -c:a aac  -filter:v fps=fps=30:round=up $outFilepath

OK, the FPS is fine now. But there is 20 minute offset between video and audio now... Maybe official Agora script will help us?

(And why is there Python 2 in 2022?)

Apparently, no. None of these works fine.
Spoiler: we also tried compiling it manually chunk by chunk, using different FFmpeg versions, different FFmpeg settings and filters, using other tools apart from FFmpeg - nope. Just nope.

Now it's time to reveal what solution we have come up with.
And it's...
...
Web Page recording!
Yes!
But not in the way you might think.
We still need to record the whole teacher's screen because the lesson encompasses side applications as well.
What we did is new page displaying only the teacher's screen stretched into the viewport and all of the audios playing in the background. Yes, we basically used web for stream-to-mp4 conversion. Now, instead of individual recording, we create one-time session allowing Agora.io robot visit the page and record its contents.
And it works just fine now!
Now our startRecording looks like that:


  async startRecording (token: AgoraToken, url: string): Promise<{ sid: string, resourceId: string }> {
    const resourceId = await this.acquire(token)

    const { sid } = (await this.axios.post(`resourceid/${resourceId}/mode/${this.mode}/start`, {
      cname: token.room,
      uid: token.user,
      clientRequest: {
        token: token.token,
        extensionServiceConfig: {
          errorHandlePolicy: 'error_abort',
          extensionServices: [
            {
              serviceName: 'web_recorder_service',
              errorHandlePolicy: 'error_abort',
              serviceParam: {
                url,
                audioProfile: 0,
                videoWidth: 1920,
                videoHeight: 1080,
                maxRecordingHour: 3
              }
            }
          ]
        },
        recordingFileConfig: {
          avFileType: ['hls', 'mp4']
        },
        storageConfig: {
          accessKey: this.AWS_KEY,
          region: 0,
          bucket: this.AWS_BUCKET,
          secretKey: this.AWS_SECRET,
          vendor: 1,
          fileNamePrefix: [
            `class${token.classId}`,
          ]
        }
      },
    })).data

    return { sid, resourceId }
  }

Conclusion

Honestly, I didn't expect that much pain from using a service the most straightforward way. You're just trying to use the most basic feature and not getting the expected result. I really hope the Agora.io team eventually implements individual recording in a way that it outputs one file.

Thanks for reading! Hope that helps some of you not to repeat our mistakes.

Blog

Screen recording with Agora.io: overcoming obstacles

Alex Popov

Prerequisites

Conclusion

Join Our Newsletter. No Spam, Only the good stuff.

Related