ReadMeAloud
A simple tool to provide an easy and efficient way to understand open-source projects for everyone especially visually impaired or sight-impaired friends
Architecutre
Read the detailed article from
Posted on March 24, 2021
As a developer, we often came across many open source projects in GitHub, in which Readme.md
is one of the first files which we will see π€©. It is the simplest way to understand, what the project is about, how to use it, and other related information (kind of documentation).
Here some known facts about readme from GitHub
A README is often the first item a visitor will see when visiting your repository. README files typically include information on:
- What the project does
- Why the project is useful
- How users can get started with the project
- Where users can get help with your project
- Who maintains and contributes to the project
Moreover having a good readme file which will help and attract many contributors. However, it's always been challenging for any low vision or visually impaired developers and contributors. As most of the readme content is in the form of text, which is quite difficult for them to read and understand. So I developed a very simple tool called "ReadmeAloud" that can convert the raw-text from any public GitHub readme file to Speech π€ and also provides a way to download it in form of an mp3 file π΅
There are many solutions that already existing in the market which help to convert text into speech easily. Some of them are
Read aloud highlights each word on the webpage as it's being ?>?read. To stop listening, select the Pause button or the X to close Read aloud.
Google Chrome Extension - Read Aloud: A Text to Speech Voice Reader
Word for Microsoft 365 - Read Aloud
Well, I agree with you that most of these tools were great and helpful for everyone. However, I felt there are a couple of factors that are missing like the feature to download the converted speech into a file(say mp3) and in most cases with the existing tools, either they need to pay for some license or they need to be connected with the internet to use the tool. That why I came up with my own little tool for low vision developers and focused on the most needed place for them which is Github, Readme File.
Here I have focused on the use-case rather than the architecture or technology that the reason you can see a very simple architecture like
-Azure front door
-Azure Webapp
-Azure Text-to-Speech Cognitive API
This is an open-source project.
The user will provide a valid Github Raw Readme.md URL e.g, https://raw.githubusercontent.com/jayendranarumugam/DemoSecrets/master/README.md in the azure front door
Once the user clicks the Search
button the azure front door route the traffic to the azure web app (Blazor) securely then convert the text from the URL to Speech using Azure Cognitive API
Once the Speech is converted successfully, the audio bytes will be used to play the audio on the browser and also provide the way to download it as an mp3 file.
This is my first time coding a server-side blazor π. Mostly I reused the default boilerplate blazor code, in which I modified and added some parts for my projects e.g, SpeechService
. I hope I learned something about blazor now. I also put some great GitHub repos in the below References section which help me to understand blazor.
The entire logic of converting the text into speech is done using simple Azure Cognitive Speech SDK which you can find in SpeechService.cs
public async Task<byte[]> SynthesizeAudioAsync(string text)
{
SpeechConfig speechConfigForAudioAsync = SpeechConfig.FromSubscription(Configuration["CognitiveAPIKey"], Configuration["CognitiveAPIRegion"]);
speechConfigForAudioAsync.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3);
using (var synthesizer = new SpeechSynthesizer(speechConfigForAudioAsync, null as AudioConfig))
{
using (var result = await synthesizer.SpeakTextAsync(text))
{
if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
return result.AudioData;
}
else if (result.Reason == ResultReason.Canceled)
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
}
return null;
}
}
}
The result.AudioData
is the converted audio in (byte array) byte[]
format which we will use to play in the browser and download as an mp3 file using Javascript functions like downloadFromByteArray
, playAudio
and stopAudio
which you can find in the helper.js
Readmealoud Is more like a small prototype where we can fit many features on top of that easily. Being said that, there are some limitations with the current design which we can improve in the future. Some of them are
The architecture itself is currently highly coupled with the server(blazor), we can make it more scalable by introducing a separate layer for the cognitive calls i.e, the azure function
Current limitation to convert only the public GitHub repo, we can improve that to private repo also by including some additional authentication.
Engish language is the by default for both text and speech conversation, we can improve that easily, since azure cognitive service has a very wide variety of supported languages
I'm not a front-end guy π. So feel free to contribute to readmealoud with your creative ideas or UI/
If the Readme
file is too long then the speech conversation would take more time some time it eventually timed out too. Currently, it is suited for small readme
files. We can improve that by changing the architecture which we discussed above.
We can also improve the design with more accessibility for people with low vision like providing voice search capability input for GitHub repo details.
The whole idea of the readmealoud is to provide an easy and efficient way to understand open-source projects for everyone especially visually impaired or sight-impaired friends. Though I showcased this for GitHub readme URL, however, we can put any valid URL for given plain text content. This is just a small idea and I hope it will reach its own audience π€
This is my simple Blazor application that demonstrates how to build SPA on Blazor and how to communicate with ASP.NET Core backend Demo application is simple books database.
Solution contains:
For Azure AD there are two project in solution:
On Azure the following services are needed:
Posted on March 24, 2021
Sign up to receive the latest update from our blog.