Content Sentiment Analysis: Explore the emotion score of your content with Google API
Bogdan Covrig
Posted on August 22, 2020
My Workflow
I spent a few nights wrapping Google's analyzeSentiment API into a GitHub Action. The Action runs Sentiment Analysis over the content of HTML files and provides an overview of the overall emotion of all (the selected) pages in your project.
The API returns values from -1 to 1, indicating how strong a certain emotion – positive or negative – is. After running the Action, a table with the score per each page is printed in its logs. Read more about Interpreting sentiment analysis values.
Along with other content analysis tools, it might come in handy to maintainers who want to understand the text that is pushed to the project every day. See it in action 🚀
⚠️ Now I got really excited about this and will continue developing along with other automation ideas that I have. This being an early release of the Action, please take a look at the roadmap and submit issues with what kind of features you would like to see in future releases.
Submission Category:
Maintainer Must-Haves
Yaml File or Link to Code
Here is an example of how to use the Action on public .html files.
name:Sentiment analysis on publicon:pushjobs:analysis:runs-on:ubuntu-lateststeps:-uses:actions/checkout@v2#Be sure you checkout the files beforehand-name:Run sentiment analysis on HTML filesuses:bogdaaamn/copy-sentiment-analysis@v0.6.1with:gcp_key:${{ secrets.GCP_KEY }}#Google Cloud Platform API key. Read the README for instructions
Run sentiment analysis over the text of your website using Google API.
Copy Sentiment Analysis
This GitHub Action runs Sentiment Analysis over the built text of your GitHub project. It uses Google's analyzeSentiment API, evaluating the overall emotion score (from positive to negative) of a page. The Action provides an overview of the scores of all the pages from your project (more on interpreting the scores).
🚀 Usage
This is a workflow example of using the Action on plain .html files from the public folder (by default).
name: Sentiment analysis on publicon: pushjobs:
analysis:
runs-on: ubuntu-lateststeps:
- uses: actions/checkout@v2 #Be sure you checkout the files beforehand
- name: Run sentiment analysis on HTML filesuses: bogdaaamn/copy-sentiment-analysis@v0.6.1with:
gcp_key: ${{ secrets.GCP_KEY }} #Google Cloud Platform API key. Read the README for instructions
Although, if you project needs to be built beforehand, be sure you place…
At the moment, the overview table is printed in the Actions tab, after the code is running. But it seems counter-intuitive and having too much friction in between.
I am curious about what the community thinks: How would you see this Action printing the results? A comment to the PR? A table in Action's log? Failing if there are too many negative results?
⚠️ GCP's bias in sentiment analysis
A few years ago, Google API was criticized in the media for producing bias results towards race, gender, and religion. So I had mixed feelings about using a pre-trained model in this Action.
Now, it is hard to understand what is going on with Google's proprietary algorithm and how they fight unwanted bias, but more recent research (see charlescearl, 2019) concluded that GCP seems to be less sensitive to the race or gender of participants than other competitor platforms. The same article recommends that users should proceed with caution and conduct evaluations on their own. The tests that I've done had neutral results, but I am ready to expand my research and pull the plug if needed.
Moreover, there is extraordinary research done towards identifying, analyzing, and diminishing bias in data (see Dixon et al., 2018 from Google Research or Caliskan et al., 2016 and May et al., 2019) and all I hope is that Google (or really any cloud provider out there) is doing better and better. I believe it is really important to enable and support bias and fairness research – especially now, after the recent upbringing of GPT-3 (see Burus, 2020) when the society gets exposed more and more to technology.