Getting started with Sentiment Analysis

How do customers feel about your products or services? That’s important question business owners shouldn’t neglect. Positive and negative words matter. They can boost your business efforts or initiate a crisis. Luckily, you can measure customer satisfaction through sentiment analysis.

Sentiment analysis is the process of analyzing online pieces of writing to determine the emotional tone they carry, whether they’re positive, negative, or neutral. In simple words, sentiment analysis helps to find the author’s attitude towards a topic.
Essentially, sentiment analysis or sentiment classification fall into the broad category of text classification tasks where you are supplied with a phrase, or a list of phrases and your classifier is supposed to tell if the sentiment behind that is positive, negative or neutral. Sometimes, the third attribute is not taken to keep it a binary classification problem.
Sentiment analysis tools will collect all publicly available mentions containing your predefined keyword and analyze the emotions behind the message. The results of sentiment analysis are a wealth of information for your customer service teams, product development, or marketing.
Sentiment analysis tools will collect all publicly available mentions containing your predefined keyword and analyse the emotions behind the message. The results of sentiment analysis are a wealth of information for your customer service teams, product development, or marketing.

Types of Sentiment Analysis
Sentiment analysis focuses on the polarity of a text (positive, negative, neutral) but it also goes beyond polarity to detect specific feelings and emotions (angry, happy, sad, etc), urgency (urgent, not urgent) and even intentions (interested v. not interested).

Depending on how you want to interpret customer feedback and queries, you can define and tailor your categories to meet your sentiment analysis needs. In the meantime, here are some of the most popular types of sentiment analysis:

Graded Sentiment Analysis
If polarity precision is important to your business, you might consider expanding your polarity categories to include different levels of positive and negative:

Very positive
Positive
Neutral
Negative
Very negative
This is usually referred to as graded or fine-grained sentiment analysis, and could be used to interpret 5-star ratings in a review, for example:

Very Positive = 5 stars
Very Negative = 1 star

Emotion detection
Emotion detection sentiment analysis allows you to go beyond polarity to detect emotions, like happiness, frustration, anger, and sadness.

Many emotion detection systems use lexicons (i.e. lists of words and the emotions they convey) or complex machine learning algorithms.

One of the downsides of using lexicons is that people express emotions in different ways. Some words that typically express anger, like bad or kill (e.g. your product is so bad or your customer support is killing me) might also express happiness (e.g. this is bad ass or you are killing it).

Aspect-based Sentiment Analysis
Usually, when analyzing sentiments of texts you’ll want to know which particular aspects or features people are mentioning in a positive, neutral, or negative way.

That's where aspect-based sentiment analysis can help, for example in this product review: "The battery life of this camera is too short", an aspect-based classifier would be able to determine that the sentence expresses a negative opinion about the battery life of the product in question.

Multilingual sentiment analysis
Multilingual sentiment analysis can be difficult. It involves a lot of preprocessing and resources. Most of these resources are available online (e.g. sentiment lexicons), while others need to be created (e.g. translated corpora or noise detection algorithms), but you’ll need to know how to code to use them.

Alternatively, you could detect language in texts automatically with a language classifier, then train a custom sentiment analysis model to classify texts in the language of your choice.

Why Is Sentiment Analysis Important?
Since humans express their thoughts and feelings more openly than ever before, sentiment analysis is fast becoming an essential tool to monitor and understand sentiment in all types of data.

Automatically analyzing customer feedback, such as opinions in survey responses and social media conversations, allows brands to learn what makes customers happy or frustrated, so that they can tailor products and services to meet their customers’ needs.

For example, using sentiment analysis to automatically analyze 4,000+ open-ended responses in your customer satisfaction surveys could help you discover why customers are happy or unhappy at each stage of the customer journey.

Maybe you want to track brand sentiment so you can detect disgruntled customers immediately and respond as soon as possible. Maybe you want to compare sentiment from one quarter to the next to see if you need to take action. Then you could dig deeper into your qualitative data to see why sentiment is falling or rising.
The overall benefits of sentiment analysis include:

1. Better customer insights: Sentiment analysis can help businesses to understand their customers' needs, preferences, and opinions. By analyzing customer feedback, companies can identify areas for improvement and tailor their products and services to better meet customer needs.

2. Improved brand reputation: By analyzing social media posts and customer reviews, businesses can monitor their brand reputation and identify potential issues before they become major problems. This allows businesses to proactively address customer concerns and maintain a positive brand image.

3. Increased customer satisfaction: By understanding customer feedback and sentiment, businesses can improve their products and services to better meet customer needs. This can lead to increased customer satisfaction and loyalty.

4. Enhanced marketing campaigns: Sentiment analysis can help businesses to develop more effective marketing campaigns by identifying customer preferences and trends. By analyzing social media posts and other online content, businesses can identify key influencers and tailor their messaging to better resonate with their target audience.

5. Streamlined customer support: Sentiment analysis can help businesses to quickly identify and prioritize customer issues. By analyzing customer feedback in real-time, businesses can respond to customer inquiries and complaints more efficiently, leading to improved customer support and satisfaction.

6. Competitive advantage: By analyzing customer sentiment and feedback, businesses can gain valuable insights into their competitors' strengths and weaknesses. This can help businesses to develop more effective marketing strategies and stay ahead of their competitors.

7. Scalability: Sentiment analysis can analyze large volumes of textual data quickly and accurately, making it an ideal solution for businesses with large customer bases or high volumes of social media posts and other online content to monitor.

Overall, sentiment analysis in machine learning can provide businesses with valuable insights into their customers, products, and competitors, leading to improved customer satisfaction, brand reputation, and competitive advantage

Formulating the problem statement of sentiment analysis:
Before understanding the problem statement of a sentiment classification task, you need to have a clear idea of general text classification problem. Let's formally define the problem of a general text classification task.

Input: - A document d - A fixed set of classes C = {c1,c2,..,cn}

Output: A predicted class c $\in$ C
The document term here is subjective because in the text classification world. By document, it is meant tweets, phrases, parts of news articles, whole news articles, a full article, a product manual, a story, etc. The reason behind this terminology is word which is an atomic entity and small in this context. So, to denote large sequences of words, this term document is used in general. Tweets mean a shorter document whereas an article means a larger document.

So, a training set of n labeled documents looks like: (d1,c1), (d2,c2),...,(dn,cn) and the ultimate output is a learned classifier.

You are doing good! But one question that you must be having at this point is where the features of the documents are? Genuine question! You will get to that a bit later.

Now, let's move on with the problem formulation and slowly build the intuition behind sentiment classification.

One crucial point you need to keep in mind while working in sentiment analysis is not all the words in a phrase convey the sentiment of the phrase. Words like "I", "Are", "Am", etc. do not contribute to conveying any kind of sentiments and hence, they are not relative in a sentiment classification context. Consider the problem of feature selection here. In feature selection, you try to figure out the most relevant features that relate the most to the class label. That same idea applies here as well. Therefore, only a handful of words in a phrase take part in this and identifying them and extracting them from the phrases prove to be challenging tasks. But don't worry, you will get to that.

Consider the following movie review to understand this better:

"I love this movie! It's sweet, but with satirical humor. The dialogs are great and the adventure scenes are fun. It manages to be romantic and whimsical while laughing at the conventions of the fairy tale genre. I would recommend it to just about anyone. I have seen it several times and I'm always happy to see it again......."

Yes, this is undoubtedly a review which carries positive sentiments regarding a particular movie. But what are those specific words which define this positivity?

Retake a look at the review.

You must have got the clear picture now. The bold words in the above piece of text are the most important words which construct the positive nature of the sentiment conveyed by the text.

A simple sentiment classifier in Python:
Here's an example of a simple sentiment classifier in Python using the Natural Language Toolkit (NLTK) library. For this case study, you'll use an off-line movie review corpus as covered in the NLTK book https://www.nltk.org/book/ch06.html#document-classification and can be downloaded from here http://www.nltk.org/nltk_data/ nltk provides a version of the dataset. The dataset categorizes each review as positive or negative. You need to download that first as follows:
python -m nltk.downloader all
It's not recommended to run it from Jupyter Notebook. Try to run it from the command prompt (if using Windows). It will take some time. So, be patient.

For more information about NLTK datasets, make sure you visit this link. https://www.nltk.org/data.html

You will be implementing Naive Bayes or let's say Multinomial Naive Bayes classifier using NLTK which stands for Natural Language Toolkit. It is a library dedicated to NLP and NLU related tasks, and the documentation is very good. It covers many techniques in a great and provides free datasets as well for experiments.

This is NLTK's official website. Make sure you check it out because it has some well-written tutorials on NLP covering different NLP concepts.

After all the data is downloaded, you will start by importing the movie reviews dataset by from nltk.corpus import movie_reviews. Then, you will construct a list of documents, labeled with the appropriate categories.

# Load and prepare the dataset
import nltk
from nltk.corpus import movie_reviews
import random

documents = [(list(movie_reviews.words(fileid)), category)
              for category in movie_reviews.categories()
              for fileid in movie_reviews.fileids(category)]

random.shuffle(documents)

Next, you will define a feature extractor for documents, so the classifier will know which aspects of the data it should pay attention too. "In this case, you can define a feature for each word, indicating whether the document contains that word. To limit the number of features that the classifier needs to process, you start by constructing a list of the 2000 most frequent words in the overall corpus" Source. You can then define a feature extractor that simply checks if each of these words is present in a given document

# Define the feature extractor

all_words = nltk.FreqDist(w.lower() for w in movie_reviews.words())
word_features = list(all_words)[:2000]

def document_features(document):
    document_words = set(document)
    features = {}
    for word in word_features:
        features['contains({})'.format(word)] = (word in document_words)
    return features

"The reason that you computed the set of all words in a document document_words = set(document), rather than just checking if the word in the document, is that checking whether a word occurs in a set is much faster than checking whether it happens in a list" - Source.

You have defined the feature extractor. Now, you can use it to train a Naive Bayes classifier to predict the sentiments of new movie reviews. To check your classifier's performance, you will compute its accuracy on the test set. NLTK provides show_most_informative_features() to see which features the classifier found to be most informative.

# Train Naive Bayes classifier
featuresets = [(document_features(d), c) for (d,c) in documents]
train_set, test_set = featuresets[100:], featuresets[:100]
classifier = nltk.NaiveBayesClassifier.train(train_set)

# Test the classifier
print(nltk.classify.accuracy(classifier, test_set))

0.71
Wow! The classifier was able to achieve an accuracy of 71% without even tweaking any parameters or fine-tuning. This is great for the first go!

# Show the most important features as interpreted by Naive Bayes
classifier.show_most_informative_features(5)

Most Informative Features
       contains(winslet) = True              pos : neg    =      8.4 : 1.0
     contains(illogical) = True              neg : pos    =      7.6 : 1.0
      contains(captures) = True              pos : neg    =      7.0 : 1.0
        contains(turkey) = True              neg : pos    =      6.5 : 1.0
        contains(doubts) = True              pos : neg    =      5.8 : 1.0

"In the dataset, a review that mentions "Illogical" is almost 8 times more likely to be negative than positive, while a review that mentions "Captures" is about 6 times more likely to be positive" - Source.

Now the question - why Naive Bayes?

You chose to study Naive Bayes because of the way it is designed and developed. Text data has some practicle and sophisticated features which are best mapped to Naive Bayes provided you are not considering Neural Nets. Besides, it's easy to interpret and does not create the notion of a blackbox model.
Naive Bayes suffers from a certain disadvantage as well:
The main limitation of Naive Bayes is the assumption of independent predictors. In real life, it is almost impossible that you get a set of predictors which are entirely independent.

Blog

Getting started with Sentiment Analysis

Rodney Kirui

Join Our Newsletter. No Spam, Only the good stuff.

Related