Hugging Face 101: A Tutorial for Absolute Beginners!

pavanbelagatti

Pavan Belagatti

Posted on September 12, 2023

Hugging Face 101: A Tutorial for Absolute Beginners!

Welcome to this beginner-friendly tutorial on sentiment analysis using Hugging Face's transformers library! Sentiment analysis is a Natural Language Processing (NLP) technique used to determine the emotional tone or attitude expressed in a piece of text.

In this tutorial, you'll learn how to leverage pre-trained machine learning models from Hugging Face to perform sentiment analysis on various text examples. We'll walk you through the entire process, from installing the required packages to running and interpreting the model's output, all within a SingleStore Notebook environment, just like Jupyter Notebook.

By the end of this tutorial, you'll be equipped with the knowledge to use Hugging Face Transformers as a Library for analyzing the sentiment of text data.

What is Hugging Face🤗?

What is Hugging Face🤗

Hugging Face🤗 is a community specializing in Natural Language Processing (NLP) and artificial intelligence (AI). Founded in 2016, the company has made significant contributions to the field of NLP by democratizing access to state-of-the-art machine learning models and tools.

Hugging Face has a strong community focus. They provide a platform where researchers and developers can share their trained models, thereby fostering collaboration and accelerating progress in the field.

NLP stands for Natural Language Processing, which is a field of artificial intelligence that focuses on the interaction between computers and human language. Hugging Face is known for its contributions to NLP through its open-source libraries, pre-trained models, and community platforms.

Hugging Face🤗 Transformers as a Library:

Hugging Face's Transformers library is an open-source library for NLP and machine learning. It provides a wide variety of pre-trained models and architectures like BERT, GPT-2, T5, and many others. The library is designed to be highly modular and easy to use, allowing for the quick development of both research and production projects. It supports multiple languages and tasks like text classification, question-answering, text generation, translation, and more.

Prerequisites

Before you start with this tutorial, make sure you have the following prerequisites in place:

  • The only prerequisite for this tutorial is SingleStore Notebook. The tutorial is designed to be followed in a SingleStore Notebook. If you haven't installed SingleStore Notebook yet, you can do so by signing up at SingleStore and then selecting the Notebook feature.

start with singlestore notebooks

Create a new blank Notebook.
new notebook

You will land on a SingleStore Notebook dashboard.
SingleStore Notebook dashboard

From here, we will use it as our python playground to execute our commands.

Step 1: Install Required Packages

First, you'll need to install the transformers library from Hugging Face. You can do this using pip:



!pip install transformers


Enter fullscreen mode Exit fullscreen mode

PyTorch is a prerequisite for using the Hugging Face transformers library.

You can install PyTorch by running the following command in your SingleStore Notebook:



!pip install torch


Enter fullscreen mode Exit fullscreen mode

Restart the Kernel: After installing, you may need to restart the SingleStore Notebook kernel to ensure that the newly installed packages are recognized. You can usually do this by clicking on "Kernel" in the menu and then selecting "Restart Kernel".

Step 2: Import Libraries

Import the necessary Python libraries.



from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch


Enter fullscreen mode Exit fullscreen mode

Step 3: Load Pre-trained Model and Tokenizer

Load a pre-trained model and its corresponding tokenizer. For this example, let's use the distilbert-base-uncased-finetuned-sst-2-english model for sentiment analysis.



tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")


Enter fullscreen mode Exit fullscreen mode

Step 4: Preprocess Text

Tokenize the text you want to analyze.



text = "I love programming!"
tokens = tokenizer(text, padding=True, truncation=True, return_tensors="pt")


Enter fullscreen mode Exit fullscreen mode

Step 5: Model Inference

Pass the tokenized text through the model.



with torch.no_grad():
    outputs = model(**tokens)
    logits = outputs.logits
    probabilities = torch.softmax(logits, dim=1)


Enter fullscreen mode Exit fullscreen mode

Step 6: Interpret Results

Interpret the model's output to get the sentiment.



label_ids = torch.argmax(probabilities, dim=1)
labels = ['Negative', 'Positive']
label = labels[label_ids]
print(f"The sentiment is: {label}")


Enter fullscreen mode Exit fullscreen mode

This should output either "Positive" or "Negative" based on the sentiment of the text.

Make sure you are executing your code in the SingleStore's Notebook playground.
SingleStore's Notebook playground

Let's modify the text we want to analyze from "I love programming!" to "I hate programming!". You should see a Negative sentiment analysis.

Negative sentiment analysis

Let's analyze one more sentence "SingleStore's Notebook feature is just mind blowing!" and see the response. (it should be positive as expected)

amazing feature

Congratulations on completing this beginner-friendly tutorial on sentiment analysis using Hugging Face's transformers library! By now, you should have a solid understanding of how to use pre-trained models to analyze the sentiment of text. You've learned how to tokenize text, run it through a model, and interpret the output—all within a SingleStore Notebook environment.

đź’– đź’Ş đź™… đźš©
pavanbelagatti
Pavan Belagatti

Posted on September 12, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related