Hugging Face 101: A Tutorial for Absolute Beginners!

Welcome to this beginner-friendly tutorial on sentiment analysis using Hugging Face's transformers library! Sentiment analysis is a Natural Language Processing (NLP) technique used to determine the emotional tone or attitude expressed in a piece of text.

In this tutorial, you'll learn how to leverage pre-trained machine learning models from Hugging Face to perform sentiment analysis on various text examples. We'll walk you through the entire process, from installing the required packages to running and interpreting the model's output, all within a SingleStore Notebook environment, just like Jupyter Notebook.

By the end of this tutorial, you'll be equipped with the knowledge to use Hugging Face Transformers as a Library for analyzing the sentiment of text data.

What is Hugging Face🤗?

Hugging Face🤗 is a community specializing in Natural Language Processing (NLP) and artificial intelligence (AI). Founded in 2016, the company has made significant contributions to the field of NLP by democratizing access to state-of-the-art machine learning models and tools.

Hugging Face has a strong community focus. They provide a platform where researchers and developers can share their trained models, thereby fostering collaboration and accelerating progress in the field.

NLP stands for Natural Language Processing, which is a field of artificial intelligence that focuses on the interaction between computers and human language. Hugging Face is known for its contributions to NLP through its open-source libraries, pre-trained models, and community platforms.

Hugging Face🤗 Transformers as a Library:

Hugging Face's Transformers library is an open-source library for NLP and machine learning. It provides a wide variety of pre-trained models and architectures like BERT, GPT-2, T5, and many others. The library is designed to be highly modular and easy to use, allowing for the quick development of both research and production projects. It supports multiple languages and tasks like text classification, question-answering, text generation, translation, and more.

Prerequisites

Before you start with this tutorial, make sure you have the following prerequisites in place:

The only prerequisite for this tutorial is SingleStore Notebook. The tutorial is designed to be followed in a SingleStore Notebook. If you haven't installed SingleStore Notebook yet, you can do so by signing up at SingleStore and then selecting the Notebook feature.

Create a new blank Notebook.

You will land on a SingleStore Notebook dashboard.

From here, we will use it as our python playground to execute our commands.

Step 1: Install Required Packages

First, you'll need to install the transformers library from Hugging Face. You can do this using pip:



!pip install transformers

PyTorch is a prerequisite for using the Hugging Face transformers library.

You can install PyTorch by running the following command in your SingleStore Notebook:



!pip install torch

Restart the Kernel: After installing, you may need to restart the SingleStore Notebook kernel to ensure that the newly installed packages are recognized. You can usually do this by clicking on "Kernel" in the menu and then selecting "Restart Kernel".

Step 2: Import Libraries

Import the necessary Python libraries.



from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

Step 3: Load Pre-trained Model and Tokenizer

Load a pre-trained model and its corresponding tokenizer. For this example, let's use the distilbert-base-uncased-finetuned-sst-2-english model for sentiment analysis.



tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

Step 4: Preprocess Text

Tokenize the text you want to analyze.



text = "I love programming!"
tokens = tokenizer(text, padding=True, truncation=True, return_tensors="pt")

Step 5: Model Inference

Pass the tokenized text through the model.



with torch.no_grad():
    outputs = model(**tokens)
    logits = outputs.logits
    probabilities = torch.softmax(logits, dim=1)

Step 6: Interpret Results

Interpret the model's output to get the sentiment.



label_ids = torch.argmax(probabilities, dim=1)
labels = ['Negative', 'Positive']
label = labels[label_ids]
print(f"The sentiment is: {label}")

This should output either "Positive" or "Negative" based on the sentiment of the text.

Make sure you are executing your code in the SingleStore's Notebook playground.

Let's modify the text we want to analyze from "I love programming!" to "I hate programming!". You should see a Negative sentiment analysis.

Let's analyze one more sentence "SingleStore's Notebook feature is just mind blowing!" and see the response. (it should be positive as expected)

Congratulations on completing this beginner-friendly tutorial on sentiment analysis using Hugging Face's transformers library! By now, you should have a solid understanding of how to use pre-trained models to analyze the sentiment of text. You've learned how to tokenize text, run it through a model, and interpret the output—all within a SingleStore Notebook environment.

Blog