BERT WordPiece Tokenizer Explained

jamescalam

James Briggs

Posted on September 14, 2021

BERT WordPiece Tokenizer Explained

Building a transformer model from scratch can often be the only option for many more specific use cases. Although BERT and other transformer models have been pre-trained for many languages and domains, they do not cover everything.

Often, these less common use cases stand to gain the most from having someone come along and build a specific transformer model. It could be for an uncommon language or a less tech-savvy domain.

BERT is the most popular transformer for a wide range of language-based machine learning — from sentiment analysis to question and answering. BERT has enabled a diverse range of innovation across many borders and industries.

The first step for many in designing a new BERT model is the tokenizer. In this article, we’ll look at the WordPiece tokenizer used by BERT — and see how we can build our own from scratch.

Full walkthrough or free link if you don't have Medium!

💖 💪 🙅 🚩
jamescalam
James Briggs

Posted on September 14, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Principal Component Analysis (PCA) Theory
machinelearning Principal Component Analysis (PCA) Theory

April 2, 2022

BERT WordPiece Tokenizer Explained
machinelearning BERT WordPiece Tokenizer Explained

September 14, 2021

7 Steps of Machine Learning
machinelearning 7 Steps of Machine Learning

June 2, 2020