Hello world. This is the monthly AWS Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS. Feel free to leave comments & share it on your social network.

NLP@AWS Customer Success Story

How to Build a Scalable Chatbot and Deploy to Europe’s Largest Airline
Ryanair had a customer care improvement initiative to enable customers to access support 24x7 and yet allow agents to spend more time with customers with challenging needs that require a human touch. In this video, Cation Consulting CTO explains the architecture which uses Amazon Translate, Amazon Lex and Amazon SageMaker to help Ryanair implement a multi-lingual and multi-channel Chatbot to optimise customer responses while using Amazon Comprehend for sentiment analysis of customer interactions.

AI Language Services

Extract granular sentiment in text When using NLP to perform sentiment analysis on an input document, typically the dominant sentiment is determined even though the document may contain nuanced sentiment referring to multiple entities . This level of granularity can be exploited by businesses using Amazon Comprehend Targeted Sentiment feature to understand which specific attributes of their products or services were well received by customers and therefore should be retained or strengthen, and which attributes needs to be improved upon.

Automate email responses using Amazon Comprehend custom classification and entity detection Organizations and businesses that provide customer care spend a lot of resources and manpower to be responsive to customer’s needs and to ensure a good customer experience. Using an automated approach to answer customer queries can often improve the customer care experience in addition to lowering costs. Many organisations have the requisite data assets but are often hindered by lack of AI/ML expertise. This blog post shows how you can use Amazon Comprehend to identify the intent of customer care email and automate the response using AWS services.

Build a traceable, custom, multi-format document parsing pipeline with Amazon Textract
Many businesses today still rely on using forms as a necessary business tool due to compliance reasons or where digital data capture is not possible.
Very often these businesses have to release new versions of their forms which can impact traditional OCR systems and impact downstream processing tools.
Amazon Textract is a machine learning (ML) service that automatically extracts text or handwriting data from forms in minutes. You can build a serverless event-driven, multi-format document parsing pipeline using Amazon Textract. This post demonstrates how to design a robust pipeline that can handle multiple versions of forms easily.

Enable conversational chatbots for telephony using Amazon Lex and the Amazon Chime SDK
Conversational AI can deliver powerful, automated and interactive experiences through voice and text. A common application of this is AI-powered self-service applications such as conversational interactive voice response systems (IVRs) that can handle voice service calls by automating informational responses. This blog post shows how you can easily build such an application in a serverless manner without any need for an expensive IVR platform. It uses an Amazon Lex-driven chatbot with audio powered by Amazon Chime SDK Public Switched Telephone Network (PSTN) and natively integrates with Amazon Polly’s text-to-speech capabilities to convert text responses into speech.

NLP on Amazon SageMaker

Train EleutherAI GPT-J using SageMaker
EleutherAI released GPT-J 6B as an open-source alternative to OpenAI's GPT-3. EleutherAI’s goal was to train a model that is equivalent in size to GPT⁠-⁠3 and make it available to the public under an open license and has since gained a lot of interest from Researchers, Data Scientists, and even Software Developers. This notebook shows you how to easily train and tune GPT-J using Amazon SageMaker Distributed Training and Hugging Face on NVIDIA GPU instances.

Have fun with your own private GPT text generation playground
Want to create your very own text generation playground without having to pay for GPT-3 usage or even having to train a GPT-J model? This blog shows how you can easily deploy GPT-J on Amazon SageMaker and create a web interface to interact with the model. Have fun!

Accelerate BERT inference

BERT and Transformers based models tend to be relatively large, complex and slow compared to traditional Machine Learning algorithms. One way of accelerating these models is to deploy them on AWS Inferentia based EC2 instances. AWS Inferentia is a chip designed to deliver up 2.3x higher throughput and up to 70% lower cost per inference than comparable current generation GPU-based Amazon EC2 instances. This post shows you how to use the AWS Neuron SDK to convert a BERT based model to run on AWS Inferentia powered EC2 Inf1 instances.

A practical guide to assessing text summarisation models

Many organisations have huge amount of text document that often need to be summarised and NLP based text summarisation is one way of automating such tasks. This 2 part series proposes a practical approach to assessing summarisation models at scale. The first part introduces the dataset and metric to evaluate a simple heuristic approach. The second part uses a zero-shot learning model and discusses an approach to comparing the 2 models that can be applied to compare many other models during the experimentation phase.

Introduction to Hugging Face on Amazon SageMaker

Transformers are state of the art NLP models that displays almost human level of performance in NLP tasks like text generation, text classification, question and answering etc. You can use pretrained transformer models and fine tune on your data to raise the intelligence on your NLP models. This video introduces you to Hugging Face and how you can easily build, train and deploy state of the art NLP models using Amazon SageMaker.

Hugging Face on Amazon SageMaker and AWS Workshop
Interested to use NLP to generate models in the style of your favourite poets? This workshop shows you how you can fine tune text generation models in the style of your favourite poets using Hugging Face on Amazon SageMaker.

NLP@AWS Community Content

Decoding Sperm Whale language using NLP models

Project CETI was selected for an Amazon Research Award and is an Imagine Grant winner. AWS ML teams are working with the Open Data program team to help decode sperm whale language. By building NLP models to parse and interpret sperm whales’ voices and hopefully develop frameworks for communicating back, CETI aims to show that today’s most cutting-edge technologies can be used to not only drive business impact, but can enable a deeper understanding of other species on this planet. CETI’s work is taking place in the Caribbean, off the coast of Dominica, and remotely with AWS via Chime.

Upcoming Events

Workshop: Techniques to accelerate BERT Inference
Learn how to apply knowledge distillation to compress a large BERT model to a small model, and then to an optimised neuron model with AWS Inferentia. By the end of this process, our model will go from 100ms+ to 5ms+ latency - a 20x improvement! Click here for registration.

AWS Summits around the globe
AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn through technical breakout sessions, demonstrations, interactive workshops, labs, and team challenges. Summits are held in major cities around the world both virtually as well as in person starting April 2022. Register for one near you!

Stay in touch with NLP on AWS

Our contact: aws-nlp@amazon.com
Email us about (1) your awesome project about NLP on AWS, (2) let us know which post in the newsletter helped your NLP journey, (3) other things that you want us to post on the newsletter. Talk to you soon.

Blog

NLP@AWS Newsletter 04/2022

Wong Voon Wong