AWS - NLP newsletter August 2021
Mia Chang
Posted on August 27, 2021
Hello world. This is the monthly Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS. This is our first newsletter on Dev.to. Special thanks to Ricardo Sueiras helped us make this happen. Feel free to leave comments, share it on your social network to celebrate this new launch with us!
Service updates about NLP on AWS
Amazon Transcribe Call Analytics
Amazon Transcribe Call Analytics is a new machine learning (ML) powered conversation insights API that enables developers to improve customer experience and agent productivity. This API can analyze call recordings to generate turn-by-turn call transcripts and actionable insights for understanding customer-agent interactions, identifying trending issues, and tracking performance metrics. Launch content: AWS News Blog, What's New Post, Webpage, Documentation, GitHub Demo, LinkedIn.Amazon Connect now works with Amazon Lex V2.
Amazon Lex allows customers to create intelligent chatbots that turn their Amazon Connect contact flows into natural conversations. Amazon Lex V2 console and API enhancements include: 1) support for multiple languages in a simple bot and the ability to manage them as a single resource through the life cycle (build, test, and deploy), 2) ability for end-users to request a bot to wait (“Can you wait while I get my credit card?”), and interrupt a bot in mid-sentence, 3) simplified bot versioning, and 4) new productivity features such as support for saving partially completed bots, bulk upload of sample utterances, and navigation via a dynamic ‘Conversation flow’ for more flexibility and control in the bot design process. Share the news: What’s New post, Amazon Lex Blog post, Amazon Connect documentation, Amazon Lex documentation
AWS Blog posts, papers, and more
Challenge entry will remain open until September 21, and research teams from academia, industry, and nonprofit and government sectors are welcome to participate. Amazon has open-sourced the development data, evaluation scripts, and baseline systems for challenge participants and other researchers in the field.
Regression bugs are in your NLP model! This study showed that in updated models, NFRs are often much higher than the total accuracy gains, from two to eight times as high. This implies that simply aiming for greater accuracy improvements in updated models will not ensure a decrease in regression; i.e., improving accuracy and minimizing regression are related but separate learning targets. This post includes mitigations to the regression bugs.
The post shares how to make improvements to the F1 score while training models with different dataset sampling configurations, including multi-lingual models. With this updated model, Amazon Comprehend makes it easy to train custom entity recognition models. Limits have been lowered to 100 annotations per entity and 250 documents for training, while offering improved accuracy with your models.
The combination of Amazon Transcribe and Amazon Kendra enable a scalable, cost-effective solution to make your media files discoverable. You can use the content of your media files to find accurate answers to your users’ questions, whether they’re from text documents or media files, and consume them in their native format.
This post explains an end-to-end solution for creating a customer churn prediction model based on customer profiles and customer-agent call transcriptions. Which included training a PyTorch model with a custom script and creating an endpoint for real-time model hosting. Start from create a public-facing API Gateway that can be securely used in your mobile applications or website, then use Amazon Transcribe for batch or real-time transcription of customer-agent conversations, which you can use for training of your model or real-time inference.
Helping home shoppers connect to the services they need in time can make the difference if they are successful in securing a property or not. In this episode, we explore how Zillow built a natural language processing solution using Amazon Transcribe and leverage the Elastic Container Service to quickly scale their machine learning engine to match customer requests to agents. We dive into how they deploy models into the environment using GitLab pipelines to simplify the job for data scientists.
Community content
NLP to help pregnant mothers in Kenya
In Kenya, 33% of maternal deaths are caused by delays in seeking care, and 55% of maternal deaths are caused by delays in action or inadequate care by providers. Jacaranda Health is employing NLP and dialogue system techniques to help mothers experience childbirth safely and with respect, and to help newborns get a safe start in life.HuggingFace Processing Jobs on Amazon SageMaker
Prepare text data for your NLP pipeline in a scalable and reproducible way. This has two principal benefits: (1) For large datasets, data preparation can take a long time. Choosing dedicated EC2 instances allows us to pick the right processing power for the task at hand. (2) Codifying the data preparation via a processing job enables us to integrate the data processing step into a CI/CD pipeline for NLP tasks in a scalable and reproducible way.How Good Is Your NLP Model Really?
How to evaluate NLP models with Amazon SageMaker Processing jobs for Hugging Face’s Transformer models. NLP model evaluation can be resource-intensive, especially when it comes to Transformer models that benefit greatly from GPU acceleration. In the post we will then see that we can speed up this process up to 267(!) times by using SageMaker’s Hugging Face Processing jobs.
Code samples
HuggingFace
Amazon SageMaker enables customers to train, fine-tune, and run inference using Hugging Face models for Natural Language Processing(NLP) on SageMaker.
- Bring your own HuggingFace pretrained BERT container to Sagemaker Tutorial.
- TensorFlow 2 HuggingFace distilBERT Tutorial.
NLP with BlazingText with Amazon SageMaker and AWS Lambda
This example illustrates how to use a BlazingText text classification training with SageMaker, and serving with AWS Lambda for both supervised (text classification) and unsupervised (Word2Vec) modes. The repository comes with Jupyter notebook, docker container, and events file you can work with.
Upcoming events
Best Practices in Conversational AI Design
Sep 01, 2021 | 07:00 PM CEST
Join conversational design leaders from Amazon, Alexa, and AWS as we discuss best practices in conversational AI design. Building conversational interfaces can be challenging given the free-form nature of communication and unstructured data. Users can say whatever they like, however they like. It is quite a bit different from web and mobile design. Our experts will cover the best practices in conversational design, including how to design for voice assistants versus text chatbots, handling fallbacks gracefully, the role of context and personalization, how to guide the users along the happy path to successful engagement, tips for creating an intuitive flow, and more.
Stay in touch with NLP on AWS
Our contact: aws-nlp@amazon.com
Email us about (1) your awesome project about NLP on AWS, (2) let us know which post in the newsletter helped your NLP journey, (3) other things that you want us to post on the newsletter. Talk to you soon.
Posted on August 27, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.