Translating S3-hosted static website using AWS Translate

m-ahmed10

Mohamed Ahmed

Posted on September 6, 2024

Translating S3-hosted static website using AWS Translate

Contents:

  1. The idea💡
  2. Brief about Amazon translate
  3. The design
  4. Lambda function
  5. S3 lambda trigger
  6. AWS Translate
  7. Debugging and logs
  8. Danger Zone⚠️ (Be Careful about those)!
  9. Result🎉
  10. Recognition🙏

1. The idea💡

I was working on the Cloud Resume Challenge for AWS, a great hands-on project about creating a personal website using mostly aws offerings(here is the link if you want to know more about it).
And as I speak french and was in the middle of french courses and practice, I immediately thought that my website should be available in french as well, and that's when I found out about Amazon translate, and I thought it was a very nice extra to add to the project.


2. Brief about Amazon translate

Amazon translate is an offering that allows you to translate a wide range of content, allowing for a global reach, and cross-lingual communication.
Some of the benefits:

  • Translating text, file, or batch jobs
  • Wide range of languages
  • Ability to customize translations with user custom terminology
  • Provides real time translation
  • And of course, API provided to be used from all your applications

3. The Complete design

Image description
We have a static website hosted on AWS S3, when the content html page gets modified, a trigger on S3 would run a lambda function, the lambda function takes the uploaded file and pass it to the translate function, the translate function would create a translated copy in the bucket.
In the main page of the website, added a language switch, to redirect to the translated page.


4. Lambda function

Created a lambda function (python-my favorite❤️), this function will take in the event sent by the s3 trigger, which is the uploaded html (we added limitation on which filename/extension to run the trigger).

First step which is very important in order to prevent falling into a recursive loop, as the trigger already warned us, we added a check on the incoming file to make sure the function runs only when it receives the page needed for translation

key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
if key == "mainpage.html":
Enter fullscreen mode Exit fullscreen mode

then the function would run the translate_document(), passing the file as a read stream, along with other parameters and the custom terminology file

Result = awstranslate.translate_document(
            Document={
                'Content': file['Body'].read(),
                'ContentType': 'text/html'
                },
            TerminologyNames=["customTerminology"],
            SourceLanguageCode='en',
            TargetLanguageCode='fr'
            )
Enter fullscreen mode Exit fullscreen mode

5. S3 lambda trigger

Created a trigger for the ObjectCreated event, which will trigger the lambda function, and pass the uploaded files.
A very important point here, is avoiding the recursive loop (you already get a warning when creating the trigger), because later we will save the translated file in the same bucket, which then will be considered as an ObjectCreated event.
So, I have put two methods to avoid this, first one we already did in the lambda function by checking the filename, second one, in the S3 trigger, you have two options to filter files included in the event:

  • filter_prefix
  • filter_suffix

So in this example, we limit the prefix to "mainpage", and the suffix to ".html"


6. AWS Translate

here we used the function translate_document(), it takes the content stream, content type, source language, output language, custom terminology file and output filename.


7. Debugging and logs

Mainly using AWS CloudWatch, you get lots of insights into what was running, what errors came out. Adding print statements in different steps of the lambda function would help a lot.
Check the output translated file, you might find translations that are out of context or un-proper html tags causing them to get translated or moved, so reading the translation and tweaking the terminology file and using the "translate=no" html tag to get the wanted result.

<span translate="no">Workflow engine</span>
Enter fullscreen mode Exit fullscreen mode

In this example, we use the "translate=no" html tag to avoid translating "engine" as motor.


8. Danger Zone⚠️ (Be Careful about those)!

Take care not to fall into a Recursive loop with lambda and s3 trigger, because we are saving the translation output file in the same bucket.
Make sure to use the "Prefix" filter in the S3 trigger, and to put the necessary fail-safe code in the lambda function.


9. Result🎉

Every time you edit your website, a translated version will be there automatically! 👏
Great experience, enjoyed working on AWS translate and tweaking it is options.
Loved the flexibility it provides, specially providing your own list of custom translations.


10. Recognition🙏

I always appreciate the culture of support and sharing, and gives credit where it is deserved.
As such, in all my posts or projects, I must mention everyone who was helpful to me and the resources that assisted me.
In this project, I would like to thank:
Ether Lee Introduction to Amazon Translate
Greg Rushing Real-time Translation with Amazon Translate
Watson Srivathsan Customize your Translation Output

💖 💪 🙅 🚩
m-ahmed10
Mohamed Ahmed

Posted on September 6, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related