Mohamed Ahmed
Posted on September 6, 2024
Contents:
- The idea💡
- Brief about Amazon translate
- The design
- Lambda function
- S3 lambda trigger
- AWS Translate
- Debugging and logs
- Danger Zone⚠️ (Be Careful about those)!
- Result🎉
- Recognition🙏
1. The idea💡
I was working on the Cloud Resume Challenge for AWS, a great hands-on project about creating a personal website using mostly aws offerings(here is the link if you want to know more about it).
And as I speak french and was in the middle of french courses and practice, I immediately thought that my website should be available in french as well, and that's when I found out about Amazon translate, and I thought it was a very nice extra to add to the project.
2. Brief about Amazon translate
Amazon translate is an offering that allows you to translate a wide range of content, allowing for a global reach, and cross-lingual communication.
Some of the benefits:
- Translating text, file, or batch jobs
- Wide range of languages
- Ability to customize translations with user custom terminology
- Provides real time translation
- And of course, API provided to be used from all your applications
3. The Complete design
We have a static website hosted on AWS S3, when the content html page gets modified, a trigger on S3 would run a lambda function, the lambda function takes the uploaded file and pass it to the translate function, the translate function would create a translated copy in the bucket.
In the main page of the website, added a language switch, to redirect to the translated page.
4. Lambda function
Created a lambda function (python-my favorite❤️), this function will take in the event sent by the s3 trigger, which is the uploaded html (we added limitation on which filename/extension to run the trigger).
First step which is very important in order to prevent falling into a recursive loop, as the trigger already warned us, we added a check on the incoming file to make sure the function runs only when it receives the page needed for translation
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
if key == "mainpage.html":
then the function would run the translate_document(), passing the file as a read stream, along with other parameters and the custom terminology file
Result = awstranslate.translate_document(
Document={
'Content': file['Body'].read(),
'ContentType': 'text/html'
},
TerminologyNames=["customTerminology"],
SourceLanguageCode='en',
TargetLanguageCode='fr'
)
5. S3 lambda trigger
Created a trigger for the ObjectCreated event, which will trigger the lambda function, and pass the uploaded files.
A very important point here, is avoiding the recursive loop (you already get a warning when creating the trigger), because later we will save the translated file in the same bucket, which then will be considered as an ObjectCreated event.
So, I have put two methods to avoid this, first one we already did in the lambda function by checking the filename, second one, in the S3 trigger, you have two options to filter files included in the event:
- filter_prefix
- filter_suffix
So in this example, we limit the prefix to "mainpage", and the suffix to ".html"
6. AWS Translate
here we used the function translate_document(), it takes the content stream, content type, source language, output language, custom terminology file and output filename.
7. Debugging and logs
Mainly using AWS CloudWatch, you get lots of insights into what was running, what errors came out. Adding print statements in different steps of the lambda function would help a lot.
Check the output translated file, you might find translations that are out of context or un-proper html tags causing them to get translated or moved, so reading the translation and tweaking the terminology file and using the "translate=no" html tag to get the wanted result.
<span translate="no">Workflow engine</span>
In this example, we use the "translate=no" html tag to avoid translating "engine" as motor.
8. Danger Zone⚠️ (Be Careful about those)!
Take care not to fall into a Recursive loop with lambda and s3 trigger, because we are saving the translation output file in the same bucket.
Make sure to use the "Prefix" filter in the S3 trigger, and to put the necessary fail-safe code in the lambda function.
9. Result🎉
Every time you edit your website, a translated version will be there automatically! 👏
Great experience, enjoyed working on AWS translate and tweaking it is options.
Loved the flexibility it provides, specially providing your own list of custom translations.
10. Recognition🙏
I always appreciate the culture of support and sharing, and gives credit where it is deserved.
As such, in all my posts or projects, I must mention everyone who was helpful to me and the resources that assisted me.
In this project, I would like to thank:
Ether Lee Introduction to Amazon Translate
Greg Rushing Real-time Translation with Amazon Translate
Watson Srivathsan Customize your Translation Output
Posted on September 6, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.