Cody Bontecou

Hosting Open-Source Translation Models on AWS SageMaker for Automated Blog Localization

January 14, 2025 · 10 minute read · aws,sagemaker,ai,localization,i18n,huggingface

Introduction

Creating multilingual content is often tedious and expensive. Let’s automate it into our blog's build process!

In this post, we’ll take a deep dive into deploying an open-source text-to-text translation (T2TT) model on AWS SageMaker and seamlessly integrating it into a Nuxt Content blog. Better yet, we’ll automate the workflow through a CI pipeline powered by GitHub Actions.

Using these modern tools, we'll be able to fully automate the internationalization of our blog, enabling it to be read in nearly 100 languages.

Why AWS SageMaker?

AWS SageMaker is a leading solution for all-things ML models:

  • Flexibility: Easily host pre-trained models like Hugging Face transformers.
  • Scalability: Handles traffic spikes without manual intervention.
  • Cost-efficiency: Pay only for what you use.
  • Integration with AWS ecosystem: Perfect for end-to-end workflows.
  • Developer Experience: Well-documented and easy to use SDK's.

Setting up the translation model

We'll be using the pre-trained model SeamlessM4T-v2 for our translations. It is a multimodal and multilingual AI translation model built and released by Meta.

SeamlessM4T-v2 supports:

  • Speech recognition for nearly 100 languages
  • Speech-to-text translation for nearly 100 input and output languages
  • Speech-to-speech translation, supporting nearly 100 input languages and 36 (including English) output languages
  • Text-to-text translation for nearly 100 languages
  • Text-to-speech translation, supporting nearly 100 input languages and 35 (including English) output languages

What I'm interesting in is it's text-to-text translations capabilities. According to my simple vibe-based development experience, SeamlessM4T-v2 is the most capable open-source model for the problem we are solving.

AWS SageMaker permissions

To create an AWS IAM role for your SageMaker application, follow these steps:

Step 1: Log in to AWS Management Console

  1. Go to the IAM service in the AWS Management Console.

Step 2: Create a New Role

  1. In the IAM dashboard, click on Roles in the left-hand menu.
  2. Click the Create Role button.

Step 3: Select the Trusted Entity

  1. Choose AWS Service as the trusted entity type.
  2. Under "Use case," select SageMaker and click Next.

Step 4: Attach Policies

  1. Attach the necessary policies to allow SageMaker to access resources like S3 and other AWS services:
    • AmazonSageMakerFullAccess: Provides full access to SageMaker features.
  2. Click Next.

Step 5: Name and Review

  1. Give your role a meaningful name, e.g., SageMakerExecutionRole.
  2. Review the details and click Create Role.

Step 6: Copy the Role ARN

  1. Find your new role in the list of roles on the IAM dashboard.
  2. Click on the role name to open its details.
  3. Copy the Role ARN (it will look something like arn:aws:iam::123456789012:role/SageMakerExecutionRole).

Deploying the model

We'll use SageMaker's SDK to deploy the model. At the time of writing this, the Javascript SDK does not support model deployment, so I had to resort to using Python.

Hugging Face and SageMaker make deploying the model simple enough to manage within a single script, so delegating this piece of the project to Python is acceptable.

I prefer to use uv for my python dependency management. But you are free to use whatever you're most comfortable here.

uv venv --python 3.11.6
source .venv/bin/activate
uv add sagemaker

Note: The SageMaker SDK only supports Python versions 3.8, 3.9,3.10, and 3.11.

One of my favorite parts of the SageMaker's SDK is that it has first-class Hugging Face support. Providing the HuggingFaceModel a Hugging Face model ID is enough to define and work with the model within our code.

Then all it takes is a simple .deploy() call with our desired instance count and instance type and within a few minutes, our model is online!

The cherry on top is that Hugging Face provides most of code for us! Just click the deploy button on the facebook/seamless-m4t-v2-large page and copy the code over.

Hugging Face's autogenerated code snippets

We are going to make a few adjustments, to personalize the code for us. In the snippet provided, they are using the 'HF_TASK':'automatic-speech-recognition'. Due to the model's multimodal behavior, we have to be explicit here and instead provide 'HF_TASK':'translation'

import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client("iam")
    role = iam.get_role(RoleName="SageMakerExecutionRole")["Role"]["Arn"]

# Hub Model configuration. https://huggingface.co/models
hub = {
    "HF_MODEL_ID": "facebook/seamless-m4t-v2-large",
    "HF_TASK": "translation",
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    transformers_version="4.37.0",
    pytorch_version="2.1.0",
    py_version="py310",
    env=hub,
    role=role,
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,  # number of instances
    instance_type="ml.m5.xlarge",  # ec2 instance type
)

It's worth browsing the source code and documentation around the HuggingFaceModel class. The snippet I provided is the bare-minimum to get our model online, but it's worth knowing that there are a handful of parameters you can manage within this class instantiation to customize your model's deployment.

Interacting with the model during Nuxt content build

With our model online, all that is left is to interact with it via our blog's build process. There are a few bits of configuration needed to allow our client to talk to our AWS SageMaker endpoint.

Start by creating your Nuxt app with the required dependencies:

npx nuxi@latest init content-app -t content
npx nuxi module add i18n
npm install @aws-sdk/client-sagemaker-runtime

Nuxt config and .env variables

If you didn't get the endpoint name when running the deploy script, run this command in your terminal and it should output the endpoint name:

aws sagemaker list-endpoints --query "Endpoints[].EndpointName" --output table

Create a .env file to manage the environmental variables we require. In it, store the endpoint name that was logged at the end of our deploy script as well as the region you configured.

AWS_ENDPOINT_NAME='huggingface-pytorch-inference-2025-01-14-22-34-04-107'
AWS_REGION='us-west-2'

Update your nuxt.config.ts file with these env variables. Here's a barebones of example of our nuxt config file so far:

export default defineNuxtConfig({
    modules: ['@nuxt/content', '@nuxtjs/i18n'],

    runtimeConfig: {
        AWS_ENDPOINT_NAME: process.env.AWS_ENDPOINT_NAME,
        AWS_REGION: process.env.AWS_REGION,
    },

    compatibilityDate: '2025-01-14',
})

Invoking our SageMaker endpoint

With our environment in place, we are ready to interact with our hosted endpoint using the AWS SageMaker's Javascript SDK. This SDK handles a lot of the heavy-lifting, taking care of aspects like authentication, so we can use the model easily.

Just make sure you have the AWS CLI installed and you have authenticated there using the aws configure command. After that, the SDK will let you work with any of your hosted models within a few simple method calls.

Here we create a utility function, invokeSageMakerEndpoint, to initialize a SageMakerRuntimeClient as well as a InvokeEndpointCommand.

Once initialized, we can .send() the command to our deployed model and receive the response in JSON.

import {
    SageMakerRuntimeClient,
    InvokeEndpointCommand,
} from '@aws-sdk/client-sagemaker-runtime'

export async function invokeSageMakerEndpoint(
    endpointName: string,
    region: string,
    inputText: string,
    srcLang: string,
    targetLang: string
) {
    // Initialize the SageMaker Runtime Client
    const client = new SageMakerRuntimeClient({ region })

    // Create the command to invoke the endpoint
    const command = new InvokeEndpointCommand({
        EndpointName: endpointName,
        Body: JSON.stringify({
            inputs: inputText,
            // These parameter's are specific to the model we are using
            parameters: {
                src_lang: srcLang,
                tgt_lang: targetLang,
            },
        }),
    })

    // Send the command and get the response
    const response = await client.send(command)
    const decodedResponse = JSON.parse(new TextDecoder().decode(response.Body))

    return decodedResponse
}

Hooking into our blog's build hooks

Nuxt Content exposes hooks to allow you to modify the content before it is parsed and after it is parsed. We just need to create a custom Nitro plugin.

Create a new file at server/plugins/translate.ts and paste the code snippet:

import { invokeSageMakerEndpoint } from '../utils/invokeSageMakerEndpoint'

export default defineNitroPlugin(async nitroApp => {
    const { AWS_ENDPOINT_NAME, AWS_REGION } = useRuntimeConfig()
    const lang = {
        src: 'eng_Latn',
        tgt: 'spa_Latn',
    }

    nitroApp.hooks.hook('content:file:beforeParse', async file => {
        const response = await invokeSageMakerEndpoint(
            AWS_ENDPOINT_NAME,
            AWS_REGION,
            file.body,
            lang.src,
            lang.tgt
        )
    })
})

The content:file:beforeParse hook iterates over every markdown file within our project's content directory, giving us the file's id and it's text body.

In this simple example. I am taking the text and passing it directly to the invokeSageMakerEndpoint along with a source language of English and a target language of Spanish.

Note: Model language codes may be different. i.e. nllb-200 requires eng_Latn for English, but SeamlessM4T-v2 uses eng.

In my case, I only have a single markdown file located at content/index.md with the content:

# Hello world

If you log the response return by invokeSageMakerEndpoint, you will see that it looks something like this:

[ { translation_text: 'Hola mundo' } ] 

Saving the translations into localized files

Now that we're getting translations back, we can write the content to it's own file.

Deployment

Taking the steps towards automating this entire process through Github Actions.

I'm just passing the translated text that was returned by our invokeSageMakerEndpoint function to another utility function that manages the post-processing of the text:

import { invokeSageMakerEndpoint } from '../utils/invokeSageMakerEndpoint'
import { handleFileCreation } from '../utils/handleFileCreation'

export default defineNitroPlugin(async nitroApp => {
    const { AWS_ENDPOINT_NAME, AWS_REGION } = useRuntimeConfig()
    const lang = {
        src: 'eng',
        tgt: 'spa',
    }

    nitroApp.hooks.hook('content:file:beforeParse', async file => {
        const response: [{ translation_text: string }] =
            await invokeSageMakerEndpoint(
                AWS_ENDPOINT_NAME,
                AWS_REGION,
                file.body,
                lang.src,
                lang.tgt
            )

        handleFileCreation(file, response[0].translation_text, lang.tgt)
    })
})

The key here is that we are taking the translated content and writing it into the appropriate file. For example, the content/about.md file will be translated to Spanish and it's content will be written to content/spa/about.md.

import { promises as fs } from 'fs'
import path from 'path'

interface ContentObject {
    _id: string
    body: string
}

interface ProcessResult {
    originalId: string
    writtenTo: string
}

export async function handleFileCreation(
    contentObj: ContentObject,
    translatedText: string,
    languageDirectory: string
): Promise<ProcessResult> {
    // Remove 'content:' prefix
    if (!contentObj._id.startsWith('content:')) {
        throw new Error('Content object ID must start with "content:"')
    }

    // Split remaining path and remove empty parts
    const parts = contentObj._id.slice(8).split(':').filter(Boolean)

    if (parts.length === 0) {
        throw new Error('Invalid content object ID format')
    }

    // Construct the full file path by joining all parts
    const filePath = path.join('content', languageDirectory, ...parts)
    console.log(filePath)

    // Create directory if it doesn't exist
    const dirPath = path.dirname(filePath)
    await fs.mkdir(dirPath, { recursive: true })

    // Write the content to the file
    await fs.writeFile(filePath, translatedText, 'utf-8')

    return {
        originalId: contentObj._id,
        writtenTo: filePath,
    }
}

For the sake of keeping this tutorial focussed, I decided to opt-out of going to in-depth on the post-processing of our markdown content.

This could be expanded to include frontmatter and handle edge cases that the AI-model may introduce such as linking, images, alt-text, etc.

Scaling down resources to avoid costs

Make sure to run the following command once you are done to avoid unnecessary charges from Amazon:

aws sagemaker delete-endpoint --endpoint-name <ENDPOINT_NAME>

Conclusion

We've walked through the complete process of automating multilingual content generation for your Nuxt Content blog using AWS SageMaker. By leveraging the power of Meta's SeamlessM4T-v2 model, AWS's scalable infrastructure, and Nuxt's flexible content system, we've created a workflow that automatically translates your content into nearly 100 languages during the build process.

This automation brings several key benefits:

  • Eliminates the manual effort of managing translations
  • Significantly reduces localization costs
  • Expands your blog's reach to a global audience
  • Maintains content consistency across all languages
  • Scales effortlessly as your content grows

The best part? Once set up, this system requires minimal maintenance. Your content creators can focus on writing great content in their primary language, while the automation handles the rest.

You can find all the code from this tutorial in our GitHub repository, complete with a working demo and additional documentation. Feel free to fork it, customize it, and make it your own.

I'm curious to hear how you might adapt this workflow for your needs. Could this automate documentation translation for your open-source project? Perhaps help with internationalizing your marketing materials? Or maybe you're thinking about using it for something entirely different?

Remember to clean up your AWS resources when you're done experimenting by deleting your endpoint, as shown in the cleanup section above.

What language will your next blog post speak? With this setup, the answer might just be "all of them."

Newsletter

Subscribe to get my latest content. No spam.