200 Languages Including Low-Resource African Languages
Machine translation has historically concentrated on high-resource language pairs. Meta's No Language Left Behind paper describes a deliberate effort to include 55 low-resource African languages — Ewe, Wolof, Twi, Yoruba, and others — that have negligible representation in existing MT systems.
The model was trained with data mining infrastructure that crawled, detected, and aligned parallel corpora specifically for these languages, supplemented by human translators for quality assessment on the FLORES-200 benchmark.
Model Variants
- NLLB-200-54B MoE: Mixture of Experts, highest quality, requires multi-GPU
- NLLB-200-3.3B: Dense, high quality, fits on single A10G (24GB VRAM)
- NLLB-200-1.3B: Dense, good quality, fits on consumer GPU
- NLLB-200-distilled-600M: Distilled from larger model, best quality/speed tradeoff, runs on CPU
The HuggingFace NLLB-200-distilled-600M page is the practical starting point for most developers.
Python Translation With Transformers
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
def translate(text: str, src_lang: str, tgt_lang: str) -> str:
"""
Language codes follow BCP 47 with script tag.
English: eng_Latn, French: fra_Latn, Arabic: arb_Arab, Yoruba: yor_Latn
Full list: https://github.com/facebookresearch/flores/blob/main/flores200/README.md
"""
tokenizer.src_lang = src_lang
inputs = tokenizer(text, return_tensors="pt")
target_lang_id = tokenizer.lang_code_to_id[tgt_lang]
translated = model.generate(
**inputs,
forced_bos_token_id=target_lang_id,
max_length=512,
)
return tokenizer.batch_decode(translated, skip_special_tokens=True)[0]
result = translate(
"Machine learning is transforming healthcare.",
src_lang="eng_Latn",
tgt_lang="fra_Latn"
)
print(result)
Fast Inference With ctranslate2
CTranslate2 provides 2-5x faster inference than PyTorch for sequence-to-sequence models:
from ctranslate2 import Translator
from transformers import AutoTokenizer
ct2_model_path = "nllb-200-ct2" # Convert with ct2-opus-mt-converter
translator = Translator(ct2_model_path, device="cpu", inter_threads=4)
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
tokenizer.src_lang = "eng_Latn"
tokens = tokenizer.tokenize("Hello, world!")
results = translator.translate_batch([tokens], target_prefix=[["fra_Latn"]])
translated = tokenizer.convert_tokens_to_string(results[0].hypotheses[0][1:])
print(translated)
Cost Comparison vs Google Translate API
Google Translate charges $20 per 1 million characters. For a SaaS product with 50 million translated characters/month, that's $1,000/month. Self-hosting NLLB-200-distilled-600M on a single c5.2xlarge instance (8 vCPU, 16GB RAM) costs approximately $200/month and handles the load comfortably on CPU with ctranslate2.
The quality difference versus Google Translate varies by language: comparable on major European languages, often better on low-resource African languages where NLLB-200 was specifically optimized.