The Strongest Open-Source Multilingual Model
Qwen 2.5 72B from Alibaba Cloud's Qwen team is a 72-billion parameter model that scores within striking distance of GPT-4o on general benchmarks while adding deep multilingual support across 29 languages — including strong performance on Chinese, Arabic, and Japanese that Western-focused models often handle poorly.
MT-Bench: 9.12 (vs GPT-4o at 9.18). That's essentially tied on the most widely used instruction-following evaluation.
What's New in 2.5 vs 2.0
- 128k context (up from 32k in Qwen 2)
- Improved instruction following with fewer refusals on legitimate tasks
- Better structured output (JSON mode reliability)
- Enhanced long-document processing with reduced hallucination on citations
- Stronger coding capability (HumanEval 86.7%)
Running With Ollama
# Pull the 72B model (requires ~45GB VRAM or ~55GB system RAM with Metal)
ollama pull qwen2.5:72b
# Interactive chat
ollama run qwen2.5:72b
# Single prompt
ollama run qwen2.5:72b "Explain transformer attention in Japanese."
# Smaller variants for lower memory
ollama pull qwen2.5:32b # ~20GB VRAM
ollama pull qwen2.5:14b # ~9GB VRAM
ollama pull qwen2.5:7b # ~5GB VRAM
Python API (Via OpenAI-Compatible Endpoint)
from openai import OpenAI
# Using Together AI or another provider
client = OpenAI(
api_key="your-together-api-key",
base_url="https://api.together.xyz/v1"
)
response = client.chat.completions.create(
model="Qwen/Qwen2.5-72B-Instruct-Turbo",
messages=[
{"role": "system", "content": "You are a helpful multilingual assistant."},
{"role": "user", "content": "Summarize this document in both English and Arabic."}
],
temperature=0.7,
max_tokens=1024,
)
print(response.choices[0].message.content)
Structured Output
Qwen 2.5 has improved JSON mode reliability for extraction tasks:
response = client.chat.completions.create(
model="Qwen/Qwen2.5-72B-Instruct-Turbo",
messages=[{
"role": "user",
"content": 'Extract: {"name": str, "date": str, "amount": float} from: "Invoice from Acme Corp dated March 15 for $1,250.00"'
}],
response_format={"type": "json_object"},
)
Language Coverage
Qwen 2.5 supports 29 languages with strong performance: English, Chinese (Simplified/Traditional), French, Spanish, German, Arabic, Japanese, Korean, Russian, Italian, Portuguese, Dutch, Polish, Turkish, Vietnamese, Thai, Indonesian, and more.
Summary
Qwen 2.5 72B is the best open-source choice for multilingual applications, particularly those serving Asian and Middle Eastern markets where other open models underperform. Full model weights available at HuggingFace and detailed benchmarks at the Qwen blog.