InternLM 2.5: Shanghai AI Lab's Long-Context Multilingual LLM

InternLM 2.5 20B offers a 1M token context window, strong Chinese-English bilingual reasoning, and native tool calling - all in a model small enough to serve on a single A100.

Mahmudul Haque Qudrati

CEO & ML Engineer

April 8, 2026

7 min read

// tags

#internlm#multilingual#long-context#chinese#open-source

FIG. ART-28

7 min read

“

InternLM 2.5: Shanghai AI Lab's Long-Context Multilingual LLM

// reading plan

sections

427

words

min read

// Developer Tools

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

Open Code Review is an open-source CLI tool from Alibaba that uses AI to review code changes. It runs locally, supports multiple LLMs, and costs about $0.01 per review. Here's a practical breakdown.

4 min read

// LLMs & Language Models

Claude Opus 4.8 vs GPT-5.5 vs Gemini 3.1 Pro: June 2026 Benchmarks and Pricing

Chain-of-Thought Reasoning

InternLM 2.5 was trained with explicit chain-of-thought (CoT) supervision, meaning it performs better on multi-step reasoning tasks when you prompt it to think step by step. On Chinese mathematical reasoning benchmarks:

CMath (Chinese math): InternLM 2.5 20B scores above Qwen 2.5 14B and competitive with Qwen 2.5 32B
CEVAL (Chinese comprehensive benchmark): scores 78.3%, above Yi-1.5 34B's 77.4% at less than half the parameter count

Tool Calling

The model includes native function calling support compatible with the OpenAI tool use format. This makes it drop-in compatible with agent frameworks like LangChain and LlamaIndex:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_stock_price",
            "description": "Get the current price of a stock by ticker symbol",
            "parameters": {
                "type": "object",
                "properties": {
                    "ticker": {"type": "string", "description": "Stock ticker (e.g. AAPL)"},
                },
                "required": ["ticker"],
            },
        },
    }
]

Multilingual Coverage

InternLM 2.5 supports 10+ languages with particular strength in:

Chinese (Simplified and Traditional)
English
Japanese
Korean
Multiple Southeast Asian languages

This is weaker coverage than Cohere's embed-multilingual-v3 (108 languages) but stronger than most models of this parameter count on Asian languages specifically.

Deployment with LMDeploy

Shanghai AI Lab maintains LMDeploy, an optimized inference framework for InternLM models with TurboMind backend:

pip install lmdeploy
lmdeploy serve api_server internlm/internlm2_5-20b-chat --server-port 23333

This exposes an OpenAI-compatible endpoint. LMDeploy achieves roughly 2 - 3x higher throughput than naive HuggingFace generation through continuous batching and KV cache optimization.

InternLM 2.5: Shanghai AI Lab's Long-Context Multilingual LLM

Related Articles

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

What Is InternLM 2.5?

The 1M Token Context Window

Chain-of-Thought Reasoning

Tool Calling

Multilingual Coverage

Deployment with LMDeploy

Links

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Claude Opus 4.8 vs GPT-5.5 vs Gemini 3.1 Pro: June 2026 Benchmarks and Pricing

DeepSeek-R1: Architectures, Training Methods, and Why Reasoning Models Matter

InternLM 2.5: Shanghai AI Lab's Long-Context Multilingual LLM

Related Articles

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

What Is InternLM 2.5?

The 1M Token Context Window

Chain-of-Thought Reasoning

Tool Calling

Multilingual Coverage

Deployment with LMDeploy

Links

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Claude Opus 4.8 vs GPT-5.5 vs Gemini 3.1 Pro: June 2026 Benchmarks and Pricing

DeepSeek-R1: Architectures, Training Methods, and Why Reasoning Models Matter

The workspace your team
actually needs