StarCoder2: The Open Coding Model Trained on 600+ Programming Languages

BigCode's StarCoder2-15B is trained on The Stack v2 covering 619 programming languages, bringing fill-in-the-middle completion and strong HumanEval scores to a model you can run without a vendor contract.

Mahmudul Haque Qudrati

CEO & ML Engineer

March 12, 2026

7 min read

// tags

#starcoder2#coding#bigcode#open-source#fill-in-middle

FIG. ART-30

7 min read

“

StarCoder2: The Open Coding Model Trained on 600+ Programming Languages

// reading plan

sections

439

words

min read

// Developer Tools

What is SpaceX Is Buying Cursor? A Practical Overview

SpaceX is buying Cursor, the AI-powered code editor. The deal signals a shift in how AI coding tools are valued and deployed. Here's a practical breakdown of what's happening and what it means for developers.

4 min read

// Developer Tools

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

HumanEval and Benchmark Comparisons

Model	HumanEval pass@1	Parameters
StarCoder2-15B	46.3% (base), ~72% instruct	15B
CodeLlama 34B	53.7%	34B
StarCoder (v1) 15B	33.6%	15B
Qwen2.5-Coder 7B	88.4%	7B

The instruct-tuned version (StarCoder2-15B-Instruct-v0.1) reaches competitive scores on instruction following, though newer models like Qwen2.5-Coder have surpassed it on raw HumanEval. StarCoder2 remains relevant for its breadth of language coverage and fill-in-the-middle support.

Fill-in-the-Middle Training

StarCoder2 is trained with the FIM (fill-in-the-middle) objective, using a special set of tokens that makes code completion feel natural:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoder2-15b")
model = AutoModelForCausalLM.from_pretrained(
    "bigcode/starcoder2-15b",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prefix = "def is_palindrome(s: str) -> bool:\n    "
suffix = "\n\nassert is_palindrome('racecar') == True"

fim_prompt = f"<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>"

inputs = tokenizer(fim_prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

VS Code Integration via Continue

The easiest way to use StarCoder2-15B for day-to-day coding is through the Continue extension:

Install Continue from the VS Code marketplace
In ~/.continue/config.json, add:

{
  "tabAutocompleteModel": {
    "title": "StarCoder2",
    "provider": "ollama",
    "model": "starcoder2:15b"
  }
}

Run ollama pull starcoder2:15b

You now have local, private tab completion that never sends your code to a third-party server.

Hugging Face Code Leaderboard

The Big Code Models Leaderboard tracks models on HumanEval, MBPP, MultiPL-E, and DS-1000. As of early 2025, StarCoder2-15B sits in the top 10 for models under 20B parameters, though the Qwen2.5-Coder family has moved above it in absolute scores.

When to Choose StarCoder2

You need maximum language breadth (619 languages vs ~100 for most models)
You want IDE tab completion with FIM and full local privacy
You need a model in the 7 - 15B range that balances quality and hardware requirements
You want to build on top of a permissively licensed, academically documented model

StarCoder2: The Open Coding Model Trained on 600+ Programming Languages

Related Articles

What is SpaceX Is Buying Cursor? A Practical Overview

What Is StarCoder2?

Model Variants

HumanEval and Benchmark Comparisons

Fill-in-the-Middle Training

VS Code Integration via Continue

Hugging Face Code Leaderboard

When to Choose StarCoder2

Links

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

What Is the Text in Claude Code's Extended Thinking Output? A Practical Overview

StarCoder2: The Open Coding Model Trained on 600+ Programming Languages

Related Articles

What is SpaceX Is Buying Cursor? A Practical Overview

What Is StarCoder2?

Model Variants

HumanEval and Benchmark Comparisons

Fill-in-the-Middle Training

VS Code Integration via Continue

Hugging Face Code Leaderboard

When to Choose StarCoder2

Links

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

What Is the Text in Claude Code's Extended Thinking Output? A Practical Overview

The workspace your team
actually needs