DeepSeek R1: The Open-Source Reasoning Model That Beat o1

DeepSeek R1 achieves 79.8% on AIME 2024 (vs o1-mini at 63.6%) using pure reinforcement learning without supervised fine-tuning, and it's MIT licensed.

Mahmudul Haque Qudrati

CEO & ML Engineer

March 22, 2026

7 min read

// tags

#deepseek#reasoning#r1#chain-of-thought#open-source

FIG. ART-27

7 min read

“

DeepSeek R1: The Open-Source Reasoning Model That Beat o1

// reading plan

sections

415

words

min read

// Developer Tools

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

Open Code Review is an open-source CLI tool from Alibaba that uses AI to review code changes. It runs locally, supports multiple LLMs, and costs about $0.01 per review. Here's a practical breakdown.

4 min read

// LLMs & Language Models

DeepSeek-R1: Architectures, Training Methods, and Why Reasoning Models Matter

Architecture: MoE With 37B Active Parameters

R1 uses a 671B parameter Mixture of Experts architecture, but only 37B parameters are active per forward pass. This gives it frontier-level capacity while keeping inference costs closer to a 37B dense model.

Distilled variants (trained to imitate R1's reasoning traces using SFT) are available at 7B, 14B, 32B, and 70B - making reasoning capability accessible on consumer hardware.

API Access

The DeepSeek API is OpenAI-compatible and priced at $0.14 per million input tokens (cache hits: $0.014) and $2.19 per million output tokens - a fraction of o1's cost.

from openai import OpenAI

client = OpenAI(
    api_key="your-deepseek-api-key",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=[
        {"role": "user", "content": "Prove that sqrt(2) is irrational."}
    ]
)

# R1 returns both thinking and answer
print(response.choices[0].message.reasoning_content)  # chain of thought
print(response.choices[0].message.content)            # final answer

Running Locally With Ollama

# 7B distilled  -  runs on consumer GPU (8GB VRAM)
ollama pull deepseek-r1:7b
ollama run deepseek-r1:7b "What is the integral of x^2 from 0 to 3?"

# 32B distilled  -  runs on 2× 3090s
ollama pull deepseek-r1:32b

When to Use R1

R1 is best for tasks with verifiable correct answers: math, formal proofs, competitive programming, structured data extraction with strict schema constraints. For open-ended creative or conversational tasks, GPT-4o or Claude 3.5 Sonnet often produce more natural output.

Summary

DeepSeek R1 is a landmark in open-source AI: frontier reasoning capability, MIT license, and a published training methodology that's already influencing how the industry thinks about RL-based training. Download weights at HuggingFace or call the API at deepseek.com.

Benchmark	DeepSeek R1	o1-mini	o1
AIME 2024	79.8%	63.6%	74.4%
MATH-500	97.3%	90.0%	96.4%
Codeforces	96.3th %ile	93.4th %ile	96.6th %ile
MMLU	90.8%	85.2%	91.8%

DeepSeek R1: The Open-Source Reasoning Model That Beat o1

Related Articles

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

What Makes R1 Different

Benchmark Results

Architecture: MoE With 37B Active Parameters

API Access

Running Locally With Ollama

When to Use R1

Summary

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

DeepSeek-R1: Architectures, Training Methods, and Why Reasoning Models Matter

Local LLMs in 2026: Comparing Llama 3.3, Mistral Large, and DeepSeek-R1

DeepSeek R1: The Open-Source Reasoning Model That Beat o1

Related Articles

Open Code Review – An AI-powered code review CLI tool: A Practical Overview

What Makes R1 Different

Benchmark Results

Architecture: MoE With 37B Active Parameters

API Access

Running Locally With Ollama

When to Use R1

Summary

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

DeepSeek-R1: Architectures, Training Methods, and Why Reasoning Models Matter

Local LLMs in 2026: Comparing Llama 3.3, Mistral Large, and DeepSeek-R1

The workspace your team
actually needs