Deepseek: The Open Source LLM That Changed the Cost Equation

Deepseek trained a GPT-4o-competitive model for a reported $5.6M — roughly 1/20th of comparable frontier model training costs — and released it under MIT license.

Mahmudul Haque Qudrati

CEO & ML Engineer

May 17, 2026

8 min read

// tags

#deepseek#open-source-llm#deepseek-v3#deepseek-r1#cost-efficient-ai

FIG. ART-26

8 min read

“

Deepseek: The Open Source LLM That Changed the Cost Equation

// reading plan

sections

889

words

min read

// LLM & Language Models

LLMs for Code Generation: A Deep Dive Into Benchmarks, Best Practices, and Limits

Which LLMs write the best code in 2026, what the benchmarks actually measure, how to get better output, and where generated code will still burn you.

9 min read

// LLM & Language Models

Claude 3.5 Sonnet Review: What It Does Better Than GPT-4o (and Where It Falls Short)

Deepseek V3 and Deepseek R1 are large language models from Deepseek AI, a Chinese company, released under MIT license in late 2024 and early 2025. Deepseek V3 performs comparably to GPT-4o on most benchmarks. Deepseek R1 rivals OpenAI's o1 on reasoning tasks. Both were reportedly trained for dramatically less than equivalent Western frontier models, and both are available as open weights you can run yourself.

The Training Cost Story

Deepseek AI reported training Deepseek V3 for approximately $5.6 million in compute costs. For context, training GPT-4 reportedly cost over $100 million, and frontier model training costs have continued to increase. Even if Deepseek's reported figure is underestimated by 2-3x, the efficiency gap is extraordinary.

How did they achieve this? A combination of factors:

Mixture-of-experts architecture (only a subset of parameters active per token, reducing compute per token)
Multi-head Latent Attention (MLA), a novel attention mechanism that reduces KV cache memory
FP8 training (lower precision, less memory bandwidth)
Aggressive pipeline and communication optimization for their H800 GPU cluster

The result: a model that competes with GPT-4o at dramatically lower training cost, and dramatically lower inference cost because of the MoE architecture.

Deepseek V3: The General-Purpose Model

Deepseek V3 is a 671B parameter model (37B active parameters per token due to MoE). Benchmark performance:

MMLU: 88.5%, matching GPT-4o's 88.7%
MATH: 90.2%, significantly stronger than GPT-4o
HumanEval: 89.0%, competitive with leading coding models
Multilingual (Chinese): significantly stronger than Western models for Chinese-language tasks

These benchmarks are from Deepseek's technical report (arXiv:2412.19437, December 2024), which has been independently largely corroborated by third-party evaluations.

Deepseek R1: The Reasoning Model

Deepseek R1 is a reasoning-optimized model, analogous to OpenAI's o1. It uses chain-of-thought reasoning (visible to the user, unlike o1) and was trained using reinforcement learning on verifiable tasks (math, code).

Benchmark performance:

MATH: 97.3%, matching o1 (o1 achieves 96.4%)
AIME 2024: 79.8%, competitive with o1's 79.2%
Codeforces rating: 2029, placing it in the top competitive programming tier

These figures are from the Deepseek R1 technical report (arXiv:2501.12948, January 2025).

For math and reasoning-intensive tasks, R1 is genuinely competitive with the best models in the world.

How to Use Deepseek

Deepseek API (Cheapest Option)

Deepseek's own API is priced dramatically lower than OpenAI:

Deepseek V3: $0.27 per 1M input tokens, $1.10 per 1M output tokens (Deepseek pricing, 2025)
Deepseek R1: $0.55 per 1M input tokens, $2.19 per 1M output tokens

Compare to GPT-4o at $2.50/$10.00. For equivalent-quality general tasks, Deepseek V3 via the Deepseek API costs roughly 1/9th of GPT-4o.

The Deepseek API is OpenAI-compatible, so you can use the OpenAI SDK by changing the base URL:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.DEEPSEEK_API_KEY,
  baseURL: "https://api.deepseek.com",
});

const response = await client.chat.completions.create({
  model: "deepseek-chat", // V3
  messages: [{ role: "user", content: "Your prompt here" }],
});

OpenRouter

OpenRouter aggregates multiple providers and hosts Deepseek V3 and R1. Useful if you want a single API key for multiple models with automatic fallback.

Ollama (Local)

For local running:

ollama pull deepseek-v3
ollama run deepseek-v3

Note: the full 671B model in FP16 requires approximately 1.3TB of GPU memory, which is impractical for most users. Quantized versions (Q4) reduce this to roughly 400GB. In practice, local running of the full model requires a cluster of consumer GPUs or a single high-end workstation. For most users, using the Deepseek API or OpenRouter is more practical.

Distilled versions (Deepseek-R1-Distill-Llama-70B and Qwen-32B) are fine-tuned from R1's reasoning traces onto smaller Llama and Qwen base models, retaining much of the reasoning capability in a locally runnable package.

The Privacy and Regulatory Controversy

Deepseek is a Chinese company. Their privacy policy states that data is stored on servers in China. For companies with sensitive data, regulatory requirements (HIPAA, GDPR, FedRAMP), or policies against Chinese data jurisdiction, using the Deepseek API directly is not appropriate.

Mitigations:

Run the open-weight model on your own infrastructure (no data leaves your environment)
Use OpenRouter's hosting (data processed by OpenRouter's US infrastructure)
Use cloud providers that host Deepseek (Azure, AWS Bedrock offer or plan to offer Deepseek hosting under their data terms)

The open weights mean that privacy concerns about the API do not apply to self-hosted deployments.

When Deepseek Is the Right Choice

Cost-sensitive production at scale: if you are processing high volumes of tokens and GPT-4o quality is sufficient for your use case, Deepseek V3 at 1/9th the price is worth serious evaluation.

Chinese language tasks: Deepseek significantly outperforms Western models on Chinese language understanding and generation. For Chinese-language applications, it is often the best choice.

Reasoning-heavy tasks: Deepseek R1's performance on MATH and logical reasoning is genuinely competitive with o1. For applications that benefit from chain-of-thought reasoning, R1 is worth benchmarking.

Self-hosted open source: for teams that want GPT-4o-class performance with fully self-hosted open weights, Deepseek is the strongest current option.

Keep Reading

Llama 3.3 Complete Guide — The other major open source option
LLM Comparison Guide 2026 — How Deepseek fits in the full model landscape
Cutting LLM API Costs: Complete Guide — Deepseek fits into a broader cost optimization strategy

Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.

Deepseek: The Open Source LLM That Changed the Cost Equation

Related Articles

LLMs for Code Generation: A Deep Dive Into Benchmarks, Best Practices, and Limits

The Training Cost Story

Deepseek V3: The General-Purpose Model

Deepseek R1: The Reasoning Model

How to Use Deepseek

Deepseek API (Cheapest Option)

OpenRouter

Ollama (Local)

The Privacy and Regulatory Controversy

When Deepseek Is the Right Choice

Keep Reading

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Claude 3.5 Sonnet Review: What It Does Better Than GPT-4o (and Where It Falls Short)

LLM Safety and Alignment Explained for Developers

Deepseek: The Open Source LLM That Changed the Cost Equation

Related Articles

LLMs for Code Generation: A Deep Dive Into Benchmarks, Best Practices, and Limits

The Training Cost Story

Deepseek V3: The General-Purpose Model

Deepseek R1: The Reasoning Model

How to Use Deepseek

Deepseek API (Cheapest Option)

OpenRouter

Ollama (Local)

The Privacy and Regulatory Controversy

When Deepseek Is the Right Choice

Keep Reading

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Claude 3.5 Sonnet Review: What It Does Better Than GPT-4o (and Where It Falls Short)

LLM Safety and Alignment Explained for Developers

The workspace your team
actually needs