The Two New Models
OpenAI replaced text-embedding-ada-002 with two new models in January 2024:
- text-embedding-3-small — 1536 dimensions, $0.02/1M tokens
- text-embedding-3-large — 3072 dimensions, $0.13/1M tokens
Ada-002 charged $0.10/1M tokens for a model that MTEB scores showed was falling behind newer alternatives. The new models are both cheaper and more capable.
MTEB Leaderboard Performance
The Massive Text Embedding Benchmark (MTEB) covers 56 tasks across retrieval, classification, clustering, and semantic similarity.
| Model | MTEB Average | Dimensions | Cost/1M tokens | |---|---|---|---| | text-embedding-3-large | 64.6 | 3072 | $0.13 | | text-embedding-3-small | 62.3 | 1536 | $0.02 | | text-embedding-ada-002 | 61.0 | 1536 | $0.10 | | Cohere embed-v3 | 64.5 | 1024 | $0.10 | | Voyage-3 | 67.1 | 1024 | $0.06 |
text-embedding-3-small beats ada-002 at one-fifth the price — for most RAG use cases, it is the obvious default.
Matryoshka Representation Learning
The headline technical feature is Matryoshka embeddings: the model is trained so that the first N dimensions of a 3072-dimension vector are nearly as useful as the full vector. This means you can truncate dimensions at query time without retraining.
from openai import OpenAI
import numpy as np
client = OpenAI()
def get_embedding(text: str, dimensions: int = 1536) -> list[float]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=text,
dimensions=dimensions, # truncate here, not post-hoc
)
return response.data[0].embedding
def cosine_similarity(a: list[float], b: list[float]) -> float:
a_arr = np.array(a)
b_arr = np.array(b)
return float(np.dot(a_arr, b_arr) / (np.linalg.norm(a_arr) * np.linalg.norm(b_arr)))
query_emb = get_embedding("How do transformers handle long sequences?", dimensions=256)
doc_emb = get_embedding("Attention mechanisms scale quadratically with sequence length.", dimensions=256)
print(f"Similarity: {cosine_similarity(query_emb, doc_emb):.4f}")
Using 256 dimensions instead of 1536 reduces vector storage by 6x while retaining roughly 92% of retrieval quality on most benchmarks.
Migration from Ada-002
The embeddings are not backward compatible — ada-002 vectors and text-embedding-3 vectors live in different spaces and cannot be compared. If you are migrating a production vector database:
- Keep ada-002 running for existing queries
- Re-embed your entire corpus with text-embedding-3-small
- Update your vector store index
- Cut over traffic and deprecate ada-002
For Pinecone, create a new index with the new dimension count (1536 for small, 3072 for large). For pgvector, alter the column or create a new one.
When Voyage or Cohere Beat OpenAI
- Voyage-3 consistently leads MTEB for English retrieval tasks — if maximum retrieval accuracy is the priority and you can afford slightly more complex integration, Voyage is worth testing.
- Cohere embed-multilingual-v3 dominates when you need 100+ languages — OpenAI's multilingual performance is good but not best-in-class.
- OpenAI wins on simplicity (one SDK, one billing account) and latency (well-optimized inference infrastructure).