Sentence Transformers: The Go-To Library for Text Embeddings in 2026

The Sentence Transformers library provides a unified interface for generating text embeddings, enabling semantic search, clustering, and fine-tuning on custom similarity tasks with minimal code.

Mahmudul Haque Qudrati

CEO & ML Engineer

March 21, 2026

7 min read

// tags

#sentence-transformers#embeddings#semantic-search#sbert#fine-tuning

FIG. ART-38

7 min read

“

Sentence Transformers: The Go-To Library for Text Embeddings in 2026

// reading plan

sections

364

words

min read

// Machine Learning

ONNX: Export Any ML Model and Run It Anywhere

ONNX (Open Neural Network Exchange) is the universal model format — export from PyTorch, scikit-learn, or HuggingFace and run 3x faster inference with ONNX Runtime on CPU or GPU.

7 min read

// Machine Learning

Supervised Learning Explained: How Models Learn from Labeled Examples

Why Sentence Transformers

Raw BERT produces token embeddings, not sentence embeddings. Averaging BERT token outputs gives poor semantic representations — similar sentences get dissimilar vectors. Sentence Transformers (SBERT) fixes this by fine-tuning BERT-style models with siamese networks on natural language inference pairs, producing embeddings where cosine similarity directly correlates with semantic similarity.

The HuggingFace Sentence Transformers collection hosts 200+ pre-trained models covering different size/quality tradeoffs.

Encoding and Cosine Similarity

from sentence_transformers import SentenceTransformer, util
import torch

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
    "The quick brown fox jumps over the lazy dog",
    "A fast auburn fox leaps above a sleepy canine",
    "The stock market closed higher today",
]

embeddings = model.encode(sentences, convert_to_tensor=True)

# Pairwise cosine similarity
cos_sim = util.cos_sim(embeddings, embeddings)
print(f"Sentences 0 and 1 similarity: {cos_sim[0][1]:.4f}")  # ~0.72 (semantically similar)
print(f"Sentences 0 and 2 similarity: {cos_sim[0][2]:.4f}")  # ~0.05 (unrelated)

Semantic Search With util.semantic_search

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("all-mpnet-base-v2")

corpus = [
    "Python is a high-level programming language",
    "Machine learning requires large datasets",
    "Neural networks are inspired by the human brain",
    "Flask is a lightweight web framework",
]

corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
query = "web development frameworks"
query_embedding = model.encode(query, convert_to_tensor=True)

hits = util.semantic_search(query_embedding, corpus_embeddings, top_k=2)
for hit in hits[0]:
    print(f"Score: {hit['score']:.4f} | {corpus[hit['corpus_id']]}")

Fine-Tuning on Custom Pairs

Use MultipleNegativesRankingLoss when you have (anchor, positive) pairs without explicit negatives — the other items in the batch serve as negatives:

from sentence_transformers import SentenceTransformer, InputExample, losses
from torch.utils.data import DataLoader

model = SentenceTransformer("all-MiniLM-L6-v2")

train_examples = [
    InputExample(texts=["What is machine learning?", "ML is a type of AI that learns from data"]),
    InputExample(texts=["How do I fix a bug?", "Debugging requires isolating the failing component"]),
]

train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
train_loss = losses.MultipleNegativesRankingLoss(model)

model.fit(
    train_objectives=[(train_dataloader, train_loss)],
    epochs=3,
    warmup_steps=100,
)
model.save("my-finetuned-model")

Best Models Comparison

| Model | Dimensions | Speed | Quality | |---|---|---|---| | all-MiniLM-L6-v2 | 384 | Very fast | Good | | all-mpnet-base-v2 | 768 | Fast | Better | | multi-qa-mpnet-base-dot-v1 | 768 | Fast | Best for QA | | BGE-M3 | 1024 | Moderate | Best overall |

The GitHub repository includes pretrained model benchmarks on STS, QA, and retrieval tasks. For most production RAG use cases, all-mpnet-base-v2 is the baseline to beat before reaching for larger models.

Sentence Transformers: The Go-To Library for Text Embeddings in 2026

Related Articles

ONNX: Export Any ML Model and Run It Anywhere

Supervised Learning Explained: How Models Learn from Labeled Examples

Why Sentence Transformers

Encoding and Cosine Similarity

Semantic Search With util.semantic_search

Fine-Tuning on Custom Pairs

Best Models Comparison

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Gradient Descent Explained: How Machine Learning Models Actually Learn

Sentence Transformers: The Go-To Library for Text Embeddings in 2026

Related Articles

ONNX: Export Any ML Model and Run It Anywhere

Supervised Learning Explained: How Models Learn from Labeled Examples

Why Sentence Transformers

Encoding and Cosine Similarity

Semantic Search With util.semantic_search

Fine-Tuning on Custom Pairs

Best Models Comparison

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Gradient Descent Explained: How Machine Learning Models Actually Learn

The workspace your team
actually needs