sentence-transformers

zechenzhangAGI

Updated Today

27 views

Metaai

About

sentence-transformers is a Python framework for generating state-of-the-art sentence, text, and image embeddings. It provides over 5,000 pre-trained models for tasks like semantic search, clustering, and retrieval, supporting multilingual and domain-specific use cases. It's ideal for developers needing a cost-effective, local alternative to API-based solutions for generating embeddings in RAG systems or similarity tasks.

Quick Install

Claude Code

Recommended

Plugin CommandRecommended

/plugin add https://github.com/zechenzhangAGI/AI-research-SKILLs

Git CloneAlternative

git clone https://github.com/zechenzhangAGI/AI-research-SKILLs.git ~/.claude/skills/sentence-transformers

Copy and paste this command in Claude Code to install this skill

Documentation

Sentence Transformers - State-of-the-Art Embeddings

Python framework for sentence and text embeddings using transformers.

When to use Sentence Transformers

Use when:

Need high-quality embeddings for RAG
Semantic similarity and search
Text clustering and classification
Multilingual embeddings (100+ languages)
Running embeddings locally (no API)
Cost-effective alternative to OpenAI embeddings

Metrics:

15,700+ GitHub stars
5000+ pre-trained models
100+ languages supported
Based on PyTorch/Transformers

Use alternatives instead:

OpenAI Embeddings: Need API-based, highest quality
Instructor: Task-specific instructions
Cohere Embed: Managed service

Quick start

Installation

pip install sentence-transformers

Basic usage

from sentence_transformers import SentenceTransformer

# Load model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Generate embeddings
sentences = [
    "This is an example sentence",
    "Each sentence is converted to a vector"
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (2, 384)

# Cosine similarity
from sentence_transformers.util import cos_sim
similarity = cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {similarity.item():.4f}")

Popular models

General purpose

# Fast, good quality (384 dim)
model = SentenceTransformer('all-MiniLM-L6-v2')

# Better quality (768 dim)
model = SentenceTransformer('all-mpnet-base-v2')

# Best quality (1024 dim, slower)
model = SentenceTransformer('all-roberta-large-v1')

Multilingual

# 50+ languages
model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2')

# 100+ languages
model = SentenceTransformer('paraphrase-multilingual-mpnet-base-v2')

Domain-specific

# Legal domain
model = SentenceTransformer('nlpaueb/legal-bert-base-uncased')

# Scientific papers
model = SentenceTransformer('allenai/specter')

# Code
model = SentenceTransformer('microsoft/codebert-base')

Semantic search

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer('all-MiniLM-L6-v2')

# Corpus
corpus = [
    "Python is a programming language",
    "Machine learning uses algorithms",
    "Neural networks are powerful"
]

# Encode corpus
corpus_embeddings = model.encode(corpus, convert_to_tensor=True)

# Query
query = "What is Python?"
query_embedding = model.encode(query, convert_to_tensor=True)

# Find most similar
hits = util.semantic_search(query_embedding, corpus_embeddings, top_k=3)
print(hits)

Similarity computation

# Cosine similarity
similarity = util.cos_sim(embedding1, embedding2)

# Dot product
similarity = util.dot_score(embedding1, embedding2)

# Pairwise cosine similarity
similarities = util.cos_sim(embeddings, embeddings)

Batch encoding

# Efficient batch processing
sentences = ["sentence 1", "sentence 2", ...] * 1000

embeddings = model.encode(
    sentences,
    batch_size=32,
    show_progress_bar=True,
    convert_to_tensor=False  # or True for PyTorch tensors
)

Fine-tuning

from sentence_transformers import InputExample, losses
from torch.utils.data import DataLoader

# Training data
train_examples = [
    InputExample(texts=['sentence 1', 'sentence 2'], label=0.8),
    InputExample(texts=['sentence 3', 'sentence 4'], label=0.3),
]

train_dataloader = DataLoader(train_examples, batch_size=16)

# Loss function
train_loss = losses.CosineSimilarityLoss(model)

# Train
model.fit(
    train_objectives=[(train_dataloader, train_loss)],
    epochs=10,
    warmup_steps=100
)

# Save
model.save('my-finetuned-model')

LangChain integration

from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

# Use with vector stores
from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(
    documents=docs,
    embedding=embeddings
)

LlamaIndex integration

from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

from llama_index.core import Settings
Settings.embed_model = embed_model

# Use in index
index = VectorStoreIndex.from_documents(documents)

Model selection guide

Model	Dimensions	Speed	Quality	Use Case
all-MiniLM-L6-v2	384	Fast	Good	General, prototyping
all-mpnet-base-v2	768	Medium	Better	Production RAG
all-roberta-large-v1	1024	Slow	Best	High accuracy needed
paraphrase-multilingual	768	Medium	Good	Multilingual

Best practices

Start with all-MiniLM-L6-v2 - Good baseline
Normalize embeddings - Better for cosine similarity
Use GPU if available - 10× faster encoding
Batch encoding - More efficient
Cache embeddings - Expensive to recompute
Fine-tune for domain - Improves quality
Test different models - Quality varies by task
Monitor memory - Large models need more RAM

Performance

Model	Speed (sentences/sec)	Memory	Dimension
MiniLM	~2000	120MB	384
MPNet	~600	420MB	768
RoBERTa	~300	1.3GB	1024

Resources

GitHub: https://github.com/UKPLab/sentence-transformers ⭐ 15,700+
Models: https://huggingface.co/sentence-transformers
Docs: https://www.sbert.net
License: Apache 2.0

GitHub Repository

zechenzhangAGI/AI-research-SKILLs

Path: 15-rag/sentence-transformers

aiai-researchclaudeclaude-codeclaude-skillscodex

Related Skills

sglang

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.

View skill