Embeddings Comparison Lab - Benchmark Multiple Models

Select Models to Compare

MiniLM-L6-v2

384 dimensions Fast 22M params

Lightweight and fast, good for most tasks

MPNet-base-v2

768 dimensions Medium 110M params

Higher quality, slower inference

Multilingual-MiniLM

384 dimensions Medium 118M params

Supports 50+ languages

BGE-small-en

384 dimensions Fast 33M params

SOTA performance on retrieval tasks

Benchmark Tasks

Semantic Similarity

Test how well models capture semantic similarity between sentence pairs.

Try an example:

Sentence 1:

Sentence 2:

Word Analogies

Test reasoning: "A is to B as C is to ?" Note: These sentence embedding models are optimized for sentences, not individual words. Classic analogies (like Word2Vec's king-man+woman=queen) may not work as reliably. Try different examples to see which relationships these models capture best.

Try an example:

is to as is to

Topic Categorization

Test how well models group similar items using k-means clustering. The algorithm uses k-means++ initialization for better results. Cluster quality is measured by silhouette score (higher = better separation).

Add word groups:

Items to categorize (one per line):

Number of categories:

How Embedding Models Work

Understanding the transformer architecture behind modern text embeddings.

Sentence Embedding Pipeline

Input Text

"The cat sat on the mat"

↓

Tokenizer

["The", "cat", "sat", "on", "the", "mat"]

↓

Transformer Encoder

Token Embeddings

Self-Attention (×N layers)

Feed-Forward Network

↓

Pooling

Mean of all token vectors

↓

Sentence Embedding

[0.12, -0.45, 0.78, ...] (384 or 768 dims)

Key Concepts

Tokenization

Text is split into subword units (WordPiece/BPE). "embedding" → ["em", "##bed", "##ding"]. This handles unknown words gracefully.

Self-Attention

Each token attends to all other tokens, learning contextual relationships. "bank" gets different representations in "river bank" vs "bank account".

Mean Pooling

Token vectors are averaged to create a single sentence vector. This captures the overall semantic meaning of the text.

Contrastive Learning

Models are trained to make similar sentences close in vector space and dissimilar sentences far apart.

Model Specifications

Model	Dimensions	Parameters	Layers	Best For
MiniLM-L6-v2	384	22M	6	Fast inference, real-time apps
MPNet-base-v2	768	110M	12	High quality, semantic search
Multilingual-MiniLM	384	118M	12	50+ languages, cross-lingual
BGE-small-en	384	33M	6	Retrieval, RAG systems

Similarity Computation

Cosine Similarity:

cos(A, B) = (A · B) / (||A|| × ||B||)

Measures the angle between two vectors. Value ranges from -1 (opposite) to 1 (identical direction).

Word Analogy (Vector Arithmetic):

king - man + woman ≈ queen

Semantic relationships are encoded as vector differences. Adding/subtracting captures analogies.

Leaderboard

Rankings based on the most recent test. Higher scores indicate better semantic understanding for the given task.

Performance Overview

Compares models across three dimensions: Quality (similarity/accuracy), Speed (inference time), and Consistency. Axes zoom to highlight differences.

Speed vs Quality

The fundamental tradeoff: larger models (upper-right) offer better quality but slower inference. Choose based on your latency requirements.

About Embeddings

What are embeddings?
Dense vector representations of text that capture semantic meaning.

Why compare models?
Different models excel at different tasks. Choose based on your needs:

Speed: Smaller models for real-time apps
Quality: Larger models for offline processing
Language: Multilingual for global apps
Domain: Specialized models for specific fields

Statistics

Models Loaded: 0

Tests Run: 0

Avg Speed: -

Top Model: -