How to build sparse semantic search with Sentence Transformers

Sparse retrieval is useful when a search feature must keep exact terms such as product labels, error codes, and password-reset phrases visible in the ranking. Sentence Transformers supports SPLADE-style SparseEncoder models that produce high-dimensional sparse vectors for queries and documents, so a small application can test sparse semantic search before wiring it into a search backend.

The SparseEncoder API keeps the query and document sides explicit with encode_query() and encode_document(). Pairing those embeddings with semantic_search() and model.similarity ranks the corpus while preserving the original document IDs beside each text record.

A small in-memory corpus is enough to prove the embedding boundary before moving vectors into Qdrant, OpenSearch, Elasticsearch, or another sparse-vector backend. Treat the numeric scores as model-dependent; the stable check is that the expected document ID ranks first and the script exits after printing the pass marker.

Steps to build sparse semantic search with Sentence Transformers:

Install Sentence Transformers in the active Python environment.
```
$ python -m pip install --upgrade sentence-transformers
```
The first sparse model run may download files from Hugging Face before printing search output.
Related: How to install Sentence Transformers with pip

Create the sparse search script.

sparse_search.py

from sentence_transformers import SparseEncoder
from sentence_transformers.util import semantic_search
 
 
model = SparseEncoder("naver/splade-cocondenser-ensembledistil")
 
documents = [
    {
        "id": "doc-001",
        "text": "Reset expired password links from the account security page.",
    },
    {
        "id": "doc-002",
        "text": "Rotate SSH deployment keys before a release window.",
    },
    {
        "id": "doc-003",
        "text": "Renew TLS certificates before restarting the web server.",
    },
    {
        "id": "doc-004",
        "text": "Export invoice PDFs from the billing dashboard.",
    },
    {
        "id": "doc-005",
        "text": "Troubleshoot SAML login errors from the identity provider logs.",
    },
]
 
query = "account password reset link expired"
corpus = [item["text"] for item in documents]
 
corpus_embeddings = model.encode_document(
    corpus,
    convert_to_tensor=True,
    show_progress_bar=False,
)
query_embedding = model.encode_query(
    query,
    convert_to_tensor=True,
    show_progress_bar=False,
)
 
hits = semantic_search(
    query_embedding,
    corpus_embeddings,
    top_k=3,
    score_function=model.similarity,
)[0]
 
print(f"Query: {query}")
print(f"Sparse corpus embeddings: {tuple(corpus_embeddings.shape)}")
 
for rank, hit in enumerate(hits, start=1):
    item = documents[hit["corpus_id"]]
    print(
        f"{rank}. {item['id']} "
        f"score={hit['score']:.4f} "
        f"text={item['text']}"
    )
 
top_item = documents[hits[0]["corpus_id"]]
if top_item["id"] != "doc-001":
    raise SystemExit(f"unexpected top sparse result: {top_item['id']}")
 
print("Sparse semantic search check: pass")

Keep each application record ID beside its text so the hit returned by corpus_id can be mapped back to the source document.

Run the sparse search script and inspect the ranked IDs.

$ python sparse_search.py
Query: account password reset link expired
Sparse corpus embeddings: (5, 30522)
1. doc-001 score=23.3164 text=Reset expired password links from the account security page.
2. doc-003 score=6.9136 text=Renew TLS certificates before restarting the web server.
3. doc-005 score=2.6232 text=Troubleshoot SAML login errors from the identity provider logs.
Sparse semantic search check: pass

doc-001 should rank first because it contains the matching password-reset terms and the sparse encoder expands the query into related vocabulary. Different sparse models can change the score scale and vocabulary dimension.