Sparse retrieval is useful when a search feature must keep exact terms such as product labels, error codes, and password-reset phrases visible in the ranking. Sentence Transformers supports SPLADE-style SparseEncoder models that produce high-dimensional sparse vectors for queries and documents, so a small application can test sparse semantic search before wiring it into a search backend.
The SparseEncoder API keeps the query and document sides explicit with encode_query() and encode_document(). Pairing those embeddings with semantic_search() and model.similarity ranks the corpus while preserving the original document IDs beside each text record.
A small in-memory corpus is enough to prove the embedding boundary before moving vectors into Qdrant, OpenSearch, Elasticsearch, or another sparse-vector backend. Treat the numeric scores as model-dependent; the stable check is that the expected document ID ranks first and the script exits after printing the pass marker.
Steps to build sparse semantic search with Sentence Transformers:
- Install Sentence Transformers in the active Python environment.
$ python -m pip install --upgrade sentence-transformers
The first sparse model run may download files from Hugging Face before printing search output.
Related: How to install Sentence Transformers with pip - Create the sparse search script.
- sparse_search.py
from sentence_transformers import SparseEncoder from sentence_transformers.util import semantic_search model = SparseEncoder("naver/splade-cocondenser-ensembledistil") documents = [ { "id": "doc-001", "text": "Reset expired password links from the account security page.", }, { "id": "doc-002", "text": "Rotate SSH deployment keys before a release window.", }, { "id": "doc-003", "text": "Renew TLS certificates before restarting the web server.", }, { "id": "doc-004", "text": "Export invoice PDFs from the billing dashboard.", }, { "id": "doc-005", "text": "Troubleshoot SAML login errors from the identity provider logs.", }, ] query = "account password reset link expired" corpus = [item["text"] for item in documents] corpus_embeddings = model.encode_document( corpus, convert_to_tensor=True, show_progress_bar=False, ) query_embedding = model.encode_query( query, convert_to_tensor=True, show_progress_bar=False, ) hits = semantic_search( query_embedding, corpus_embeddings, top_k=3, score_function=model.similarity, )[0] print(f"Query: {query}") print(f"Sparse corpus embeddings: {tuple(corpus_embeddings.shape)}") for rank, hit in enumerate(hits, start=1): item = documents[hit["corpus_id"]] print( f"{rank}. {item['id']} " f"score={hit['score']:.4f} " f"text={item['text']}" ) top_item = documents[hits[0]["corpus_id"]] if top_item["id"] != "doc-001": raise SystemExit(f"unexpected top sparse result: {top_item['id']}") print("Sparse semantic search check: pass")
Keep each application record ID beside its text so the hit returned by corpus_id can be mapped back to the source document.
- Run the sparse search script and inspect the ranked IDs.
$ python sparse_search.py Query: account password reset link expired Sparse corpus embeddings: (5, 30522) 1. doc-001 score=23.3164 text=Reset expired password links from the account security page. 2. doc-003 score=6.9136 text=Renew TLS certificates before restarting the web server. 3. doc-005 score=2.6232 text=Troubleshoot SAML login errors from the identity provider logs. Sparse semantic search check: pass
doc-001 should rank first because it contains the matching password-reset terms and the sparse encoder expands the query into related vocabulary. Different sparse models can change the score scale and vocabulary dimension.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.