How to build sparse semantic search with Sentence Transformers

Sparse retrieval is useful when a search feature must keep exact terms such as product labels, error codes, and password-reset phrases visible in the ranking. Sentence Transformers supports SPLADE-style SparseEncoder models that produce high-dimensional sparse vectors for queries and documents, so a small application can test sparse semantic search before wiring it into a search backend.

The SparseEncoder API keeps the query and document sides explicit with encode_query() and encode_document(). Pairing those embeddings with semantic_search() and model.similarity ranks the corpus while preserving the original document IDs beside each text record.

A small in-memory corpus is enough to prove the embedding boundary before moving vectors into Qdrant, OpenSearch, Elasticsearch, or another sparse-vector backend. Treat the numeric scores as model-dependent; the stable check is that the expected document ID ranks first and the script exits after printing the pass marker.

Steps to build sparse semantic search with Sentence Transformers:

  1. Install Sentence Transformers in the active Python environment.
    $ python -m pip install --upgrade sentence-transformers

    The first sparse model run may download files from Hugging Face before printing search output.
    Related: How to install Sentence Transformers with pip

  2. Create the sparse search script.
    sparse_search.py
    from sentence_transformers import SparseEncoder
    from sentence_transformers.util import semantic_search
     
     
    model = SparseEncoder("naver/splade-cocondenser-ensembledistil")
     
    documents = [
        {
            "id": "doc-001",
            "text": "Reset expired password links from the account security page.",
        },
        {
            "id": "doc-002",
            "text": "Rotate SSH deployment keys before a release window.",
        },
        {
            "id": "doc-003",
            "text": "Renew TLS certificates before restarting the web server.",
        },
        {
            "id": "doc-004",
            "text": "Export invoice PDFs from the billing dashboard.",
        },
        {
            "id": "doc-005",
            "text": "Troubleshoot SAML login errors from the identity provider logs.",
        },
    ]
     
    query = "account password reset link expired"
    corpus = [item["text"] for item in documents]
     
    corpus_embeddings = model.encode_document(
        corpus,
        convert_to_tensor=True,
        show_progress_bar=False,
    )
    query_embedding = model.encode_query(
        query,
        convert_to_tensor=True,
        show_progress_bar=False,
    )
     
    hits = semantic_search(
        query_embedding,
        corpus_embeddings,
        top_k=3,
        score_function=model.similarity,
    )[0]
     
    print(f"Query: {query}")
    print(f"Sparse corpus embeddings: {tuple(corpus_embeddings.shape)}")
     
    for rank, hit in enumerate(hits, start=1):
        item = documents[hit["corpus_id"]]
        print(
            f"{rank}. {item['id']} "
            f"score={hit['score']:.4f} "
            f"text={item['text']}"
        )
     
    top_item = documents[hits[0]["corpus_id"]]
    if top_item["id"] != "doc-001":
        raise SystemExit(f"unexpected top sparse result: {top_item['id']}")
     
    print("Sparse semantic search check: pass")

    Keep each application record ID beside its text so the hit returned by corpus_id can be mapped back to the source document.

  3. Run the sparse search script and inspect the ranked IDs.
    $ python sparse_search.py
    Query: account password reset link expired
    Sparse corpus embeddings: (5, 30522)
    1. doc-001 score=23.3164 text=Reset expired password links from the account security page.
    2. doc-003 score=6.9136 text=Renew TLS certificates before restarting the web server.
    3. doc-005 score=2.6232 text=Troubleshoot SAML login errors from the identity provider logs.
    Sparse semantic search check: pass

    doc-001 should rank first because it contains the matching password-reset terms and the sparse encoder expands the query into related vocabulary. Different sparse models can change the score scale and vocabulary dimension.