How to encode queries and documents with Sentence Transformers

Search queries and indexed documents often carry different amounts of information. Sentence Transformers keeps that split explicit with encode_query() for user input and encode_document() for corpus text, which helps retrieval code stay aligned with models that define separate query and document behavior.

The role-specific methods use the same normal encoding options, including batch_size, precision, device, and normalize_embeddings. When a model defines query/document prompts or a Router module, the matching method selects that query or document path; models without separate role behavior can still return the same vectors while keeping the application code ready for a retrieval-specific model.

Use the same normalization, precision, device, and truncation settings on both sides of the retrieval index. A small password-reset corpus is enough to check the wiring. The query vector and document matrix should share the same embedding dimension, and the password document should rank first for the password query.

Steps to encode query and document embeddings with Sentence Transformers:

  1. Create a temporary Python script that encodes a query and a document corpus with separate role methods.
    query_document_encode.py
    from sentence_transformers import SentenceTransformer
     
    model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
     
    query = "How do I reset a forgotten password?"
    documents = [
        "Generate quarterly revenue charts from a CSV export.",
        "Reset a lost account password from the profile security page.",
        "Tune the database connection pool for a busy API server.",
    ]
     
    document_embeddings = model.encode_document(
        documents,
        normalize_embeddings=True,
        show_progress_bar=False,
    )
    query_embedding = model.encode_query(
        [query],
        normalize_embeddings=True,
        show_progress_bar=False,
    )
     
    scores = model.similarity(query_embedding, document_embeddings)[0]
    best_index = int(scores.argmax())
     
    print(f"query shape: {query_embedding.shape}")
    print(f"document shape: {document_embeddings.shape}")
    print(f"top match: doc-{best_index + 1}")
    print(f"score: {scores[best_index]:.3f}")
    print(f"text: {documents[best_index]}")

    normalize_embeddings=True keeps both sides on the same unit-length scale before similarity scoring. Replace the sample model with a retrieval model that matches your corpus before creating a production index.
    Related: How to normalize embeddings with Sentence Transformers
    Related: How to choose a Sentence Transformers model for semantic search

  2. Run the script.
    $ python query_document_encode.py
    query shape: (1, 384)
    document shape: (3, 384)
    top match: doc-2
    score: 0.731
    text: Reset a lost account password from the profile security page.

    The matching 384-column shapes confirm the query and document vectors can be compared directly. The top match should be the password-reset document for this query.

  3. Remove the temporary script after copying the code into the retrieval project.
    $ rm query_document_encode.py