How to build a Qdrant index with Sentence Transformers

Vector search moves semantic matching out of an in-memory prototype and into a collection that can store payloads, IDs, and nearest-neighbor results. Pairing Sentence Transformers with Qdrant lets a Python application encode support documents, upload their vectors, and retrieve the closest record for a natural-language query.

Sentence Transformers handles the dense embeddings with encode_document() and encode_query(), which keep retrieval code ready for models that define separate document and query prompts. Qdrant stores each vector with payload fields so a hit can be mapped back to the source document instead of only returning an array position.

A local in-memory Qdrant client keeps the smoke test reproducible without a running database container. Persistent Qdrant and Qdrant Cloud deployments use the same collection, vector size, upload, and query calls after the client connection is pointed at the service.

Steps to build a Qdrant index with Sentence Transformers:

  1. Install the Python packages in the active environment.
    $ python -m pip install --upgrade sentence-transformers qdrant-client

    The first model run may download model files from Hugging Face.
    Related: How to install Sentence Transformers with pip

  2. Create the Qdrant indexing script.
    build_qdrant_index.py
    import numpy as np
    from qdrant_client import QdrantClient, models
    from sentence_transformers import SentenceTransformer
     
     
    collection_name = "support_docs"
    query = "password reset instructions"
     
    corpus = [
        {
            "doc_id": "doc-001",
            "title": "Reset a forgotten password",
            "text": "Reset a forgotten password from account settings and confirm the email link.",
        },
        {
            "doc_id": "doc-002",
            "title": "Create an invoice receipt",
            "text": "Create a billing invoice and download a PDF receipt.",
        },
        {
            "doc_id": "doc-003",
            "title": "Rotate API tokens",
            "text": "Rotate API tokens before sharing a new integration with a teammate.",
        },
        {
            "doc_id": "doc-004",
            "title": "Store semantic vectors",
            "text": "Qdrant stores Sentence Transformers embeddings for semantic search.",
        },
    ]
     
    model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
    documents = [item["text"] for item in corpus]
     
    document_embeddings = model.encode_document(
        documents,
        normalize_embeddings=True,
        convert_to_numpy=True,
        show_progress_bar=False,
    )
    document_embeddings = np.asarray(document_embeddings, dtype="float32")
    dimension = document_embeddings.shape[1]
     
    client = QdrantClient(":memory:")
    client.create_collection(
        collection_name=collection_name,
        vectors_config=models.VectorParams(
            size=dimension,
            distance=models.Distance.COSINE,
        ),
    )
     
    client.upload_points(
        collection_name=collection_name,
        points=[
            models.PointStruct(
                id=index,
                vector=vector.tolist(),
                payload=item,
            )
            for index, (item, vector) in enumerate(zip(corpus, document_embeddings), start=1)
        ],
    )
     
    query_embedding = model.encode_query(
        query,
        normalize_embeddings=True,
        convert_to_numpy=True,
        show_progress_bar=False,
    )
     
    hits = client.query_points(
        collection_name=collection_name,
        query=query_embedding.tolist(),
        limit=2,
        with_payload=True,
    ).points
     
    point_count = client.count(collection_name=collection_name, exact=True).count
     
    print(f"collection: {collection_name}")
    print(f"vector size: {dimension}")
    print(f"points: {point_count}")
    print(f"query: {query}")
    print("top matches:")
    for rank, hit in enumerate(hits, start=1):
        payload = hit.payload
        print(
            f"{rank}. {payload['doc_id']} score={hit.score:.4f} "
            f"title={payload['title']}"
        )
     
    if hits[0].payload["doc_id"] != "doc-001":
        raise SystemExit(f"unexpected top match: {hits[0].payload['doc_id']}")
     
    print("verification: PASS query returned the password reset document")

    Replace the in-memory client with QdrantClient(url=“https://qdrant.example.com”, api_key=“qdrant-api-key”) when the collection must persist outside the Python process. Keep the collection vector size tied to the same embedding model used for query vectors.

  3. Run the script to create the collection, upload points, and query the index.
    $ python build_qdrant_index.py
    collection: support_docs
    vector size: 384
    points: 4
    query: password reset instructions
    top matches:
    1. doc-001 score=0.5070 title=Reset a forgotten password
    2. doc-003 score=0.1281 title=Rotate API tokens
    verification: PASS query returned the password reset document

    doc-001 wins because its payload text matches the password reset query. The point count should match the number of source records before the collection is used by an application.