How to build an Elasticsearch vector index with Sentence Transformers

Semantic search needs a place to keep document vectors beside the records an application already serves. Pairing Sentence Transformers with Elasticsearch lets a Python application encode text, index each dense vector beside its source fields, and run kNN searches through the Elasticsearch API.

The Python script uses sentence-transformers/all-MiniLM-L6-v2 to create normalized document and query embeddings. The Elasticsearch index maps the embedding field as dense_vector with cosine similarity, so the vector search compares query meaning rather than exact keyword overlap.

A reachable Elasticsearch endpoint and a Python environment that can install packages are enough for the smoke test. The configured demo index is reset on each run, which keeps repeated tests from mixing old vectors with the current corpus.

Steps to build an Elasticsearch vector index with Sentence Transformers:

Install the Python packages in the active environment.
```
$ python -m pip install --upgrade sentence-transformers elasticsearch
```
The first model run may download model files from Hugging Face.
Related: How to install Sentence Transformers with pip
Set the Elasticsearch endpoint for the Python client.
```
$ export ELASTICSEARCH_URL=http://localhost:9200
```
For secured clusters, set ELASTICSEARCH_API_KEY and ELASTICSEARCH_CA_CERTS before running the script. Keep credentials in environment variables or a secret manager, not in the source file.

Create the Elasticsearch indexing script.

build_elasticsearch_index.py

import os
 
import numpy as np
from elasticsearch import Elasticsearch, helpers
from sentence_transformers import SentenceTransformer
 
 
INDEX_NAME = os.environ.get("ELASTICSEARCH_INDEX", "support-docs-demo")
QUERY = "password reset instructions"
 
CORPUS = [
    {
        "doc_id": "doc-001",
        "title": "Reset a forgotten password",
        "text": "Reset a forgotten password from account settings and confirm the email link.",
    },
    {
        "doc_id": "doc-002",
        "title": "Create an invoice receipt",
        "text": "Create a billing invoice and download a PDF receipt.",
    },
    {
        "doc_id": "doc-003",
        "title": "Rotate API tokens",
        "text": "Rotate API tokens before sharing a new integration with a teammate.",
    },
    {
        "doc_id": "doc-004",
        "title": "Store semantic vectors",
        "text": "Elasticsearch stores Sentence Transformers embeddings for vector search.",
    },
]
 
 
def elasticsearch_client() -> Elasticsearch:
    api_key = os.environ.get("ELASTICSEARCH_API_KEY")
    ca_certs = os.environ.get("ELASTICSEARCH_CA_CERTS")
    options = {}
    if api_key:
        options["api_key"] = api_key
    if ca_certs:
        options["ca_certs"] = ca_certs
    return Elasticsearch(
        os.environ.get("ELASTICSEARCH_URL", "http://localhost:9200"),
        request_timeout=60,
        **options,
    )
 
 
client = elasticsearch_client()
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
 
texts = [item["text"] for item in CORPUS]
document_embeddings = model.encode_document(
    texts,
    normalize_embeddings=True,
    convert_to_numpy=True,
    show_progress_bar=False,
)
document_embeddings = np.asarray(document_embeddings, dtype="float32")
dimension = document_embeddings.shape[1]
 
if client.indices.exists(index=INDEX_NAME):
    client.indices.delete(index=INDEX_NAME)
 
client.indices.create(
    index=INDEX_NAME,
    mappings={
        "properties": {
            "doc_id": {"type": "keyword"},
            "title": {"type": "text"},
            "text": {"type": "text"},
            "embedding": {
                "type": "dense_vector",
                "dims": dimension,
                "index": True,
                "similarity": "cosine",
            },
        }
    },
)
 
actions = []
for item, embedding in zip(CORPUS, document_embeddings):
    actions.append(
        {
            "_op_type": "index",
            "_index": INDEX_NAME,
            "_id": item["doc_id"],
            "_source": {**item, "embedding": embedding.tolist()},
        }
    )
 
indexed_count, errors = helpers.bulk(client, actions)
if errors:
    raise SystemExit(errors)
 
client.indices.refresh(index=INDEX_NAME)
 
query_embedding = model.encode_query(
    QUERY,
    normalize_embeddings=True,
    convert_to_numpy=True,
    show_progress_bar=False,
).astype("float32")
 
response = client.search(
    index=INDEX_NAME,
    knn={
        "field": "embedding",
        "query_vector": query_embedding.tolist(),
        "k": 2,
        "num_candidates": 4,
    },
    source=["doc_id", "title", "text"],
)
hits = response["hits"]["hits"]
stored_count = client.count(index=INDEX_NAME)["count"]
 
print(f"index: {INDEX_NAME}")
print(f"embedding dimension: {dimension}")
print(f"indexed documents: {indexed_count}")
print(f"stored documents: {stored_count}")
print(f"query: {QUERY}")
print("top matches:")
for rank, hit in enumerate(hits, start=1):
    source = hit["_source"]
    print(
        f"{rank}. {source['doc_id']} score={hit['_score']:.4f} "
        f"title={source['title']}"
    )
 
if hits[0]["_source"]["doc_id"] != "doc-001":
    raise SystemExit(f"unexpected top match: {hits[0]['_source']['doc_id']}")
 
print("verification: PASS query returned the password reset document")

The sample script deletes and recreates only the index named by ELASTICSEARCH_INDEX, defaulting to support-docs-demo. Change that value before using a shared or production cluster.

Run the script to create the vector index, bulk-index documents, and search it.

$ python build_elasticsearch_index.py
index: support-docs-demo
embedding dimension: 384
indexed documents: 4
stored documents: 4
query: password reset instructions
top matches:
1. doc-001 score=0.7535 title=Reset a forgotten password
2. doc-003 score=0.5641 title=Rotate API tokens
verification: PASS query returned the password reset document

The document count should match the source corpus, and the top hit should be the record that answers the query. A different embedding model changes the vector dimension and can change similarity scores.

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.