Haystack query pipelines need a text embedder when a retriever expects vectors instead of raw strings. A Sentence Transformers text embedder lets a Haystack project create those query vectors locally from a model that is compatible with the Sentence Transformers library, avoiding a hosted embedding API for this part of retrieval.
The current Haystack integration is installed with the sentence-transformers-haystack package. It provides SentenceTransformersTextEmbedder for single query strings and companion document embedders for indexing pipelines.
Configure the model, embedding normalization, and progress output before wiring the component into a retriever. Calling warm_up() loads the model before the first real query, and the smoke test checks both vector length and normalization so a missing integration package or wrong model setting is caught early.
Steps to configure a Sentence Transformers text embedder in Haystack:
- Install the Haystack Sentence Transformers integration in the active Python environment.
$ python3 -m pip install --upgrade sentence-transformers-haystack Successfully installed haystack-ai-2.30.2 sentence-transformers-haystack-0.1.0
Install this package in the same virtual environment that runs the Haystack pipeline.
- Create a query embedder script.
$ cat > haystack-query-embedder.py <<'PY' from haystack_integrations.components.embedders.sentence_transformers import SentenceTransformersTextEmbedder embedder = SentenceTransformersTextEmbedder( model="sentence-transformers/all-MiniLM-L6-v2", normalize_embeddings=True, progress_bar=False, ) embedder.warm_up() result = embedder.run(text="how to reset a forgotten password") embedding = result["embedding"] print(f"embedding_dimensions={len(embedding)}") print(f"first_value={embedding[0]:.6f}") print(f"normalized_l2={sum(value * value for value in embedding):.6f}") PYnormalize_embeddings=True returns vectors with L2 norm near 1.0, which fits cosine-similarity retrieval. Add a prefix only for models whose model card or Haystack documentation requires a query instruction.
- Run the script to load the model and generate a query embedding.
$ python3 haystack-query-embedder.py embedding_dimensions=384 first_value=-0.020695 normalized_l2=1.000000
embedding_dimensions=384 matches sentence-transformers/all-MiniLM-L6-v2, and normalized_l2=1.000000 confirms that the configured embedder normalized the vector.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.