How to generate embeddings with Sentence Transformers

Semantic search, clustering, and retrieval pipelines need text represented as dense numeric vectors before they can compare meaning. Sentence Transformers provides pretrained models that turn strings into embeddings with a single Python encode call.

SentenceTransformer loads the selected model from the local cache or Hugging Face Hub, and encode() returns one vector for each input sentence. For a list of texts, the default output is a NumPy array whose row count should match the input count.

sentence-transformers/all-MiniLM-L6-v2 is small enough for a local shape check and returns 384-dimensional vectors. Use the same model ID in later vector storage, similarity scoring, or search code so saved embeddings keep the expected shape.

Steps to generate Sentence Transformers embeddings:

Open a Python environment with Sentence Transformers available.

Install the package first if the import fails.
Related: How to install Sentence Transformers with pip

Create a script that loads a model and encodes a list of sentences.

generate_embeddings.py

from sentence_transformers import SentenceTransformer
 
 
model_id = "sentence-transformers/all-MiniLM-L6-v2"
sentences = [
    "Reset a user password",
    "Create a private S3 bucket",
    "Rotate an SSH key",
]
 
model = SentenceTransformer(model_id)
embeddings = model.encode(sentences, show_progress_bar=False)
 
print(f"model={model_id}")
print(f"input_count={len(sentences)}")
print(f"embedding_shape={embeddings.shape}")
print(f"embedding_dtype={embeddings.dtype}")
print(f"model_dimension={model.get_embedding_dimension()}")
print(f"first_vector_preview={[round(float(value), 4) for value in embeddings[0][:5]]}")

show_progress_bar=False keeps short scripts and saved command transcripts from printing progress-bar control characters.

Run the script.

$ python generate_embeddings.py
model=sentence-transformers/all-MiniLM-L6-v2
input_count=3
embedding_shape=(3, 384)
embedding_dtype=float32
model_dimension=384
first_vector_preview=[-0.0378, -0.0338, -0.0607, -0.0331, -0.1292]

The first run downloads model files if they are not already in the local cache.

Check that the first shape value matches the number of input sentences.

For this run, embedding_shape=(3, 384) means three input strings produced three vectors, each with 384 numbers.
Check that the second shape value matches model_dimension before storing the vectors.

A vector database collection or FAISS index created for a different dimension rejects these embeddings or returns invalid comparisons.
Remove the temporary script after copying the encode block into the application code.
```
$ rm generate_embeddings.py
```

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.