Semantic search, clustering, and retrieval pipelines need text represented as dense numeric vectors before they can compare meaning. Sentence Transformers provides pretrained models that turn strings into embeddings with a single Python encode call.
SentenceTransformer loads the selected model from the local cache or Hugging Face Hub, and encode() returns one vector for each input sentence. For a list of texts, the default output is a NumPy array whose row count should match the input count.
sentence-transformers/all-MiniLM-L6-v2 is small enough for a local shape check and returns 384-dimensional vectors. Use the same model ID in later vector storage, similarity scoring, or search code so saved embeddings keep the expected shape.
Install the package first if the import fails.
Related: How to install Sentence Transformers with pip
from sentence_transformers import SentenceTransformer model_id = "sentence-transformers/all-MiniLM-L6-v2" sentences = [ "Reset a user password", "Create a private S3 bucket", "Rotate an SSH key", ] model = SentenceTransformer(model_id) embeddings = model.encode(sentences, show_progress_bar=False) print(f"model={model_id}") print(f"input_count={len(sentences)}") print(f"embedding_shape={embeddings.shape}") print(f"embedding_dtype={embeddings.dtype}") print(f"model_dimension={model.get_embedding_dimension()}") print(f"first_vector_preview={[round(float(value), 4) for value in embeddings[0][:5]]}")
show_progress_bar=False keeps short scripts and saved command transcripts from printing progress-bar control characters.
$ python generate_embeddings.py model=sentence-transformers/all-MiniLM-L6-v2 input_count=3 embedding_shape=(3, 384) embedding_dtype=float32 model_dimension=384 first_vector_preview=[-0.0378, -0.0338, -0.0607, -0.0331, -0.1292]
The first run downloads model files if they are not already in the local cache.
For this run, embedding_shape=(3, 384) means three input strings produced three vectors, each with 384 numbers.
A vector database collection or FAISS index created for a different dimension rejects these embeddings or returns invalid comparisons.
$ rm generate_embeddings.py