Vector stores and similarity indexes need one fixed length for every dense vector in a collection. With Sentence Transformers, that length comes from the loaded model and the encode settings, so checking it before schema creation prevents dimension mismatch errors later.
get_embedding_dimension() reports the output length for encode(), while the encoded array shape confirms the value against actual input text. Reading both values is a quick way to catch the wrong model ID, a stale index setting, or an unexpected truncation path before embeddings are written.
Use the same model ID and encode options that the application will use for indexing and querying. If production code sets truncate_dim, verify that shortened shape instead of the model's full vector length.
Use the same environment that will build the vector index or write embeddings for the application.
Related: How to install Sentence Transformers with pip
import sentence_transformers as st org = "sentence-transformers" model_name = "all-MiniLM-L6-v2" model_id = f"{org}/{model_name}" model = st.SentenceTransformer(model_id) sentences = [ "Reset a user password", "Create an S3 bucket", ] embeddings = model.encode(sentences) get_dim = model.get_embedding_dimension reported_dim = get_dim() actual_dim = embeddings.shape[1] match = actual_dim == reported_dim print(f"model={model_name}") print(f"reported_dim={reported_dim}") print(f"shape={embeddings.shape}") print(f"match={match}") if not match: raise SystemExit( f"mismatch: shape={actual_dim}, dim={reported_dim}" )
Replace model_id with the model that will write the production vectors.
$ python check_dim.py model=all-MiniLM-L6-v2 reported_dim=384 shape=(2, 384) match=True
The first number in shape is the number of input sentences. The second number is the embedding dimension that downstream indexes and vector fields must accept.
Recreate or rebuild an existing vector index when its configured dimension differs from the new model output. A collection created for 768-dimension vectors cannot accept 384-dimension embeddings.
$ rm check_dim.py