Long input text is truncated before a Sentence Transformers model creates embeddings. Setting a shorter maximum sequence length lets an embedding job cap token work for documents that have already been chunked, latency-sensitive tests, or workloads where only the beginning of each text should be embedded.
The max_seq_length property belongs to the loaded SentenceTransformer model object. Each model starts with its own limit, and assigning a lower value before encode() changes how much tokenized text reaches the transformer in that Python process.
Use the setting as a truncation guard, not as a replacement for document chunking. Raising the value above what the underlying transformer supports does not make that transformer accept longer inputs, and heavily clipped documents can lose important information near the end of the text.
$ python3 - <<'PY'
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
print(model.max_seq_length)
PY
256
Use model.max_seq_length directly. get_max_seq_length() remains for older code, but the package reference marks it deprecated in favor of the property.
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") sentence = ( "Sentence Transformers truncates input tokens when a document is longer " "than the configured sequence length. " ) long_text = sentence * 20 print(f"original max_seq_length={model.max_seq_length}") model.max_seq_length = 128 features = model.preprocess([long_text]) print(f"updated max_seq_length={model.max_seq_length}") print(f"tokenized sequence length={features['input_ids'].shape[1]}") embedding = model.encode([long_text], show_progress_bar=False) print(f"embedding shape={embedding.shape}")
preprocess() shows the tokenized input length after the limit is changed. Application code can set max_seq_length and call encode() without using preprocess().
$ python3 set_max_seq_length.py original max_seq_length=256 updated max_seq_length=128 tokenized sequence length=128 embedding shape=(1, 384)
The embedding width stays 384 for all-MiniLM-L6-v2 because sequence length controls input truncation, not output vector dimension.
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") model.max_seq_length = 128 embeddings = model.encode(documents, show_progress_bar=False)
Set the value before every encode() call that depends on the same truncation policy. Changing document truncation after a vector index is built changes stored embeddings and can change retrieval scores.
$ rm set_max_seq_length.py