A saved Sentence Transformers directory is the handoff point between training, packaging, and later embedding jobs. Saving the model locally keeps the tokenizer, pooling configuration, and weight files together so a later Python process can reload the same embedding behavior without depending on the original model object.
The save_pretrained() method writes the model modules, configuration, tokenizer files, and weights to a directory. Loading that saved path with SentenceTransformer() should recreate the model stack, and local_files_only=True makes the reload fail if the path is incomplete instead of falling back to a remote download.
A public MiniLM embedding model can stand in for a fine-tuned checkpoint during smoke testing. The local directory should remain empty before export, and the pass condition is matching embedding shape plus a zero max difference after reload.
Steps to save and reload a Sentence Transformers model:
- Choose the source model and local directory for the saved copy.
Use the model ID and local path defined in the script, and start with an empty directory. A path that already contains another model can leave stale tokenizer, pooling, or weight files beside the new export.
- Create a save and reload smoke-test script.
- save_reload.py
from pathlib import Path import importlib as il import numpy as np ST = il.import_module( "sentence_transformers" ).SentenceTransformer model_id = ( "sentence-transformers/" "all-MiniLM-L6-v2" ) save_path = Path( "models/" "support-embeddings" ) texts = [ "reset password", "change billing address", ] files = [ "modules.json", ( "config_sentence_" "transformers.json" ), "model.safetensors", "1_Pooling/config.json", ] if save_path.exists(): message = ( f"{save_path} exists; " "choose an empty path" ) raise SystemExit(message) model = ST(model_id) before = model.encode( texts, ) model.save_pretrained( str(save_path), safe_serialization=True, ) missing = [] for name in files: if not (save_path / name).exists(): missing.append(name) if missing: raise SystemExit( "missing saved files: " + ", ".join(missing) ) reloaded = ST( str(save_path), local_files_only=True, ) after = reloaded.encode( texts, ) delta = abs(before - after) diff = float( np.max(delta) ) if before.shape != after.shape: raise SystemExit( "shape mismatch" ) if diff > 1e-6: raise SystemExit( "embedding mismatch" ) print("saved path: ok") print( "checked files:", len(files), ) print( "shape before:", before.shape, ) print( "shape after:", after.shape, ) print( "max diff:", f"{diff:.8f}", )
safe_serialization=True writes the model weights as .safetensors. Keep it enabled unless a downstream runtime specifically requires legacy PyTorch weight files.
- Run the smoke-test script.
$ python save_reload.py saved path: ok checked files: 4 shape before: (2, 384) shape after: (2, 384) max diff: 0.00000000
The printed shapes should match the number of input texts and the embedding dimension of the saved model. A nonzero max difference means the reload did not reproduce the same embeddings.
- Create a separate local-path reload script for the application handoff. <file python load_model.py>import importlib as il
ST = il.import_module(
"sentence_transformers"
).SentenceTransformer
model = ST(
"models/" "support-embeddings", local_files_only=True,
) embeddings = model.encode(
["reset password"],
) print(
"local shape:", embeddings.shape,
)
- Run the local-path reload script from the same project directory.
$ python load_model.py local shape: (1, 384)
- Remove the smoke-test scripts after the saved model has passed the reload checks.
$ rm save_reload.py \ load_model.py
This removes only the temporary validation scripts. Keep models/support-embeddings as the saved model artifact.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.