How to set the cache directory for Sentence Transformers models

Sentence Transformers stores downloaded model snapshots through the Hugging Face Hub cache, which can grow quickly on containers, shared hosts, and systems with small home directories. Pointing a model load at a selected cache directory keeps those files on a volume with enough space and makes later runs reuse the same snapshot.

The cache_folder argument on SentenceTransformer() is the most explicit control because it belongs to the model load that downloads or reuses the files. Use it when one application, job, or service should keep its embedding model cache separate from the user's default /.cache/huggingface tree.

A completed setup leaves a models--sentence-transformers--... snapshot directory under the selected path. Loading the same model again with local_files_only=True confirms the cache is usable without contacting the Hub.

Steps to set the Sentence Transformers model cache directory:

  1. Create the model cache directory.
    $ mkdir -p ~/models/sentence-transformers
  2. Add cache_folder to the SentenceTransformer loader in the application code.
    from pathlib import Path
    from sentence_transformers import SentenceTransformer
     
    cache_dir = Path("~/models/sentence-transformers").expanduser()
    model = SentenceTransformer(
        "sentence-transformers/all-MiniLM-L6-v2",
        cache_folder=str(cache_dir),
    )

    SENTENCE_TRANSFORMERS_HOME sets a default for Sentence Transformers loads that omit cache_folder. Set HF_HOME or HF_HUB_CACHE before Python starts only when other Hugging Face Hub clients should share the same cache root.

  3. Run a small model load through the configured cache path.
    $ python3 - <<'PY'
    from pathlib import Path
    from sentence_transformers import SentenceTransformer
    
    cache_dir = Path("~/models/sentence-transformers").expanduser()
    model = SentenceTransformer(
        "sentence-transformers/all-MiniLM-L6-v2",
        cache_folder=str(cache_dir),
    )
    embedding = model.encode(["custom cache test"])
    print(f"cache_dir={cache_dir}")
    print(f"embedding_shape={embedding.shape}")
    PY
    cache_dir=/home/user/models/sentence-transformers
    embedding_shape=(1, 384)
  4. List the cache directory to confirm the model snapshot landed there.
    $ ls -1 ~/models/sentence-transformers
    models--sentence-transformers--all-MiniLM-L6-v2
  5. Reload the model from the same cache without network access.
    $ python3 - <<'PY'
    from pathlib import Path
    from sentence_transformers import SentenceTransformer
    
    cache_dir = Path("~/models/sentence-transformers").expanduser()
    model = SentenceTransformer(
        "sentence-transformers/all-MiniLM-L6-v2",
        cache_folder=str(cache_dir),
        local_files_only=True,
    )
    embedding = model.encode(["cache-only reload"])
    print(embedding.shape)
    PY
    (1, 384)

    If this load fails with a missing-file error, the earlier run did not populate the selected cache path or the application is pointing at a different directory.