Semantic search compares a user query with document embeddings, so a support note can match words that are not typed exactly the same way. Sentence Transformers provides bi-encoder models that turn queries and documents into vectors before ranking the nearest matches by similarity.
The small in-memory pattern fits product FAQs, notes, tickets, and other corpora that can be embedded during startup or a batch refresh. Each text keeps an application ID beside it, the model encodes the document texts, and the search result returns the original ID plus a score that can be sent back to the application.
Use encode_document() for corpus text and encode_query() for user queries when the model supports query/document prompts. For corpora that outgrow exact in-memory search, keep the same embedding boundary and move the stored vectors into FAISS, Qdrant, or another vector index.
from sentence_transformers import SentenceTransformer, util model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") documents = [ {"id": "doc-001", "text": "Set up billing alerts for monthly cloud spending."}, {"id": "doc-002", "text": "Reset expired password links from the account security page."}, {"id": "doc-003", "text": "Rotate SSH keys for production deployment hosts."}, {"id": "doc-004", "text": "Renew TLS certificates before the web server reload."}, {"id": "doc-005", "text": "Export customer invoices from the finance dashboard."}, ] query = "password reset link expired" corpus = [item["text"] for item in documents] corpus_embeddings = model.encode_document( corpus, convert_to_tensor=True, normalize_embeddings=True, show_progress_bar=False, ) query_embedding = model.encode_query( query, convert_to_tensor=True, normalize_embeddings=True, show_progress_bar=False, ) hits = util.semantic_search(query_embedding, corpus_embeddings, top_k=3)[0] print(f"Corpus embeddings: {tuple(corpus_embeddings.shape)}") print(f"Query: {query}") for rank, hit in enumerate(hits, start=1): item = documents[hit["corpus_id"]] print(f"{rank}. {item['id']} score={hit['score']:.4f} text={item['text']}") top_item = documents[hits[0]["corpus_id"]] if top_item["id"] == "doc-002": print("Semantic search check: pass")
Use a Python environment where Sentence Transformers is installed before running the script.
Related: How to install Sentence Transformers with pip
$ python semantic_search_build.py Corpus embeddings: (5, 384) Query: password reset link expired 1. doc-002 score=0.8394 text=Reset expired password links from the account security page. 2. doc-004 score=0.3238 text=Renew TLS certificates before the web server reload. 3. doc-001 score=0.0895 text=Set up billing alerts for monthly cloud spending. Semantic search check: pass
The first tuple value is the number of indexed documents, and the second value is the embedding dimension from the selected model.
The exact score can change with a different model or corpus, but the highest-ranked ID should belong to the record that matches the query intent.