Overview
The Knowledge Base Search Node (embed-search) retrieves the most relevant chunks from a Knowledge Base for a given search query. It uses vector search (with optional reranking and continuation) so you can feed retrieved context into LLM nodes—a common Retrieval Augmented Generation (RAG) pattern.Recommendation: For most RAG and Q&A use cases, we recommend using the Agent Node with the Knowledge Base Search tool configured instead of this node. The agent can decide when and how to query the knowledge base and combine search results with other tools (e.g. web search). Use this Knowledge Base Search Node when you need a single, deterministic search step in the graph (e.g. a fixed query that always runs and passes results to an LLM).
Configuration Parameters
- Knowledge Base Select the knowledge base to search. If you haven’t created one yet, do so from the Knowledge Base page. The node will retrieve chunks from this index.
-
Search String
The query used for semantic search. Supports format strings from workflow input (e.g.
{user_question}). Example: What are the benefits of AI? - Include Document Metadata for Citations When enabled, the output includes XML-formatted metadata for citations.
Advanced Settings
- [VectorDB] Search Type Similarity — Retrieve by vector similarity only. MMR — Maximal Marginal Relevance; balances similarity with diversity to reduce redundant chunks.
- [VectorDB] Retrieval Mode Dense — Use dense embeddings only (default). Sparse — Use sparse embeddings only. Hybrid — Combine dense and sparse. Sparse and Hybrid require the knowledge base to have a sparse embedding model configured; otherwise the node returns an error and only Dense is supported.
- [VectorDB] Retrieve Top-N Chunks Number of most similar chunks to fetch from the vector store before reranking (0–100). Default: 40.
- [VectorDB] Similarity Score Threshold Minimum similarity score for results (0.0–1.0). Chunks below this threshold are excluded.
-
[Reranker] Rerank Model
Cohere model used to rerank retrieved chunks:
- Cohere Rerank-v3.5
- Cohere Rerank-English-v3.0
- Cohere Rerank-Multilingual-v3.0
- [Reranker] Return Top-N Chunks Number of chunks to keep after reranking (0–100). Default: 20.
- [Advanced RAG] Max Continuation Radius Maximum number of neighboring continuation chunks to add to each returned chunk for better context (0–20, step 5). Helps avoid cutting off mid-sentence or mid-paragraph.
Expected Inputs and Outputs
-
Inputs:
- input: Optional. Values can be referenced in the Search String using format strings (e.g.
{user_question}). If omitted, the search string is used as-is. A search string must be provided (non-empty after formatting).
- input: Optional. Values can be referenced in the Search String using format strings (e.g.
-
Outputs:
- output: A string containing the retrieved data chunks. If “Include document metadata for citations” is enabled, the output includes XML-formatted citations.
Use Case Examples
- Research Analysis Use this node to quickly find specific information or references within a large collection of research documents, streamlining your research workflow.
- Content Recommendation Content creators can use this node to identify similar articles or documents, aiding in the development of well-researched and informed content.
- Customer Support Enhance customer support by enabling chatbots to search and retrieve relevant knowledge base articles, providing quick and accurate responses to customer inquiries.
- Data Insights Extraction Extract specific insights from large datasets by setting a relevant search string, allowing for efficient data analysis without manual data sifting.
Error Handling and Troubleshooting
- Missing Knowledge Base If you see “A knowledge base must be selected,” select a knowledge base from the dropdown.
- Search String Required If you see “A search string must be provided,” ensure the Search String is non-empty (after format-string substitution from input).
- Sparse / Hybrid Retrieval If you see “Sparse embedding model is not set for this knowledge base. Only ‘dense’ retrieval mode is supported,” the knowledge base does not have a sparse embedding model configured. Use Dense retrieval mode, or configure a sparse embedding model for the knowledge base to use Sparse or Hybrid.