Describe the bug
meta_fields_to_embed silently drops valid falsy metadata values (0, False) during text preparation for embedding and ranking. This happens because the filtering logic uses a Python truthiness check instead of an explicit None guard.
Affected components:
SentenceTransformersDocumentEmbedder
SentenceTransformersSparseDocumentEmbedder
TransformersSimilarityRanker
SentenceTransformersSimilarityRanker
SentenceTransformersDiversityRanker
Error message
No error is raised. The values are silently excluded, making it a silent correctness bug.
Expected behavior
All metadata values specified in meta_fields_to_embed should be included in the embedded text unless they are explicitly None or the key is absent from doc.meta.
To Reproduce
from haystack import Document
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
embedder = SentenceTransformersDocumentEmbedder(
model="sentence-transformers/all-MiniLM-L6-v2",
meta_fields_to_embed=["rating", "is_available"],
embedding_separator="\n"
)
doc = Document(content="some content", meta={"rating": 0, "is_available": False})
# Expected embedded text: "0\nFalse\nsome content"
# Actual embedded text: "some content" ← both fields silently dropped
Root cause — all 5 affected components use:
if key in doc.meta and doc.meta[key]
which treats any falsy value as absent. Should be:
if key in doc.meta and doc.meta[key] is not None
Additional context
OpenAIDocumentEmbedder and AzureOpenAIDocumentEmbedder already use the correct is not None pattern and are unaffected.
FAQ Check
System:
- OS: Fedora Linux 42 (KDE Plasma) x86_64
- GPU/CPU: Intel i5-12500H / NVIDIA RTX 3050 4GB
- Haystack version: main (d8a7c96)
- DocumentStore: N/A
- Reader: N/A
- Retriever: N/A
Describe the bug
meta_fields_to_embedsilently drops valid falsy metadata values (0,False) during text preparation for embedding and ranking. This happens because the filtering logic uses a Python truthiness check instead of an explicitNoneguard.Affected components:
SentenceTransformersDocumentEmbedderSentenceTransformersSparseDocumentEmbedderTransformersSimilarityRankerSentenceTransformersSimilarityRankerSentenceTransformersDiversityRankerError message
No error is raised. The values are silently excluded, making it a silent correctness bug.
Expected behavior
All metadata values specified in
meta_fields_to_embedshould be included in the embedded text unless they are explicitlyNoneor the key is absent fromdoc.meta.To Reproduce
Root cause — all 5 affected components use:
which treats any falsy value as absent. Should be:
Additional context
OpenAIDocumentEmbedderandAzureOpenAIDocumentEmbedderalready use the correctis not Nonepattern and are unaffected.FAQ Check
System: