Retriever¶
FoxNoseRetriever is a LangChain BaseRetriever implementation that queries the FoxNose Flux _search endpoint.
Architecture¶
FoxNoseRetriever
├── _execute_search() → dispatches to the appropriate SDK method
│ ├── client.vector_search() (vector mode, auto embeddings)
│ ├── client.vector_field_search() (vector mode, custom embeddings)
│ ├── client.hybrid_search() (hybrid mode)
│ ├── client.boosted_search() (vector_boosted mode)
│ └── client.search() (text mode)
└── map_results_to_documents() → converts results to LangChain Documents
The retriever uses SDK v0.5.0 convenience methods for validated, type-safe search requests.
Content Mapping¶
FoxNose returns structured results with _sys (system metadata) and data (your fields). You must tell the retriever which field(s) become page_content.
Single field¶
Multiple fields (concatenated)¶
retriever = FoxNoseRetriever(
client=client,
folder_path="articles",
page_content_fields=["title", "body"],
page_content_separator="\n\n", # default
)
Custom mapper¶
For full control, pass a callable that receives the raw result dict:
retriever = FoxNoseRetriever(
client=client,
folder_path="articles",
page_content_mapper=lambda result: (
f"# {result['data']['title']}\n\n{result['data']['body']}"
),
)
Metadata¶
By default, metadata includes:
_sysfields:key,folder,created_at,updated_at- All
datafields except those used forpage_content
Whitelist¶
retriever = FoxNoseRetriever(
...,
metadata_fields=["title", "category"], # only these data fields
)
Blacklist¶
Disable system metadata¶
Custom Embeddings (Vector Field Search)¶
When you have your own embedding model or pre-computed vectors, use vector_field together with embeddings or query_vector to search via the SDK's vector_field_search() method.
With a LangChain Embeddings model¶
The retriever converts the query text into a vector at query time:
from langchain_openai import OpenAIEmbeddings
retriever = FoxNoseRetriever(
client=client,
folder_path="articles",
page_content_field="body",
search_mode="vector",
embeddings=OpenAIEmbeddings(model="text-embedding-3-small"),
vector_field="embedding", # field name in FoxNose
similarity_threshold=0.75,
)
docs = retriever.invoke("How do I reset my password?")
Warning
The query text is sent to the embedding provider (e.g. OpenAI) on every invocation.
With a static query vector¶
If you already have a vector, pass it directly:
retriever = FoxNoseRetriever(
client=client,
folder_path="articles",
page_content_field="body",
search_mode="vector",
query_vector=[0.1, 0.2, ...], # your pre-computed vector
vector_field="embedding",
)
With vector-boosted mode¶
Custom embeddings also work in vector_boosted mode. The retriever sends both text and vector, using the vector for similarity boosting:
retriever = FoxNoseRetriever(
client=client,
folder_path="articles",
page_content_field="body",
search_mode="vector_boosted",
embeddings=OpenAIEmbeddings(model="text-embedding-3-small"),
vector_field="embedding",
vector_boost_config={"boost_factor": 1.5},
)
Validation rules¶
embeddingsandquery_vectorare mutually exclusivevector_fieldis required when either is setvector_fieldandvector_fieldsare mutually exclusive (vector_fieldfor custom embeddings,vector_fieldsfor auto-generated)- Custom embeddings are only supported in
vectorandvector_boostedmodes query_vectormust be non-empty with finite values (no NaN/Inf)
Sync vs Async¶
Sync (default)¶
from foxnose_sdk.flux import FluxClient
client = FluxClient(...)
retriever = FoxNoseRetriever(client=client, ...)
docs = retriever.invoke("query")
Native async¶
from foxnose_sdk.flux import AsyncFluxClient
async_client = AsyncFluxClient(...)
retriever = FoxNoseRetriever(async_client=async_client, ...)
docs = await retriever.ainvoke("query")
Fallback¶
When only a sync client is provided, ainvoke() falls back to running the sync search in an executor (the default LangChain behaviour).
Both clients¶
You can provide both. Sync calls use client, async calls use async_client: