
We are super excited to release an update on the DeepSearch API! This release was shaped by the conversations we've had with you: the feedback, questions, and friction you’ve shared while building everything from quick internal tools to full-blown agentic & search systems. Thank you.
Across use cases, the pain was familiar: not because the system wasn’t working, but because the interface and content weren’t doing enough heavy lifting. You had to work around results that weren’t quite ready to plug into your workflows: citation fields that didn’t go deep enough, references that weren’t always structured cleanly, formatting that fell apart across modalities. The content was there, but not always in a form that played well with how you were actually using it, in apps, chains, agents, UIs. So we fixed that, while also expanding coverage in areas like biomedical research and financial content. You weren’t building search boxes, you were building systems that needed to reason, cite, and compose.
This update doesn’t reset anything, it builds directly on what already worked: deep search across proprietary full-text content, web, and financial data. What’s new is better coverage, more modalities (including images and figures), smarter reranking, and a cleaner experience for agent workflows and tool-calling.
Whether you're hitting us from a simple prompt or stitching us into multi-step chains, this is about closing the gap between what you retrieve and what your system can actually use. Here’s what’s new and why it matters.
New Content You Can Retrieve
1. Wiley Academic
You can now access full-text journals and textbooks from Wiley across Business, Finance, and Accounting.
Each result includes:
- Full body text, structured by section
- Lists of authors and affiliations
- Inline citation strings (for attribution in generated answers)
- Structured references (for graph building, citation following)
Example: A student builds a study assistant that retrieves textbook content and generates summaries, lesson plans and study guides. The assistant uses citation strings to footnote its answers and links to source sections for follow-up reading.
2. PubMed (2022–Present)
Multimodal biomedical articles: full text, images, captions, references, and metadata.
Use cases we’ve seen:
- A team working on a clinical research summariser that uses our API to retrieve only recent studies (post-2022), extracts findings sections, and links out references for traceability.
- An LLM-powered medical search assistant that filters by author, publication date, and includes figures in the results.
3. arXiv (Complete)
The full arXiv archive structured for retrieval.
Includes:
- Full paper text
- Equations and math blocks
- Structured citations and reference lists
- Metadata for downstream processing
Example: A developer building a research agent for ML papers can retrieve the full text, parse out formulas, and follow references without scraping, cleaning, or guessing.
4. Financial Data
We’ve integrated:
- Real-time and historical pricing across equities, FX, options, and indices
- Editorial and news content from financial sources (min. 180K retrievals/month)
What this enables:
- Copilots that combine pricing lookups with market commentary
- RAG systems that anchor financial recommendations in historical trends and cited sources
- Dashboards or interfaces that pull context for earnings calls, volatility spikes, or macroeconomic events
5. Web Search (Improved)
We’ve upgraded web extraction:
- More complete full-text content
- Support for embedded images
- Faster and more consistent latency
Example: An LLM toolchain retrieves both proprietary content and recent web articles about a public company. The API returns images (e.g. charts), author metadata, and clean paragraphs ready for summarisation.
API v2.0: Cleaner, More Controllable
No breaking changes. But far more structure and control.
What's new:
citation_string
: drop-in ready citations for completions or footnotesreferences
: full lists of outbound citations for traceabilityauthors
: structured lists to support summarisation or entity linkingstart_date
/end_date
: filter by publication windowincluded_sources
: explicitly choose what datasets to retrieve fromis_tool_call
: format optimized for tool calls (but optional)
Example: A RAG backend retrieves papers using start_date
and category
filters, injects the citation_string
into the prompt, and shows the references
as expandable footnotes in a UI.
Get started in 4 lines of code:
1from valyu import Valyu23valyu = Valyu(api_key="your-api-key-here")45response = valyu.search(6 "Implementation details of agentic search-enhanced large reasoning models",7 max_num_results=5, # Limit to top 5 results8 max_price=10 # Maximum price per thousand queries (CPM)9)1011print(response)1213# Feed the results to your AI agent as you would with other search APIs
Ranking That Thinks Like a Researcher
Relevance isn’t enough. We now score documents with additional signals:
- Citation count
- Publisher trust level
- Metadata completeness
This helps LLMs and agents that use retrieval to pick results that aren’t just related — but reliable.
Example: A scientific assistant ranks competing sources for a claim. With our reranker, results from heavily cited review papers rank higher compared to lower cited papers.
Developer Tools
The new Playground:
- Visualises results by modality (text, images, etc.)
- Lets you toggle between raw and structured output
- Supports time filtering, source targeting, and more
We also added Google Auth so teams can get started faster (Github coming soon).
What You Can Build With This
This release is designed for teams building awesome things:
For Knowledge Work & Research
- Context-aware tools with inline citations and structured references
- Study assistants that surface summaries, authors, and relevant citations
- Generative video tutorials that need diagrams: 3B1B for any paper (please build this).
For Finance
- Assistants that answer with real-time data and editorial context
- Research workflows that trace historical pricing alongside market commentary
- Event-driven agents that retrieve and explain volatility spikes with cited data
For Advanced Retrieval Systems
- Semantic search engines that return structured context
- Tool-calling chains that can follow references, not just find results
Let us know what you need next.
We Build 🛠️