Valyu Logo

Product

DeepSearch v2.0 —Tool Call Friendly, More Proprietary Content and Easy Use

>_ Hirsh P

Blog Cover for Valyu Deepsearch v2.0

We are super excited to release an update on the DeepSearch API! This release was shaped by the conversations we've had with you: the feedback, questions, and friction you’ve shared while building everything from quick internal tools to full-blown agentic & search systems. Thank you.

Across use cases, the pain was familiar: not because the system wasn’t working, but because the interface and content weren’t doing enough heavy lifting. You had to work around results that weren’t quite ready to plug into your workflows: citation fields that didn’t go deep enough, references that weren’t always structured cleanly, formatting that fell apart across modalities. The content was there, but not always in a form that played well with how you were actually using it,  in apps, chains, agents, UIs. So we fixed that, while also expanding coverage in areas like biomedical research and financial content. You weren’t building search boxes, you were building systems that needed to reason, cite, and compose.

This update doesn’t reset anything, it builds directly on what already worked: deep search across proprietary full-text content, web, and financial data. What’s new is better coverage, more modalities (including images and figures), smarter reranking, and a cleaner experience for agent workflows and tool-calling.

Whether you're hitting us from a simple prompt or stitching us into multi-step chains, this is about closing the gap between what you retrieve and what your system can actually use. Here’s what’s new and why it matters.


New Content You Can Retrieve

1. Wiley Academic

You can now access full-text journals and textbooks from Wiley across Business, Finance, and Accounting.

Each result includes:

  • Full body text, structured by section
  • Lists of authors and affiliations
  • Inline citation strings (for attribution in generated answers)
  • Structured references (for graph building, citation following)

Example: A student builds a study assistant that retrieves textbook content and generates summaries, lesson plans and study guides. The assistant uses citation strings to footnote its answers and links to source sections for follow-up reading.

2. PubMed (2022–Present)

Multimodal biomedical articles: full text, images, captions, references, and metadata.

Use cases we’ve seen:

  • A team working on a clinical research summariser that uses our API to retrieve only recent studies (post-2022), extracts findings sections, and links out references for traceability.
  • An LLM-powered medical search assistant that filters by author, publication date, and includes figures in the results.

3. arXiv (Complete)

The full arXiv archive structured for retrieval.

Includes:

  • Full paper text
  • Equations and math blocks
  • Structured citations and reference lists
  • Metadata for downstream processing

Example: A developer building a research agent for ML papers can retrieve the full text, parse out formulas, and follow references without scraping, cleaning, or guessing.

4. Financial Data

We’ve integrated:

  • Real-time and historical pricing across equities, FX, options, and indices
  • Editorial and news content from financial sources (min. 180K retrievals/month)

What this enables:

  • Copilots that combine pricing lookups with market commentary
  • RAG systems that anchor financial recommendations in historical trends and cited sources
  • Dashboards or interfaces that pull context for earnings calls, volatility spikes, or macroeconomic events

5. Web Search (Improved)

We’ve upgraded web extraction:

  • More complete full-text content
  • Support for embedded images
  • Faster and more consistent latency

Example: An LLM toolchain retrieves both proprietary content and recent web articles about a public company. The API returns images (e.g. charts), author metadata, and clean paragraphs ready for summarisation.


API v2.0: Cleaner, More Controllable

No breaking changes. But far more structure and control.

What's new:

  • citation_string: drop-in ready citations for completions or footnotes
  • references: full lists of outbound citations for traceability
  • authors: structured lists to support summarisation or entity linking
  • start_date / end_date: filter by publication window
  • included_sources: explicitly choose what datasets to retrieve from
  • is_tool_call: format optimized for tool calls (but optional)

Example: A RAG backend retrieves papers using start_date and category filters, injects the citation_string into the prompt, and shows the references as expandable footnotes in a UI.

Get started in 4 lines of code:

1from valyu import Valyu
2
3valyu = Valyu(api_key="your-api-key-here")
4
5response = valyu.search(
6 "Implementation details of agentic search-enhanced large reasoning models",
7 max_num_results=5, # Limit to top 5 results
8 max_price=10 # Maximum price per thousand queries (CPM)
9)
10
11print(response)
12
13# Feed the results to your AI agent as you would with other search APIs


Ranking That Thinks Like a Researcher

Relevance isn’t enough. We now score documents with additional signals:

  • Citation count
  • Publisher trust level
  • Metadata completeness

This helps LLMs and agents that use retrieval to pick results that aren’t just related — but reliable.

Example: A scientific assistant ranks competing sources for a claim. With our reranker, results from heavily cited review papers rank higher compared to lower cited papers.


Developer Tools

The new Playground:

  • Visualises results by modality (text, images, etc.)
  • Lets you toggle between raw and structured output
  • Supports time filtering, source targeting, and more

We also added Google Auth so teams can get started faster (Github coming soon).


What You Can Build With This

This release is designed for teams building awesome things:

For Knowledge Work & Research

  • Context-aware tools with inline citations and structured references
  • Study assistants that surface summaries, authors, and relevant citations
  • Generative video tutorials that need diagrams: 3B1B for any paper (please build this).

For Finance

  • Assistants that answer with real-time data and editorial context
  • Research workflows that trace historical pricing alongside market commentary
  • Event-driven agents that retrieve and explain volatility spikes with cited data

For Advanced Retrieval Systems

  • Semantic search engines that return structured context
  • Tool-calling chains that can follow references, not just find results

Let us know what you need next.

We Build 🛠️