
Our June 2025 release of the Valyu DeepSearch API brings three highly requested improvements designed to remove friction from your search and retrieval workflows. You’ll now spend less time tweaking parameters and more time getting the exact context you need.
- Understands Your Intent
We’ve supercharged our Named Entity Recognition model so it automatically picks out author names, paper/article titles, conferences and years. It is entirely intent driven so no special flags, no extra parameters- DeepSearch “gets” what you mean. - Search Inside Any Open-Access Article
Fetch, parse and search within an article all in one call. Just hand DeepSearch a URL to an open-access PDF and ask your question. It works seamlessly with arXiv, PubMed Central and similar sources. - Rich Images from Web Results
When your query returns web pages, DeepSearch now crawls and serves any embedded images alongside the text. Ideal for grabbing figures, diagrams or photos without extra work.
Smarter Search That Gets Your Intent
DeepSearch’s improved NER engine tags entities such as author names, paper titles, conference names and dates. We feed those signals into our ranking algorithm so your results automatically align with your intent.
For example, if you ask for:
1from valyu import Valyu23valyu = Valyu(api_key="YOUR_KEY")45response = valyu.search("Andrew Ng's 2012 ICML paper on deep learning")67print(response[0].title)8# ➜ Deep Learning (ICML 2012)
DeepSearch spots “Andrew Ng,” “2012”, and “ICML” and elevates the exact paper you were looking for, as you can see there is no other extra params needed.
Search Inside Any Open Article
Found and article but want your agent/app to dive into a specific open-access article? Just pass its URL to DeepSearch and ask your follow-up question.
1# Step 1 # Broad query2survey = valyu.search(3 "Transformer RLHF survey recent advances",4 max_num_results=3,5 is_tool_call=True,6)78paper = docs[0].url910# Step 2: Ask a specific question11conclusion = valyu.search(12 "Return the conclusion section",13 included_sources=[paper],14 is_tool_call=True,15)1617print(conclusion[0].content[:200])
It works with open-access sources like arXiv and PubMed Central. There’s no scraping or downloading involved. Just clean, agent-ready search.
Quick demo:
1# Search over an ArXiv paper2valyu.search(3 "compute dataset size parameters scaling law diagram",4 included_sources=["https://arxiv.org/abs/2001.08361"],5 is_tool_call=True,6)
Response:
1[ {...2"query": "compute dataset size parameters scaling law diagram",3 "results": [4 {5 "id": "47586890-c304-45a0-90f5-ab1e2d21538a:2001.08361:4",6 "title": "Scaling Laws for Neural Language Models",7 "url": "https://arxiv.org/abs/2001.08361?utm_source=valyu.network&utm_medium=referral&utm_campaign=ai_pilot&utm_content=ai_query",8 "content": "Content: #### 3.3 Performance with Dataset Size and Compute\n\nWe display empirical trends for the test loss as a function of dataset size D (in tokens) and training compute C in Figure 1 \n\nFor the trend with D we trained a model with (nlayer, nembd) = (36, 1280) on fixed subsets of the WebText2 dataset. We stopped training once the test loss ceased to decrease. We see that the resulting test losses can be fit with simple power-law\n\n$$L(D)\\approx\\left(\\frac{D_{c}}{D}\\right)^{\\alpha_{D}}\\tag{3.2}$$\n\nin the dataset size. The data and fit appear in Figure 1 \n\nThe total amount of non-embedding compute used during training can be estimated as C = 6NBS, where B is the batch size, S is the number of parameter updates, and the factor of 6 accounts for the forward and backward passes. Thus for a given value of C we can scan over all models with various N to find the model\n\nwith the best performance on step S = C 6BS . Note that in these results *the batch size* B *remains fixed for all models*, which means that these empirical results are not truly optimal. We will account for this in later sections using an adjusted Cmin to produce cleaner trends.\n\nThe result appears as the heavy black line on the left-hand plot in Figure 1 It can be fit with\n\n$$L(C)\\approx\\left(\\frac{C_{c}}{C}\\right)^{\\alpha_{C}}\\tag{3.3}$$\n\nThe figure also includes images of individual learning curves to clarify when individual models are optimal. We will study the optimal allocation of compute more closely later on. The data strongly suggests that sample efficiency improves with model size, and we also illustrate this directly in Figure 19 in the appendix.\n\n\n\n## 4 Charting the Infinite Data Limit and Overfitting\n\nIn Section 3 we found a number of basic scaling laws for language modeling performance. Here we will study the performance of a model of size N trained on a dataset with D tokens while varying N and D simultaneously. We will empirically demonstrate that the optimally trained test loss accords with the scaling law of Equation (1.5). This provides guidance on how much data we would need to train models of increasing size while keeping overfitting under control.\n\n\n\n#### 4.1 Proposed L(N, D) Equation\n\nWe have chosen the parameterization (1.5) (repeated here for convenience):\n\n$$L(N,D)=\\left[\\left(\\frac{N_{c}}{N}\\right)^{\\frac{\\alpha_{N}}{D}}+\\frac{D_{c}}{D}\\right]^{\\alpha_{D}}\\tag{4.1}$$\n\nusing three principles:\n\n- 1. Changes in vocabulary size or tokenization are expected to rescale the loss by an overall factor. The parameterization of L(N, D) (and all models of the loss) must naturally allow for such a rescaling.\n- 2. Fixing D and sending N → ∞, the overall loss should approach L(D). Conversely, fixing N and sending D → ∞ the loss must approach L(N).\n- 3. L(N, D) should be analytic at D = ∞, so that it has a series expansion in 1/D with integer powers. Theoretical support for this principle is significantly weaker than for the first two.\n\nOur choice of L(N, D) satisfies the first requirement because we can rescale Nc, Dc with changes in the vocabulary. This also implies that the values of Nc, Dc have no fundamental meaning.\n\nSince we stop training early when the test loss ceases to improve and optimize all models in the same way, we expect that larger models should always perform better than smaller models. But with fixed finite D, we also do not expect any model to be capable of approaching the best possible loss (ie the entropy of text). Similarly, a model with fixed size will be capacity-limited. These considerations motivate our second principle. Note that knowledge of L(N) at infinite D and L(D) at infinite N fully determines all the parameters in L(N, D).\n\nThe third principle is more speculative. There is a simple and general reason one might expect overfitting to scale ∝ 1/D at very large D. Overfitting should be related to the variance or the signal-to-noise ratio of the dataset [AS17], and this scales as 1/D. This expectation should hold for any smooth loss function, since we expect to be able to expand the loss about the D → ∞ limit. However, this argument assumes that 1/D corrections dominate over other sources of variance, such as the finite batch size and other limits on the efficacy of optimization. Without empirical confirmation, we would not be very confident of its applicability.\n\nOur third principle explains the asymmetry between the roles of N and D in Equation (1.5). Very similar symmetric expressions4 are possible, but they would not have a 1/D expansion with integer powers, and would require the introduction of an additional parameter.\n\nIn any case, we will see that our equation for L(N, D) fits the data well, which is the most important justification for our L(N, D) ansatz.\n<F>\n**Figure 1**: Figure 1\n<img>_page_3_Figure_1.jpeg</img>\nFigure 1: Smooth power laws: Performance has a power-law relationship with each of the three scale factors N, D, C when not bottlenecked by the other two, with trends spanning more than six orders of magnitude (see Figure 1 ). We observe no signs of deviation from these trends on the upper end, though performance must flatten out eventually before reaching zero loss. (Section 3)\n</F>\n",9 "source": "valyu/valyu-arxiv",10 "length": 801,11 "image_url": {12 "_page_3_Figure_1.jpeg": "SEE BELOW"13 },14 "publication_date": "2020-01-01",15 "doi": "https://doi.org/10.48550/arxiv.2001.08361",16 "citation": "Jared Kaplan et al. (2020). Scaling Laws for Neural Language Models.https://doi.org/10.48550/arxiv.2001.08361",17 "citation_count": 1162,18 "authors": [19 "Jared Kaplan",20 "Sam McCandlish",21 "Tom Henighan",22 "T. B. Brown",23 "Benjamin Chess",24 "Rewon Child",25 "Scott Gray",26 "Alec Radford",27 "Jeffrey Wu",28 "Dario Amodei"29 ],30 "references": "...",31 "price": 0.0005,32 "data_type": "unstructured",33 "source_type": "paper",34 "relevance_score": 0.9735 }]3637}38]
The image returned:
That’s one line to pull out the relevant section of “Scaling Laws for Neural Language Models” for the scaling law images.
Rich Images for Web Search Results
Now, when your search taps into web pages, DeepSearch will extract any embedded images and include their URLs in the result payload. Perfect for grabbing infographics, diagrams or photos.
1{2 "success": true,3 "error": "",4 "tx_id": "tx_db349fe1-0dba-4ccf-9602-42c76936b013",5 "query": "national geographic 2018 best photos",6 "results": [7 {8 "title": "National Geographic's Best Pictures of 2018",9 "url": "https://www.nationalgeographic.com/photography/article/best-pictures-2018?utm_source=valyu.network&utm_medium=referral&utm_campaign=ai_pilot&utm_content=ai_query",10 "content": "# Best photos of 2018\n\nNational Geographic's 100 best images of the year—curated from 107 photographers, 119 stories, and more than two million photographs....",11 "description": "National Geographic's 100 best images of the year—curated from 107 photographers, 119 stories, and more than two million photographs.",12 "source": "web",13 "price": 0.0015,14 "length": 24996,15 "data_type": "unstructured",16 "source_type": "website",17 "id": "https://www.nationalgeographic.com/photography/article/best-pictures-2018",18 "image_url": {19 "0": "https://i.natgeofe.com/n/9bc51240-32d4-478c-b20a-2e836a0832a6/last-ice-climate-change-ocean-sea-animals-3_16x9.jpg?w=1200",20 "1": "https://i.natgeofe.com/n/9bc51240-32d4-478c-b20a-2e836a0832a6/last-ice-climate-change-ocean-sea-animals-3.jpg",21 "2": "https://i.natgeofe.com/n/b8a3be63-5693-4dfc-8bd3-4699c632f6bc/poisoning-africa-herders.jpg",22 "3": "https://i.natgeofe.com/n/8c1128d0-52e5-43db-8caa-60a3c4682af4/shark-frenzy-breeding-groupers-french-polynesia-channel-17.jpg",23 "4": "https://i.natgeofe.com/n/b7b28736-5c05-4f05-ae34-ea7ab6fdd6ed/muslims-in-america-children-bounce-house-eid-celebration-lynsey-addario.jpg",24 "5": "https://i.natgeofe.com/n/8e49ca1c-c97d-4620-bc09-c1d9c18d5094/jellyfish-diversity-species-17.jpg",25 "6": "https://i.natgeofe.com/n/cb6f2d42-467e-48c6-bc11-21054bf4e227/latino-power-palmer-society-graduation.jpg",26 "7": "https://i.natgeofe.com/n/70e9140f-9f0a-49c8-b204-cd561da303e9/butterfly-trade-catchers-insects-6.jpg",27 "8": "https://i.natgeofe.com/n/a32d0198-e98f-47a7-a336-ff1a6bb29f17/plastic-waste-single-use-worldwide-consumption-9.jpg",28 "9": "https://i.natgeofe.com/n/0e380c1e-b15b-4938-b7fc-bb9f1789d737/hungry-tiger-china-food-industry-animals-18.jpg"29 },30 "relevance_score": 0.934821706216866731 },
You get both the text and any visuals in one go, so your agents can assemble richer, more context-aware responses.
How to use these features?
Our design philosophy for the DeepSearch API has always been to make it agent-native. As such, we’ve made it super simple for agents in tool-calling workflows to interact with these features and get the context they need. For example, a research agent might:
- Make a high-level query such as “research on agentic search-enhanced large reasoning models”
1{2 "query": "research on agentic search-enhanced large reasoning models",3 "search_type": "proprietary",4 "max_num_results": 105}67# example response item8"id": "e8dfe45e-93d5-4c75-a66f-a258f372c7ab:2501.05366:1",9"title": "Search-o1: Agentic Search-Enhanced Large Reasoning Models",10"url": "https://arxiv.org/abs/2501.05366?utm_source=valyu.network&utm_medium=referral&utm_campaign=ai_pilot&utm_content=ai_query",11"content": "#### 3.2 Overview of the Search-o1 Framework...",
- Using the response returned from the api, the agent can dive deep into the papers returned by passing the arXiv URL in the included_sources field:
1{2 "query": "conclusion of the paper",3 "search_type": "proprietary",4 "included_sources": ["https://arxiv.org/abs/2501.05366"],5 "max_num_results": 36}78# example reponse item9"id": "c5c50f09-9536-47b4-a35f-7727633f0495:2501.05366:6",10"title": "Search-o1: Agentic Search-Enhanced Large Reasoning Models",11"url": "https://arxiv.org/abs/2501.05366?utm_source=valyu.network&utm_medium=referral&utm_campaign=ai_pilot&utm_content=ai_query",12"content": "# 5 Conclusion\n\nIn this work, we present Search-o1, a framework that addresses...",
How to upgrade
There’s nothing you need to change in your code or settings. These enhancements live in our backend and apply automatically for every existing API key.
Why we built it
We know AI agents live and die by the quality of their retrieval. With smarter intent detection, in-paper search and built-in image support, your agents spend less time retrying queries and more time delivering precise, actionable insights.
Give it a try or explore our docs. We’re excited to see what you’ll build with the next generation of DeepSearch. If you want us to help you index a specific source of information, reach out at founders@valyu.network