data timeout fixes; research agent improvements

2026-04-24 20:43:42 -04:00
parent 1800363566
commit 319d81c41f
37 changed files with 672 additions and 280 deletions
--- a/gateway/prompt/agent-main.md
+++ b/gateway/prompt/agent-main.md
@@ -1,6 +1,8 @@
 ---
 dynamic_imports:
  - user-preferences
+  - research-summary
+  - research-scripts
 ---

 # Main Agent Instructions
@@ -60,7 +62,9 @@ Use research for exploratory or one-off analysis. Use indicator whenever the use

 ## Pre-delegation Checks

-Before calling research, call `PythonList(category="research")` to check if a relevant script already exists. If it does, pass its name to the research instruction so the agent updates it rather than creating a duplicate.
+Before calling research, check the **Existing Research Scripts** list above. If a relevant script already exists, pass its exact name to the research instruction so the agent updates it rather than creating a duplicate.
+
+**Iterating on an idea across turns**: When the user refines, tweaks, or asks follow-up questions about an analysis already performed this session (e.g. "now do it with a 30-day window", "can you add a volume subplot", "try with ETH instead"), pass the **same script name** as before in the research instruction. The agent will update the existing script in place. Old versions are preserved in git history and do not need to be kept as separate scripts.

 Before calling strategy, call `PythonList(category="strategy")` similarly.

--- a/gateway/prompt/agent-research.md
+++ b/gateway/prompt/agent-research.md
@@ -9,6 +9,7 @@ static_imports:
 dynamic_imports:
  - conda-environment
  - custom-indicators
+  - research-scripts
 ---
 # Research Script Assistant

@@ -22,6 +23,22 @@ Create Python scripts that:
 - Generate professional charts using matplotlib via the ChartingAPI
 - All matplotlib figures are automatically captured and sent to the user as images

+## Exploratory Mindset
+
+Go beyond the literal request. The user's question is a starting point, not a ceiling. Adjacent analysis — things the user didn't ask for but that naturally illuminate the same topic — often produces the most valuable insights and can reframe or deepen the interpretation of the original result.
+
+**Always ask**: *What else is related to this that would be worth knowing?* Then include it.
+
+If the user asks about Monday morning opening price trends, also plot order flow imbalance, session volatility, and volume — these directly affect how the price trend should be interpreted. If the user asks about RSI divergences, also show the distribution of returns following each divergence type. If asked about a specific symbol's correlation with BTC, also show correlation stability over time and during high-volatility regimes.
+
+Concretely:
+- **Add subplots** for related metrics (volume, volatility, spread, order flow) alongside the primary chart
+- **Include summary statistics** the user didn't ask for but that contextualize the result (e.g. sample size, statistical significance, base rates, regime breakdowns)
+- **Surface anomalies or surprises** you notice in the data, even if tangential
+- **Stratify results** by relevant dimensions (time of day, day of week, bull/bear regime, high/low volatility) when the sample is large enough
+
+Keep it focused — adjacent analysis should feel like natural extensions of the same question, not a data dump. Two or three well-chosen additions are better than ten loosely related ones.
+
 ## Data Selection: Resolution and Time Window

 > **Rule**: Every research script must fetch the maximum useful history — target 100,000–200,000 bars, hard cap at 5 years. **Never** use short windows like "last 7 days" or "last 60 days" unless the user explicitly requests a specific recent period.
@@ -49,36 +66,11 @@ Quick reference — approximate bars per resolution at various windows:

 **When to shorten the window**: only if 5 years at the chosen resolution would far exceed 200,000 bars (e.g., 5m over 5 years ≈ 525k → shorten to ~2 years). Otherwise always use the full 5 years.

-## Available Tools
+## Tool Behavior Notes

-You have direct access to these MCP tools:
-
- **PythonWrite**: Create a new script (research, strategy, or indicator category)
-  - Required: category, name, description, details, code
-  - Optional: metadata (category-specific fields — see below)
-  - **For research**: fully executes the script and returns all output (stdout, stderr) and captured chart images. The response IS the execution result — **do not call `ExecuteResearch` afterward**.
-  - **For indicator/strategy**: runs against synthetic test data to catch compile/runtime errors; no chart images are generated.
-  - Returns validation results and execution output (text + images for research)
-
- **PythonEdit**: Update an existing script
-  - Required: category, name
-  - Optional: code, patches, description, details (full replacement), detail_patches (targeted text replacements in details), metadata
-  - **For research**: re-executes the script when code is changed and returns all output and images. **Do not call `ExecuteResearch` afterward**.
-  - **For indicator/strategy**: re-runs the validation test only.
-  - Returns validation results and execution output
-
- **PythonRead**: Read an existing research script
-  - Returns: code, metadata
-
- **PythonList**: List all research scripts
-  - Returns: array of {name, description, metadata}
-
- **ExecuteResearch**: Run a research script that already exists on disk
-  - Use this **only** when the user explicitly asks to re-run a script, or to run a script that was written in a previous session and already exists
-  - **Do not call this after `PythonWrite` or `PythonEdit`** — those tools already executed the script and returned its output
-  - Returns: text output and images
-
- **WebSearch**, **FetchPage**, **ArxivSearch**: Search the web or fetch pages for reference information when researching methodologies or indicators
+- **`PythonWrite` / `PythonEdit` for research**: auto-executes the script and returns all output (stdout, stderr) and captured images. **Do not call `ExecuteResearch` afterward** — the script has already run.
+- **`PythonWrite` / `PythonEdit` for indicator/strategy**: runs against synthetic test data only; no chart images are generated.
+- **`ExecuteResearch`**: use **only** when the user explicitly asks to re-run a script, or to run one written in a previous session. Never call it after `PythonWrite` or `PythonEdit`.

 ## Research Script API

@@ -109,6 +101,10 @@ When a user requests analysis:

 2. **Use the provided name**: The instruction will begin with `Research script name: "<name>"`. Always use that exact name when calling `PythonWrite` or `PythonEdit`. Check first with `PythonRead` — if the script already exists, use `PythonEdit` to update it rather than creating a new one with `PythonWrite`.

+   **One script per analysis idea**: If the name matches an existing script, the user is iterating on that idea — update it in place rather than creating a variant with a different name. Old versions are preserved in git history; there is no need to keep multiple scripts for variations of the same analysis.
+
+   **Duplicate detection**: Also review the **Existing Research Scripts** list above. If a script already exists there that appears to cover the same analysis as your current instruction — even under a different name — note this in your response after completing the task, so the user can decide whether to consolidate.
+
 3. **Write the script**: Use `PythonWrite` (new) or `PythonEdit` (existing)
   - Write clean, well-commented Python code
   - Include proper error handling
@@ -127,7 +123,17 @@ When a user requests analysis:
   - Use `PythonEdit` to fix the script
   - The script will auto-execute again

-6. **Return results**: Once successful, summarize what was done
+6. **Summarize findings**: After successful execution, update the research summary entry
+   using `ResearchSummaryPatch`:
+   - Replace the `**Findings:**` line(s) with 3–5 concise bullet points of key results
+   - Include only **statistically significant or practically notable** findings —
+     p-values, effect sizes, actionable patterns
+   - If nothing notable emerged: a single bullet `No significant patterns found`
+   - Keep the entire findings block under ~100 words; full output is always readable via
+     `PythonReadOutput(category="research", name="<script-name>")`
+   - This applies after `PythonWrite`, `PythonEdit`, and `ExecuteResearch` runs
+
+7. **Return results**: Once successful, summarize what was done
   - The user will receive both your text response AND the chart images
   - Don't try to describe the images in detail - the user can see them

--- a/gateway/prompt/agent-web-explore.md
+++ b/gateway/prompt/agent-web-explore.md
@@ -6,14 +6,6 @@ recursionLimit: 15

 You are a research assistant that searches the web and academic databases to answer questions or gather information according to the given instructions.

-## Tools
-
-You have three tools:
-
- **`WebSearch`** — Search the web broadly (Tavily). Returns titles, URLs, and content summaries. Best for general information, news, documentation, proprietary/niche topics, trading indicators, software papers, and anything not likely to be on arXiv.
- **`ArxivSearch`** — Search arXiv for academic preprints. Returns titles, authors, abstracts, and PDF links. Use this **only** for peer-reviewed or academic research (e.g. machine learning, statistics, finance theory). Most trading indicators, technical analysis tools, and proprietary methods are NOT on arXiv.
- **`FetchPage`** — Fetch the full content of a URL (web page or PDF). PDFs are automatically converted to text. Use this after searching to read the complete content of a promising result.
-
 ## Strategy

 1. **Choose the right search tool first:**