Prompted LLM
55-62%
Draft benchmark band for raw prompting without task-specific tuning.
OpenTradex Guide
Hedge funds spend real money assembling the same building blocks this guide walks through: prompting, validation, domain adaptation, retrieval, debate, monitoring, and risk controls. The goal here is not fantasy returns. The goal is to help you build a serious trading-LLM workflow from first signal to production guardrails.
Prompted LLM
55-62%
Draft benchmark band for raw prompting without task-specific tuning.
Fine-tuned
65-72%
Illustrative range for a LoRA-tuned financial classification task.
Fine-tuned + RAG
68-75%
Adds current retrieval context on top of a domain-adapted model.
Reality check
Validate it
Slippage, data leakage, and regime change still matter more than slide-deck numbers.
Six build levels
The red-line version is a raw LLM with no optimization. The stronger system adds data hygiene, task adaptation, retrieval, debate, and production controls one layer at a time.
Level 1
Start with a structured analyst prompt, force JSON output, and map one news item into a next-day directional signal.
Working result
You finish with a working analysis function that can classify a headline as bullish, bearish, or neutral.
Prompt + analyzer
pythonimport json
import os
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
SYSTEM_PROMPT = """You are a senior equity research analyst.
Analyze financial news and predict the most likely stock move for the NEXT TRADING DAY.
Respond only with JSON: {signal, confidence, reasoning, key_factors}"""
def analyze_news(news_text: str, ticker: str = "") -> dict:
response = client.chat.completions.create(
model="gpt-4o-mini",
temperature=0.1,
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Ticker: {ticker}\nNews: {news_text}"},
],
)
return json.loads(response.choices[0].message.content)Level 2
Join dated news to next-day returns, convert returns into labeled outcomes, and compute whether the signals would have helped.
Working result
You stop guessing whether the prompt is useful and start measuring a real signal-to-result pipeline.
Dataset + backtest core
pythonimport numpy as np
import pandas as pd
import yfinance as yf
def build_backtest_dataset(ticker: str, news_list: list[dict]) -> pd.DataFrame:
stock = yf.download(ticker, start="2024-01-01", end="2025-12-31")
stock["next_day_return"] = stock["Close"].pct_change().shift(-1)
rows = []
for item in news_list:
date = pd.Timestamp(item["date"])
future = stock.loc[stock.index >= date, "next_day_return"]
if future.empty:
continue
ret = float(future.iloc[0])
actual = "BULLISH" if ret > 0.005 else ("BEARISH" if ret < -0.005 else "NEUTRAL")
pred = analyze_news(item["text"], ticker)
rows.append({"actual_return": ret, "actual_signal": actual, **pred})
return pd.DataFrame(rows)
def run_backtest(df: pd.DataFrame) -> dict:
accuracy = (df["signal"] == df["actual_signal"]).mean()
returns = np.where((df["signal"] == "BULLISH") & (df["confidence"] >= 6), df["actual_return"], 0.0)
sharpe = (returns.mean() / (returns.std() + 1e-9)) * np.sqrt(252)
return {"accuracy": float(accuracy), "sharpe": float(sharpe)}Level 3
Build a date-split dataset, then use 4-bit loading plus LoRA adapters to teach a local model your label format and financial framing.
Working result
You get a portable adapter tuned for your signal format, your examples, and your evaluation loop.
Chronological dataset builder
pythonimport json
from pathlib import Path
class TradingDatasetBuilder:
def __init__(self, output_dir="./dataset"):
self.output_dir = Path(output_dir)
self.output_dir.mkdir(exist_ok=True)
self.examples = []
def add_news_with_prices(self, news_items: list[dict], ticker: str):
stock = yf.download(ticker, start="2023-01-01", end="2025-12-31", progress=False)
stock["ret"] = stock["Close"].pct_change().shift(-1)
for item in news_items:
date = pd.Timestamp(item["date"])
future = stock[stock.index >= date]
if future.empty:
continue
ret = float(future["ret"].iloc[0])
signal = "BULLISH" if ret > 0.01 else ("BEARISH" if ret < -0.01 else "NEUTRAL")
self.examples.append({"input": item["text"], "output": json.dumps({"signal": signal}), "_date": item["date"]})
def save(self):
self.examples.sort(key=lambda row: row["_date"])
clean = lambda rows: [{k: v for k, v in row.items() if not k.startswith("_")} for row in rows]
n = len(self.examples)
splits = {
"train": clean(self.examples[: int(n * 0.85)]),
"val": clean(self.examples[int(n * 0.85) : int(n * 0.95)]),
"test": clean(self.examples[int(n * 0.95) :]),
}
for name, rows in splits.items():
with open(self.output_dir / f"{name}.json", "w", encoding="utf-8") as handle:
json.dump(rows, handle, indent=2)Level 4
Use embeddings plus a vector store to retrieve recent market context and inject it at inference time.
Working result
Your model keeps its learned analysis style while receiving fresh context from current articles and watchlist-specific news.
Financial RAG skeleton
pythonimport json
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
class FinancialRAG:
def __init__(self):
self.embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
self.vectorstore = None
def add_news(self, articles: list[dict]):
texts = [article["text"] for article in articles]
if self.vectorstore is None:
self.vectorstore = FAISS.from_texts(texts, self.embeddings)
else:
self.vectorstore.add_texts(texts)
def context_for(self, news: str, ticker: str = "") -> str:
docs = self.vectorstore.similarity_search(f"{ticker} {news}", k=3) if self.vectorstore else []
return "\n---\n".join(doc.page_content for doc in docs)Level 5
Break the job into specialized analysts, then synthesize their arguments with a portfolio-manager style final pass.
Working result
Instead of one opaque answer, you get structured disagreement, consensus, and role-specific reasoning.
Minimal multi-agent trader
pythonimport json
class MultiAgentTrader:
def __init__(self, client, model="trading-lora"):
self.client = client
self.model = model
def _ask(self, role_prompt, news, ticker):
response = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": role_prompt},
{"role": "user", "content": f"Ticker: {ticker}\nNews: {news}\nJSON only."},
],
)
return json.loads(response.choices[0].message.content)
def analyze(self, news, ticker):
bull = self._ask("BULLISH analyst. Find every positive signal.", news, ticker)
bear = self._ask("BEARISH analyst. Find every risk.", news, ticker)
quant = self._ask("QUANT analyst. Only numbers and measurable deltas.", news, ticker)
return {"bull": bull, "bear": bear, "quant": quant}Level 6
Serve the model behind an OpenAI-compatible API, monitor drift, gate trades through hard risk rules, and start on paper.
Working result
You end with a system that can say stop, reduce size, or skip entirely when signal quality degrades.
Monitor + risk layer
pythonfrom collections import deque
from dataclasses import dataclass
import numpy as np
class ModelMonitor:
def __init__(self, window=100):
self.predictions = deque(maxlen=window)
self.actuals = deque(maxlen=window)
self.returns = deque(maxlen=window)
self.baseline_accuracy = 0.68
def status(self):
if len(self.predictions) < 30:
return {"status": "WARMING_UP"}
acc = sum(p == a for p, a in zip(self.predictions, self.actuals)) / len(self.predictions)
sharpe = (np.mean(self.returns) / (np.std(self.returns) + 1e-9)) * np.sqrt(252)
if acc < self.baseline_accuracy * 0.8 or sharpe < 0:
return {"status": "HALT"}
if acc < self.baseline_accuracy * 0.9:
return {"status": "CAUTION"}
return {"status": "HEALTHY"}
@dataclass
class TradeOrder:
ticker: str
side: str
size: float
stop_loss: float
take_profit: float
reason: strWhat stays honest
The strongest systems still fail when data leaks, costs are ignored, or models drift into a new market regime. Good architecture helps. Good evaluation and hard risk boundaries matter more.
Start building
Python starter stack
pip install openai yfinance pandas numpy scikit-learnOpenTradex onboarding
npm install -g opentradex@latest && opentradex onboardAlt package flow
npx opentradex@latest onboardHosted installer
curl -fsSL https://opentradex.vercel.app/install.sh | bash