RAG-Based AI Stock Investment Agent Part 4 — Portfolio Construction, Risk Rules, and Backtesting

Stock Analysis and Portfolio Management Are Different Problems

An Agent can be correct about a stock and still be wrong about taking the trade.

A practical system needs to separate:

stock quality
portfolio state
risk tolerance

That is why portfolio logic should exist as a separate layer.

Why a Portfolio Layer Matters

The overall flow should look like:

Signal Layer
 -> Candidate Selection
 -> Position Sizing
 -> Portfolio Constraints
 -> Risk Checks
 -> Paper Order Proposal

The Agent is usually strongest in signal generation and candidate ranking. Sizing and constraints should remain mostly deterministic.

Position Sizing Does Not Need To Be Fancy at First

A good first version often uses simple rules:

max single-name weight
max sector exposure
per-strategy capital cap
smaller size for high-volatility symbols

Example:

def target_weight(volatility_20d: float) -> float:
    if volatility_20d > 0.45:
        return 0.02
    if volatility_20d > 0.30:
        return 0.03
    return 0.05

Simple rules already prevent many dangerous over-sizing behaviors.

Risk Rules Should Not Live Only in Prompts

The application layer should enforce rules such as:

maximum single-name exposure
maximum sector exposure
max new entries per day
no new positions right before earnings
pause new entries after drawdown thresholds

Example:

def evaluate_trade(symbol, candidate_weight, portfolio):
    if portfolio.single_name_exposure(symbol) + candidate_weight > 0.10:
        return "reject: single-name limit exceeded"
    if portfolio.sector_exposure(symbol) + candidate_weight > 0.25:
        return "reject: sector limit exceeded"
    return "pass"

These rules should be explicit, explainable, and auditable.

Event Risk Deserves Its Own Check

Market volatility often clusters around events:

earnings
FOMC
CPI
major regulation updates
product launches

Even when a stock looks attractive, event proximity may justify delaying entry.

Example:

if earnings_date <= now + 24h:
  block new position
if FOMC day:
  disable high-risk strategy

Backtesting Has Limits

A strong backtest does not guarantee production success.

Common pitfalls include:

lookahead bias
document timestamp misalignment
survivorship bias
overfitting
missing transaction costs

RAG-based strategies are especially sensitive to data timing.

What the Backtest Engine Needs

At minimum:

universe definition
point-in-time data snapshot
signal generation rules
risk rules
transaction cost model
slippage assumptions
performance metrics

Useful performance metrics:

CAGR
Sharpe
max drawdown
hit ratio
average holding period
turnover

Why RAG Backtests Are Harder

Structured price-based strategies are easier to replay historically.

RAG-based systems depend on text data becoming available at specific times:

an earnings release at 08:30
a syndicated article at 09:10
a transcript summary at 10:00

Your historical store needs point-in-time fidelity, not just the latest final version.

Paper Trading Is Not the Same as Backtesting

Paper trading validates the live operational workflow:

scheduler reliability
data delays
duplicate signal creation
approval workflow
alerts
order proposal logic

Many operational failures appear in paper trading long before real capital is involved.

A Good Initial Workflow

Agent generates daily candidates
Risk engine runs rule checks
Passing candidates become paper proposals
A human approves or rejects them
Results and reasons are stored
Post-trade performance is tracked

This makes experimentation safer and easier to explain.

Example Portfolio Output

Candidates:
- NVDA 3%
- MSFT 4%

Rejected:
- AMD: semiconductor exposure limit exceeded
- TSLA: earnings event within 12 hours

Portfolio impact:
- AI infrastructure exposure 18% -> 22%
- no single-name limit violation

Outputs like this make the Agent much closer to a usable research tool.

Closing Thoughts

A real investment Agent needs a portfolio and risk layer behind the analysis engine.

The key principles are:

good analysis is not the same as a good trade
sizing and limits should stay outside the LLM
backtests require point-in-time data discipline
paper trading should come before real execution

That is how the system becomes operationally credible.