TestForge Blog
← All Posts

RAG-Based AI Stock Investment Agent Part 5 — FastAPI, PostgreSQL, and pgvector System Design

A practical implementation blueprint for a RAG-based stock investment Agent using FastAPI, PostgreSQL, pgvector, Redis, async workers, and domain-separated service modules.

TestForge Team ·

Turning Architecture Into Code Structure

Once the architecture is clear, the next job is to turn it into a maintainable codebase.

A practical baseline stack:

  • FastAPI for APIs
  • PostgreSQL for structured and time-series data
  • pgvector for document embeddings
  • Redis for cache and transient state
  • background workers for ingestion and embeddings
app/
├─ api/
│  └─ routes/
├─ core/
├─ models/
├─ services/
├─ agents/
├─ workers/
└─ schemas/

The goal is to separate HTTP routing, domain logic, orchestration, and background jobs.

API Endpoints Should Map to Real Use Cases

A good first set might be:

  • POST /analysis/stock
  • POST /analysis/screen
  • POST /portfolio/evaluate
  • POST /paper-trading/proposals
  • GET /analysis/{run_id}

These are user-oriented workflows, not arbitrary CRUD endpoints.

Example Request Flow

POST /analysis/stock

Input:

{
  "symbol": "NVDA",
  "question": "Analyze new-entry risk over the next 30 days",
  "portfolio_id": "pf_001"
}

Internal flow:

  1. route validates request
  2. agent_service is called
  3. Router parses intent
  4. market_data_service builds a structured summary
  5. retrieval_service retrieves context
  6. risk_service evaluates portfolio rules
  7. response composer creates final output
  8. result is stored and returned

PostgreSQL Schema Design

Example core tables:

create table symbol (
  id bigserial primary key,
  ticker text not null unique,
  name text not null,
  sector text,
  industry text,
  market text
);

create table news_article (
  id bigserial primary key,
  symbol_id bigint references symbol(id),
  title text not null,
  body text not null,
  source text,
  published_at timestamptz not null
);

Timestamp fidelity is especially important in this domain.

How pgvector Fits In

Example chunk table:

create table document_chunk (
  id bigserial primary key,
  symbol_id bigint references symbol(id),
  source_type text not null,
  source_document_id bigint not null,
  chunk_text text not null,
  embedding vector(1536),
  published_at timestamptz,
  metadata jsonb default '{}'::jsonb
);

Filtering by symbol, source type, and time matters just as much as similarity itself.

Retrieval Service Interface

A practical service boundary might be:

class RetrievalService:
    async def search_symbol_context(
        self,
        symbol: str,
        query: str,
        days: int = 30,
        top_k: int = 5,
    ) -> list[dict]:
        ...

This keeps the API layer clean and makes later replacement easier.

Agent Service as the Orchestration Layer

Example shape:

class StockAnalysisAgentService:
    async def run(self, symbol: str, question: str, portfolio_id: str | None):
        query = self.router.parse(question)
        price_summary = await self.market_data.get_price_summary(symbol)
        contexts = await self.retrieval.search_symbol_context(symbol, question)
        risk = await self.risk.evaluate(symbol, portfolio_id)
        return await self.composer.compose(
            symbol=symbol,
            query=query,
            price_summary=price_summary,
            contexts=contexts,
            risk=risk,
        )

The main value is keeping orchestration reusable and testable.

What Should Be Async?

Synchronous paths

  • user-triggered stock analysis
  • portfolio review

Background paths

  • news ingestion
  • filing ingestion
  • transcript storage
  • embedding generation
  • scheduled daily research jobs

Separating those two paths is essential for reliability.

Redis Use Cases

Redis can support:

  • recent analysis cache
  • status tracking for async runs
  • duplicate request suppression
  • worker coordination

Example keys:

analysis:run:{id}
analysis:latest:NVDA
screening:daily:semiconductor

Connect Analysis to Paper Trading Safely

It helps to separate:

  • analysis_run
  • trade_proposal
  • proposal_approval
  • paper_order

That keeps analysis and execution loosely coupled and approval-friendly.

Closing Thoughts

A useful implementation is not about using the most complex stack. It is about clean layering.

Good design here means:

  • separate ingestion from user-facing analysis
  • keep orchestration outside routes
  • treat retrieval as symbol-aware and time-aware
  • separate analysis from execution through proposal objects

That structure gives the system room to grow.