Building an AI Chatbot for Investment Portfolios on AWS Bedrock

The Problem with Financial Interfaces

Most retail investors in India interact with their portfolio through a fragmented set of apps — one for stocks, another for mutual funds, a banking app for cash. Each shows you a slice of your financial picture, none shows you the whole thing. And none of them let you just ask a question.

"How much of my portfolio is in tech stocks?" "If I sell my HDFC Flexi Cap units, what does my total liquid position look like?" "What's my overall P&L this month across everything?" These are simple questions. Getting answers requires switching between apps, doing mental arithmetic, and hoping you haven't missed something.

This project was built to explore whether a conversational AI layer — one that has full visibility across stocks, mutual funds, and banking — could make that experience genuinely useful. Not a dashboard with more charts. A chatbot you can actually talk to.

Architecture: Orchestrator + Specialist Agents

The core architectural decision was to use a supervisor-agent pattern rather than a single monolithic assistant. Financial queries fall into fairly distinct domains — stock analysis, mutual fund analysis, banking — and each domain benefits from a focused context window rather than one agent trying to hold everything in mind at once.

The system has four agents:

OrchestratorAgent — the router

Every user message hits the orchestrator first. It inspects the query and routes it to the most appropriate specialist based on keyword matching and query semantics. If a query spans domains ("show me my overall portfolio value"), the orchestrator handles it directly using a consolidated portfolio view. It's the fallback when no specialist is a clear fit.

StockAgent — equities specialist

Handles anything related to the user's equity holdings — individual stock P&L, sector exposure, holding values, unrealised gains. Its system prompt is injected with the user's actual holdings data at runtime, so every response is grounded in real numbers rather than generic financial knowledge.

MutualFundAgent — funds specialist

Covers mutual fund queries — NAV, SIP status, fund category breakdown, returns across different horizons. Same pattern: live portfolio data injected into the system prompt, responses grounded in the user's actual folio.

BankingAgent — liquidity specialist

Handles banking and cash position queries. Useful for questions like "how much liquid cash do I have available?" or cross-domain queries that need to factor in both invested and uninvested capital.

STACK

Python · FastAPI · AWS Bedrock (Claude 3.5 Sonnet) · Bedrock Converse API · DynamoDB · Next.js 14 · Server-Sent Events

Model Choice: Why Claude 3.5 Sonnet on Bedrock

Model selection for a financial chatbot involves tradeoffs that aren't just about benchmark performance. Here's how the decision broke down:

Why Claude over other options

Financial conversations require a model that handles two things well: following precise constraints and maintaining context across a multi-turn conversation. The constraint that mattered most here was "do not give investment advice" — a hard requirement for any financial application operating in a regulated context. Claude's instruction-following is strong enough that this constraint holds reliably across varied query types, including edge cases designed to get around it.

Claude also handles numerical reasoning well. When a user asks about P&L percentages or cross-portfolio totals, the model needs to do basic arithmetic on injected data accurately. This is an area where model quality matters and where Claude 3.5 Sonnet consistently outperformed lighter alternatives tested during development.

Why Claude 3.5 Sonnet specifically, not Opus or Haiku

Opus was evaluated and rejected on latency grounds. For a streaming chat interface, time-to-first-token matters — users feel the difference between a 400ms and 1200ms initial response even when total generation time is similar. Sonnet hits the right balance of quality and speed for conversational use.

Haiku was evaluated and rejected on quality grounds. For nuanced financial queries — "explain the difference between my XIRR and absolute return on this fund" — Haiku's responses were noticeably shallower and required more back-and-forth to get to a useful answer, which defeats the purpose of a conversational interface.

Sonnet 3.5 sits in the sweet spot: fast enough for streaming chat, capable enough for multi-turn financial reasoning.

APAC cross-region inference and data residency

The production deployment uses apac.anthropic.claude-3-5-sonnet-20240620-v1:0 — the APAC cross-region inference profile — with the Bedrock client hard-locked to ap-south-1 (Mumbai). This was a deliberate data residency decision: financial portfolio data should not leave the region. Using the APAC inference profile means the model invocation stays within the Asia Pacific geography while still benefiting from Bedrock's cross-region routing for availability and throughput.

This is an underappreciated dimension of model choice in enterprise deployments. It's not just "which model" — it's "which model, on which infrastructure, in which region." For financial applications serving Indian users, the answer has to account for data localisation requirements from the outset.

Streaming Responses

The backend uses Bedrock's converse_stream API and streams tokens to the frontend via Server-Sent Events. This was a non-negotiable UX requirement — financial responses can be verbose (a full portfolio breakdown is a lot of text) and a blank screen followed by a wall of text feels broken compared to watching the answer build in real time.

The streaming implementation is straightforward. The orchestrator yields text chunks as they arrive from Bedrock's event stream:

for event in stream:
    if "contentBlockDelta" in event:
        delta = event["contentBlockDelta"]["delta"]
        if "text" in delta:
            yield delta["text"]

The FastAPI route wraps this in a StreamingResponse with text/event-stream content type. The Next.js frontend consumes the SSE stream and appends tokens to the chat bubble as they arrive. Time-to-first-token is typically under 500ms.

Conversation Memory

The chatbot maintains conversation history per session, stored in DynamoDB in production and in-memory for the POC. Every message sent to Bedrock includes the full conversation history, enabling the model to answer follow-up questions coherently — "and what about the mutual fund side?" works because the model has context from the previous turns.

Each agent receives the same history, which means context is preserved even when the orchestrator routes consecutive messages to different specialists. The user doesn't experience a reset when their question shifts from stocks to funds.

What I Learned

Inject live data into system prompts, not the conversation history. Portfolio data belongs in the system prompt — it's context for every turn, not a message in the conversation. Putting it in the system prompt keeps it separate from the dialogue and ensures the model treats it as ground truth rather than something the user said.
Model selection is a product decision, not just a technical one. The choice between Haiku, Sonnet, and Opus directly affected the quality of the user experience. Treat it with the same rigour as any other product decision — evaluate on your actual use case, not benchmark leaderboards.
Data residency is a first-class constraint for financial applications. It needs to be resolved at architecture design time, not retrofitted after the fact. The region, inference profile, and memory store all need to be consistent with where data is allowed to live.
Keyword-based routing works for well-defined domains. The orchestrator's routing logic is intentionally simple — keyword matching against known domain terms. For clearly scoped financial domains (stocks vs funds vs banking), this is reliable and fast. A more complex routing mechanism would add latency without meaningful accuracy gains.
Streaming is table stakes for chat. Users tolerate latency far better when they can see progress. For financial responses that can run to several paragraphs, non-streaming responses feel broken. Build streaming in from the start.

What's Next

Replace keyword routing with a lightweight intent classifier — more robust for ambiguous queries that don't contain obvious domain keywords.
Add buy/sell action capabilities: the chatbot currently only answers questions; the natural next step is letting it initiate transactions with appropriate confirmation flows.
Connect to live market data APIs so stock prices and NAVs are real-time rather than static.
Add a summarisation step that compresses old conversation history before it hits the context limit, enabling very long sessions without degradation.