Data Analysis

AI Conversation History in Google Sheets for FP&A

Marc SeanMay 4, 20265 min read

For FP&A work specifically, this matters more than it does for most users. A board pack build spans days. A DCF iteration lives across multiple sessions. If the AI can't reference what it told you yesterday, you're not working with an assistant — you're working with a very fast autocomplete.

What Persistent Conversation History Actually Enables

Without persistence, every session starts at zero. You paste in your assumptions tab, explain that column G is the revenue bridge, clarify that "plan" means FY26 board-approved targets and not the revised April reforecast, and 10-15 minutes later you're finally asking the actual question you came to ask.

With persistent conversation history, the AI already knows your model structure. It can pick up a runway sensitivity analysis you started Tuesday and continue it Thursday. It can cross-reference a decision you made in session 14 when answering a question in session 22.

The practical unlock isn't raw capability — it's compounding context. Each session builds on the last instead of starting from scratch.

Why Conversation History Is a Structural Control on Consistency

Here's the underrated problem: without conversation history, AI behavior is fundamentally inconsistent across sessions — not because the model changed, but because it has no idea who you are or what you've established.

You tell it once that you use a 10.5% WACC for your core scenario. Next session, it defaults to 12% because that's "typical" for companies in your sector. Your numbers don't tie. You don't immediately know why.

Conversation history isn't just a convenience feature — it's a consistency control. If the AI can reference session 3 where you locked in WACC assumptions, session 8 where you agreed on a revenue recognition policy, and session 15 where you established how to handle the minority interest, it will produce consistent outputs. Without that thread, every session is a fresh negotiation over your model's ground rules.

This is the gap that makes most AI tools feel useful for isolated tasks but unreliable for anything spanning a multi-week model build.

How Token Limits Shape Your Conversation History

Every AI model has a context window — the maximum text it can "see" at once, including your conversation history. As of May 2026, that window ranges from 128K tokens on most Claude deployments to 200K on Claude's extended mode. That sounds enormous until you realize a dense financial model description, a full conversation transcript, and a few tool outputs can consume that headroom quickly.

When you hit the limit, something has to give. Most implementations silently truncate the oldest messages — meaning the AI loses the context you built at the start of a long session.

The approaches for handling this vary significantly:

ApproachWhat happens at the limitQuality impact
Silent truncationOldest messages droppedAI loses early context with no warning
Hard cutoffConversation stopsDisruptive but at least transparent
Summarization checkpointAI summarizes before continuingContext preserved in compressed form
Natural completion promptAI signals limit, asks how to proceedUser stays in control

The last approach is what you actually want for complex FP&A work. ModelMonkey handles token-limit events by generating a completion summary — what it's accomplished, what comes next, and a prompt for how you'd like to proceed — rather than silently dropping context or abruptly stopping. That keeps you oriented even across long multi-turn sessions.

How Custom Instructions Complement Conversation History

Custom instructions are session-agnostic — they apply to every conversation regardless of history. Conversation history is session-specific — it builds within and across sessions.

The two work together. Custom instructions handle standing rules: your WACC methodology, preferred output format, whether you want formula explanations or just the formula. Conversation history handles accumulating context: decisions you've made, iterations you've run, model-specific choices that are unique to this engagement.

Without custom instructions, you're re-establishing preferences every session. Without conversation history, you're re-establishing facts every session. Both gaps compound into the same problem: a stateless assistant, and stateless tools don't work well for multi-week deliverables.

The right pattern is to use custom instructions (typically supporting up to 4,000 characters) for standing rules and let conversation history carry model-specific context. Don't conflate them — custom instructions set in week 1 of an engagement shouldn't be patched with session-specific decisions that belong in conversation history.

What to Evaluate When Choosing an AI Tool for FP&A

The checklist isn't long, but most tools fail at least 2 of these:

  • Session persistence: Can you resume a conversation started 3 days ago, with full message history intact?
  • History browsability: Can you search or scroll past conversations, or is yesterday's session gone?
  • Truncation behavior: Does the tool tell you when it's approaching token limits, or does it quietly drop context?
  • Cross-session coherence: If you reference a decision from a prior session, does the AI know what you're talking about?
  • Export: Can you pull a conversation transcript for documentation or audit purposes?

Most off-the-shelf tools built for general productivity handle the first two adequately. The third and fourth are where FP&A-specific use cases expose the gaps that matter.


Frequently Asked Questions