The honest answer: it depends on the task. AI handles roughly 60-80% of the mechanical cleanup work — pulling actuals, reformatting dates, fixing broken references — and handles approximately 0% of validating assumptions against comps. That's not a bug. It's just the boundary between computation and judgment.
The Tab-Rename Nightmare: Before and After
Here's the scenario Remy from our finance team described: inherited a model where someone renamed three tabs — P&L became Income Statement, Assumptions became Inputs, Returns became Returns Analysis — and broke roughly 40 formulas across the workbook. Every cross-tab reference that used the old name was now returning #REF!.
The broken state looks like this across the Returns Analysis tab:
=('P&L'!C15-'P&L'!C8)/'Assumptions'!$B$3 → #REF!
=SUMIFS('P&L'!C:C,'P&L'!B:B,">="&'Assumptions'!$B$3) → #REF!
='P&L'!C22*'Assumptions'!$D$14 → #REF!
Forty of these, spread across four tabs. The manual fix is find-and-replace, but only if the renamed tabs are consistent — if someone made a typo in the new name, or if ranges shifted when rows were inserted, you're doing it cell by cell.
The prompt that fixed it in ModelMonkey:
"Three tabs were renamed. 'P&L' is now 'Income Statement', 'Assumptions' is now 'Inputs', 'Returns' is now 'Returns Analysis'. Find every broken cross-tab reference in the workbook and rewrite them with the correct tab names."
What came back: The sidebar identified 43 broken references across Returns Analysis, LBO Summary, and the Debt Schedule tab. It proposed a diff — old formula vs. new formula, cell by cell — before writing anything. The diff looked like this for the first batch:
Returns Analysis!B12
BEFORE: =('P&L'!C15-'P&L'!C8)/'Assumptions'!$B$3
AFTER: =('Income Statement'!C15-'Income Statement'!C8)/'Inputs'!$B$3
Returns Analysis!B13
BEFORE: =SUMIFS('P&L'!C:C,'P&L'!B:B,">="&'Assumptions'!$B$3)
AFTER: =SUMIFS('Income Statement'!C:C,'Income Statement'!B:B,">="&'Inputs'!$B$3)
All 43 rewrites proposed, reviewed, approved. Total time: about 4 minutes including the review. Manual equivalent with find-and-replace: 25-35 minutes, and you'd still need to audit for any that got missed.
The key thing the diff review caught: 3 of the 43 references had also shifted row numbers when someone inserted rows in Q3. The AI flagged those as potentially needing manual review, rather than silently applying the wrong fix. That's the right behavior.
Where AI Reliably Saves Time
The tab-rename scenario is representative of a class of problems where the work is mechanical but fiddly: the logic is clear, the pattern is repetitive, and a mistake costs an audit rather than just a minute. As of April 2026, these are the tasks where AI earns its keep in a financial model:
| Task | Time saved | Failure mode to watch |
|---|---|---|
| Cross-tab reference repair after renaming | 25-35 min → 4 min | Row/column shifts not caught by name scan alone |
| Date format normalization (Salesforce exports) | 45 min → 8 min | Single-digit months if source isn't zero-padded |
| Pulling actuals from ERP dump into P&L | 60-80% time reduction | Mapping errors when account codes change mid-year |
| ARRAYFORMULA across 5,000 rows | Near-instant | Formula correct, assumption wrong |
| Building SUMIFS across 8-tab model | 5 min → 30 sec | Misidentified header row |
What's not in that table: validating your WACC assumptions against comps, stress-testing your terminal growth rate, checking whether your exit multiple makes sense for the sector. Those still require a human who knows what the numbers mean.
The WACC Example Done Properly
The previous version of this article name-dropped Damodaran's ERP and stopped there. That's not useful to anyone. Here's how the WACC calculation actually plays out for a Series B SaaS company in 2026, and where AI helps in the model.
Inputs as of April 2026:
- Risk-free rate: 4.35% (10-year US Treasury)
- Equity risk premium: 4.7% (Damodaran's January 2026 estimate for the US market)
- Beta: 1.35 (raw peer beta for mid-size SaaS, unlevered, re-levered at 20% debt/equity)
- Size premium: 1.8% (Duff & Phelps CRSP decile 9 for $200-500M market cap range)
- Cost of debt: 7.2% (current senior secured rate for B/B+ credits in the current environment)
- Tax rate: 21%
That gets you to a WACC of approximately 12.5-13.0%, not 9%. The Damodaran ERP alone isn't enough — you need the size premium, and for a pre-profitability SaaS name the beta assumption is doing a lot of work. A peer group of 8-12 public SaaS comps with negative EBITDA will give you levered betas in the 1.2-1.6 range depending on growth profile.
Where AI helps in this calculation: building the peer beta lookup. If you have a tab Comps with 12 ticker rows and want to pull unlevered betas, re-lever at your target capital structure, and average them, the formula set is tedious but mechanical. ModelMonkey can write the full AVERAGE(ARRAYFORMULA(...)) structure across the tab faster than you can type it. It cannot tell you whether your peer set is right.
The ARRAYFORMULA Date Problem
The previous version also used a date parsing example that would break on real Salesforce exports. Here's the honest version.
Salesforce exports dates as M/D/YYYY — not zero-padded. So January 5, 2026 comes out as 1/5/2026, not 01/05/2026. A MID(A2,4,2) approach to extract the day fails on single-digit months because the character positions shift.
The robust formula for normalizing these to YYYY-MM-DD across a 3,000-row export:
=ARRAYFORMULA(
IF(A2:A="","",
TEXT(DATEVALUE(A2:A),"YYYY-MM-DD")
)
)
DATEVALUE handles M/D/YYYY natively — it doesn't care about zero-padding. The TEXT wrapper converts to ISO format for SUMIFS date comparisons across your P&L tab:
=SUMIFS('Income Statement'!C:C,
'Income Statement'!B:B,">="&TEXT(Inputs!$B$3,"YYYY-MM-DD"),
'Income Statement'!B:B,"<="&TEXT(Inputs!$B$4,"YYYY-MM-DD"))
This is the kind of thing where spending 3 minutes prompting ModelMonkey to write the date normalization pipeline — including the cross-tab SUMIFS — beats 20 minutes of trial and error when the source format isn't what you expected.
What AI Writing Formulas Actually Means
The failure mode is AI writing technically correct formulas for the wrong question. It's fast enough at the mechanical part that you can outrun your own understanding of the model — approve a SUMIFS that references the right cells but aggregates the wrong thing, and your variance analysis is wrong in a way that passes sanity checks until the CFO asks why revenue is $400K higher than the management accounts.
The fix isn't slower AI — it's tighter prompts and the habit of reviewing proposed changes before approving them. ModelMonkey's approval-before-write workflow exists for exactly this reason: every formula change is a diff you can read before it touches the sheet.