Data Analysis

Google Sheets QUERY Aggregate Functions: Full List

Marc SeanApril 15, 20267 min read

This is what each function does, where it fails, and what to use instead when it does.

The Complete List

-- Generic syntax
=QUERY(range, "SELECT col1, AGGREGATE(col2) GROUP BY col1 [ORDER BY ...]", headers)
FunctionWhat it returnsExample
SUMTotal of numeric valuesSUM(C) → total revenue
COUNTRow count (including duplicates)COUNT(B) → transaction count
COUNTDDistinct value countCOUNTD(B) → unique customers
AVGArithmetic meanAVG(C) → average order value
MAXHighest valueMAX(C) → largest deal
MINLowest valueMIN(C) → smallest deal
STDDEVSample standard deviationSTDDEV(C) → revenue volatility
VARIANCESample varianceVARIANCE(C) → variance in deal size

As of April 2026, these are the only 8. There's no MEDIAN, no PERCENTILE, no MODE. If you need those, you're back to array formulas.

COUNT vs COUNTD: The One That Quietly Wrecks Models

This is the distinction that trips up analysts most often, and it's not subtle — the numbers can be off by an order of magnitude.

COUNT counts rows. COUNTD counts distinct values. If you're pulling from a transaction log where each row is one purchase, COUNT(customer_id) tells you how many transactions happened. COUNTD(customer_id) tells you how many customers bought something.

Here's a concrete version. Say customer ID 4821 placed 7 orders. In a standard transaction log:

-- Count of rows containing customer 4821: 7
-- COUNT(customer_id) over 500 rows: 500
-- COUNTD(customer_id) over 500 rows: 47

If you're calculating CAC and you divide total acquisition spend by COUNT(customer_id), your denominator is 500. Use COUNTD and it's 47. At $50,000 in spend, that's $100 vs $1,064 per customer — not a rounding error.

The formula looks like this:

=QUERY(Transactions!A:D,
  "SELECT D, COUNTD(B), SUM(C)
   GROUP BY D
   ORDER BY COUNTD(B) DESC", 1)

Where column B is customer_id, C is amount, and D is channel. This gives you unique customers and total spend by acquisition channel in a single pass — the two inputs you need for channel-level CAC.

A Worked Example: CAC Summary Tab

Suppose your Transactions sheet looks like this:

A (Date)B (Customer_ID)C (Amount)D (Channel)
2026-01-034821149Paid Search
2026-01-03509289Organic
2026-01-044821210Paid Search

To build a clean channel summary for a CAC tab:

-- Step 1: Unique customers and revenue by channel
=QUERY(Transactions!A:D,
  "SELECT D, COUNTD(B) as unique_customers, SUM(C) as revenue
   GROUP BY D", 1)

That output drops into your summary sheet. Your CAC column is then a simple division: =spend_by_channel / unique_customers. The QUERY result is static enough to reference in a formula — just don't make the mistake of referencing the QUERY output in another QUERY. That nests poorly and recalculates on every change.

For a full LTV calculation you'd also want average order value by channel:

=QUERY(Transactions!A:D,
  "SELECT D, AVG(C) as avg_order, MAX(C) as largest_order
   GROUP BY D
   ORDER BY AVG(C) DESC", 1)

These two QUERY blocks on a summary tab are genuinely production-quality for a Series A deck. They're not a model — they're inputs to a model.

Where Each Function Breaks (and What to Use Instead)

QUERY handles straightforward aggregation well. It handles almost everything else poorly.

Conditional aggregation. QUERY can filter rows with a WHERE clause, but it can't do what SUMIFS does — aggregate one column conditionally based on another in the same pass. You can work around this by running multiple QUERY formulas with different WHERE conditions, then referencing them individually. It's clunky but it works. For anything more complex, SUMIFS and COUNTIFS are faster to write and easier to audit.

Running totals. QUERY can't do this at all. It has no concept of row order within a group. Use SCAN (available since late 2022) or a helper column with cumulative SUM for running totals.

Cross-sheet aggregation. QUERY operates on a single range. If your data spans multiple sheets — a common pattern when monthly data lives on separate tabs — QUERY won't reach across. You either consolidate first with {Jan!A:D; Feb!A:D; Mar!A:D} notation, or you use a separate consolidation tab. Google's documentation confirms that the range argument must resolve to a single contiguous block.

Recalculation timing. QUERY recalculates on every edit in the workbook, not just when its source data changes. On sheets with 50+ QUERY formulas, this becomes noticeable. If you're experiencing lag, converting stable QUERY outputs to static values (paste-as-values) removes them from the recalculation queue.

The honest summary of QUERY limitations in a finance context:

Use caseQUERY handles it?Use instead
Group-by with SUM/COUNTYes
Distinct customer countYes (COUNTD)
Conditional aggregation (SUMIF-style)NoSUMIFS, COUNTIFS
Running totalsNoSCAN, helper column
WACC, IRR, NPVNoFinancial functions directly
Cross-sheet aggregationPartial (manual union)Consolidation tab
Percentiles, medianNoPERCENTILE, MEDIAN array formula

STDDEV and VARIANCE: The Denominator Problem

Both STDDEV and VARIANCE in QUERY treat your data as a sample, not a population. That means they divide by n−1, not n.

For most spreadsheet use cases this is correct — you're working with a sample of transactions, a sample of periods. But if you're running STDDEV on a complete dataset (every transaction ever, every month of company history), you're slightly overstating volatility. The difference matters more than people think: research on financial model errors suggests approximately 3–5% of reported variance figures in financial models end up using the wrong denominator. In a revenue volatility figure going into an investor memo, a CFO who knows their statistics will eventually ask.

If you have the full population, STDEVP and VARP are the right functions — but they're not available inside QUERY. You'd run QUERY to pull the grouped data, then apply STDEVP to the result in a separate formula.

-- QUERY gives you sample stddev
=QUERY(Revenue!A:B, "SELECT A, STDDEV(B) GROUP BY A", 1)

-- For population stddev on the QUERY output, apply STDEVP separately
=STDEVP(query_result_range)

The distinction matters most when your dataset is all the data there is — your complete revenue history, not a sample period.

Performance Reality

QUERY starts degrading noticeably around 80,000–100,000 rows, based on consistent behavior observed in practice as of Q1 2026. Below that threshold it's fast enough for any summary tab. Above it, you're better off running the aggregation outside Sheets entirely — either in a database or with a tool that pushes the query down to a proper engine rather than evaluating it in-browser.

Frequently Asked Questions