Data AnalysisIntermediate10 min read

How to Clean Messy Data in Google Sheets

Master data cleaning techniques with formulas, functions, and AI automation. Turn messy spreadsheets into analysis-ready datasets in minutes.

Data cleaning is the most time-consuming part of any analysis project—often taking 60-80% of your total work time. Whether you are dealing with duplicate entries, inconsistent formatting, missing values, or text that should be numbers, messy data blocks you from getting insights. In this comprehensive guide, you will learn professional data cleaning techniques used by top analysts. From built-in Google Sheets functions to AI-powered automation, you will transform chaotic datasets into clean, analysis-ready tables. By the end, you will cut your data cleaning time from hours to minutes.

What You'll Need

  • Basic familiarity with Google Sheets
  • A messy dataset that needs cleaning (we will cover common scenarios)
  • Understanding of basic formulas (TRIM, CLEAN)

Step-by-Step Guide

1

Identify Common Data Quality Issues

Scan your dataset to find the types of problems you need to fix.

  • Duplicate rows: Use Data → Remove duplicates to find how many duplicates exist
  • Inconsistent formatting: Look for mixed date formats, text vs numbers, capitalization issues
  • Missing values: Scan for blank cells, "N/A", "null", or placeholder text
  • Extra whitespace: Check for leading/trailing spaces that break formulas
  • Merged cells: These break most analysis functions—flag them for unmerging
  • Data type mismatches: Numbers stored as text, dates formatted incorrectly

Pro Tip

Create a "Data Quality Checklist" sheet listing all issues found. Check them off as you clean—this prevents missing problems.

2

Remove Duplicates

Find and eliminate duplicate entries that skew your analysis.

  • Select your data range including headers
  • Go to Data → Remove duplicates
  • Check "Data has header row" if your first row contains column names
  • Select which columns to check for duplicates (usually all columns, or just key ID fields)
  • Google Sheets will show how many duplicates were removed
  • Alternative: Use UNIQUE() function for a formula-based approach: =UNIQUE(A2:D100)

Pro Tip

Before removing duplicates, create a backup sheet. Sometimes "duplicates" are actually legitimate entries with slight differences.

3

Clean Text: Remove Spaces and Special Characters

Fix whitespace and unwanted characters that break formulas.

  • Remove leading/trailing spaces: =TRIM(A2) removes extra spaces
  • Remove ALL spaces: =SUBSTITUTE(A2," ","") if you need to remove internal spaces too
  • Remove line breaks: =CLEAN(A2) removes non-printable characters
  • Combine both: =TRIM(CLEAN(A2)) for thorough text cleaning
  • Remove specific characters: =SUBSTITUTE(A2,"$","") removes dollar signs
  • Apply to entire column: Use ARRAYFORMULA to clean all rows at once

Pro Tip

Use Find & Replace (Ctrl+H) for simple character removal. For complex cleaning, formulas give you more control and can be reused.

4

Standardize Text Formatting

Make text consistent for proper grouping and analysis.

  • Convert to UPPERCASE: =UPPER(A2) for consistent capitalization
  • Convert to lowercase: =LOWER(A2) for email addresses or URLs
  • Convert to Title Case: =PROPER(A2) capitalizes first letter of each word
  • Fix inconsistent names: Use Find & Replace to standardize "Inc" vs "Inc." vs "Incorporated"
  • Extract first/last names: =SPLIT(A2," ") to separate full names into columns

Pro Tip

After applying text formulas, copy results and "Paste special → Values only" to replace formulas with clean data.

5

Fix Data Types and Formatting

Convert text to numbers, fix date formats, and ensure data types match.

  • Convert text to numbers: =VALUE(A2) or multiply by 1: =A2*1
  • Fix date formatting: Format → Number → Date, then choose your preferred format
  • Parse dates from text: =DATEVALUE("January 15, 2024") converts text to date
  • Convert numbers to text: =TO_TEXT(A2) or add apostrophe prefix
  • Identify data type issues: Numbers left-aligned are stored as text, right-aligned are true numbers

Pro Tip

Use Conditional Formatting to highlight cells with data type issues. Format → Conditional formatting → "Text contains" can find problems.

6

Handle Missing Values

Decide how to treat blank cells and incomplete data.

  • Find blank cells: Use Go To Special (no direct equivalent in Sheets, use filter instead)
  • Replace blanks with zero: =IF(ISBLANK(A2),0,A2)
  • Replace blanks with "Unknown": =IF(ISBLANK(A2),"Unknown",A2)
  • Fill down from above: Select range, Ctrl+D fills down from top cell
  • Flag missing data: =IF(ISBLANK(A2),"MISSING",A2) to track incomplete rows
  • Remove rows with missing critical fields using FILTER function

Pro Tip

Do not automatically replace all blanks with zero—this can distort averages. Only fill blanks when it makes business sense.

7

Split and Combine Columns

Separate combined data or merge split information.

  • Split text into columns: Data → Split text to columns (choose delimiter: comma, space, etc.)
  • Split with formula: =SPLIT(A2,",") separates "Last, First" into two columns
  • Combine columns: =A2&" "&B2 or =CONCATENATE(A2," ",B2) to merge
  • Join with delimiter: =JOIN(", ",A2:D2) to combine with commas
  • Extract specific parts: =LEFT(A2,3) for first 3 characters, =RIGHT(A2,4) for last 4

Pro Tip

When splitting columns, make sure you have enough empty columns to the right—Split will overwrite existing data.

8

Validate and Verify Clean Data

Check that your cleaning worked and data is analysis-ready.

  • Check for remaining duplicates: Data → Remove duplicates with "Remove" turned off shows count
  • Verify data types: Sort each column—numbers and dates should sort logically
  • Count blanks: =COUNTBLANK(A2:A1000) should return zero for required fields
  • Check for outliers: Use MIN/MAX functions to spot impossible values
  • Test formulas: Run a sample analysis to ensure cleaned data works correctly

Pro Tip

Create a "Data Quality Dashboard" with counts of rows, blanks, duplicates, and data type checks. Update it after each cleaning step.

Wrapping Up

You now have professional data cleaning skills that transform messy spreadsheets into analysis-ready datasets. From removing duplicates to standardizing formats and handling missing values, you can tackle the most common data quality issues with confidence. However, if you are spending hours on repetitive cleaning tasks, there is a faster way. **ModelMonkey can automate your entire data cleaning workflow with simple AI prompts**. Instead of writing formulas and clicking through menus, just tell ModelMonkey what you need: - "Remove duplicates and trim all whitespace" - "Convert all dates to YYYY-MM-DD format and replace blanks with 'Unknown'" - "Standardize company names and split full names into first/last columns" ModelMonkey analyzes your data, applies the right cleaning functions, and explains what it did—all in seconds. It even learns your preferences, so repetitive cleaning becomes a one-click task. **Ready to stop wasting time on data cleaning?** Start your free 14-day trial and let AI handle the tedious work.

Frequently Asked Questions

What is the fastest way to clean messy data in Google Sheets?

Use built-in tools like Data → Remove duplicates and Data → Split text to columns for quick fixes. For complex cleaning, combine TRIM(), CLEAN(), and ARRAYFORMULA to process entire columns at once. However, AI tools like ModelMonkey can automate the entire process with simple prompts, reducing hours of work to seconds.

How do I remove duplicates in Google Sheets without deleting important data?

Before removing duplicates, create a backup copy of your sheet. Use Data → Remove duplicates and carefully select which columns to check—sometimes rows that look duplicate differ in important fields. Alternatively, use =UNIQUE(A2:D100) to create a cleaned copy while keeping your original data intact.

How can I convert text to numbers in Google Sheets?

Use the VALUE() function: =VALUE(A2) converts text numbers to real numbers. Alternatively, multiply by 1: =A2*1. For entire columns, use Format → Number → Number to force correct data type. If numbers have currency symbols or commas, remove them first with SUBSTITUTE().

What formulas clean data in Google Sheets?

Essential data cleaning formulas: TRIM() removes extra spaces, CLEAN() removes non-printable characters, UPPER()/LOWER()/PROPER() standardize capitalization, SUBSTITUTE() replaces unwanted characters, SPLIT() separates combined data, and VALUE() converts text to numbers. Combine these with ARRAYFORMULA to clean entire columns instantly.

How do I handle missing values in my dataset?

It depends on your analysis goals. Use =IF(ISBLANK(A2),0,A2) to replace blanks with zero, or =IF(ISBLANK(A2),"Unknown",A2) for text placeholders. For critical fields, filter out incomplete rows with =FILTER(A2:D100, NOT(ISBLANK(A2:A100))). Never arbitrarily fill blanks—this can skew statistical analysis.

Try Unlimited AI Free for 14 Days

ModelMonkey Logo